Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2017 Feb 27;114(11):E2195–E2204. doi: 10.1073/pnas.1616173114

Aberrant splicing in maize rough endosperm3 reveals a conserved role for U12 splicing in eukaryotic multicellular development

Christine M Gault a,b,1,2, Federico Martin a,b,1,3, Wenbin Mei c, Fang Bai b, Joseph B Black b, W Brad Barbazuk a,c,d, A Mark Settles a,b,d,4
PMCID: PMC5358371  PMID: 28242684

Significance

The last eukaryotic common ancestor had two spliceosomes. The major spliceosome acts on nearly all introns, whereas the minor spliceosome removes rare, U12-type introns. Based on in vitro RNA-splicing assays, the RGH3/ZRSR2 RNA-splicing factor has functions in both spliceosomes. Here, we show that the maize rgh3 mutant allele primarily disrupts U12 splicing, similar to human ZRSR2 mutants, indicating a conserved in vivo function in the minor spliceosome. These mutant alleles block cell differentiation leading to overaccumulation of stem cells in endosperm and blood, respectively. We found extensive conservation between maize and human U12-type intron-containing genes, demonstrating that a common genetic architecture controls at least a subset of cell differentiation pathways in both plants and animals.

Keywords: minor spliceosome, cell differentiation, stem cell, myelodysplastic syndrome, maize endosperm

Abstract

RNA splicing of U12-type introns functions in human cell differentiation, but it is not known whether this class of introns has a similar role in plants. The maize ROUGH ENDOSPERM3 (RGH3) protein is orthologous to the human splicing factor, ZRSR2. ZRSR2 mutations are associated with myelodysplastic syndrome (MDS) and cause U12 splicing defects. Maize rgh3 mutants have aberrant endosperm cell differentiation and proliferation. We found that most U12-type introns are retained or misspliced in rgh3. Genes affected in rgh3 and ZRSR2 mutants identify cell cycle and protein glycosylation as common pathways disrupted. Transcripts with retained U12-type introns can be found in polysomes, suggesting that splicing efficiency can alter protein isoforms. The rgh3 mutant protein disrupts colocalization with a known ZRSR2-interacting protein, U2AF2. These results indicate conserved function for RGH3/ZRSR2 in U12 splicing and a deeply conserved role for the minor spliceosome to promote cell differentiation from stem cells to terminal fates.


Most eukaryotic transcripts contain introns that are removed by dynamic ribonucleoprotein complexes known as spliceosomes (1). Spliceosomes include hundreds of RNA-splicing factors that influence splice site selection. Many eukaryotic lineages have two different spliceosomes (2). The major spliceosome removes the common U2-type introns, and the minor spliceosome removes rare U12-type introns. U2- and U12-type introns are recognized by different consensus sequences at the splice sites and branch point (3). The significance of maintaining separate splicing machinery for these rare introns is not well defined. U12-type intron splicing efficiency increases in response to cell stress signaling in HeLa cells (4). Reduction in U12-type splicing causes developmental defects in Arabidopsis, Drosophila, zebrafish, and humans (511).

Human ZRSR2 is required for the second transesterification reaction in U2 splicing and takes part in the initial assembly of the minor spliceosome in human cell extracts (12, 13). Somatic mutations in ZRSR2 are found in patients with myelodysplastic syndrome (MDS) (1416). MDS is a blood cell differentiation disorder that causes an increase in undifferentiated blasts, abnormal myeloid cells, and a decrease in fully differentiated myeloid cell types (17). MDS subtypes with ZRSR2 mutations more frequently progress to acute myeloid leukemia, and consequently, loss of ZRSR2 is considered a driver toward cancer (17). RNA-sequencing (RNA-seq) analysis of ZRSR2 mutants found reduced U12 splicing with U2-type introns largely unaffected (18).

The hypomorphic rough endosperm3 (rgh3) allele in maize disrupts the ZRSR2 ortholog, seed development, and plant viability (19). Endosperm cell differentiation is defective and delayed in rgh3 mutants, allowing mutant cells to proliferate in tissue culture at late developmental stages when WT endosperm is unable to grow (19) (Fig. S1). Based on a survey of alternatively spliced transcripts in maize, only a few genes have been identified with altered transcript isoform abundance in rgh3 mutants (19).

Fig. S1.

Fig. S1.

Extended proliferation of rgh3 endosperm tissue cultures. (A) Normal sibling and rgh3 mutant endosperm culture plates at 30–35 d of culture. Endosperm tissues were plated from kernels at 7, 10, and 16 d after pollination (DAP). (B) Frequency of callus growth from rgh3 and normal sibling endosperm tissues after 30 d of culture. Results are from 50 endosperm tissues for each developmental stage and genotype. These data are independent replicates of figure 4 and figure S2 of ref. 19.

Results

Missplicing of U12-Type Introns in rgh3.

We determined the genome-wide effect of rgh3 on mRNA splicing with RNA-seq and isoform expression analysis from homozygous rgh3 and WT sibling root and shoot tissues. Cufflinks (20) predicted 46 genes had altered isoform use in rgh3 (Table S1). Localized differences in read coverage depth between rgh3 and WT libraries were used to design semiquantitative RT-PCR assays for nine of the genes identified as differentially spliced (Fig. 1A and Fig. S2). Sequencing of the amplified products revealed that intron retention occurs more frequently in rgh3 mutants at seven of these loci (Fig. 1B and Figs. S3 and S4). Some rgh3 amplifications exhibited slow-migrating bands due to heteroduplex formation of differently sized RT-PCR products (Fig. S5). Six of the validated splicing differences have U12-type introns as defined by the ERISdb plant splice site database (21). The intron retained in the seventh gene, GRMZM2G133028, is likely to be a U12-type intron. Although the 5′-splice site and branch point in GRMZM2G133028 are somewhat diverged from the U12-type consensus, the six additional plant species analyzed within ERISdb have U12-type introns at the orthologous intron position in this conserved, plant-specific gene (Fig. S6).

Table S1.

Genes with differentially expressed transcript isoforms based on Cufflinks statistics

Gene ERISdb annotation WT vs. rgh3 comparison (JSD)1/2 Cufflinks test statistic FDR-corrected P value RT-PCR
GRMZM2G000665 U12 intron Shoots 0.831 4.59E-05 1.86E-03
GRMZM2G007981 Shoots 0.833 2.11E-04 1.86E-03
GRMZM2G012841 Shoots 0.401 3.12E-12 1.86E-03
GRMZM2G059671 Shoots 0.831 1.73E-10 1.86E-03
GRMZM2G061596 Shoots 0.832 1.49E-06 1.86E-03
GRMZM2G083655 Shoots 0.832 1.16E-08 1.86E-03
Roots 0.831 8.87E-09 2.97E-03
GRMZM2G096600 Shoots 0.831 7.01E-05 1.86E-03 No difference
GRMZM2G097170 U12 intron Shoots 0.833 4.42E-09 1.86E-03
Roots 0.832 2.67E-04 2.97E-03
GRMZM2G098423 Shoots 0.831 0.00E+00 1.86E-03
GRMZM2G110277 Shoots 0.801 1.10E-11 1.86E-03
GRMZM2G111954 Shoots 0.833 2.03E-11 1.86E-03
GRMZM2G119640 Shoots 0.683 2.22E-16 1.86E-03
GRMZM2G130432 U12 intron Shoots 0.733 8.71E-12 1.86E-03 Misspliced
Roots 0.656 2.99E-10 2.65E-02
GRMZM2G137847 Shoots 0.833 3.75E-12 1.86E-03
GRMZM2G175398 Shoots 0.828 3.40E-08 1.86E-03
GRMZM2G350312 Shoots 0.830 4.38E-06 1.86E-03
GRMZM2G476538 Shoots 0.833 1.78E-08 1.86E-03
GRMZM2G480607 Shoots 0.682 1.05E-09 1.86E-03
GRMZM5G820727 U12 intron Shoots 0.685 4.80E-14 1.86E-03 Misspliced
GRMZM2G000842 Roots 0.797 2.47E-09 2.97E-03
GRMZM2G007453 Roots 0.832 0.00E+00 2.97E-03
GRMZM2G048846 Roots 0.220 9.10E-08 2.97E-03
GRMZM2G057646 Roots 0.184 3.67E-08 2.97E-03
GRMZM2G077233 Roots 0.827 8.49E-12 2.97E-03
GRMZM2G127911 Roots 0.833 1.17E-07 2.97E-03
GRMZM2G138566 Roots 0.831 1.23E-08 2.97E-03
GRMZM2G336533 Roots 0.831 4.81E-07 2.97E-03
GRMZM2G454550 Roots 0.831 3.51E-10 2.97E-03
GRMZM2G408305 U12 intron Shoots 0.725 5.60E-09 3.54E-03 Misspliced
GRMZM2G068710 Shoots 0.588 1.69E-09 3.85E-03
GRMZM2G097568 U12 intron Shoots 0.777 8.50E-08 3.85E-03 Misspliced
Roots 0.783 2.31E-10 4.09E-03
GRMZM2G113619 Shoots 0.288 3.05E-05 3.85E-03
GRMZM2G166659 Roots 0.120 6.45E-08 6.29E-03
GRMZM2G045257 Shoots 0.535 7.73E-07 6.64E-03
GRMZM5G848692 Roots 0.282 1.51E-06 1.52E-02
GRMZM2G306935 U12 intron Roots 0.716 2.18E-10 1.63E-02 Misspliced
GRMZM2G011636 U12 intron Roots 0.664 1.88E-10 2.00E-02 Misspliced
GRMZM2G021549 Roots 0.770 9.30E-08 2.00E-02
GRMZM2G041418 Roots 0.789 6.74E-06 2.00E-02
GRMZM2G133028 U2 intron* Shoots 0.687 6.75E-07 2.62E-02 Misspliced
Roots 0.731 4.63E-07 2.65E-02
GRMZM2G103152 Roots 0.656 7.43E-10 2.65E-02 No difference
GRMZM2G111912 Shoots 0.433 1.58E-06 2.72E-02
GRMZM2G036837 Roots 0.322 2.62E-08 3.71E-02
GRMZM2G032348 Roots 0.706 5.83E-07 3.77E-02
GRMZM2G014709 Roots 0.580 1.03E-10 3.88E-02
GRMZM2G065655 Shoots 0.771 3.84E-03 4.66E-02

JSD is the Jensen–Shannon divergence.

*

The retained intron in GRMZM2G133028 has a 5′-splice site similar to the U12-type consensus and a potential U12-type branch point (Fig. S6).

Fig. 1.

Fig. 1.

Minor splicing is compromised in rgh3. (A) WT and rgh3 root RNA-seq read depth for GRMZM2G011636. Transcript models show the U12-type intron with increased read depth in rgh3. A brace symbol indicates the region amplified in B. (B) RT-PCR of normal (Rgh3 +) and rgh3 mutant (Rgh3 −) RNA from root, shoot, kernel, embryo, starchy endosperm (SE), basal endosperm transfer cell layer (BETL), and endosperm culture (EC) tissues. Schematic shows sequenced amplification products with forward and reverse primers indicated by arrows. (C–F) Intron read depth and PSO metrics with U2-type introns indicated in orange and U12-type introns indicated in blue. (C) Scatterplot showing intron read depth normalized for gene expression. Black diagonal line indicates equivalent read depth, and dotted lines are twofold differences. (D) Distribution of Welch’s t test P values. Lines were fit with a sliding window average of 0.3 log units. Gray line, P = 0.05. (E) Scatterplot of PSO metrics. Introns with less than 10 exon–exon junction reads in both WT and rgh3 are not plotted. (F) Distribution of splicing differences between WT and rgh3 based on ΔPSO metrics.

Fig. S2.

Fig. S2.

Genes identified by Cufflinks with differential splicing in rgh3 mutants. Root RNA-seq read depth for three WT and three rgh3 mutant replicates are shown. Annotated transcript models have U12-type introns labeled. A brace symbol indicates the region amplified in RT-PCR validation experiments shown in Fig. S3. Partial gene models are shown for (A) GRMZM2G408305, (B) GRMZM2G097568, and (D) GRMZM2G133028 to adequately resolve intron read depth for the region amplified. Full gene models are shown for (C) GRMZM2G306935, (E) GRMZM2G130432, and (F) GRMZM5G820727.

Fig. S3.

Fig. S3.

RT-PCR validation of rgh3 splicing defects predicted by Cufflinks in diverse tissue types. Gene-specific primers amplified cDNA from roots, shoots, whole kernels, embryos, starchy endosperm (SE), endosperm tissue enriched for the basal endosperm transfer cell layer (BETL), and endosperm culture (EC). For root and shoot, the Rgh3 + labels indicate homozygous WT Rgh3 tissues. The Rgh3 + labels for all other tissues indicate phenotypically WT samples that are either homozygous or heterozygous for the WT Rgh3 gene. Rgh3 – labels indicate homozygous mutant samples. Panel order is the same as in Fig. S2. (A–G) Transcript diagrams and RT-PCR products are shown for the following: (A) GRMZM2G408305, (B) GRMZM2G097568, (C) GRMZM2G306935, (D) GRMZM2G133028, (E) GRMZM2G130432, (F) GRMZM5G820727, and (G) a ubiquitin control. All transcript diagrams are based on cloned, sequenced products and drawn to the same scale as in A. Arrows indicate forward and reverse primers used for RT-PCR.

Fig. S4.

Fig. S4.

Genes identified by Cufflinks as having differential splicing in rgh3 mutants that failed to validate in RT-PCR experiments. (A and B) RNA-seq read depth for three WT and three rgh3 mutant replicates from root libraries are shown for the full gene models of (A) GRMZM2G096600 and (B) GRMZM2G103152. A brace symbol indicates the region amplified in RT-PCR validation experiments. (C) RT-PCR analysis of the same RNA used for two of the three RNA-seq libraries. Size markers as well as the expected size of the annotated transcript variants and a genomic fragment are indicated.

Fig. S5.

Fig. S5.

Mixed PCR products from alternatively spliced cDNA form slow migrating heteroduplex products. Purified plasmids containing inserts of transcript fragments with the intron spliced out (S) and the intron retained (R) were amplified using gene-specific primers to show expected sizes. The S and R plasmid for each gene also were combined in a 1:1 ratio as template for the S/R reaction. The S/R templates amplify two bands of expected size as well as additional slow-migrating, heteroduplex PCR products similar to those observed in RT-PCRs that have significant levels misspliced or intron retention products such as GRMZM2G011636 (Fig. 1B) or GRMZM2G106613 (Fig. 2D).

Fig. S6.

Fig. S6.

Intron affected in GRMZM2G133028 is a U12-type intron. Sequence comparison of the U12-like intron in GRMZM2G133028 with U12-type introns of orthologs from ERISdb. The last five codons of the upstream exon, the 5′-splice site, the branch point, and the downstream exon are shown for the following: Zea mays GRMZM2G133028 (Zm), Oryza sativa LOC_Os01g15790 (Os), Arabidopsis thaliana AT4G36440 (At), Vitis vinifera Vv04s0023g02560 (Vv), Glycine max GLYMA02G05950 (Gm), Selaginella moellendorffii SELMODRAFT_176312/ SELMODRAFT_102953 (Sm), and Physcomitrella patens PP1S6_51V6 /PP1S83_251V6 (Pp). Peptide sequences are shown below the exons. The U12-type 5′-splice site is underlined, and the U12-type branch point in ERISdb is in red text. The homologous branch point in Zea mays is in blue text. Additional sequences consistent with a U12-type branch point are underlined and bold.

U12-type intron-containing genes account for <2% of the B73 filtered gene set. With 20% of Cufflinks predictions involving U12-type introns, we hypothesized that U12-type introns are misspliced on a global scale in rgh3. To allow for novel transcript isoforms to be detected, we analyzed intron read counts normalized by gene expression for all introns in the filtered gene set of the maize genome (Fig. 1C). Of the 446 nonredundant U12-type introns in ERISdb, 340 were expressed in the seedling libraries. Significantly more RNA-seq reads mapped to 240 U12-type introns in rgh3 libraries vs. WT libraries (Fig. 1D and Dataset S1). This represents 71% of the maize U12-type introns tested. Only three U12-type introns showed significantly fewer reads in rgh3 libraries, which is below the expected false-positive rate for the number of t tests completed. By contrast, 4% of the remaining 113,345 nonredundant introns within expressed genes had differences in the number of normalized reads mapped in rgh3 vs. WT (Fig. 1D and Dataset S2). Globally, fewer U2-type introns showed significant differences than the expected number of false-positive t tests, indicating that rgh3 specifically affects genes with U12-type introns.

To quantify the extent of intron splicing defects, we calculated percent spliced out (PSO) for individual introns using exon–exon junction and intron reads (Fig. 1E). Based on Fisher’s exact tests of read counts, splicing defects were detected for 77% of U12-type introns (Dataset S1). The median difference between WT and rgh3 PSO (ΔPSO) was 62%, indicating extensive retention of U12-type introns in rgh3 (Fig. 1F). More than 13% of all other introns also showed statistically significant increased retention in rgh3, but the median ΔPSO for these introns was only 4%, indicating minor impacts on U2 splicing (Dataset S2).

An additional nine genes showing significant differences in U12-type intron read depth and PSO were randomly selected for RT-PCR (Fig. 2 A–C and Fig. S7). All of these genes had splicing defects in rgh3 with three patterns of altered splicing: intron retention (Fig. 2D and Fig. S8 A–D), missplicing of the U12-type intron concomitant with retention of a downstream U2-type intron (Fig. 2E and Fig. S8E), and activation of cryptic, U2-type, 5′- and 3′-splice sites at the U12-type intron (Fig. 2F and Fig. S8 E and F). There were no differences in splice site consensus sequences between the U12-type introns that are misspliced in rgh3 and those with no significant difference in rgh3 mutants (Fig. S9). The range of splicing defects found in rgh3 is also observed in human bone marrow samples from MDS patients with ZRSR2 mutations (18). We conclude that rgh3 mutants, like human ZRSR2 mutants, are impaired in minor spliceosome function.

Fig. 2.

Fig. 2.

Diverse rgh3 splicing defects are associated with U12-type introns. (A–C) WT and rgh3 root RNA-seq read depth for (A) GRMZM2G106613, (B) GRMZM2G083620, and (C) GRMZM2G587327. Transcript models show U12-type introns with a brace symbol indicating the regions amplified in D–F. (D–F) RT-PCR of normal (Rgh3 +) and rgh3 mutant (Rgh3 −) RNA from root, shoot, kernel, embryo, starchy endosperm (SE), basal endosperm transfer cell layer (BETL), and endosperm culture (EC) tissues. Schematics show sequenced amplification products with forward and reverse primers indicated by arrows. (D) GRMZM2G106613 showed U12-type intron retention in rgh3. (E) GRMZM2G083620 had adjacent U12- and U2-type intron retention. (F) GRMZM2G587327 activated cryptic U2-type splice sites in rgh3.

Fig. S7.

Fig. S7.

RNA-seq read depth for experimentally validated genes with significantly more U12-type intron normalized expression. All panels show RNA-seq read depth for three WT and three rgh3 mutant replicates from root libraries. Annotated transcript models are shown with the U12-type intron labeled. A brace symbol indicates the region amplified in RT-PCR validation experiments shown in Fig. S8. Full gene models are shown for the following: (A) GRMZM2G131321, (B) GRMZM2G074015, (C) GRMZM2G040401, and (D) GRMZM2G033430. Partial gene models are shown for (E) GRMZM2G416751 and (F) GRMZM2G153434 to illustrate intron read depth adequately in the region amplified.

Fig. S8.

Fig. S8.

RT-PCR validation of rgh3 U12 splicing defects identified by normalized intron read depth analysis. (A–G) Gene-specific primers (arrows) amplified cDNA from roots, shoots, whole kernels, embryos, starchy endosperm (SE), endosperm tissue enriched for the basal endosperm transfer cell layer (BETL), and endosperm culture (EC). The seedling cDNA was derived from the same RNA used for RNA-seq as in Fig. 1. The cDNA for the seed and EC tissues are the same as in Fig. S3. Transcript diagrams and RT-PCR products are shown for (A) GRMZM2G131321, (B) GRMZM2G074015, (C) GRMZM2G040401, (D) GRMZM2G033430, (E) GRMZM2G416751, (F) GRMZM2G153434, and (G) ubiquitin. All transcript diagrams are based on cloned, sequenced RT-PCR product and are drawn to the same scale with 100 bp indicated in A. Arrows on the gene model schematics show forward- and reverse-primer binding sites.

Fig. S9.

Fig. S9.

Consensus splice sites for U12-type introns showing differential splicing in rgh3. Sequence logos are shown for 5′-splice sites, branch point sequences, 3′-splice sites, and exons downstream of U12-type introns that had significantly increased intron read depth in rgh3 (A) and nonsignificant U12-type introns (B). The last 3 bp of the upstream exon did not show any conserved nucleotides and is not shown.

Predicted Functions of rgh3 Misspliced Genes.

Pfam domains in the 230 maize genes with increased intron read depth in rgh3 showed no significant enrichment or deenrichment compared with all 308 expressed U12-type intron-containing genes in the RNA-seq experiment (Dataset S3). These analyses suggest most biological processes that are dependent upon U12 splicing are affected in rgh3. Many U12-type introns are found within genes involved in DNA replication, DNA repair, transcription, RNA processing, and translation (22, 23). Arabidopsis U12-type introns were used for prior cross-kingdom comparisons, but only 58% of maize U12-type intron-containing genes have an Arabidopsis ortholog that also contains a U12-type intron (21).

Pfam domains were analyzed at a global level to identify U12-dependent biological processes affected in rgh3 (Fig. 3A and Dataset S3). More than one-half of the domains with annotated roles in translation, endomembrane dynamics, and unknown functions were enriched in the 230 rgh3 misspliced genes relative to all genes tested for intron read depth differences. When all U12-type intron-containing genes are compared genome-wide, more than one-half of domains with roles in cell cycle, RNA processing, and protein folding/degradation also are enriched (Fig. 3A). These analyses support conserved functions for maize U12-type intron-containing genes with U12 spliceosome targets in other species.

Fig. 3.

Fig. 3.

U12 splicing defects in rgh3-affected genes involved in cell differentiation and growth. (A) Heat map of Pfam domains found in U12-type intron-containing genes. Red domains are enriched in genes with rgh3 U12 splicing defects relative to all maize genes tested for splicing defects. Blue indicates additional domains enriched in U12-type intron-containing genes relative to all maize genes. Gray and white indicate no enrichment of the domain relative either to genes tested for splicing defects or to all maize genes, respectively. (B) Cell cycle schematic showing human homolog gene symbols for maize U12-type intron-containing genes. Bold indicates U12 splicing defects in rgh3. Asterisks indicate splicing defects in human ZRSR2 mutants. (C–E) Endosperm expression of rgh3 misspliced genes (24). Cluster analysis of all genes (open triangles) and rgh3 misspliced genes (blue squares) is plotted for embryo (C) and endosperm (D). E/L, early/late developmental expression. Const., constitutive expression. (E) Example endosperm expression profiles for maize/human homologs with U12-type introns. (F) Cumulative frequency plot for U12-type intron positions in maize proteins (orange circles) with human homologs (blue squares) containing U12-type introns. Orange (maize) and blue (human) lines plot the expected normal distribution of each protein set. (G and H) Scatterplots of expression levels for misspliced genes with human homologs that also have U12-type introns (orange circles) and all other misspliced genes (blue squares). Dotted black line shows 1:1 ratio of WT to rgh3 expression. Dotted gray lines show a twofold ratio change.

Endosperm cells in rgh3 show aberrant differentiation of embryo surrounding region (ESR) and basal endosperm transfer layer (BETL) cells (19), which is analogous to blood cell differentiation defects in MDS patients (18). BETL and ESR differentiation occurs early in seed development. To examine developmental expression of rgh3-affected genes, we reanalyzed a transcriptome profile of maize seed development (24). Based on reported clusters of expression profiles, the 230 misspliced genes in rgh3 were overrepresented in early endosperm but not in early embryo development (Fig. 3 C and D). Approximately 78% of misspliced genes have a peak of expression between 6 and 12 d after pollination (Fig. 3E). These data are consistent with U12-type intron-containing genes playing a role in endosperm cell differentiation.

To explore mechanistic similarities between U12-dependent cell differentiation pathways in maize and human, we identified human homologs of maize genes with U12-type introns. BLASTP searches of human RefSeq proteins identified 233 maize U12-type intron-containing genes with a human homolog (Dataset S4). The 233 maize genes correspond to 154 human genes due to differences in gene redundancy in the maize and human genomes. We found two biological processes that account for 35% of the U12-type intron-containing maize genes with human homologs. First, 50 maize U12-type intron-containing genes have roles in cell cycle. Of these cell cycle genes, 29 are differentially spliced in rgh3, which correspond to 24 human homologs (Fig. 3B). Second, 33 maize genes with U12-type introns have predicted roles in protein glycosylation including the following: synthesis and transport UDP-xylose, synthesis of dolichyl-diphosphooligosaccharide, secretion of glycosylated protein complexes, and the unfolded protein response pathway (Table S2). Twenty-five of the protein glycosylation genes are misspliced in rgh3 and correspond to 10 human homologs.

Table S2.

Maize–human homologs with predicted roles in protein glycosylation

Maize gene U12 splicing defect in rgh3 Human gene symbol Predicted function
UDP-xylose synthesis and transport
 GRMZM2G032003 Yes UGP2 UDP-glucose pyrophosphorylase 2
 GRMZM2G098370 UGP2
 GRMZM2G007195 Yes UXS1 UDP-glucuronate decarboxylase 1
 GRMZM2G007404 Yes UXS1
 GRMZM2G347717 Yes UXS1
 GRMZM2G359234 Yes UXS1
 GRMZM2G370048 Yes UXS1
 GRMZM2G381473 Yes UXS1
 GRMZM2G000632 Yes GALE UDP-galactose-4-epimerase
 GRMZM2G040397 Yes GALE
 GRMZM2G145460 Yes GALE
 GRMZM5G830983 Yes GALE
 GRMZM2G301172 UMPS Uridine monophosphate synthetase
 GRMZM2G063253 Yes SLC35E3 UDP-xylose transporter
 GRMZM2G063511 Yes SLC35E3
 GRMZM2G068714 Yes SLC35E3
 GRMZM2G081848 Yes SLC35E3
 GRMZM2G116053 Yes SLC35E3
 GRMZM2G122618 Yes SLC35E3
 GRMZM2G048434 SLC35E3
 GRMZM5G828581 SLC35E3
Protein glycosylation
 GRMZM2G426275 MGAT1 UDP-N-acetylglucosamine:α-3-d-mannoside β-1,2-N-acetylglucosaminyltransferase I
 GRMZM2G133421 RFT1 Dolichol-PP-Man(5)GlcNAc(2) transporter
 GRMZM2G152194 Yes ALG12 Dolichol-PP-Man(7)GlcNAc(2) α-1,6-mannosyltransferase
 GRMZM2G164304 Yes MPDU1 Mannose-P-dolichol utilization defect 1
 GRMZM2G462325 Yes DDOST Dolichyl-diphosphooligosaccharide-protein glycosyltransferase
 GRMZM2G048762 PIGM Phosphatidylinositol glycan anchor biosynthesis, class M
 GRMZM2G000937 Yes PIGB Phosphatidylinositol glycan anchor biosynthesis, class B
 GRMZM2G164175 Yes PIGB
Glycoprotein turnover and secretion
 GRMZM2G117388 Yes DERL2 Degradation of misfolded glycoproteins
 GRMZM2G143817 DERL2
 GRMZM2G061922 Yes TRAPPC2 Sedlin, collagen secretion
 GRMZM2G097568 Yes TRAPPC2

To find genes with altered splicing in both rgh3 and human ZRSR2 mutants, we identified the current gene symbols for human U12-intron–containing genes in U12DB (Dataset S5) (25). This revealed 36 human genes that are homologs of 57 maize genes with U12-type introns in both species. Maize/human homology enriches for U12 splicing defects in both rgh3 and ZRSR2 mutants (Table S3). Fifty of these maize genes were tested in the RNA-seq analysis, and 96% (48 of 50) had evidence of splicing defects in rgh3, as determined by intron read depth or junction read tests. Of the 758 human genes in U12DB, only 216 (28%) genes had significant splicing defects reported in ZRSR2 mutants, but 47% (17 of 36) of human U12-type intron-containing genes with maize homologs have splicing defects (18).

Table S3.

Human–maize gene pairs both containing U12-type introns that are misspliced in rgh3

Human gene Maize gene Maize–human U12 intron position Splicing defect in ZRSR2 MDS (18) Biological process
GPN2 GRMZM2G093716 Both U12 introns at same residues Yes Embryonic development
ALG12 GRMZM2G152194 Same residue Yes Protein glycosylation
MAEA GRMZM2G177026 Same residue Yes Cell differentiation
SACM1L GRMZM2G047894 Same residue Yes Metabolism
SACM1L GRMZM2G171080 Same residue Yes Metabolism
SACM1L GRMZM2G418916 Same residue Yes Metabolism
TAPT1 GRMZM2G347645 Same residue Yes Embryonic development
TRAPPC2 GRMZM2G061922 Same residue Yes Endomembrane
TRAPPC2 GRMZM2G097568 Same residue Yes Endomembrane
SLC9A8 GRMZM2G067747 First U12 intron, same residue Yes Transport
WDR91 GRMZM2G158179 First U12 intron, same residue Yes Unknown
SMYD2 GRMZM2G457881 Second U12 intron, conserved position, divergent protein motif Yes Cell cycle
DERL2 GRMZM2G117388 Divergent Yes Protein processing
E2F3 GRMZM2G041701 Divergent Yes Cell cycle
E2F3 GRMZM2G052515 Divergent Yes Cell cycle
EXO1 GRMZM2G096920 Divergent Yes Cell cycle
FRA10AC1 GRMZM2G001444 Divergent Yes RNA processing
IPO4 GRMZM2G408305 Divergent Yes Protein targeting
IPO9 GRMZM2G457415 Divergent Yes Protein targeting
PIGB GRMZM2G000937 Divergent Yes Protein glycosylation
PIGB GRMZM2G164175 Divergent Yes Protein glycosylation
SLC35E3 GRMZM2G048434 Divergent Yes Protein glycosylation
SLC35E3 GRMZM2G063253 Divergent Yes Protein glycosylation
SLC35E3 GRMZM2G063511 Divergent Yes Protein glycosylation
SLC35E3 GRMZM2G068714 Divergent Yes Protein glycosylation
SLC35E3 GRMZM2G081848 Divergent Yes Protein glycosylation
SLC35E3 GRMZM2G116053 Divergent Yes Protein glycosylation
SLC35E3 GRMZM2G122618 Divergent Yes Protein glycosylation
SLC35E3 GRMZM5G828581 Divergent Yes Protein glycosylation
NAPG GRMZM2G145175 First U12 intron, same residue Endomembrane
BRCC3 GRMZM2G096491 Same residue Cell cycle
BRCC3 GRMZM2G152436 Same residue Cell cycle
EXOSC1 GRMZM5G841900 Same residue RNA processing
EXOSC5 GRMZM2G083620 Same residue RNA processing
FAM96B GRMZM2G159389 Same residue Metabolism
FAM96B GRMZM2G162266 Same residue Metabolism
POLE2 GRMZM2G154267 Same residue Cell cycle
XRCC5 GRMZM2G137968 Same residue Cell cycle
PQLC2 GRMZM2G024733 Shifted by 1 codon Transport
PQLC2 GRMZM2G153434 Shifted by 1 codon Transport
BTAF1 GRMZM2G168096 Divergent Transcription
GTF2H3 GRMZM2G027209 Divergent Transcription
NCBP2 GRMZM2G034804 Divergent RNA processing
NCBP2 GRMZM2G052341 Divergent RNA processing
SETD2 GRMZM2G033694 Divergent Chromatin structure
SETD2 GRMZM2G130910 Divergent Chromatin structure
SMC3 GRMZM2G456570 Divergent Cell cycle
SMYD3 GRMZM2G080462 Divergent Cell cycle

Mutations in three of these genes, GPN2, MAEA, and TAPT1, have documented roles in animal and plant development. The Arabidopsis homolog of GPN2 encodes the QQT1 protein and is required for embryos to complete periclinal divisions to establish epidermal and internal cell layers (26). The mouse ortholog of MAEA is required for final differentiation of erythrocytes (27). The vertebrate TAPT1 gene is required for skeletal patterning and normal function of the primary cilium (28). The Arabidopsis TAPT1 ortholog, POD1, is required for apical–basal patterning of the early embryo and endomembrane protein sorting (29).

Our analysis also revealed divergent loss of U12-type introns for genes with conserved biological functions (Datasets S4 and S5). The protein glycosylation enzymes, ALG12 and PIGB, have U12-type introns in both maize and human. U12-type introns are also found in other protein glycosylation genes including the following: ALG3, ALG6, ALG8, PIGN, and PIGP in human as well as the maize homologs of DDOST, MPDU1, MGAT1, RFT1, and PIGM. The DNA origin of replication complex subunit, ORC3, in humans contains a U12-type intron, whereas the maize ORC4 homolog contains a U12-type intron. For ubiquitin-specific peptidases, U12-type introns are found in human USP7, USP10, and USP14 as well as in maize homologs of USP36 and USP42. Similarly, molybdenum cofactor (Moco) biosynthesis genes contain U12-type introns in maize. In humans, two Moco cofactor-requiring enzymes, aldehyde oxidase (AOX1) and xanthine dehydrogenase (XDH), contain U12-type introns. Loss of Moco in maize does not affect endosperm development but does cause seedling lethality (30, 31). There are at least 25 additional examples where different members of protein complexes or biochemical pathways have U12-type introns in human and maize including TRAPP and adaptor related protein complexes, Rab, importins, DNA polymerase, activating signal cointegrator-1, TFIIA, RNA polymerase III, the exosome, ribosomal proteins, and the autophagy pathway. These homologies suggest a common genetic architecture of minor spliceosome targets in human and maize where splicing defects in U12-type introns disrupt conserved biological processes and result in stem cell-like phenotypes in ZRSR2 mutant MDS cells and rgh3 endosperm cells.

Conservation of U12-Type Intron Positions in Maize and Human.

Divergence of gene sets with U12-type introns in a given species is almost exclusively due to loss of U12-type introns or mutation toward a U2-type intron (22, 32). Protein alignments of the human and maize isoforms were used to identify U12-type intron positions for maize/human homologs (Fig. S10 and Dataset S4). Approximately one-half of the maize/human homologs (19 of 36 human genes and 26 of 57 maize genes) have at least one U12-type intron in a conserved position within the protein coding sequence (Fig. S11 and Table S3). The U12-type introns are randomly distributed within the coding sequences of both maize and human homologs (Fig. 3F). The intron positions fit the normal distribution (Shapiro–Wilk P > 0.05) with very little skew (human = −0.04; maize = −0.03) but have a low kurtosis (human = −0.53; maize = −0.91). The near-uniform distribution suggests a diversity of coding sequence impacts with most defective transcripts expected to trigger nonsense-mediated decay (NMD). However, rgh3 splicing defects do not appear to significantly alter transcript abundance. For the rgh3 misspliced genes, 86 and 79% of shoot and root transcripts accumulate within a twofold range of WT, respectively (Fig. 3 G and H). Near-equivalent transcripts levels were also observed for misspliced genes with a U12DB human homolog.

Fig. S10.

Fig. S10.

Protein sequence alignment of maize and human E2F3 homologs reveals U12-type intron position divergence. Clustal Omega was used to align the maize and human protein isoforms as well as sequences truncated at the last codon 5′ of the U12-type intron. Truncated proteins were removed from the alignment, and the 5′-codon is indicated by an underlined residue in the alignment. The position of the human U12-type intron is indicated by the blue arrow callout (Hs), and the maize intron is indicated by the orange arrow callout (Zm).

Fig. S11.

Fig. S11.

Protein sequence alignment of maize and human MAEA homologs identifies conserved U12-type intron positions. Clustal Omega was used to align the maize and human protein isoforms as well as sequences truncated at the last codon 5′ of the U12-type intron. Truncated proteins were removed from the alignment, and the 5′-codon is indicated by an underlined residue in the alignment. The position of the human U12-type intron is indicated by the blue arrow callout (Hs), whereas the maize intron is indicated by the orange arrow callout (Zm).

RT-PCR from purified nuclei and polysomes indicates that some misspliced transcripts in rgh3 are likely to be translated. Intron retention and misspliced transcripts copurified with polysomes when the U12-type intron is the last intron of the transcript, such as in maize homologs of the E2F3 cell-cycle transcription factor and TRAPPC2 (Fig. 4 A and B). The U12 splicing defects in GRMZM2G033430 and PQLC2 also copurify with polysomes in rgh3 mutants (Fig. 4C and Fig. S12A). These splice variants may not be NMD targets, because the termination codons are relatively close to the last exon–exon junction. By contrast, transcripts with multiple exon–exon junctions downstream of the U12 splicing defect as well as control transcripts expected to be retained in the nucleus were enriched in nuclei and excluded from polysomes (Fig. 4 D–F and Fig. S12 B–E).

Fig. 4.

Fig. 4.

A subset of rgh3 misspliced transcripts may be translated. (A–H) RT-PCR analysis using RNA extracted from total, nuclei, and polysome fractions of normal and rgh3 samples. U12-type intron regions were amplified from the following: (A) GRMZM2G052515, (B) GRMZM2G097568, (C) GRMZM2G033430, (D) GRMZM2G177026, and (E) GRMZM2G093716. The larger two mRNA variants of Rsp31B (F) and mir156 premiRNA (G) are expected to be excluded from polysomes, whereas actin (H) serves as loading control. All transcript diagrams are drawn to the same scale as in A with U12-type introns indicated. Arrows show forward and reverse primers used for RT-PCR. (I) Transcript and protein diagrams for maize E2F3 genes and their human E2F3a homolog. Retention of the maize U12-type introns introduce premature termination codons (red octagon) that would have minimal effect on E2F protein domains. Retention and cryptic splicing of the U12-type intron in human are predicted NMD targets, whereas skipping exons 4–5 is predicted to produce a nonfunctional protein.

Fig. S12.

Fig. S12.

U12-type intron retention transcripts are primarily retained in the nucleus. Semiquantitative RT-PCR analysis using RNA extracted from total tissue, nuclear enriched, and polysome fractions of normal and rgh3 seedlings. (A) GRMZM2G153434, (B) GRMZM2G106613, (C) GRMZM2G083620, (D) GRMZM2G131321, and (E) GRMZM2G074015. Actin (F) was used as loading control. All transcript diagrams are based on cloned, sequenced RT-PCR products and are drawn to the same scale with 100 bp indicated in A. Arrows on the gene model schematics show forward- and reverse-primer binding sites.

Missplicing of E2F3 was identified as a candidate for mediating human myeloid malignancies observed in ZRSR2 mutant cells (18). U12-type intron retention in the maize E2F3 homologs truncate the C-terminal transactivation domain similar to the endogenous GRMZM2G041701_T02 isoform (Fig. 4I). The predicted E2F proteins from the intron retention transcripts contain all domains necessary for cell proliferation and the endocycle in plants (33). By contrast, the splicing defects observed in human E2F3 are primarily expected to be NMD targets, except for a transcript that skips exons 3–4, which flank the U12-type intron (18). If translated, the human exon skip transcript would produce a nonfunctional E2F3 protein lacking the DP-dimerization and transactivation domains. This difference in U12-type intron position may contribute to the contrasting cell proliferation phenotypes with rgh3 being proliferative in culture and ZRSR2 mutants arresting the cell cycle.

Mutant RGH3 Proteins Disrupt Localization with U2AF2.

It is surprising that rgh3 disrupts a larger proportion of U12-type introns than ZRSR2 mutations in MDS patients. The rgh3 mutant is predicted to encode a weak allele, whereas most ZRSR2 mutations are loss-of-function alleles (1416, 19). The rgh3-umu1 allele has a Mutator transposon insertion that partially splices from the transcript to delete 12 aa and insert 47 aa coded by the transposon sequence in the N-terminal acidic domain of the RGH3 protein (Fig. 5A). The normal Rgh3 allele produces multiple splice variants. Only the Rgh3α isoform encodes a full-length ortholog of ZRSR2 (19). Both the mutant and splice isoforms coding for protein truncations are likely to be expressed as protein in maize, because RGH3 antibodies cross-react with multiple proteins that migrate similarly to in vitro transcribed/translated cDNA clones of the alternatively spliced variants (Fig. S13). We investigated subcellular localization of RGH3 protein variants to gain additional insight into the nature of the rgh3 mutant allele.

Fig. 5.

Fig. 5.

Multiple RGH3 protein domains are necessary for colocalization with U2AF2. (A) Protein domain schematic of RGH3 isoforms and UHM domain deletion tested in colocalization assays. (B–D) Transient colocalization of U2AF2 with mutant RGH3umu1α allele (B), RGH3ΔUHM domain deletion (C), and the RGH3ε isoform (D). (E–H) Transient expression of BiFC constructs: cYFP-U2AF2 with RGH3α-nYFP (E), nYFP-U2AF2 with U2AF1-cYFP (F), nYFP-U2AF2 with cYFP-RGH3umuα (G), and cYFP-U2AF2 with nYFP-RGH3ΔUHM (H). White arrowheads point to nucleolus. [Scale bar: 5 µm (in all microscopy images).]

Fig. S13.

Fig. S13.

Alternatively spliced Rgh3 variants produce truncated proteins in vivo. (A) Schematic of RGH3 protein isoforms based on full-length cDNA sequences. Predicted protein molecular weights are given on the Left. (B) Western blot analysis with an N-terminal anti-RGH3 peptide antibody. The panel on the Left shows in vitro transcription/translation reactions charged with full-length cDNA clones coding for different RGH3 protein isoforms. The arrowhead points to RGH3α and RGH3umu1α full-length proteins. The arrow points to a nonspecific, cross-reactive protein in the wheat germ extract. All RGH3 isoforms migrate through SDS/PAGE slower than predicted. The panel on the Right shows 24 DAP seed tissue protein extracts. Multiple protein bands cross-react with the anti-RGH3 antibody with the arrowhead indicating proteins correlating in mobility to RGH3α and RGH3umu1α in vitro-transcribed/translated isoforms. The asterisk indicates potential RGH3 truncated isoforms. The RGH3β and RGH3γ isoforms are the most common alternative splice variants expressed from the rgh3 locus and migrate similarly to the major low–molecular-weight proteins recognized by the anti-RGH3 antibody.

Human ZRSR2 interacts with the major spliceosome subunit, U2AF2, through a U2AF homology motif (UHM) that is related to RNA recognition motifs but mediates protein–protein interactions (13, 34). These interactions are likely conserved in maize (19). Colocalization is observed when maize RGH3α and U2AF2 are transiently expressed in Nicotiana benthamiana as GFP and RFP fusions (Fig. S14A). U2AF2 is the large subunit of U2 auxiliary factor (U2AF), and maize U2AF2 also colocalizes with the small subunit of U2AF, U2AF1 (Fig. S14B). These data are consistent with maize RGH3α and U2AF2 acting in the same subnuclear compartment.

Fig. S14.

Fig. S14.

Colocalization analysis of multiple RGH3 natural and engineered isoforms with U2AF2. Engineered and truncated protein variants of RGH3 were fused to GFP or RFP and transiently coexpressed with GFP-U2AF2 or U2AF2-RFP in N. benthamiana. (A) RGH3α colocalizes with U2AF2 throughout the nucleoplasm as well as the nucleolus (white arrowhead). This is an independent experiment from figure 9D in ref. 19. (B) U2AF1 and U2AF2 subunits colocalize in the nucleoplasm. (C and D) RGH3umu1α and RGH3ΔUHM show intermittent colocalization with U2AF2 within structures of the nucleoplasm (red arrowhead). (E and F) WT, truncated RGH3 protein variants are concentrated in the nucleolus and fail to colocalize with U2AF2-RFP. [Scale bar: 5 µm (in all images).]

The mutant, RGH3umu1α, protein has aberrant subnuclear localization with low levels of diffuse signal in the nucleoplasm (Fig. 5B) instead of localizing to spliceosomal speckles as seen for RGH3α (Fig. S14A). When RGH3umu1α is coexpressed with U2AF2, the proteins typically localize to different subnuclear compartments. Some cells showed partial overlap of RFP and GFP fusions, indicating reduced colocalization (Fig. S14C). An in-frame deletion of the UHM domain (RGH3ΔUHM) showed intermittent colocalization like RGH3umu1α (Fig. 5C and Fig. S14D). Splice isoforms coding for protein truncations of the RS domain (RGH3ε) or the UHM and RS domain (RGH3β, RGH3γ) did not show overlap with U2AF2 (Fig. 5D and Fig. S14 E and F).

Bimolecular fluorescent complementation (BiFC) assays of U2AF2 with RGH3α, U2AF1, RGH3umu1α, and RGH3ΔUHM all resulted in YFP signal in the nucleus (Fig. 5 E–H and Fig. S15). Similar to colocalization experiments, U2AF2 and RGH3α showed YFP signal in nuclear speckles and the nucleolus, whereas U2AF1 and U2AF2 had YFP signal in nuclear speckles. By contrast, BiFC signals from U2AF2 with RGH3umu1α or RGH3ΔUHM appear aggregated in larger subnuclear foci. Reconstitution of YFP in BiFC assays is irreversible with transient interactions able to give stable YFP signal (35). These data show that U2AF1, U2AF2, and RGH3 colocalize as predicted from human protein–protein interaction studies. Mutations or truncations affecting the acidic, UHM, or RS-domain all disrupt the dynamic colocalization of RGH3 with U2AF2 equivalently. Combined with the RNA-seq results, these localization experiments suggest that rgh3-umu1 is more likely a strong allele and indicate an important role for the acidic domain of ZRSR2/RGH3 in U12 splicing.

Fig. S15.

Fig. S15.

Additional BiFC images supporting colocalization of RGH3 variants and U2AF2. BiFC signal is observed when the N- and C-terminal segments of the split YFP are swapped in two alternate combinations of the RGH3α and U2AF2 fusions as well as one alternate combination of RGH3ΔUHM and U2AF2 fusions. (Scale bars: 5 µm.)

Discussion

RGH3/ZRSR2 Are U12 Splicing Factors in Vivo.

Our data provide an independent genetic analysis of ZRSR2 function in a distantly related species from humans. RGH3/ZRSR2 has a conserved role in splicing U12-type intron-containing genes. ZRSR2 and rgh3 mutants exhibit U12-type intron retention, activation of cryptic 5′- and 3′-splice sites within U12-type introns, and retention of U2-type introns that are adjacent to misspliced U12-type introns.

A near-exclusive in vivo function for ZRSR2/RGH3 in U12 splicing contradicts biochemical experiments that conclusively show ZRSR2 copurifies and is required in both spliceosomes (12, 13). It is possible that copurification of ZRSR2 and U2AF in human cell extracts is due to independent binding of common pre-mRNA species. However, ZRSR2 interacts with U2AF2 in yeast two-hybrid assays (13). Yeast lacks U12-type introns, suggesting direct protein–protein interactions are more likely between these spliceosome subunits. In maize, colocalization of RGH3 with U2AF2 appears to be an indicator of RGH3 function in the minor spliceosome. Potentially, interactions between the major and minor spliceosomes promote efficient splicing of either class of intron.

Minor Splicing as a Regulatory Process.

Minor splicing factors are at low abundance, and U12 splicing can be a rate-limiting step to produce protein-coding mRNA (36). For example, U6atac levels in HeLa cells affect splicing efficiency as well as expression level of genes with U12-type introns, and the minor spliceosome was proposed to regulate cellular responses to stimuli (4). Under this model, minor spliceosome activity determines the balance of coding mRNA vs. NMD or translation into alternative protein isoforms (4). We found little evidence of splicing defects resulting in expression level changes of maize U12-type intron-containing genes. Similarly, Drosophila U6atac mutants have little impact on U12-type intron-containing gene expression levels (37). In maize, reduced U12 splicing generally leads to predicted NMD targets being retained in the nucleus. A smaller subset of U12-type intron retention transcripts in rgh3 are associated with polysomes and are likely to be translated into alternative protein isoforms.

Although different molecular mechanisms seem to act downstream of U12 splicing efficiency in maize and humans, there are a substantial number of homologous genes with U12-type introns. We found 233 maize U12-type intron-containing genes with easily identified human homologs representing 154 human genes. Nearly 25% of the maize–human homolog pairs contain U12-type introns in both species (57 of 233 for maize genes or 36 of 154 human homologs). There are also overlapping functions among nonhomologous genes subject to U12 splicing. These nonhomologous overlapping genes are subunits of the same protein complexes, members of conserved gene families, or members of the same metabolic pathways. These overlaps suggest selection for specific biological processes, such as cell cycle and protein glycosylation, to be dependent upon U12 splicing efficiency and support the idea that the minor spliceosome could be regulatory.

Minor Splicing Is Required for Cell Differentiation.

ZRSR2 and rgh3 mutants both disrupt cell differentiation programs. MDS patients with ZRSR2 mutations accumulate myeloid blast precursors (18). Maize rgh3 endosperm retains proliferative capacity and shows cell fate switching to aleurone in the basal endosperm transfer cell layer and the embryo-surrounding region (19). Mutations in other minor spliceosome factors lead to developmental defects in many species. In humans, Taybi–Linder syndrome is caused by mutations in U4atac, resulting in severe bone abnormalities and microcephaly (6, 7). Reduced U12 splicing efficiency may also be the primary molecular cause of spinal muscular atrophy, which affects the peripheral nervous system (38, 39). Zebrafish mutations in RNPC3 lead to defects in endodermal organ development and aberrant intestinal epithelium morphology (8). Drosophila mutants in U6atac and U12 snRNA are lethal in third-instar larvae, when adult metamorphosis occurs (5). Artificial microRNA (amiRNA) down-regulation of Arabidopsis U12 splicing factors show defective leaf morphology and arrest inflorescence development (911). These phenotypes all suggest that reduced U12 splicing is not immediately lethal to the cell but rather disrupts essential developmental processes.

Despite the many U12 splicing phenotypes reported, no unifying developmental function has been ascribed to minor splicing. By focusing on individual tissues such as endosperm or blood, a common function for Rgh3 and ZRSR2 in cell differentiation becomes clear (18, 19). Extending to a more general interpretation of U12 splicing mutant phenotypes, the data suggest a role for U12 splicing to promote differentiation of a subset of cell types in both plants and animals. It is unlikely for U12 splicing to be needed in all differentiation processes, because the minor spliceosome has been lost in multiple eukaryotic lineages (2).

Conservation of Minor Splicing During Evolution.

The losses of minor splicing during evolution raise the possibility that the minor spliceosome evolved in a convergent manner to function in plant and animal cell differentiation. However, there appears to be more selective pressure to maintain U12 splicing in multicellular eukaryotes. Although the minor spliceosome is missing in some lineages with multicellular development, a larger fraction of unicellular eukaryotic genomes have lost U12 splicing (2, 40). The few unicellular species that retain the minor spliceosome, such as Acanthamoeba castellani, tend to have amoeboid cellular organization (2, 40). The A. castellani genome sequence revealed a high frequency of horizontal gene transfer, including potential eukaryote-to-eukaryote gene transfer that would confound deep evolutionary comparisons (41).

Volvocine green algae illustrate convergent evolutionary innovation in multicellular development. Volvocines do not have a minor spliceosome and independently evolved multicellular species relative to higher plants (2, 42). Recent genome comparisons within this clade found that cell cycle genes are expanded in multicellular species with mutation of retinoblastoma as the primary change needed for unicellular species to become multicellular (41). In addition, genes encoding glycosylated extracellular matrix proteins are expanded in Volvox carteri, which has two distinct cell types. Both processes are highly represented in maize U12-type intron-containing genes. However, neither retinoblastoma nor the volvocine-type extracellular matrix proteins have U12-type introns in any species (21, 25).

Importantly, the U12-type introns affected in rgh3 and ZRSR2 mutants were maintained since the divergence of plants and animals (32). The number of U12-type introns in the last eukaryotic common ancestor is not known, but all lineages with a minor spliceosome have a small fraction of genes with U12-type introns. Assuming random loss of introns, the number of homologous genes that still have U12-type introns in human and maize is large, especially with one-half of the homologs having divergent U12-type intron positions. These observations argue for selective pressure to maintain a subset of genes with U12-type introns to carry out essential roles in cell differentiation.

Materials and Methods

RNA-Seq.

Normal and rgh3 kernels in the W22 inbred background were sown in soil (29 °C/20 °C, 12 h/12 h, day/night). Individual seedlings were harvested when about 5 cm high. Normal seedlings were genotyped to select homozygous WT individuals. Total mRNA was extracted from three biological replicates of rgh3 and WT roots and shoots using TRIzol Reagent (Life Technologies) with RNase-free glycogen (Fermentas) as an RNA carrier. Nonstrand-specific TruSeq (Illumina) cDNA libraries were built from 2 μg of RNA input with a 200-bp median insert length. Twelve libraries were pooled using the KAPA Library Quantification Kit (Kapa Biosystems) and sequenced on the HiSeq 2000 platform with 100-bp paired-end reads.

Read Mapping.

Bases with a quality score <20 were trimmed with the FASTQ/A utility in the FASTX toolkit (hannonlab.cshl.edu/fastx_toolkit/). Low-quality reads were discarded if >20% of bases had a quality score <20. Reads from repetitive elements in the MIPS database (43) were removed using Bowtie (44), version 0.12.7. Single-nucleotide polymorphisms (SNPs) between W22 and B73 RefGen_v2 genome were identified by mapping WT libraries to the B73 RefGen_v2 genome (45) (Dataset S6). Mosaik 2, version 2.1.33, was run with a hash size of 14, up to 100 hash positions per seed, and allowing 2% mismatch (46). Duplicate reads were removed using Picard, version 1.54 (broadinstitute.github.io/picard/). W22 SNPs were identified with FreeBayes, version 0.9.5, assuming 2% pairwise differences with B73 and requiring a 99% confidence statistic that an individual variant exists (47). FreeBayes parameters included at least 10 mapped reads with >40% of reads supporting the variant and at least one read having a minimum quality score of 20.

GSNAP, version 2012-5-24, with an IIT file containing W22/B73 SNPs, was used to uniquely align all RNA-seq libraries with B73 RefGen_v2 (48). Alignment parameters allowed 2% mismatch and mate pairs to deviate 800 bp from the expected paired-end length of 200 bp. GSNAP was guided by the ZmB73_5b filtered gene set annotations but was permitted to discover novel splicing events. Transcript isoforms were assembled using Cufflinks, version 1.3.0 (20). Each library was assembled independently with a minimum intron size of 10 bp and an overlap radius of 10 bp. Isoform expression analysis was restricted to the ZmB73_5b filtered gene set. Differential splicing was tested for roots and shoots independently using Cuffdiff with the bias detection and correction algorithm and upper-quartile normalization.

Tests for Altered Splicing of Individual Introns.

Nonredundant introns were identified from the ZmB73_5b annotation. HTSeq, version 0.6.1p1 (49), tallied intron read counts, which were normalized by the fragments per kilobase of transcript per million mapped reads (FPKM) of the gene model in each library. Differences in intron reads were determined by a two-tailed t test with the Welch modification for degrees of freedom with no correction for multiple testing. Genes that were not expressed in every RNA-seq library were excluded from the analysis. Genomic coordinates for U12-type introns were identified by BLASTN searches against the B73 RefGen_v2 genome using sequences from the ERISdb database (21, 45).

PSO was calculated from exon–exon junction reads and reads mapping to shortest nonredundant intron for each unique 5′- and 3′-splice site (50). Read counts were calculated with the intersectBed command from Bedtools, version 2.24.0 (51). Fisher’s exact test was calculated for each intron with the summed reads across all WT and all mutant libraries and a false-discovery rate of 0.05 using the Benjamini–Hochberg method.

RT-PCR.

RNA from 12 d after pollination (DAP) whole kernels was isolated by freezing rgh3 and normal kernels in liquid nitrogen with extraction using TRIzol Reagent followed by RNeasy RNA cleanup with on-column DNase I digestion (Qiagen). Embryo, starchy endosperm, and basal endosperm transfer cell-enriched tissue was hand dissected from 12 DAP whole kernels at −18 °C (52). RNA from these tissues was extracted with the Arcturus PicoPure RNA isolation kit (Applied Biosystems) with on-column DNase I digestion. Endosperm cultures were established as described (19). RNA was extracted from endosperm culture, seedling roots, and seedling shoots using the RNeasy Plant Mini Kit (Qiagen) with on-column DNase I digestion. Total RNA was reverse-transcribed using the SuperScript III First-Strand Synthesis System (Invitrogen) with oligo(dT)20 as primer.

Gene-specific primers were designed to distinguish multiple splice variants from individual genes using one primer pair. For genes predicted by Cufflinks to have splicing differences in rgh3, annotated alternative transcripts along with raw read coverage depth, visualized by the Integrated Genome Browser (53), were used to target specific regions for RT-PCR. Amplification conditions consisted of the following: 1-min denaturation at 95 °C; 30–31 cycles of 95 °C for 1 min, 55 °C for 1 min, 72 °C for 30 s, and a final extension step of 72 °C for 5 min. RT-PCR products were cloned into the pCR4-TOPO vector for Sanger sequencing.

Conserved Protein Domain Enrichment Analysis.

All predicted protein isoforms in the maize filtered gene set were queried with HMMER 3.0 against the Pfam database using an inclusion threshold of 0.01 (54, 55). Hypergeometric P values were calculated for three samples and populations (Dataset S3): (i) rgh3 misspliced genes relative to all expressed U12-type intron-containing genes, (ii) rgh3 misspliced genes relative to all expressed maize genes, and (iii) all U12-type intron-containing genes relative to all genes in the filtered gene set. A gene was considered expressed if at least one intron was tested for normalized intron read depth differences between rgh3 and WT samples. Pfam domains were manually categorized into biological functions based on the annotation of each domain.

Human Homolog Analysis.

The B73 RefGen_v3 maize protein isoforms for the 408 genes in ERISdb with U12-type introns were downloaded from MaizeGDB (21, 56). Each protein isoform was queried against the human RefSeq protein database using BLASTP. Human homologs were defined as the best match with a minimum bit score of 80 (Dataset S4). National Center for Biotechnology Information (NCBI) Gene database unique identifiers were retrieved for each human RefSeq accession along with the corresponding HUGO Gene Nomenclature Committee (HGNC) human gene symbols (57). Human gene symbols containing U12-type introns were identified using the “intron FASTA” query at U12DB to download the complete set of U12-type introns (25). ENSEMBL gene identifiers were parsed from the FASTA file, and identifiers that were still current were converted to NCBI Gene database unique identifiers and human gene symbols using DAVID (58, 59). Predicted protein sequences from retired ENSEMBL gene identifiers were used to query the human RefSeq protein database using BLASTP. Intron sequences that did not have an ENSEMBL identifier were queried against the NCBI human genomic plus transcript database using BLASTN. The complete list of U12DB intron identifiers with current NCBI Gene database unique identifiers and human gene symbols is given in Dataset S5. The human gene symbols from maize–human homologs were used to cross-reference the human gene symbols identified from U12DB to determine maize and human genes that both contained a U12-type intron (Table S3).

To determine U12-type intron positions relative to maize protein sequences, the 5′-exon at ERISdb was queried with BLASTX against the maize protein isoforms from the B73 RefGen_v3 annotation. Human protein isoforms were downloaded from the Consensus CDS protein set at the NCBI CCDS database (60). The 5′-exon sequence given at U12DB was used for pattern match searches of the consensus transcripts. The relative amino acid position of the U12-type intron in the maize and human homologs were determined using Clustal Omega (61). All maize protein isoforms related to each human protein were aligned along with protein isoforms that were truncated at the last complete codon in the exon 5′ of the U12-type intron. The U12-type intron position was scored as identical if the C-terminal residues of the maize and human truncated isoforms were at the same position in the alignment. Normalized U12-type intron positions were determined based on the amino acid residues for each protein isoform and then averaged by gene. When a protein had more than one U12-type intron, normalized positions were calculated independently for each intron. U12-type intron retention models were constructed for E2F gene family members using the first annotated transcript and protein isoform for each gene.

Isolation of Polysomes.

Normal and rgh3 seedlings were collected at 3- to 5-cm stage. For each biological replicate, seedlings from two self-pollinated families were combined, flash frozen, and pulverized in liquid nitrogen. Total, nuclei, and polysomal RNA fractions were extracted from 2.5–3 mL of ground tissue using the conventional isolation of plant polysomes protocol as described (62). Initial pellets to clarify the tissue extract were saved as the nuclear fraction, and 500 µL of the clarified supernatant was saved as the total RNA fraction. RNA from all fractions was extracted with Qiagen RNeasy kits and treated with Ambion TURBO DNA-free kit (Thermo Fisher Scientific) following the manufacturer’s instructions. RT-PCR was completed from 1 μg of RNA using M-MLV reverse transcriptase (Promega) to synthesize cDNA as described (63). PCR cycles were optimized for each gene.

Subcellular Localization.

The RGH3-GFP and U2AF2 fluorescent protein fusion constructs were previously described (19). Rgh3ΔUHM was cloned from Rgh3α by amplifying an N-terminal fragment from the start codon to the end of the first zinc finger using primers Rgh3ΔUHM-1 and Rgh3ΔUHM-2 (Table S4). A C-terminal fragment was amplified from the second zinc finger to the C-terminal stop codon using primers Rgh3ΔUHM-3 and Rgh3ΔUHM-4. The fragments were ligated by overlap extension PCR (64). U2AF2 and U2AF1 were amplified from leaf cDNA of normal 14-d-old seedlings. The RT-PCR products were subcloned into pENTR vector (Invitrogen) and then cloned into pB7-WGF2, pB7-RWG2, or pB7-FWG2 (65, 66) as described (19). BiFC constructs were created by transferring coding sequences from pENTR vector clones to pSAT4-DEST or pSAT5-DEST vectors (67, 68) by LR clonase reactions following manufacturer’s instructions (Invitrogen). The recombined transcription cassettes were digested with I-CeuI or I-SceI and ligated into the pPZP-RCS2 binary vector (67). Transient expression in Nicotiana benthamia and microscopy analyses were completed as described with some modifications (19). Binary vectors were transformed into Agrobacterium tumefaciens strain GV3101. Protein expression was visualized 24–48 h after transient transformation. YFP was excited at 514 nm and detected with an emission band of 525–565 nm.

Table S4.

Primers used in this study

Primer name Sequence
RT-PCR primers
 Actin-L CATGAGGCCACGTACAACTCCATC
 Actin-R TCATACTCTCCCTTGGAGATCCAC
 GRMZM2G011636-L CTTCCATTGTCGGAGGGATTAG
 GRMZM2G011636-R GGTCACACAAAGTCAAATAGCAAAC
 GRMZM2G021272-L GGATTTGTTTGTCGCCAACT
 GRMZM2G021272-R AGCCCGTATAGCATCATTGC
 GRMZM2G033430-L CATGTGCGTGTCAGTTACAAATATC
 GRMZM2G033430-R GATGAGAGGTCCACCATCC
 GRMZM2G040401-L ATGCTTTCATAGGTTTAGGCTTCC
 GRMZM2G040401-R CCTTGGCGCTTGTCTTTTC
 GRMZM2G052515-L CTCCTCCAAGGCCTACACAG
 GRMZM2G052515-R ATCCTCCGCGTTGAATTTCT
 GRMZM2G074015-L TGGGAGTCAGCCATTCTTCTATA
 GRMZM2G074015-R CCTGGAGCAAAGTACTGGATAC
 GRMZM2G083620-L GTTGTGGGTGATGATGGTTCT
 GRMZM2G083620-R CTGCTCTTCTGCTCTGCTAG
 GRMZM2G093716-L GGAAAGTTGCGGTTGTCAAT
 GRMZM2G093716-R TCTGGCATTTGAGTGAAGGA
 GRMZM2G096600-L CAAGTTGCCTGAAGAAATACTGC
 GRMZM2G096600-R TCTGGTGGAAAGAAGACTCCT
 GRMZM2G097568-L TGTTCAGGATCTAGCATGGACA
 GRMZM2G097568-R GCAGGTAAAGCGGGTTGA
 GRMZM2G103152-L TGTCTTTGCTGAAAAGGAGAACTTA
 GRMZM2G103152-R ATGCTCAATGCCATCAATAACAGA
 GRMZM2G106613-L CCCAGAAAAACACCAATGCTTC
 GRMZM2G106613-R ATGGCACGCATCTTTGCT
 GRMZM2G130432-L GCATTTAGGCGGCGTGT
 GRMZM2G130432-R AGTTATCTGTGTCAGCACATTGATC
 GRMZM2G131321-L AGCGAAGGTTACCCCAAAG
 GRMZM2G131321-R GGAGGGTGCGTGAATAGG
 GRMZM2G133028-L AGTTACTGCTATCATCGTTGTTCC
 GRMZM2G133028-R ACCTGAACCAGTTAGAAAAGAGTG
 GRMZM2G153434-L TAGAATCGGGAGGATGTGTTTTG
 GRMZM2G153434-R CCAAGCAGCAACCAGTGA
 GRMZM2G177026-L AGAATTGCAGCGTGTCACAG
 GRMZM2G177026-R AGCTTCGAGTGGTGCTGTTT
 GRMZM2G306935-L GTTTGTCGGTGGTTTATTTGTCAG
 GRMZM2G306935-R CCTTTTGTGCCAGCAATGTG
 GRMZM2G408305-L GTGGACATTGATGATGCCGATA
 GRMZM2G408305-R AGCCTGACATCTTCATGGAAATAAC
 GRMZM2G416751-L GACTGGACCTGGTCTGTG
 GRMZM2G416751-R CCATTTGAAGCATCCTCAAGC
 GRMZM2G587327-L CGACTCCTGGGTCTTCAC
 GRMZM2G587327-R CCTTTGCCTGACGACGTT
 GRMZM5G820727-L CTTGAGCGAAGAGCTTCAGAA
 GRMZM5G820727-R CGTTTCATTGTTGTCTGTAACTTCC
 mir156-L GCACACACACAACCTGTTCA
 mir156-R CAGATGGGCTTGATGAGTGA
 Rsp31B-L GGATTTGTTTGTCGCCAACT
 Rsp31B-R AGCCCGTATAGCATCATTGC
 Ubiquitin-L TAAGCTGCCGATGTGCCTGCG
 Ubiquitin-R CTGAAAGACAGAACATAATGAGCACA
Recombinant cloning
 Rgh3ΔUHM-1 ACAAGATGGAAGGCGGCCATATGCGG
 Rgh3ΔUHM-2 CCGCATATGGCCGCCTTCCATCTTGTTGATTTATCAGGGTAAAAGTG
 Rgh3ΔUHM-3 CCCTGATAAATCAACAAGATGGAAGGCGGCC
 Rgh3ΔUHM-4 TGATTTATCAGGGTAAAAGTG
 U2AF1-L CACCATGGCTGAGCATCTTGCGTCCATCTTTG
 U2AF1-R TTTCACCTGAGCTGCCTCCCGCTCGCGGTTC
 U2AF2-L CACCATGTCCGAGTACGACGAGCGCTACC
 U2AF2-R TTTAAGCCCGATTGCCAATATTGCATAGGG

Protein Analyses.

Rabbit polyclonal N-terminal RGH3 antibodies was raised against the synthetic peptide: SAQEVLDKVAQETPNFGTE (Bio Synthesis). Total protein from seed tissue was extracted as described (69) except that the fresh tissue extracts were cleared with a single filtration step. For the in vitro transcription/translation, ORFs from described Rgh3 variants (19) were amplified with Phusion polymerase (Finnzymes) using primers containing an N-terminal 6×His tag and restriction sites for Sgf1 (Promega) and Pme1 (New England Bio Labs). PCR products were digested and ligated into pF3AWG vectors (Promega) and expressed using the TnT SP6 High-Yield Wheat Germ Protein Expression System (Promega) following the manufacturer’s instructions.

Supplementary Material

Supplementary File
Supplementary File
pnas.1616173114.sd02.xlsx (14.3MB, xlsx)
Supplementary File
pnas.1616173114.sd03.xlsx (57.1KB, xlsx)
Supplementary File
pnas.1616173114.sd04.xlsx (27.5KB, xlsx)
Supplementary File
pnas.1616173114.sd05.xlsx (69.4KB, xlsx)
Supplementary File

Acknowledgments

We thank Yuqing Xiong and Byung-Ho Kang for technical assistance in dissecting endosperm tissues. This work was supported by National Science Foundation (NSF) Graduate Research Fellowship DGE-0802270 (to C.M.G.), NSF Awards MCB-1412218 and IOS-1031416, the University of Florida Genetics Institute Center for RNA Research, and the Vasil–Monsanto Endowment.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Data deposition: The RNA-seq data files reported in this paper have been deposited in the Gene Expression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE57466).

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1616173114/-/DCSupplemental.

References

  • 1.Braunschweig U, Gueroussov S, Plocik AM, Graveley BR, Blencowe BJ. Dynamic integration of splicing within gene regulatory pathways. Cell. 2013;152(6):1252–1269. doi: 10.1016/j.cell.2013.02.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Dávila López M, Rosenblad MA, Samuelsson T. Computational screen for spliceosomal RNA genes aids in defining the phylogenetic distribution of major and minor spliceosomal components. Nucleic Acids Res. 2008;36(9):3001–3010. doi: 10.1093/nar/gkn142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Turunen JJ, Niemelä EH, Verma B, Frilander MJ. The significant other: Splicing by the minor spliceosome. Wiley Interdiscip Rev RNA. 2013;4(1):61–76. doi: 10.1002/wrna.1141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Younis I, et al. Minor introns are embedded molecular switches regulated by highly unstable U6atac snRNA. eLife. 2013;2:e00780. doi: 10.7554/eLife.00780. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Otake LR, Scamborova P, Hashimoto C, Steitz JA. The divergent U12-type spliceosome is required for pre-mRNA splicing and is essential for development in Drosophila. Mol Cell. 2002;9(2):439–446. doi: 10.1016/s1097-2765(02)00441-0. [DOI] [PubMed] [Google Scholar]
  • 6.Edery P, et al. Association of TALS developmental disorder with defect in minor splicing component U4atac snRNA. Science. 2011;332(6026):240–243. doi: 10.1126/science.1202205. [DOI] [PubMed] [Google Scholar]
  • 7.He H, et al. Mutations in U4atac snRNA, a component of the minor spliceosome, in the developmental disorder MOPD I. Science. 2011;332(6026):238–240. doi: 10.1126/science.1200587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Markmiller S, et al. Minor class splicing shapes the zebrafish transcriptome during development. Proc Natl Acad Sci USA. 2014;111(8):3062–3067. doi: 10.1073/pnas.1305536111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Kim WY, et al. The Arabidopsis U12-type spliceosomal protein U11/U12-31K is involved in U12 intron splicing via RNA chaperone activity and affects plant development. Plant Cell. 2010;22(12):3951–3962. doi: 10.1105/tpc.110.079103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Jung HJ, Kang H. The Arabidopsis U11/U12-65K is an indispensible component of minor spliceosome and plays a crucial role in U12 intron splicing and plant development. Plant J. 2014;78(5):799–810. doi: 10.1111/tpj.12498. [DOI] [PubMed] [Google Scholar]
  • 11.Xu T, Kim BM, Kwak KJ, Jung HJ, Kang H. The Arabidopsis homolog of human minor spliceosomal protein U11-48K plays a crucial role in U12 intron splicing and plant development. J Exp Bot. 2016;67(11):3397–3406. doi: 10.1093/jxb/erw158. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Shen H, Zheng X, Luecke S, Green MR. The U2AF35-related protein Urp contacts the 3′ splice site to promote U12-type intron splicing and the second step of U2-type intron splicing. Genes Dev. 2010;24(21):2389–2394. doi: 10.1101/gad.1974810. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Tronchère H, Wang J, Fu XD. A protein related to splicing factor U2AF35 that interacts with U2AF65 and SR proteins in splicing of pre-mRNA. Nature. 1997;388(6640):397–400. doi: 10.1038/41137. [DOI] [PubMed] [Google Scholar]
  • 14.Yoshida K, et al. Frequent pathway mutations of splicing machinery in myelodysplasia. Nature. 2011;478(7367):64–69. doi: 10.1038/nature10496. [DOI] [PubMed] [Google Scholar]
  • 15.Damm F, et al. Groupe Francophone des Myélodysplasies Mutations affecting mRNA splicing define distinct clinical phenotypes and correlate with patient outcome in myelodysplastic syndromes. Blood. 2012;119(14):3211–3218. doi: 10.1182/blood-2011-12-400994. [DOI] [PubMed] [Google Scholar]
  • 16.Thol F, et al. Frequency and prognostic impact of mutations in SRSF2, U2AF1, and ZRSR2 in patients with myelodysplastic syndromes. Blood. 2012;119(15):3578–3584. doi: 10.1182/blood-2011-12-399337. [DOI] [PubMed] [Google Scholar]
  • 17.Cazzola M, Della Porta MG, Malcovati L. The genetic basis of myelodysplasia and its clinical relevance. Blood. 2013;122(25):4021–4034. doi: 10.1182/blood-2013-09-381665. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Madan V, et al. Aberrant splicing of U12-type introns is the hallmark of ZRSR2 mutant myelodysplastic syndrome. Nat Commun. 2015;6:6042. doi: 10.1038/ncomms7042. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Fouquet R, et al. Maize rough endosperm3 encodes an RNA splicing factor required for endosperm cell differentiation and has a nonautonomous effect on embryo development. Plant Cell. 2011;23(12):4280–4297. doi: 10.1105/tpc.111.092163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–515. doi: 10.1038/nbt.1621. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Szcześniak MW, Kabza M, Pokrzywa R, Gudyś A, Makałowska I. ERISdb: A database of plant splice sites and splicing signals. Plant Cell Physiol. 2013;54(2):e10. doi: 10.1093/pcp/pct001. [DOI] [PubMed] [Google Scholar]
  • 22.Burge CB, Padgett RA, Sharp PA. Evolutionary fates and origins of U12-type introns. Mol Cell. 1998;2(6):773–785. doi: 10.1016/s1097-2765(00)80292-0. [DOI] [PubMed] [Google Scholar]
  • 23.Chang WC, Chen YC, Lee KM, Tarn WY. Alternative splicing and bioinformatic analysis of human U12-type introns. Nucleic Acids Res. 2007;35(6):1833–1841. doi: 10.1093/nar/gkm026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Chen J, et al. Dynamic transcriptome landscape of maize embryo and endosperm development. Plant Physiol. 2014;166(1):252–264. doi: 10.1104/pp.114.240689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Alioto TS. U12DB: A database of orthologous U12-type spliceosomal introns. Nucleic Acids Res. 2007;35(Database issue):D110–D115. doi: 10.1093/nar/gkl796. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Lahmy S, et al. QQT proteins colocalize with microtubules and are essential for early embryo development in Arabidopsis. Plant J. 2007;50(4):615–626. doi: 10.1111/j.1365-313X.2007.03072.x. [DOI] [PubMed] [Google Scholar]
  • 27.Soni S, Bala S, Kumar A, Hanspal M. Changing pattern of the subcellular distribution of erythroblast macrophage protein (Emp) during macrophage differentiation. Blood Cells Mol Dis. 2007;38(1):25–31. doi: 10.1016/j.bcmd.2006.09.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Symoens S, et al. Genetic defects in TAPT1 disrupt ciliogenesis and cause a complex lethal osteochondrodysplasia. Am J Hum Genet. 2015;97(4):521–534. doi: 10.1016/j.ajhg.2015.08.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Li HJ, et al. POD1 regulates pollen tube guidance in response to micropylar female signaling and acts in early embryo patterning in Arabidopsis. Plant Cell. 2011;23(9):3288–3302. doi: 10.1105/tpc.111.088914. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Porch TG, Tseung CW, Schmelz EA, Settles AM. The maize Viviparous10/Viviparous13 locus encodes the Cnx1 gene required for molybdenum cofactor biosynthesis. Plant J. 2006;45(2):250–263. doi: 10.1111/j.1365-313X.2005.02621.x. [DOI] [PubMed] [Google Scholar]
  • 31.Suzuki M, et al. The maize viviparous15 locus encodes the molybdopterin synthase small subunit. Plant J. 2006;45(2):264–274. doi: 10.1111/j.1365-313X.2005.02620.x. [DOI] [PubMed] [Google Scholar]
  • 32.Lin CF, Mount SM, Jarmołowski A, Makałowski W. Evolutionary dynamics of U12-type spliceosomal introns. BMC Evol Biol. 2010;10:47. doi: 10.1186/1471-2148-10-47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Magyar Z, et al. Arabidopsis E2FA stimulates proliferation and endocycle separately through RBR-bound and RBR-free complexes. EMBO J. 2012;31(6):1480–1493. doi: 10.1038/emboj.2012.13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Kielkopf CL, Rodionova NA, Green MR, Burley SK. A novel peptide recognition mode revealed by the X-ray structure of a core U2AF35/U2AF65 heterodimer. Cell. 2001;106(5):595–605. doi: 10.1016/s0092-8674(01)00480-9. [DOI] [PubMed] [Google Scholar]
  • 35.Hu CD, Chinenov Y, Kerppola TK. Visualization of interactions among bZIP and Rel family proteins in living cells using bimolecular fluorescence complementation. Mol Cell. 2002;9(4):789–798. doi: 10.1016/s1097-2765(02)00496-3. [DOI] [PubMed] [Google Scholar]
  • 36.Patel AA, McCarthy M, Steitz JA. The splicing of U12-type introns can be a rate-limiting step in gene expression. EMBO J. 2002;21(14):3804–3815. doi: 10.1093/emboj/cdf297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Pessa HK, et al. Gene expression profiling of U12-type spliceosome mutant Drosophila reveals widespread changes in metabolic pathways. PLoS One. 2010;5(10):e13215. doi: 10.1371/journal.pone.0013215. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lotti F, et al. An SMN-dependent U12 splicing event essential for motor circuit function. Cell. 2012;151(2):440–454. doi: 10.1016/j.cell.2012.09.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Doktor TK, et al. RNA-sequencing of a mouse-model of spinal muscular atrophy reveals tissue-wide changes in splicing of U12-dependent introns. Nucleic Acids Res. 2016;45(1):395–416. doi: 10.1093/nar/gkw731. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Russell AG, Charette JM, Spencer DF, Gray MW. An early evolutionary origin for the minor spliceosome. Nature. 2006;443(7113):863–866. doi: 10.1038/nature05228. [DOI] [PubMed] [Google Scholar]
  • 41.Clarke M, et al. Genome of Acanthamoeba castellanii highlights extensive lateral gene transfer and early evolution of tyrosine kinase signaling. Genome Biol. 2013;14(2):R11. doi: 10.1186/gb-2013-14-2-r11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Hanschen ER, et al. The Gonium pectorale genome demonstrates co-option of cell cycle regulation during the evolution of multicellularity. Nat Commun. 2016;7:11370. doi: 10.1038/ncomms11370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Nussbaumer T, et al. MIPS PlantsDB: A database framework for comparative plant genome research. Nucleic Acids Res. 2013;41(Database issue):D1144–D1151. doi: 10.1093/nar/gks1153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10(3):R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Monaco MK, et al. Gramene 2013: Comparative plant genomics resources. Nucleic Acids Res. 2014;42(Database issue):D1193–D1199. doi: 10.1093/nar/gkt1110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Lee WP, et al. MOSAIK: A hash-based algorithm for accurate next-generation sequencing short-read mapping. PLoS One. 2014;9(3):e90581. doi: 10.1371/journal.pone.0090581. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Garrison E, Marth G. 2012. Haplotype-based variant detection from short-read sequencing. arXiv:1207.3907.
  • 48.Wu TD, Nacu S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinformatics. 2010;26(7):873–881. doi: 10.1093/bioinformatics/btq057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Anders S, Pyl PT, Huber W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31(2):166–169. doi: 10.1093/bioinformatics/btu638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Katz Y, Wang ET, Airoldi EM, Burge CB. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat Methods. 2010;7(12):1009–1015. doi: 10.1038/nmeth.1528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Quinlan AR, Hall IM. BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Xiong Y, Li Q, Kang B, Chourey P. Discovery of genes expressed in basal endosperm transfer cells in maize using 454 transcriptome sequencing. Plant Mol Biol Report. 2011;29(4):835–847. [Google Scholar]
  • 53.Nicol JW, Helt GA, Blanchard SG, Jr, Raja A, Loraine AE. The Integrated Genome Browser: Free software for distribution and exploration of genome-scale datasets. Bioinformatics. 2009;25(20):2730–2731. doi: 10.1093/bioinformatics/btp472. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Eddy SR. A new generation of homology search tools based on probabilistic inference. Genome Inform. 2009;23(1):205–211. [PubMed] [Google Scholar]
  • 55.Finn RD, et al. The Pfam protein families database: Towards a more sustainable future. Nucleic Acids Res. 2016;44(D1):D279–D285. doi: 10.1093/nar/gkv1344. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Andorf CM, et al. MaizeGDB update: New tools, data and interface for the maize model organism database. Nucleic Acids Res. 2016;44(D1):D1195–D1201. doi: 10.1093/nar/gkv1007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Gray KA, Seal RL, Tweedie S, Wright MW, Bruford EA. A review of the new HGNC gene family resource. Hum Genomics. 2016;10:6. doi: 10.1186/s40246-016-0062-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Huang D-W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57. doi: 10.1038/nprot.2008.211. [DOI] [PubMed] [Google Scholar]
  • 59.Huang D-W, Sherman BT, Lempicki RA. Bioinformatics enrichment tools: Paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Res. 2009;37(1):1–13. doi: 10.1093/nar/gkn923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Farrell CM, et al. Current status and new features of the Consensus Coding Sequence database. Nucleic Acids Res. 2014;42(Database issue):D865–D872. doi: 10.1093/nar/gkt1059. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Sievers F, et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol. 2011;7:539. doi: 10.1038/msb.2011.75. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Mustroph A, Juntawong P, Bailey-Serres J. Isolation of plant polysomal mRNA by differential centrifugation and ribosome immunopurification methods. Methods Mol Biol. 2009;553:109–126. doi: 10.1007/978-1-60327-563-7_6. [DOI] [PubMed] [Google Scholar]
  • 63.Bai F, Reinheimer R, Durantini D, Kellogg EA, Schmidt RJ. TCP transcription factor, BRANCH ANGLE DEFECTIVE 1 (BAD1), is required for normal tassel branch angle formation in maize. Proc Natl Acad Sci USA. 2012;109(30):12225–12230. doi: 10.1073/pnas.1202439109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Horton RM, Hunt HD, Ho SN, Pullen JK, Pease LR. Engineering hybrid genes without the use of restriction enzymes: Gene splicing by overlap extension. Gene. 1989;77(1):61–68. doi: 10.1016/0378-1119(89)90359-4. [DOI] [PubMed] [Google Scholar]
  • 65.Karimi M, Inzé D, Depicker A. GATEWAY vectors for Agrobacterium-mediated plant transformation. Trends Plant Sci. 2002;7(5):193–195. doi: 10.1016/s1360-1385(02)02251-3. [DOI] [PubMed] [Google Scholar]
  • 66.Karimi M, Depicker A, Hilson P. Recombinational cloning with plant gateway vectors. Plant Physiol. 2007;145(4):1144–1154. doi: 10.1104/pp.107.106989. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Tzfira T, et al. pSAT vectors: A modular series of plasmids for autofluorescent protein tagging and expression of multiple genes in plants. Plant Mol Biol. 2005;57(4):503–516. doi: 10.1007/s11103-005-0340-5. [DOI] [PubMed] [Google Scholar]
  • 68.Citovsky V, et al. Subcellular localization of interacting proteins by bimolecular fluorescence complementation in planta. J Mol Biol. 2006;362(5):1120–1131. doi: 10.1016/j.jmb.2006.08.017. [DOI] [PubMed] [Google Scholar]
  • 69.Abdalla KO, Thomson JA, Rafudeen MS. Protocols for nuclei isolation and nuclear protein extraction from the resurrection plant Xerophyta viscosa for proteomic studies. Anal Biochem. 2009;384(2):365–367. doi: 10.1016/j.ab.2008.09.049. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
Supplementary File
pnas.1616173114.sd02.xlsx (14.3MB, xlsx)
Supplementary File
pnas.1616173114.sd03.xlsx (57.1KB, xlsx)
Supplementary File
pnas.1616173114.sd04.xlsx (27.5KB, xlsx)
Supplementary File
pnas.1616173114.sd05.xlsx (69.4KB, xlsx)
Supplementary File

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES