Skip to main content
PLOS Genetics logoLink to PLOS Genetics
. 2020 Mar 18;16(3):e1008637. doi: 10.1371/journal.pgen.1008637

Transcriptional regulation of genes bearing intronic heterochromatin in the rice genome

Nino A Espinas 1,2, Le Ngoc Tu 1, Leonardo Furci 1, Yasuka Shimajiri 1,3, Yoshiko Harukawa 1, Saori Miura 1, Shohei Takuno 4, Hidetoshi Saze 1,*
Editor: Ortrun Mittelsten Scheid5
PMCID: PMC7145194  PMID: 32187179

Abstract

Intronic regions of eukaryotic genomes accumulate many Transposable Elements (TEs). Intronic TEs often trigger the formation of transcriptionally repressive heterochromatin, even within transcription-permissive chromatin environments. Although TE-bearing introns are widely observed in eukaryotic genomes, their epigenetic states, impacts on gene regulation and function, and their contributions to genetic diversity and evolution, remain poorly understood. In this study, we investigated the genome-wide distribution of intronic TEs and their epigenetic states in the Oryza sativa genome, where TEs comprise 35% of the genome. We found that over 10% of rice genes contain intronic heterochromatin, most of which are associated with TEs and repetitive sequences. These heterochromatic introns are longer and highly enriched in promoter-proximal positions. On the other hand, introns also accumulate hypomethylated short TEs. Genes with heterochromatic introns are implicated in various biological functions. Transcription of genes bearing intronic heterochromatin is regulated by an epigenetic mechanism involving the conserved factor OsIBM2, mutation of which results in severe developmental and reproductive defects. Furthermore, we found that heterochromatic introns evolve rapidly compared to non-heterochromatic introns. Our study demonstrates that heterochromatin is a common epigenetic feature associated with actively transcribed genes in the rice genome.

Author summary

Intronic regions of eukaryotic genomes accumulate many Transposable Elements (TEs) and repeats. These intronic repeats are often targeted by epigenetic silencing mechanisms and form a repressive heterochromatin structure, even within transcriptionally active genes. However, the distribution of TEs in the intragenic regions, and their contributions to genetic diversity and evolution in plant genomes, remain poorly understood. In this study, we investigated the genome-wide distribution of intronic TEs and their epigenetic states in the Oryza sativa genome, where TEs comprise 35% of the genome. We found that over 10% of rice genes contain introns associated with repressive heterochromatin. Genes with heterochromatic introns are implicated in various biological functions. The conserved protein OsIBM2 is required for proper transcription of a group of heterochromatin-containing genes. We also found that heterochromatic introns evolve rapidly compared to non-heterochromatic introns. Our study indicates that heterochromatin is a common feature in transcribed genes in the rice genome.

Introduction

Genomes of eukaryotes contain substantial numbers of transposable elements (TEs), which shape genomic structures and epigenomic landscapes [1, 2]. In plants, genomic TE contents are strongly correlated with genome size expansion [3]. Since TE insertions in genes disrupt coding sequences and regulatory elements, TEs are evolutionarily purged from genic regions and accumulated in the gene-poor pericentromeric regions of chromosomes, especially in species with small genomes, such as Arabidopsis thaliana [2, 4]. However, in plants with larger genomes, TEs are also distributed across the gene-rich chromosome arm regions, and often affect transcription of surrounding genes [2, 58].

Due to their harmful effects in the genome, TEs are often epigenetically modified and transcriptionally silenced by genome defense mechanisms [9, 10]. In plants, interdependent chromatin modifications including DNA methylation, histone modifications, and RNA interference (RNAi) play key roles in transcriptional repression of TEs. DNA cytosine methylation is found in both CG and non-CG (CHG, CHH; H = A, T, C) contexts in plant genomes, which is important in TE silencing. CG methylation is maintained through DNA replications by the Methyltransferase 1 (MET1) [1114]. In addition, DNA methylation is directed by RNAi-based RNA-dependent DNA methylation, where small interfering RNAs (siRNAs) recruit de novo DNA methyltransferase to target sequences [15]. Furthermore, histone modifications, including histone H3 Lys9 methylation (H3K9me), are tightly linked to non-CG methylation, and are associated with repressive chromatin states [16]. Chromatin with these modifications results in the formation of a condensed repressive chromatin structure called heterochromatin [1618], commonly associated with most TE sequences. The chromatin remodeler Decrease in DNA Methylation 1 (DDM1) is required for the maintenance of heterochromatin [1921].

The formation of heterochromatin on TEs in genic regions causes transcriptional repression of surrounding genes in plant genomes [22, 23]. For example, heterochromatin associated with TEs and repetitive sequences in promoter regions often causes transcriptional repression of downstream genes [2427]. Many TEs are also present in intronic regions especially in large plant genomes [2831], likely due to less adverse effects on coding sequences compared to exonic insertions. Enigmatically, intronic TE sequences can also be targeted by repressive chromatin modifications, thus forming heterochromatic structure within transcription permissive chromatin environments [32]. In Arabidopsis thaliana, nuclear proteins, including INCREASE IN BONSAI METHYLATION 2 (IBM2)/ANTI-SILENCING1/SHOOT GROWTH1, are required for proper transcription of heterochromatin-containing genes [3336]. IBM2 contains a Bromo-Adjacent Homology (BAH) domain and an RNA recognition motif, and in the Arabidopsis ibm2 mutant, genes containing heterochromatic introns show a transcription defect due to premature termination of transcripts at the heterochromatic regions. Intronic heterochromatin tends to repress expression of associated genes in both animals and plants [3742]. However, in some circumstances establishment and maintenance of heterochromatin within intronic regions are critical for transcriptional control of the associated genes required for environmental responses and development [28, 43, 44]. For example, in A. thaliana, maintenance of H3K9 methylation and DNA methylation of intronic TEs is important for transcription of the RPP7 gene, which confers resistance against a plant pathogen [45]. In winter wheat, vernalization induces DNA hypermethylation of the intron of VRN-A1, which promotes expression of the gene [43]. On the other hand, in oil palm, loss of DNA methylation of an intronic TE arising during tissue culture alters the splicing pattern of the associated gene, resulting in a developmental abnormality of the fruit [46]. These observations suggest that heterochromatin formation in intragenic, especially intronic TEs, may have functional relevance to transcriptional regulation of associated genes, and would also profoundly influence on gene diversification and evolution. Indeed, plant introns often encode regulatory elements for recruitment of transcription factors that alter chromatin states, which lead to both transcriptional repression and activation of developmental genes [4750]. However, epigenetic states of introns at a genome-wide scale, their impacts on gene regulation, functions of genes bearing intronic heterochromatin, and their contribution to genetic diversity in plant genomes are not well understood.

The Oryza sativa genome is an ideal model for investigating interactions between genes and TEs, since 35% of the genome consists of TEs that are widely distributed in genic regions [5153]. In this study, we investigated the genome-wide distribution of intronic heterochromatin and its impact on transcriptional control of genes in the rice genome. We found that over 10% of rice genes contain introns associated with repressive heterochromatin, which are involved in various biological processes. Transcription of genes bearing intronic heterochromatin as well as other genes without heterochromatic introns are affected by a loss of function of the conserved factor OsIBM2, which is essential for development and reproduction of rice. Rapid evolution of heterochromatic introns suggests their potential impacts on the evolution of gene sequences.

Results

Accumulation of heterochromatic introns in the rice genome

To investigate the epigenetic states of intronic regions in the rice genome (Oryza sativa L. ssp. japonica CV. Nipponbare), we performed whole-genome bisulfite sequence (WGBS) analysis using the mature rice leaf tissue. We specifically focused on detecting repressive heterochromatic states in intronic regions. Non-CG DNA methylation, especially CHG methylation, is well correlated with the heterochromatic state of histone modifications such as H3K9 methylation in plant genomes [20, 31, 54, 55]. Therefore, CHG methylation level was used as a proxy to define a heterochromatic state of chromatin within intronic regions (see methods for more details). We identified 5,809 introns within 4,227 gene models that contain heterochromatic domains in the rice genome (Fig 1A, S1 and S2 Tables). This is about 11% of gene models in rice genome (IRGSP-1.0; 37,866 gene models), and is 10-fold more abundant than in Arabidopsis thaliana (S1A Fig). Heterochromatic introns accumulate CG, CHG, and CHH methylation, as well as H3K9 di-methylation (H3K9me2; S1B Fig), similar to transposable elements (TEs; Fig 1B and 1C), indicating that they contain canonical heterochromatin. Loci containing heterochromatic introns are not biased toward repeat-rich pericentromeric regions, but are rather scattered throughout the rice chromosome arms (Fig 1D). RNA-seq analysis of the leaf tissue demonstrated that many of these loci are transcribed in the presence of intronic heterochromatin (Fig 1E), indicating that heterochromatic introns can co-exist within transcriptionally active genes in the rice genome.

Fig 1. Intronic heterochromatin in the rice genome.

Fig 1

(A) Rice IRGSP-1.0 gene models (n = 37,866) that contain heterochromatic domain in their intron (n = 4,227). (B) DNA methylation levels in CG (mCG), CHG (mCHG) and CHH (mCHH) contexts for indicated genome features. (C) Metaplots of DNA methylation in CG (blue), CHG (light blue) and CHH (orange) context for indicated genome features. (D) (Top) Density of repeats, genes, and intron sequences in 1MB bins in rice chromosomes. (Bottom) Density of introns with heterochromatic domains as above. (E) Representative rice genome loci containing heterochromatic domains within introns. Tracks: Top to bottom; RNAseq (Reads per Million are indicated at top left), mCG ratio (0 to1), mCHG ratio (0 to1), mCHH ratio (0 to1), H3K9me2 (RPM; 0 to 1), TE annotation (blue), repeats (orange), gene model (purple), introns containing heterochromatic domain (black). Black arrows indicate the orientation of coding sequence.

Heterochromatin is enriched in promoter-proximal introns

In general, introns in the rice genome are longer than those in the A. thaliana genome (S2A Fig), which may be due to abundant repeat sequences in introns (13.9% of total intron sequence; S2B Fig). In particular, rice introns associated with CHG methylation tend to be longer (Fig 2A, S2C Fig) [34]. It has been reported that the first intron is generally longer than later introns in most of eukaryotic genomes, including that of rice [56] (Fig 2B). We found that heterochromatic introns are longer irrespective of their positions (Fig 2B). However, formation of heterochromatic introns is significantly biased toward the 5′-ends of rice genes (p < 1.0e-6 by a permutation test, Fig 2C), which is associated with accumulation of TEs in promoter-proximal introns (S2C Fig). This suggests that a preferential targeting of TEs toward the 5′-ends of rice genes might be a trigger for the formation of heterochromatin in promoter proximal introns.

Fig 2. Heterochromatin is enriched in promoter-proximal introns.

Fig 2

(A) Boxplots for length of normal and heterochromatic introns. Heterochromatic introns are significantly longer than introns without heterochromatic domains (p-value < 2.2e-16, Wilcoxon exact test). (B) (left) Intron position and length for all introns. (right) Intron position and length for all introns (white), non-heterochromatic introns (pink) and heterochromatic introns (red). (C) Enrichment of heterochromatin in promoter-proximal introns. Fraction of relative positions for all introns (n = 151,045), and heterochromatic introns (n = 6,086; the average position of heterochromatic introns was 3.02) are shown. Identical introns annotated in different positions in different splicing variants were independently counted.

Many intronic TEs are short and hypomethylated in CHG context

Next, we investigated how the presence of TE affects heterochromatin formation within intronic regions. A set of manually curated TE annotations (n = 29,100, S3 Table) was analyzed for their locations in the genome. We found that about 7% (2,122/29,100) of TEs are located within intragenic regions (Fig 3A, S3A Fig), and that most of them (82%; 1,751/2,122) are present in introns (Fig 3B). TEs annotated as Miniature inverted-repeat transposable elements (MITE) are particularly enriched in intragenic regions (S3A Fig) consistent with previous studies [57, 58], while no strong orientation bias against the associated genes was observed in any of the TE families (S3B Fig). As expected, most of heterochromatic introns (84%; 4,886/5,809) are associated with TEs and other repeat sequences (Fig 3C). Interestingly, however, about 50% (980/1,967) of TE-containing introns do not overlap with the heterochromatic introns (Fig 3C), suggesting that intronic TEs are not always associated with heterochromatin. On the other hand, 16% (923/5,809) of heterochromatic introns are not associated with TEs or with repeat annotations, which is likely due to a spreading of heterochromatic modifications from neighboring chromatin (S3C Fig) [59, 60]. DNA methylation of intronic TEs seems to be maintained in the same manner as intergenic TEs, since methylation in intronic TEs is affected by mutations of maintenance methylase OsMET1, and the chromatin remodeler OsDDM1 (S3D Fig) [12, 21]. However, we found that a fraction of intronic TEs is hypomethylated especially in CHG, while distribution of CG, and CHH methylation levels among TEs is comparable between intergenic and intronic TEs, irrespective of the TE families (Fig 3D, S4 Fig). CHG-hypomethylated TEs are generally shorter than CHG-hypermethylated TEs (S5 Fig), and they are more abundant in introns (Fig 3E). A similar trend was observed in an analysis using a comprehensive MITE dataset [61] (153,751 TE sequences),which showed that short, CHG-hypomethylated MITEs (S6A and S6B Fig) are enriched in introns. Shorter TEs are likely degenerated or truncated TE sequences, on which relaxed epigenetic silencing may have resulted in a reduction of CHG methylation.

Fig 3. Intronic TEs are short and hypomethylated.

Fig 3

(A) Classification of all TEs (n = 29,100) in the rice genome. “Other TEs” refers to TE annotations overlapping to both gene and intergenic regions. (B) Classification of intragenic TEs (n = 2,122) in the rice genome. “Exon-intron” refers to TE annotations overlapping to both exon and intron. “Exon/intron” refers to TE annotations included in an exon of a gene/transcript model as well as in an intron of other gene/transcript models. (C) Venn diagram showing the number of overlapping introns containing heterochromatic domains (blue), TEs (red) and other repeats (yellow). (D) Histograms of the number of intergenic and intronic TEs and their methylation levels (0 to 1) in CG, CHG, and CHH contexts. TEs with methylation data at ≥ 5 Cs were included in the analysis. (E) Density plots showing length (log10) and methylation levels (0 to 1) of intergenic and intronic TEs in CG, CHG, and CHH contexts. TEs with methylation data at ≥ 5 Cs in each context were included in the analysis.

Heterochromatic introns are associated with genes involved in various biological functions

Rice genes with heterochromatic introns encode various proteins with enzymatic activities, including oxidoreductases and hydrolases, as well as with nucleotide-binding activities (S7A Fig). Gene Ontology (GO) enrichment analysis indicates that genes with heterochromatic introns are implicated in diverse functions, such as lipid/carbohydrate metabolic processes, post-embryonic and reproductive developmental processes, and cell death pathway, which is a manifestation of plant defense responses against pathogens (Fig 4A). On the other hand, GO terms such as nitrogen biosynthetic/metabolic processes were depleted in the genes (S7B Fig). Our transcriptome analysis of mature rice leaf tissue showed that the expression levels of genes with heterochromatic introns were generally lower than those without heterochromatin (Fig 4B). We further examined expression patterns of rice genes in various developmental stages as well as in responses to environmental stimuli, using public microarray data in an expression atlas of rice genes and rice RNA-seq data [62, 63]. We calculated entropy values of gene expression patterns as a measure of specificity [64], which showed that genes with heterochromatic introns tend to have tissue-specific expression patterns, and are also responsive to plant hormones and environmental stresses (Fig 4C, S7C Fig). However, overall effect sizes of the values between gene with and without heterochromatic intron in the analyses were relatively small (r < 0.1), suggesting that expression profiles of genes with heterochromatic intron are not too different from genes without heterochromatin.

Fig 4. Genes with heterochromatic introns tend to show tissue-specific, and environment-responsive expression patterns.

Fig 4

(A) Gene Ontology enrichment for genes containing heterochromatic introns (2,449 genes out of 4,227 genes were analyzed for enrichment analysis. p-values were obtained by Fisher test with Hochberg adjustments (FDR < 0.05). GO terms (odds ratio; 95% Confidence Interval): catalytic activity (1.43; 1.31, 1.56), hydrolase activity (1.29; 1.14, 1.46), transporter activity (1.44; 1.17, 1.76), lipid metabolic process (2.10; 1.67, 2.63), carbohydrate metabolic process (1.60; 1.29, 1.97), death (1.85; 1.36, 2.49), post-embryonic development (2.79; 1.68, 4.50), cell death (1.85; 1.36, 2.49), secondary metabolic process (2.40; 1.55, 3.66), cell differentiation (3.21; 1.72, 5.73), reproductive developmental process (2.44; 1.50, 3.86), reproductive structure development (2.37; 1.44, 3.78), flower development (3.02; 1.59, 5.46), cellular developmental process (2.52; 1.42, 4.27), reproduction (1.78; 1.19, 2.59), membrane (1.23; 1.10, 1.38), extracellular space (3.01; 1.62, 5.34). (B) Expression levels of genes with heterochromatic introns in the leaf tissue measured by RNA-seq (Transcript per million; TPM > 0). p-values by Wilcoxon test are indicated. Effect size r = 0.046. (C) Tissue specificity of normal genes (pink) and heterochromatin-containing genes (red) in the rice developmental process, and specificities for Jasmonic Acid (JA) and abscisic acid (ABA) treatments, and stress treatments (Cold, Flood), measured by entropy values. p-values from the Wilcoxon test are indicated. Effect size (r) in each analysis: Development; 0.095, JA; 0.024, ABA; 0.031, Cold; 0.031, Flood, 0.037.

To understand the effects of heterochromatic introns on gene regulation in response to environmental signals, we searched for insertion/deletion polymorphisms in the intronic regions between Nipponbare (NB) and the indica-rice cultivar KASALATH (KAS) using whole-genome re-sequencing data [65]. In particular, we sought genes showing expression changes in response to JA, a plant hormone essential for development and also for both biotic and abiotic responses [66] (Fig 4C). Based on the genome re-sequencing data [67] and a public expression profile [63], we selected 12 JA-responsive loci (4 up-regulated, and 8 down-regulated loci after JA treatment; S8 and S9 Figs) that have large intronic deletions in the KASALATH genome corresponding to the regions showing heterochromatic state in the Nipponbare genome (S9 Fig). Consistent with the public expression profile, the JA-inducible genes OsAOS2 [68, 69] as well as the 12 selected loci in NB showed expression changes in the root tissues upon JA treatment (S8 Fig). Several loci (4 out of 8 loci showing down-regulation in NB by JA treatment) in KAS showed reduced responses to JA treatment (p>0.05; t-test), whereas other loci including up-regulated genes showed essentially similar responses between NB and KAS, with variable degrees (S8 Fig). Thus, impacts of heterochromatic intron on the gene response and expression remain to be elucidated.

A conserved epigenetic machinery regulates transcription of genes containing heterochromatin, and is essential for rice development

It has been shown in A. thaliana that transcription of genes with heterochromatic introns is regulated by a nuclear protein complex [33]. One of the proteins in the complex is Increased Bonsai Methylation 2 (IBM2), which contains a Bromo-Adjacent Homology (BAH) domain and an RNA recognition motif (S10 Fig)[3436]. The rice homolog of IBM2 is encoded as a single-copy gene in the rice genome [34]. To examine whether it has a conserved function for transcription of genes with heterochromatic introns, we knocked down the transcript of the homologous gene (Os01g0610300; named as OsIBM2) using RNA interference (RNAi), targeting the 3′ end of the gene (Fig 5A). Among several independent T1 transformants, lines #2 and #16 showed a marked reduction of the transcript and were further investigated (Fig 5B). In addition, the CRISPR-Cas9 system was employed to obtain OsIBM2 knock-out lines, which generated independent deletion mutant lines targeting either the BAH domain-encoding region (g1#5 and g1#27) or a 3′ region downstream of the RRM encoding region (g2#24) (Fig 5C). Both RNAi and CRISPR-targeted mutants showed severe dwarfism and sterility (Fig 5D, S11A–S11C Fig). Particularly, mutants with deletions in the BAH domain (g1#5 and g1#27) could not produce homozygous mutant seeds, suggesting an embryonic lethality of these alleles. Heterozygous mutants with a deletion in the 3′ region (osibm2_g2#24) could produce homozygous seeds, but the homozygous plants showed a complete sterility (S11C Fig), indicating that OsIBM2 is essential for development and reproduction. Previous studies in Arabidopsis ibm2 have shown that transcription at downstream of heterochromatic introns is reduced due to a premature termination of transcript within heterochromatic introns [34]. Therefore, we analyzed changes in accumulation of transcripts upstream and downstream of introns with both heterochromatic and non-heterochromatic state (total of 126,068 introns) in the rice genome. Our transcriptome analysis of the leaf tissues from both RNAi and CRISPR-Cas9 mutant lines detected 454 differentially expressed genes (DEGs) commonly in RNAi_#2, #16 and osibm2_g#24 lines, which showed changes in transcripts downstream of introns compared with wild type (Fig 5E, S4 Table). Among DEGs, genes containing heterochromatic introns were significantly enriched (93 genes out of 454 DEGs (20.5%); p = 2.9e-17, Fisher’s exact test). DEGs with heterochromatic introns showed a significant reduction of transcripts in the 3′ downstream of the heterochromatic intron (p = 1.0e-6, Tukey-Kramer test; S12B Fig), which was due to premature polyadenylation in the intronic regions (Fig 5F and 5G, S13, S14A–S14C Figs), similar to the phenotypes of the Arabidopsis ibm2 [34, 70]. On the other hand, DEGs with normal introns showed less changes in their 3′ transcription (S12A and S12B Fig), suggesting that the mutation in OsIBM2 results in transcription defects predominantly at heterochromatin-containing DEGs. We also searched for differentially expressed TEs in the osibm2. We detected only a few of them (23 TEs; 22 LTR, 1 DNA/En-Spm; 12 up-regulated, 11 down-regulated; S14D Fig), including 8 intronic TEs (3 TEs were associated with the DEGs containing heterochromatin; S14E Fig); some of these expression changes of TEs might be due to epigenetic changes during tissue culture transformation. The number of DEGs with and without heterochromatin that were detected by the RNA-seq analysis may have been underestimated, considering the partial loss of function of mutant alleles as well as the tissue-specific/environment-responsive expression profiles of heterochromatin-containing genes (Fig 4 and Fig 5). Indeed, additional RT-PCR analysis using RNAs from endosperm/embryo of osibm2 showed that several heterochromatin-containing genes primarily expressed during reproductive development [7176] were severely affected in osibm2 (S11D Fig), even though they were not detected as DEGs in the RNA-seq of leaf tissues.

Fig 5. OsIBM2 is required for rice development and transcription of heterochromatin-bearing genes.

Fig 5

(A) Gene structure of Os01g0610300 (OsIBM2). Exons and untranslated regions are shown with black and white boxes, respectively. Regions designed for two gRNAs and hairpin RNA (RNAi) are also indicated. (B) qRT-PCR analysis of the expression of OsIBM2 in 95-day-old leaf blade tissue of wild-type Nipponbare (NB), RNAi-GFP control line, and four RNAi-IBM2 transgenic lines. Expression levels in each sample were normalized by ACT1 expression levels, and the average of OsIBM2/ACT1 in NB was set as 1. Bars represent means of three biological replicates ± S.E.M (n = 3). (C) Cas9 gRNAs and targeted deletions obtained in independent osibm2 mutants. PAM: Protospacer Adjacent Motif. (D) Three-month-old rice plants of osibm2_g2#24 and their segregating wild type siblings (WT; T4). (E) Venn diagrams of overlapping genes showing altered expression in RNAi #2, #16, and osibm2_g2#24. P-values for significance of overlaps were tested with Fisher’s exact test. (F) Representative rice genome loci showing altered expression patterns in mutants of OsIBM2. Tracks: Top to bottom; RNAseq (Reads per Million are indicated in top left), mCG ratio (0 to1), mCHG ratio (0 to1), mCHH ratio (0 to1), H3K9me2 (RPM; 0 to 1), TE annotation (blue), repeats (orange), gene model (purple). The black arrow indicates the orientation of coding sequence. Red bars indicate primer positions for qPCR in S13B Fig. OsIBM2 locus is shown as a validation of RNAi knock-down. (G) 3′ Rapid Amplification of cDNA Ends (RACE) of Os03g0332100 containing intronic heterochromatin. Upper panel: Structure of Os03g0332100 locus and polyadenylated mRNA variants detected by 3′ RACE. Exons and spliced introns confirmed by sequencing analysis are shown as black/red boxes and lines, respectively. Primer positions used for 3′ RACE are indicated by arrows. Lower panel: Gel picture of DNA fragments amplified by 3′ RACE. Two biological replicates for each genotype were examined. DNA fragments indicated by arrowheads were cloned and sequenced for at least 8 clones, and the representative sequences supported with more than 3 clones are shown in the upper panel. NB: Nipponbare; osibm2: osibm2_g2#24; WT: wild type segregants of osibm2; (A)n: polyadenylation.

In Arabidopsis, the histone H3K9 demethylase gene, IBM1, contains heterochromatin in the 7th intron due to an insertion of organelle genome sequence, and Arabidopsis ibm2 reduces expression of IBM1, which results in genome-wide accumulation of H3K9me2 and non-CG methylation at genic regions [77, 78]. We therefore scrutinized, by WGBS analysis of osibm2, whether OsIBM2 regulates non-CG methylation in the rice genome using the CRISPR mutant osibm2_g#24. We found that DNA methylation patterns in CG and non-CG contexts were nearly identical in genic as well as intergenic regions between osibm2 and WT (S15 Fig). In the rice genome, two IBM1 homologs, OsJMJ718 (MSU ID: Os09g22540; RAP ID: Os09g0393200) and OsJMJ719 (MSU ID: Os02g01940; RAP ID: Os02g0109400, Os02g0109501), have been identified (S16A and S16B Fig) [79]. We found that one of the IBM1 homologs, OsJMJ718 contains heterochromatin in the last intron (S16A Fig), although it was not identified as a commonly affected gene among OsIBM2 mutants (significant transcript changes were detected in RNAi_#2 and #16; q < 0.01). The less significant effects of the OsIBM2 mutation on OsJMJ718 expression and genome-wide non-CG methylation may be due to a partial loss of function of OsIBM2 in the mutants, or to functional redundancy of OsJMJ718 and OsJMJ719. Alternatively, the OsJMJ18 transcript may be more resistant to the effects of heterochromatin that is downstream of the jmjC domain-coding sequence, compared with the A. thaliana IBM1, which has the heterochromatic intron in the middle of jmjC domain-coding sequence (S16C Fig).

Rapid evolution of heterochromatic intron

To understand how heterochromatin formation affects gene evolution in the rice genome, we further investigated the pattern of nucleotide substitutions in rice genes with heterochromatic introns. We first tested whether the degrees of selective constraints are similar between genes with and without heterochromatic introns in O. sativa. To this end, we compared the genome sequence of O. sativa with that of a close wild relative, O. meridionalis [80]. We predicted the orthologs in O. meridionalis and calculated the rate of nucleotide substitutions (Materials and Methods). Although our previous study of the A. thaliana genome did not find a significant difference [28], we found a relaxation of selective constraints in genes with heterochromatic introns in the rice genome (Fig 6A), where the ratio of nonsynonymous substitution rates to synonymous substitution rates (KA/KS) was 0.473 (n = 928), compared to 0.384 (n = 10,456) in genes without heterochromatic introns (P < 10−5 by a permutation test). This indicates that heterochromatic introns would be deleterious for genes under high levels of selective constraint.

Fig 6. Patterns of nucleotide substitution rates between Oryza sativa and Oryza meridionalis.

Fig 6

(A) Frequency distributions of KA/KS values. Blue and red plots represent genes with and without heterochromatic introns, respectively. (B) Frequency distributions of KI values. Orange and light blue plots represent heterochromatic introns and heterochromatin-free introns, respectively.

We further investigated the pattern of nucleotide substitutions in introns (KI). Even though we excluded repeat sequences for the inter-species comparison, KI values of heterochromatic introns showed higher base substitution rates (0.0325; n = 627) than non-heterochromatic introns (0.0242; n = 35,354; P < 10−5 by a permutation test) (Fig 6B). This indicates that heterochromatic introns have evolved more rapidly than heterochromatin-free introns, suggesting an acceleration of intronic sequence divergence associated with heterochromatin formation.

Discussion

In this study, we revealed the genome-wide distribution of heterochromatic introns in the rice genome, which contains heterochromatic introns in approximately 11% of the genes. The underlying molecular mechanisms that allow the presence of repressive heterochromatin within actively transcribed regions are still unclear [37]. However, our study demonstrated that the conserved epigenetic factor OsIBM2 is critical for production of proper mRNA through heterochromatic introns in dozens of loci in the rice genome (Fig 5, S11D Fig). In addition, many genes without heterochromatic introns are also affected by OsIBM2 mutation in the leaf tissue (Fig 5), suggesting the profound impact of the loss of function of OsIBM2. Rice mutants of major epigenetic regulators, including OsMET1, OsCMT3, OsDRM and OsDDM1, have been shown to exhibit severe developmental defects such as embryonic/seedling lethality and sterility [14, 20, 8183], while phenotypes of mutants of these genes in A. thaliana are relatively mild and plants are essentially viable [11, 8487]. The difference likely stems from the genome structure of rice, where abundant TEs are distributed along gene-rich chromosome arms [51]. The close association of TEs with genes would make genes more susceptible to epigenetic changes in nearby TEs. In A. thaliana, ibm2 plants are still fertile [34], while rice osibm2 results in severe developmental defects and sterility (Fig 5, S11 Fig). This suggests that in the rice genome, in addition to the maintenance of heterochromatic states by DNA methyltransferases, transcriptional regulation of genes by OsIBM2 affects rice development and reproduction. For plant genomes harboring abundant intragenic heterochromatin, gene regulation mechanisms involving IBM2 would be more vital [2931].

Insertion of TEs in intronic regions often results in repression of associated genes due to accumulation of repressive epigenetic marks. In rice, insertions of MITE in an intron cause repression of the Elongated Uppermost Internode (EUI) gene, which is due to siRNA production from the intronic TEs [88]. In Arabidopsis and Capsella natural strains, insertions of TEs in the intron of flowering repressor gene Flowering Locus C (FLC) downregulate its expression, and induce early-flowering phenotypes [41, 59, 89]. Consistent with these reports, we found that rice genes with heterochromatic introns tend to show lower expression in the leaf tissue (Fig 4B). Alternatively, genes expressed at lower levels may tolerate those insertions [90]. On the other hand, our analysis of JA-responsive loci with insertion/deletion polymorphisms in heterochromatic introns suggested that responsiveness of genes to the hormone are largely unaffected by the presence/absence of heterochromatin in introns (S8 and S9 Figs). However, further comprehensive analyses are required to fully understand the impacts of intronic heterochromatin on gene regulation during environmental responses.

Longer first introns are a universal feature of eukaryotic gene structure [56]. The first intron sequence is more conserved than the later introns in animal genomes [91, 92]. In plants, enhancement of gene expression by intronic sequences, known as intron-mediated enhancement (IME), is associated with specific sequence motifs enriched in the first intron [93]. In rice, the first introns are required for the higher expression of tubulin genes [94, 95]. Intriguingly, intronic heterochromatin is significantly enriched in first and second introns which are associated with the accumulation of TEs in these introns (Fig 2, S2C Fig). Many TEs are known to target the 5′ end of genes [7, 96, 97], while insertions into the exons in the 5′ ends of genes would be selected against, which may result in the accumulation of TEs in promoter-proximal introns. Insertion of TEs and formation of repressive chromatin may physically disrupt or override transcription enhancer functions of the promoter-proximal introns, which may contribute to lower expression of the associated genes (Fig 4B). Additionally, the inserted TEs may provide novel regulatory sequences such as transcription factor binding sites [6, 22], allowing genes to acquire tissue-specific, or environment-responsive expression properties (Fig 4). The degree of selective constraints and tissue specificity are negatively correlated in Arabidopsis species [98]. Consistent with this, we observed that genes with heterochromatic introns tend to be expressed in a tissue-specific manner, and to show a lower degree of selective constraints than the other genes (Fig 6). We also observed that heterochromatic intron sequences show higher evolution rates (Fig 6), likely due to the higher mutation rates of methylated cytosine residues [99]. Thus, the formation of heterochromatin in intronic regions may contribute to the divergence of gene sequences.

The association of repetitive elements with genes is most prominent in disease-resistance gene (R gene) loci in plant genomes, which would accelerate gene diversification by enhancing recombination, and by shuffling and duplication of the sequences [6]. Indeed, R-genes are significantly overrepresented in genes with heterochromatic introns (119 out of 689 R-genes; 2.8% of 4,227 genes with heterochromatic introns; p = 1.66e-6, Fisher’s exact test). Also, our GO analysis showed that heterochromatic introns are enriched in genes involved in the cell death pathway, which is provoked during plant immune responses mediated by R-genes [100, 101]. Acquiring repressive chromatin by TE insertions within intronic regions may also contribute to reduced expression of R genes, which may be advantageous for the prevention of autoimmune responses in the absence of pathogens [102].

A recent study showed that wild rice genomes tend to accumulate TEs in genic regions, while cultivated rice genomes show depletion of TEs from genic regions including introns [103]. This has likely occurred independently in the genomes of several cultivars [103]. This convergent loss of genic TE sequences in cultivar genomes may be a result of selective pressure against long heterochromatic TEs in the genic regions during domestication and selection (Fig 3). Alternatively, under uniform growing conditions in a nutrition-rich environment, inbreeding cultivar genomes may have gradually lost environment-responsive regulatory elements associated with genic TEs. In contrast, longer introns with TE insertions in wild rice genomes may be adaptive for dynamic transcription changes in the fluctuating natural environment. Indeed, recent studies in budding yeast demonstrated that the presence of introns promotes survival under starvation conditions, while the introns are dispensable in a nutrient-rich environment [104, 105]. Intron sequences in plant genomes may have more profound impacts on genome evolution and plant adaptation than previously thought.

Methods

Rice genome annotations

Annotations of Oryza sativa genome, version IRGSP v1.0, locus/transcript/repeat annotations (IRGSP-1.0_representative_2015-03-31_2) were retrieved from RAP-DB (http://rapdb.dna.affrc.go.jp/) [106]. We identified TEs in the Japonica rice genome using RepeatMasker (ver. 4.0.5; http://www.repeatmasker.org). Repbase library (ver. 20140131) [107] was downloaded and used as a repeat library. We ran RepeatMasker with the default parameters and screened putative TE segments. We first excluded non-TE repeats such as simple repeats, rRNAs and satellite DNAs. We then further filtered out the following results; 1) the hit regions covering <70% of the total length of the repeats in the library, 2) the length of the hit regions is < 100 bp, 3) nucleotide divergence between the hit region and the repeat in the library is >20%. The list of TEs is in S3 Table. MITE annotation was retrieved from the P-MITE database [61], and used for a BLASTN [108] search of the IRGSP genome with a cutoff e-value of 1e-40. MITE sequences with identical lengths to query sequences having no mismatch and no gap (153,751 sequences) were used for further analysis. Chip-seq data for H3K9me2 was obtained from [109]. BS-seq data for osmet1 and osddm1 were obtained from [12, 21], respectively. Rice seed core collections (World Rice Core Collection; WRC) were obtained from Genebank Project, National Agriculture and Food Research Organization (NARO; https://www.gene.affrc.go.jp/databases-core_collections_wr.php).

Rice transgenic lines

All rice plants used in this study were grown in growth chambers under short-day condition (10 hours light/ 14 hours dark cycles) at 30°C during daytime and 25°C during the night. For RNAi knock-down of the OsIBM2 mRNA, about 500 bp of the cDNA sequence of OsIBM2 was cloned into pANDA vector [110]. A partial GFP sequence was used as a control RNAi vector. Wild-type Nipponbare calli were transformed with the RNAi vector at InPlanta Innovations (Yokohama, Japan) or at our laboratory, and more than 15 independent T1 transformants for each vector were obtained. For CRISPR-Cas9 knock-out of OsIBM2, two guide RNAs (S5 Table) were designed and cloned into pHUE411 (Addgene #62203) by GoldenGate Mix (NEB), and transformed into rice calli with a standard agrobacterium transformation method. Gene targeting events were detected by digestion with HpaII (gRNA1) or HaeIII (gRNA2), and were confirmed by Sanger sequencing. For osibm2_g2#24, the absence of the pHUE411 vector and fixation of the mutation (Fig 5C) were confirmed at T3. Segregating wild type (WT) and homozygous T4 plants were used for further analyses.

All oligonucleotides used in this study are listed in S5 Table.

Bisulfite sequencing and data analysis

For Whole Genome Bisulfite-Sequencing (WGBS) analyses, we used genomic DNA of Nipponbare, osibm2_g2#24 (T4), and wild-type segregants of osibm2_g2#24 (T4) isolated from three-month-old mature leaf tissues with Nucleon PhytoPure (GE). An Illumina Sequencing libraries (125 bp paired-end for Nipponbare, 150 bp paired-end for osibm2_g2#24 and Wild-type) were constructed using the PBAT method [111] and sequenced at OIST Sequencing Center (SQC). Raw reads were trimmed by Trimmomatic [112] with parameters; HEADCROP:10 SLIDINGWINDOW:4:20 MINLEN:50. Remaining paired reads were mapped to rice genome IRGSP v1.0 with Bismark (v0.19.0) [113] with parameters; -N 1—pbat -ambiguous -R 10 -un—score_min L,0,-0.6. Unmapped reads together with dropped single-end reads from trimming were further mapped to the rice genome as single-end reads with parameters; -N 1—pbat -ambiguous -R 10—score_min L,0,-0.6, for R1, and -N 1 -ambiguous -R 10 -un—score_min L,0,-0.6, for R2. Methylation reports from paired and single reads were merged with bedtools [114]. Only uniquely mapped reads were used for further analysis, and C bases covered by fewer than 3 reads, and also Cs more than 100 (Cs with unnaturally high coverage; top ~0.01% of covered Cs) were excluded. Methylcytosines were identified by binomial test [115], with the bisulfite conversion rate estimated by mapping sequencing reads to the rice chloroplast genome. Methylation levels were calculated using the ratio of #C/(#C + #T) as described in [116]. Methylcytosine domain containing consecutive ≥ 5 mCHG with ≥ 0.5 methylation on average was considered as heterochromatic domain. Boxplots, sequence density, and metaplots for DNA methylation were generated with deeptools [117], Microsoft Excel, and R. A summary of WGBS analyses is shown in S6 Table.

GO analysis

GO analysis for enrichment was performed using the AgriGO website [118] and significant terms were extracted by Fisher’s exact test with Hochberg adjustment (FDR<0.05). GO term depletion analysis was performed with TopGO (https://rdrr.io/bioc/topGO/) using Fisher’s exact test. Protein classes were determined using the Panther database [119].

Expression data analysis

Micro-array data and RNA-seq data were retrieved from RiceXpro database [62] and TENOR [63]. For gene expression of developmental stages, gene expression profiles of 48 rice developmental stages/tissues were used for calculation of entropy value of each gene. For gene expression profiles of stress/hormone treatment conditions, gene expression data at following time points were used for calculation of entropy value of each gene: Jasmonic Acid, ABA, Cold, Drought treatments; 0, 1, 3, 6, 12, 24 hours, Flood treatment; 0, 1, 3, 6, 12, 24, 72 hours, Osmotic stress; 0, 1, 3, 6, 12 hours, High/Low phosphate treatments; 0, 1, 5, 10 days. Entropy (modified H) was calculated with ROKU function in TCC package in R [120].

Genome sequencing data analysis for indel identification

Genome resequencing data of KASALATH genomes was retrieved from [65] and mapped to IRGSP-1.0 using bowtie2-2.2.2 [121]. Candidate loci for intronic deletion in the KASAKLATH genome were searched based on the INDEL data retrieved from [122]. The presence of deletion was confirmed by PCR.

Jasmonic Acid treatment and gene expression analysis

The rice strains, Nipponbare and KASALATH (WRC 2), were germinated on plates. After 6 days, seedlings were transferred to 15 mL plastic tubes and grown hydroponically in 1/10 Murashige-Skoog (MS) media for 4 days in a growth chamber as described above. Plants were transferred to 1/10 MS media containing the final 100 μM Jasmonic Acid (JA, SIGMA) and 0.02% DMSO, or 0.02% DMSO as a mock treatment. After 6 hours of treatments, total RNA was extracted from the roots using Maxwell 16 LEV Plat RNA kit (Promega), and Quantitative RT-PCR (qRT-PCR) was performed for analysis of gene expression.

RNA-seq analysis

For RNA-seq analysis, total RNA from the leaf tissues was isolated with Maxwell 16 LEV Plat RNA kit (Promega). Two biological replicates for Nipponbare (NB), GFP-RNAi control lines (T2), RNAi #2 lines (T2), osibm2_g2#24 lines (T4), and wild-type segregants of osibm2_g2#24 (WT; T4) were prepared. An additional NB line was used for comparison with the single RNAi #16 (T2) line. Illumina RNA-Seq libraries (150bp paired-end) were prepared and sequenced at the OIST Sequencing Center. Raw reads were trimmed with Trimmomatic with the following parameters; HEADCROP:10 LEADING:15 TRAILING:15 SLIDINGWINDOW:10:15 MINLEN:25. Remaining paired reads were mapped to rice genome IRGSP v1.0 with Hisat2 [123] with parameters;—min-intronlen 20—max-intronlen 20000. Exon and splicing junction information was specified by the annotation retrieved from RAP-DB to prepare a genome index for Hisat2. A summary of RNA-seq analysis is shown in S7 Table. Reads mapped to rDNA and tRNA were removed with bedtools. For visualization of RNA-seq read tracks, read duplication was removed with samtools [124], and Reads per million (RPM) for 1 bp bin was calculated with deeptools. The read tracks were visualized in Integrated Genome Browser [125]. For estimation of expression level, reads mapped on transcript annotations were counted with the featureCounts function in Rsubread [126] with parameters; allowMultiOverlap = TRUE, minOverlap = 1, fracOverlap = 0, countMultiMappingReads = FALSE, and used for Transcript Per Million (TPM) calculations for each gene model (Fig 4B). Expression changes in RNAi (RNAi_#2 and _#16) and CRISPR knock-out lines (osibm2_g2#24) were analyzed based on the methods in [34]. Transcripts mapped to pre- and post- introns (n = 126,068) in each gene model were counted by featureCounts. Ratio of read counts (mapped reads in pre-intron: mapped reads in post-intron) of two biological replicates of each genotype (RNAi_#2 lines vs RNAi_GFP lines, osibm2-g2#24 lines (T4) vs wild-type segregants of osibm2-g2#24 (WT; T4)) were tested to detect changes in the expression pattern, by employing logistic regression analysis with p-value correction by Benjamini-Hochberg (BH) method for multiple testing. Changes in gene expression between RNAi_#16 (one replicate) and control NB were detected by binominal test with p-value correction as above. Data sets with q ≤ 0.01 were considered as significantly changed in downstream transcription (both up- and down-regulated loci in 3′ region). A relative 5′/3′ ratio of transcripts mapped to up- and down-stream of introns was calculated as described previously [34]. Differential expression analysis of TEs was performed by DESeq2 [127] using mapped read data by Hisat2 (osibm2-g2#24 lines (T4) vs wild-type segregants of osibm2-g2#24 (WT; T4)).

Quantitative RT-PCR (qRT-PCR) and 3′ RACE were performed as described in [34].

Nucleotide substitution analysis

To reveal patterns of nucleotide substitutions in genes with heterochromatic introns, we compared nucleotide sequences of O. sativa and O. meridionalis [80]. Putative orthologs were identified using GenomeThreader [128] with mRNAs of O. sativa to find orthologs in O. meridionalis, with the following parameters; -minmatchlen 18 -seedlength 16 -exdrop 2. When multiple orthologs were detected for an mRNA, it was discarded. If no ortholog was detected, we incremented the parameter -exdrop by one. This process was repeated until a single ortholog was detected or until the parameter -exdrop was less than or equal to 5. We further screened orthologs in which the exon-intron structures were conserved between the orthologs in 80% of their nucleotide sequences after alignments with CLUSTALW2 [129]. Nonsynonymous and synonymous nucleotide substitution rates (KA and KS, respectively) were calculated using the Nei and Gojobori method [130]. We discarded genes with KS > 0.1. We also calculated nucleotide substitution rates in introns as p-distance.

Supporting information

S1 Fig. Heterochromatic introns in Arabidopsis thaliana and rice genomes.

(A) Arabidopsis thaliana genes (TAIR10) containing intron with heterochromatic domains. (B) Heatmap showing accumulation of H3K9 di-methylation on genome features in the rice genome. Data from [109] were used for the analysis.

(PDF)

S2 Fig. Length of introns in Arabidopsis thaliana and rice genomes.

(A) A comparison of intron length between Arabidopsis thaliana (n = 127,836; average 169.0 bp) and Oryza sativa (n = 126,068; average 446.9 bp). (B) Fraction of repetitive elements in intronic regions of the rice genome. (C) Enrichment of heterochromatin and TEs in promoter-proximal introns. Fractions of all intron (n = 151,045), and heterochromatic introns (n = 6,086), and TE-containing introns (n = 1,982) are shown in the relative positions. Identical intronic regions annotated in different positions in different splicing variants were independently counted.

(PDF)

S3 Fig. TE families in rice introns.

(A) Fraction of TE families in the intronic regions of the Oryza sativa genome. (B) Orientation of intronic TE insertion against gene annotations in each TE family. No significant orientation bias was observed in the TE families (p > 0.01; two-sided binominal test). (C) Metaplots of DNA methylation in CG, CHG and CHH contexts for heterochromatic introns with TEs and repeats (n = 4,886), heterochromatic introns without repeat (n = 923), and non-heterochromatic introns (n = 145,235). (D) Heatmap of methylation profiles of intronic TEs in wild-type O. sativa and mutants of OsMET1 (met1) and of OsDDM1 (ddm1) at CG, CHG, and CHH-contexts.

(PDF)

S4 Fig. DNA methylation of rice intergenic and intronic TEs.

Histograms of the number of representative intergenic and intronic TE families (>20 copies in each category) and their methylation levels (0 to 1) in CG, CHG, and CHH contexts. TEs with methylation data at ≥ 5 Cs were analyzed.

(PDF)

S5 Fig. Length and DNA methylation of intronic TEs.

Boxplots showing length of representative intergenic and intronic TE families (>10 copies in each category) and their methylation levels in CG (high; mCG ≥ 0.9, low; mCG < 0.9), CHG (high; mCHG ≥ 0.2, low; mCHG < 0.2), and CHH (high; mCHH ≥ 0.1, low; mCHH < 0.1). * p < 0.05, ** p < 0.01, *** p < 0.001, Wilcoxon exact test. N.S.: no significance, p ≥ 0.05. TEs with methylation data at ≥ 5 Cs were analyzed.

(PDF)

S6 Fig. DNA methylation of MITEs in rice introns.

(A) Histograms of the number of representative intergenic and intronic MITEs (data retrieved from the P-MITE database [61] and their methylation levels (0 to 1) in CG, CHG, and CHH contexts. TEs with methylation data at ≥ 5 Cs were used in the analysis. (B) Density plots showing length (log10) and methylation levels (0 to 1) of intergenic and intronic MITEs in CG, CHG, and CHH contexts.

(PDF)

S7 Fig. Protein classes and expression changes of genes containing heterochromatic introns.

(A) Protein classes defined by the Panther database [119]. 1,407 of 4,227 genes containing heterochromatic introns matching the database are indicated. (B) Gene Ontology depletion for genes containing heterochromatic introns. P-values were obtained by Fisher test, and terms with FDR < 0.05 are indicated. (C) Expression changes of all genes and genes with or without heterochromatic introns by various stress treatments. Specificity of the responses to given treatments were measured as entropy values. P-values from Wilcoxon exact test are indicated. Effect size (r) in each analysis: Low phosphate; 0.024, High phosphate; 0.020, Drought; 0.007, Osmotic stress; 0.009.

(PDF)

S8 Fig. JA response of genes in Nipponbare (NB) and KASALATH (KAS) with structural variations in heterochromatic intron.

(A) Heatmap showing expression levels of the indicated genes after Jasmonic Acid (JA) treatment in the Nipponbare root. Expression data were obtained from TENOR [63]. (B) Quantitative RT-PCR (qRT-PCR) analysis of genes before (pre-treatment), and after JA (JA treatment). OsAOS2 was included as a control for JA-dependent induction of expression. Relative expression levels in each sample were normalized by UBQ1 expression levels, and the average of expression values in pre-treatment NB samples was set as 1, and plotted as dots (n = 6) with blue (NB) and yellow (KAS). The large dots and bars represent means of 6 biological replicates ± standard deviation (S. D.). P-values were obtained by t-test.

(PDF)

S9 Fig. Structural variations of heterochromatic introns in Nipponbare and KASALATH strains.

Insertion/deletion polymorphisms in Nipponbare and KASALATH. Tracks: Top to bottom: mCG ratio (0 to1), mCHG ratio (0 to1), mCHH ratio (0 to1), genome-resequencing data coverage (0 to 30) [65], repeats (orange), TE annotation (blue), gene model (purple). Structural variations detected by PCR are indicated under the tracks as gel pictures. Red arrows indicate the primer positions used for PCR amplifications shown in the gel panel. The region used for qRT-PCR is indicated as red bar.

(PDF)

S10 Fig. Amino acid alignment of homologs of OsIBM2.

Amino acid alignment of homologs of OsIBM2 in plants based on [34]. Bromo-Adjacent Homology (BAH) domain and RNA-Recognition Motif (RRM) are framed with a blue line. Arrows indicate regions designed for guide RNAs used for CRISPR-Cas9 mediated deletion. At; Arabidopsis thaliana: Zm; Zea mays: Os; Oryza sativa: Sb; Sorghum bicolor: Pt; Populus trichocarpa: Rc; Ricinus communis.

(PDF)

S11 Fig. Developmental phenotypes of osibm2 mutants.

(A) Whole plant picture of three-month-old Nipponbare (left), RNAi_#2 line (middle) and RNAi_GFP control line (right). (B) Close-up pictures of seeds set in Nipponbare and RNAi lines (T1). (C) A close-up picture of seeds set in osibm2_g2#24 and their segregating wild-type siblings (WT; T4). White bar: 1 cm. (D) RT-PCR analysis of gene expression in endosperm and embryo of Nipponbare and osibm2. RNAs from ~10 DAF (Days After Fertilization) developing endosperm and embryo of osibm2_g2 #24 (T2) were used for the analysis.

(PDF)

S12 Fig. Expression changes in genes containing heterochromatic introns in osibm2.

(A) (Top) DNA methylation levels of differentially expressed genes (DEGs) with heterochromatic introns (n = 93), DEGs without heterochromatic intron (n = 361), and non DEGs (n = 20293) in Nipponbare background. (middle) DNA methylation difference in osibm2 (osibm2_ g2 #24) and wild type at loci as above. (Bottom) H3K9 methylation levels at loci as above. (B) 5′/3′ ratio of transcripts mapped to up- and down-stream of introns relative to wild type. RNA-seq data from osibm2_ g2 #24 and WT (wild-type segregants of osibm2) were used. In each locus, the 5′/3′ ratio of a representative transcript variant with TPM >1 was used for calculation. Bars represent the means of DEGs with heterochromatic introns (n = 68), DEGs without heterochromatic intron (n = 335), and randomly selected 300 nonDEG loci ± S.E.M. p-values were obtained by Tukey-Kramer test.

(PDF)

S13 Fig. Expression changes of genes in osibm2.

(A) Representative rice genome loci showing altered expression patterns in mutants of OsIBM2. Tracks; Top to bottom: RNAseq (Reads per Million are indicated at top left), mCG ratio (0 to1), mCHG ratio (0 to1), mCHH ratio (0 to1), H3K9me2 (RPM; 0 to 1), TE annotation (blue), repeats (orange), gene model (purple). The black arrow indicates the orientation of coding sequence. (B) Quantitative RT-PCR (qRT-PCR) analysis of expression of genes containing heterochromatic introns in osibm2_g2#24 (osibm2) and WT (wild-type segregants of osibm2). Primer positions are indicated in Fig 5F and S13A Fig as red bars. Expression levels in each sample were normalized by UBQ1 expression levels, and the average of OsIBM2/UBQ1 in WT was set as 1. Bars represent the means of three biological replicates ± S. D. (n = 3).

(PDF)

S14 Fig. 3′ Rapid Amplification of cDNA Ends (RACE) of genes containing heterochromatin in mutants of OsIBM2.

(A) 3′ RACE of Os01g0650200. Upper panel: Structure of Os01g0650200 locus and polyadenylated mRNA variants detected by 3′ RACE. Exons and spliced introns confirmed by sequencing analysis are shown as black/red boxes and lines, respectively. Primer positions used for 3′ RACE are indicated by arrowheads. Lower panel: Gel picture of DNA fragments amplified by 3′ RACE. Two biological replicates for each genotype were examined. DNA fragments indicated by arrowheads were cloned and sequenced at least for 8 clones, and the representative sequences supported with more than 3 clones are shown in the upper panel. The black arrow indicates the orientation of coding sequence. NB: Nipponbare; osibm2: osibm2_g2#24; WT: wild-type segregants of osibm2; (A)n: polyadenylation. (B) 3′ RACE of Os06g0360600 as in (A). (C) 3′ RACE of Os08g0567200 as in (A). (D) The number of TEs showing expression changes in osibm2_g2#24 (osibm2). 22 LTR TEs, and 1DNA/En-Spm showed significant changes (q<0.05) of both up-regulation (12 TEs) and down-regulation (11 TEs). (E) Rice genome loci showing altered expression patterns of intronic TEs in mutants of OsIBM2. Tracks; Top to bottom: RNAseq (Reads per Million are indicated at top left), mCG ratio (0 to1), mCHG ratio (0 to1), mCHH ratio (0 to1), H3K9me2 (RPM; 0 to 1), TE annotation (blue), repeats (orange), gene model (purple). The black arrow indicates the orientation of coding sequence.

(PDF)

S15 Fig. DNA methylation in osibm2.

(A) Genome-wide DNA methylation in osibm2_g2#24 (osibm2, T4) and their wild type segregating siblings (WT, T4) in CG, CHG and CHH contexts for each chromosome. Average methylation levels in 1 MB bins were plotted. (B) Metaplots of DNA methylation in osibm2_g2#24 (osibm2) and their wild-type segregating siblings (WT) in CG, CHG and CHH contexts for indicated genome features.

(PDF)

S16 Fig. Rice homologs of the Arabidopsis H3K9 demethylase IBM1.

Genome loci for OsJMJ718 (Os09g0393200) (A) and OsJMJ719 (Os02g0109400, Os02G0109501) (B). RNA-seq, DNA methylation and H3K9me2 tracks are shown as in S13 Fig. (C) An alignment of amino acids sequences of A. thaliana IBM1 (At_IBM1) and OsJMJ718. The amino acid sequence of the N-terminal part of OsJMJ718 is predicted based on RNA-seq reads in this study. The alignment was generated by CLUSTAL W [131]. Jumonji-C (JmjC) domains predicted by SMART [132] are circled with blue lines. Positions of heterochromatic introns are indicated by red arrowheads.

(PDF)

S1 Table. Genes containing heterochromatic introns.

(XLSX)

S2 Table. Chromosomal positions of heterochromatic introns.

(XLSX)

S3 Table. Transposon annotation used in this study.

(XLSX)

S4 Table. Genes showing expression changes in osibm2 mutants.

(XLSX)

S5 Table. Primers used in the study.

(XLSX)

S6 Table. A summary table for Whole Genome Bisulfite Sequencing (WGBS) analysis.

(XLSX)

S7 Table. A summary table for RNA-seq analysis.

(XLSX)

S1 Data. Numerical data used to generate Figures.

(XLSX)

Acknowledgments

We thank the Genebank project, NARO, for rice seed collection. We thank OIST SQC for BS-seq and RNA-seq analysis, and Drs. Yoshiki Habu, and Reina Komiya for critical reading of the manuscript. We also thank Dr. Steven D. Aird for editing the manuscript.

Data Availability

All the sequence data reported in this study have been deposited in the DDBJ Sequence Read Archive under accession ID DRA008322. All other data are within the manuscript and its Supporting Information files.

Funding Statement

This work was supported by MEXT Grant-in-Aid for Scientific Research on Innovative Area (http://www.mext.go.jp/a_menu/shinkou/hojyo/1218181.htm) Grant Number 19H05272 to HS, and also supported by Okinawa Institute of Science and Technology Graduate University (https://www.oist.jp) to HS. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Kazazian HH Jr. Mobile elements: drivers of genome evolution. Science. 2004;303(5664):1626–32. Epub 2004/03/16. 10.1126/science.1089670 . [DOI] [PubMed] [Google Scholar]
  • 2.Bennetzen JL, Wang H. The contributions of transposable elements to the structure, function, and evolution of plant genomes. Annu Rev Plant Biol. 2014;65:505–30. Epub 2014/03/04. 10.1146/annurev-arplant-050213-035811 . [DOI] [PubMed] [Google Scholar]
  • 3.Tenaillon MI, Hollister JD, Gaut BS. A triptych of the evolution of plant transposable elements. Trends Plant Sci. 2010;15(8):471–8. Epub 2010/06/15. 10.1016/j.tplants.2010.05.003 . [DOI] [PubMed] [Google Scholar]
  • 4.Hollister JD, Gaut BS. Epigenetic silencing of transposable elements: a trade-off between reduced transposition and deleterious effects on neighboring gene expression. Genome Res. 2009;19(8):1419–28. Epub 2009/05/30. 10.1101/gr.091678.109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Makarevitch I, Waters AJ, West PT, Stitzer M, Hirsch CN, Ross-Ibarra J, et al. Transposable elements contribute to activation of maize genes in response to abiotic stress. PLoS Genet. 2015;11(1):e1004915 Epub 2015/01/09. 10.1371/journal.pgen.1004915 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Galindo-Gonzalez L, Mhiri C, Deyholos MK, Grandbastien MA. LTR-retrotransposons in plants: Engines of evolution. Gene. 2017;626:14–25. Epub 2017/05/10. 10.1016/j.gene.2017.04.051 . [DOI] [PubMed] [Google Scholar]
  • 7.Naito K, Zhang F, Tsukiyama T, Saito H, Hancock CN, Richardson AO, et al. Unexpected consequences of a sudden and massive transposon amplification on rice gene expression. Nature. 2009;461(7267):1130–4. Epub 2009/10/23. 10.1038/nature08479 . [DOI] [PubMed] [Google Scholar]
  • 8.Lisch D. How important are transposons for plant evolution? Nat Rev Genet. 2013;14(1):49–61. Epub 2012/12/19. 10.1038/nrg3374 . [DOI] [PubMed] [Google Scholar]
  • 9.Quadrana L, Colot V. Plant Transgenerational Epigenetics. Annu Rev Genet. 2016;50:467–91. Epub 2016/10/13. 10.1146/annurev-genet-120215-035254 . [DOI] [PubMed] [Google Scholar]
  • 10.Slotkin RK, Martienssen R. Transposable elements and the epigenetic regulation of the genome. Nat Rev Genet. 2007;8(4):272–85. Epub 2007/03/17. 10.1038/nrg2072 . [DOI] [PubMed] [Google Scholar]
  • 11.Saze H, Mittelsten Scheid O, Paszkowski J. Maintenance of CpG methylation is essential for epigenetic inheritance during plant gametogenesis. Nat Genet. 2003;34(1):65–9. Epub 2003/04/02. 10.1038/ng1138 . [DOI] [PubMed] [Google Scholar]
  • 12.Hu LJ, Li N, Xu CM, Zhong SL, Lin XY, Yang JJ, et al. Mutation of a major CG methylase in rice causes genome-wide hypomethylation, dysregulated genome expression, and seedling lethality. Proc Natl Acad Sci USA. 2014;111(29):10642–7. 10.1073/pnas.1410761111 WOS:000339310700060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Kankel MW, Ramsey DE, Stokes TL, Flowers SK, Haag JR, Jeddeloh JA, et al. Arabidopsis MET1 cytosine methyltransferase mutants. Genetics. 2003;163(3):1109–22. WOS:000182046900023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Yamauchi T, Johzuka-Hisatomi Y, Terada R, Nakamura I, Iida S. The MET1b gene encoding a maintenance DNA methyltransferase is indispensable for normal development in rice. Plant Mol Biol. 2014;85(3):219–32. 10.1007/s11103-014-0178-9 WOS:000336030800002. [DOI] [PubMed] [Google Scholar]
  • 15.Matzke MA, Mosher RA. RNA-directed DNA methylation: an epigenetic pathway of increasing complexity. Nat Rev Genet. 2014;15(6):394–408. Epub 2014/05/09. 10.1038/nrg3683 . [DOI] [PubMed] [Google Scholar]
  • 16.Law JA, Jacobsen SE. Establishing, maintaining and modifying DNA methylation patterns in plants and animals. Nat Rev Genet. 2010;11(3):204–20. Epub 2010/02/10. 10.1038/nrg2719 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Wendte JM, Schmitz RJ. Specifications of Targeting Heterochromatin Modifications in Plants. Mol Plant. 2018;11(3):381–7. 10.1016/j.molp.2017.10.002 WOS:000426964100005. [DOI] [PubMed] [Google Scholar]
  • 18.Martienssen R, Moazed D. RNAi and heterochromatin assembly. Cold Spring Harb Perspect Biol. 2015;7(8):a019323 Epub 2015/08/05. 10.1101/cshperspect.a019323 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Zemach A, Kim MY, Hsieh PH, Coleman-Derr D, Eshed-Williams L, Thao K, et al. The Arabidopsis nucleosome remodeler DDM1 allows DNA methyltransferases to access H1-containing heterochromatin. Cell. 2013;153(1):193–205. Epub 2013/04/02. 10.1016/j.cell.2013.02.033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Tan F, Zhou C, Zhou QW, Zhou SL, Yang WJ, Zhao Y, et al. Analysis of Chromatin Regulators Reveals Specific Features of Rice DNA Methylation Pathways. Plant Physiol. 2016;171(3):2041–54. 10.1104/pp.16.00393 WOS:000381303300043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Numa H, Yamaguchi K, Shigenobu S, Habu Y. Gene Body CG and CHG Methylation and Suppression of Centromeric CHH Methylation are Mediated by DECREASE IN DNA METHYLATION1 in Rice. Mol Plant. 2015;8(10):1560–2. Epub 2015/08/19. 10.1016/j.molp.2015.08.002 . [DOI] [PubMed] [Google Scholar]
  • 22.Hirsch CD, Springer NM. Transposable element influences on gene expression in plants. Biochim Biophys Acta. 2017;1860(1):157–65. Epub 2016/05/29. 10.1016/j.bbagrm.2016.05.010 . [DOI] [PubMed] [Google Scholar]
  • 23.Mirouze M, Vitte C. Transposable elements, a treasure trove to decipher epigenetic variation: insights from Arabidopsis and crop epigenomes. J Exp Bot. 2014;65(10):2801–12. 10.1093/jxb/eru120 WOS:000338005600021. [DOI] [PubMed] [Google Scholar]
  • 24.Henderson IR, Jacobsen SE. Tandem repeats upstream of the Arabidopsis endogene SDC recruit non-CG DNA methylation and initiate siRNA spreading. Gene Dev. 2008;22(12):1597–606. 10.1101/gad.1667808 WOS:000256797300006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Soppe WJJ, Jacobsen SE, Alonso-Blanco C, Jackson JP, Kakutani T, Koornneef M, et al. The late flowering phenotype of fwa mutants is caused by gain-of-function epigenetic alleles of a homeodomain gene. Mol Cell. 2000;6(4):791–802. 10.1016/s1097-2765(05)00090-0 WOS:000090136700004. [DOI] [PubMed] [Google Scholar]
  • 26.Manning K, Tor M, Poole M, Hong Y, Thompson AJ, King GJ, et al. A naturally occurring epigenetic mutation in a gene encoding an SBP-box transcription factor inhibits tomato fruit ripening. Nat Genet. 2006;38(8):948–52. 10.1038/ng1841 WOS:000239325700027. [DOI] [PubMed] [Google Scholar]
  • 27.Gehring M, Bubb KL, Henikoff S. Extensive demethylation of repetitive elements during seed development underlies gene imprinting. Science. 2009;324(5933):1447–51. Epub 2009/06/13. 10.1126/science.1171609 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Le TN, Miyazaki Y, Takuno S, Saze H. Epigenetic regulation of intragenic transposable elements impacts gene transcription in Arabidopsis thaliana. Nucleic Acids Res. 2015;43(8):3911–21. Epub 2015/03/31. 10.1093/nar/gkv258 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Seymour DK, Koenig D, Hagmann J, Becker C, Weigel D. Evolution of DNA methylation patterns in the Brassicaceae is driven by differences in genome organization. PLoS Genet. 2014;10(11):e1004785 Epub 2014/11/14. 10.1371/journal.pgen.1004785 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Nystedt B, Street NR, Wetterbom A, Zuccolo A, Lin YC, Scofield DG, et al. The Norway spruce genome sequence and conifer genome evolution. Nature. 2013;497(7451):579–84. Epub 2013/05/24. 10.1038/nature12211 . [DOI] [PubMed] [Google Scholar]
  • 31.West PT, Li Q, Ji L, Eichten SR, Song J, Vaughn MW, et al. Genomic distribution of H3K9me2 and DNA methylation in a maize genome. PLoS One. 2014;9(8):e105267 Epub 2014/08/15. 10.1371/journal.pone.0105267 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.To TK, Saze H, Kakutani T. DNA Methylation within Transcribed Regions. Plant Physiol. 2015;168(4):1219–25. Epub 2015/07/06. 10.1104/pp.15.00543 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Duan CG, Wang X, Zhang L, Xiong X, Zhang Z, Tang K, et al. A protein complex regulates RNA processing of intronic heterochromatin-containing genes in Arabidopsis. Proc Natl Acad Sci U S A. 2017;114(35):E7377–E84. Epub 2017/08/16. 10.1073/pnas.1710683114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Saze H, Kitayama J, Takashima K, Miura S, Harukawa Y, Ito T, et al. Mechanism for full-length RNA processing of Arabidopsis genes containing intragenic heterochromatin. Nat Commun. 2013;4:2301 Epub 2013/08/13. 10.1038/ncomms3301 . [DOI] [PubMed] [Google Scholar]
  • 35.Coustham V, Vlad D, Deremetz A, Gy I, Cubillos FA, Kerdaffrec E, et al. SHOOT GROWTH1 maintains Arabidopsis epigenomes by regulating IBM1. PLoS One. 2014;9(1):e84687 Epub 2014/01/10. 10.1371/journal.pone.0084687 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Wang X, Duan CG, Tang K, Wang B, Zhang H, Lei M, et al. RNA-binding protein regulates plant DNA methylation by controlling mRNA processing at the intronic heterochromatin-containing gene IBM1. Proc Natl Acad Sci U S A. 2013;110(38):15467–72. Epub 2013/09/05. 10.1073/pnas.1315399110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Saze H. Epigenetic regulation of intragenic transposable elements: a two-edged sword. J Biochem. 2018;164(5):323–8. 10.1093/jb/mvy060 WOS:000449471000001. [DOI] [PubMed] [Google Scholar]
  • 38.Wei L, Gu L, Song X, Cui X, Lu Z, Zhou M, et al. Dicer-like 3 produces transposable element-associated 24-nt siRNAs that control agricultural traits in rice. Proc Natl Acad Sci U S A. 2014;111(10):3877–82. Epub 2014/02/21. 10.1073/pnas.1318131111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Liu N, Lee CH, Swigut T, Grow E, Gu B, Bassik MC, et al. Selective silencing of euchromatic L1s revealed by genome-wide screens for L1 regulators. Nature. 2018;553(7687):228–32. Epub 2017/12/07. 10.1038/nature25179 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Lorincz MC, Dickerson DR, Schmitt M, Groudine M. Intragenic DNA methylation alters chromatin structure and elongation efficiency in mammalian cells. Nat Struct Mol Biol. 2004;11(11):1068–75. Epub 2004/10/07. 10.1038/nsmb840 . [DOI] [PubMed] [Google Scholar]
  • 41.Liu J, He Y, Amasino R, Chen X. siRNAs targeting an intronic transposon in the regulation of natural flowering behavior in Arabidopsis. Genes Dev. 2004;18(23):2873–8. Epub 2004/11/17. 10.1101/gad.1217304 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Kum R, Tsukiyama T, Inagaki H, Saito H, Teraishi M, Okumoto Y, et al. The active miniature inverted-repeat transposable element mPing posttranscriptionally produces new transcriptional variants in the rice genome. Mol Breeding. 2015;35(8). ARTN 159 10.1007/s11032-015-0353-y WOS:000360005100010. [DOI] [Google Scholar]
  • 43.Khan AR, Enjalbert J, Marsollier AC, Rousselet A, Goldringer I, Vitte C. Vernalization treatment induces site-specific DNA hypermethylation at the VERNALIZATION-A1 (VRN-A1) locus in hexaploid winter wheat. BMC Plant Biol. 2013;13:209 Epub 2013/12/18. 10.1186/1471-2229-13-209 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Osabe K, Harukawa Y, Miura S, Saze H. Epigenetic Regulation of Intronic Transgenes in Arabidopsis. Sci Rep. 2017;7:45166 Epub 2017/03/25. 10.1038/srep45166 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Tsuchiya T, Eulgem T. An alternative polyadenylation mechanism coopted to the Arabidopsis RPP7 gene through intronic retrotransposon domestication. Proc Natl Acad Sci U S A. 2013;110(37):E3535–43. Epub 2013/08/14. 10.1073/pnas.1312545110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Ong-Abdullah M, Ordway JM, Jiang N, Ooi SE, Kok SY, Sarpan N, et al. Loss of Karma transposon methylation underlies the mantled somaclonal variant of oil palm. Nature. 2015;525(7570):533–7. Epub 2015/09/10. 10.1038/nature15365 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Xie Y, Zhang Y, Han J, Luo J, Li G, Huang J, et al. The Intronic cis Element SE1 Recruits trans-Acting Repressor Complexes to Repress the Expression of ELONGATED UPPERMOST INTERNODE1 in Rice. Mol Plant. 2018;11(5):720–35. Epub 2018/03/11. 10.1016/j.molp.2018.03.001 . [DOI] [PubMed] [Google Scholar]
  • 48.Questa JI, Song J, Geraldo N, An HL, Dean C. Arabidopsis transcriptional repressor VAL1 triggers Polycomb silencing at FLC during vernalization. Science. 2016;353(6298):485–8. 10.1126/science.aaf7354 WOS:000380583600040. [DOI] [PubMed] [Google Scholar]
  • 49.Yuan WY, Luo X, Li ZC, Yang WN, Wang YZ, Liu R, et al. A cis cold memory element and a trans epigenome reader mediate Polycomb silencing of FLC by vernalization in Arabidopsis. Nat Genet. 2016;48(12):1527–34. 10.1038/ng.3712 WOS:000389011100013. [DOI] [PubMed] [Google Scholar]
  • 50.Hong RL, Hamaguchi L, Busch MA, Weigel D. Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing. Plant Cell. 2003;15(6):1296–309. 10.1105/tpc.009548 WOS:000185078300004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Song X, Cao X. Transposon-mediated epigenetic regulation contributes to phenotypic diversity and environmental adaptation in rice. Curr Opin Plant Biol. 2017;36:111–8. Epub 2017/03/09. 10.1016/j.pbi.2017.02.004 . [DOI] [PubMed] [Google Scholar]
  • 52.Feschotte C, Jiang N, Wessler SR. Plant transposable elements: Where genetics meets genomics. Nat Rev Genet. 2002;3(5):329–41. 10.1038/nrg793 WOS:000175350000011. [DOI] [PubMed] [Google Scholar]
  • 53.Matsumoto T, Wu JZ, Kanamori H, Katayose Y, Fujisawa M, Namiki N, et al. The map-based sequence of the rice genome. Nature. 2005;436(7052):793–800. 10.1038/nature03895 WOS:000231116500034. [DOI] [PubMed] [Google Scholar]
  • 54.Du J, Zhong X, Bernatavichute YV, Stroud H, Feng S, Caro E, et al. Dual binding of chromomethylase domains to H3K9me2-containing nucleosomes directs DNA methylation in plants. Cell. 2012;151(1):167–80. Epub 2012/10/02. 10.1016/j.cell.2012.07.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Roudier F, Ahmed I, Berard C, Sarazin A, Mary-Huard T, Cortijo S, et al. Integrative epigenomic mapping defines four main chromatin states in Arabidopsis. EMBO J. 2011;30(10):1928–38. 10.1038/emboj.2011.103 WOS:000291645400009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Bradnam KR, Korf I. Longer First Introns Are a General Property of Eukaryotic Gene Structure. Plos One. 2008;3(8). ARTN e3093 10.1371/journal.pone.0003093 WOS:000264796800003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Oki N, Yano K, Okumoto Y, Tsukiyama T, Teraishi M, Tanisaka T. A genome-wide view of miniature inverted-repeat transposable elements (MITEs) in rice, Oryza sativa ssp japonica. Genes Genet Syst. 2008;83(4):321–9. 10.1266/ggs.83.321 WOS:000261872300004. [DOI] [PubMed] [Google Scholar]
  • 58.Lu C, Chen J, Zhang Y, Hu Q, Su W, Kuang H. Miniature inverted-repeat transposable elements (MITEs) have been accumulated through amplification bursts and play important roles in gene expression and species diversity in Oryza sativa. Mol Biol Evol. 2012;29(3):1005–17. Epub 2011/11/19. 10.1093/molbev/msr282 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Quadrana L, Bortolini Silveira A, Mayhew GF, LeBlanc C, Martienssen RA, Jeddeloh JA, et al. The Arabidopsis thaliana mobilome and its impact at the species level. Elife. 2016;5 Epub 2016/06/04. 10.7554/eLife.15716 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Choi JY, Purugganan MD. Evolutionary Epigenomics of Retrotransposon-Mediated Methylation Spreading in Rice. Mol Biol Evol. 2018;35(2):365–82. Epub 2017/11/11. 10.1093/molbev/msx284 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Chen J, Hu Q, Zhang Y, Lu C, Kuang H. P-MITE: a database for plant miniature inverted-repeat transposable elements. Nucleic Acids Res. 2014;42(Database issue):D1176–81. Epub 2013/11/01. 10.1093/nar/gkt1000 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Sato Y, Takehisa H, Kamatsuki K, Minami H, Namiki N, Ikawa H, et al. RiceXPro Version 3.0: expanding the informatics resource for rice transcriptome. Nuc Acids Res. 2013;41(D1):D1206–D13. 10.1093/nar/gks1125 WOS:000312893300171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Kawahara Y, Oono Y, Wakimoto H, Ogata J, Kanamori H, Sasaki H, et al. TENOR: Database for Comprehensive mRNA-Seq Experiments in Rice. Plant Cell Physiol. 2016;57(1):e7 Epub 2015/11/19. 10.1093/pcp/pcv179 . [DOI] [PubMed] [Google Scholar]
  • 64.Schug J, Schuller WP, Kappen C, Salbaum JM, Bucan M, Stoeckert CJ. Promoter features related to tissue specificity as measured by Shannon entropy. Genome Biol. 2005;6(4). ARTN R33 10.1186/gb-2005-6-4-r33 WOS:000228436000010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Fawcett JA, Kado T, Sasaki E, Takuno S, Yoshida K, Sugino RP, et al. QTL map meets population genomics: an application to rice. PLoS One. 2013;8(12):e83720 Epub 2014/01/01. 10.1371/journal.pone.0083720 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Wasternack C, Hause B. Jasmonates: biosynthesis, perception, signal transduction and action in plant stress response, growth and development. An update to the 2007 review in Annals of Botany. Ann Bot-London. 2013;111(6):1021–58. 10.1093/aob/mct067 WOS:000319433300002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Mansueto L, Fuentes RR, Borja FN, Detras J, Abriol-Santos JM, Chebotarov D, et al. Rice SNP-seek database update: new SNPs, indels, and queries. Nucleic Acids Res. 2017;45(D1):D1075–D81. Epub 2016/12/03. 10.1093/nar/gkw1135 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Mei CS, Qi M, Sheng GY, Yang YN. Inducible overexpression of a rice allene oxide synthase gene increases the endogenous jasmonic acid level, PR gene expression, and host resistance to fungal infection. Mol Plant Microbe In. 2006;19(10):1127–37. 10.1094/Mpmi-19-1127 WOS:000240692300009. [DOI] [PubMed] [Google Scholar]
  • 69.Ogawa S, Kawahara-Miki R, Miyamoto K, Yamane H, Nojiri H, Tsujii Y, et al. OsMYC2 mediates numerous defence-related transcriptional changes via jasmonic acid signalling in rice. Biochem Bioph Res Co. 2017;486(3):796–803. 10.1016/j.bbrc.2017.03.125 WOS:000399966700030. [DOI] [PubMed] [Google Scholar]
  • 70.Le TN, Osabe K, Miyazaki Y, Saze H. Epigenetic regulation of intragenic repeats in plant genomes. Genes Genet Syst. 2016;91(6):317–. WOS:000405886000006. 10.1266/ggs.91.317 [DOI] [PubMed] [Google Scholar]
  • 71.Yang Q, Liang C, Zhuang W, Li J, Deng H, Deng Q, et al. Characterization and identification of the candidate gene of rice thermo-sensitive genic male sterile gene tms5 by mapping. Planta. 2007;225(2):321–30. Epub 2006/08/10. 10.1007/s00425-006-0353-6 . [DOI] [PubMed] [Google Scholar]
  • 72.Itabashi E, Iwata N, Fujii S, Kazama T, Toriyama K. The fertility restorer gene, Rf2, for Lead Rice-type cytoplasmic male sterility of rice encodes a mitochondrial glycine-rich protein. Plant J. 2011;65(3):359–67. Epub 2011/01/27. 10.1111/j.1365-313X.2010.04427.x . [DOI] [PubMed] [Google Scholar]
  • 73.Kubo T, Takano-kai N, Yoshimura A. RFLP mapping of genes for long kernel and awn on chromosome 3 in rice. Rice Genet Newsl. 2001;18:26–8. [Google Scholar]
  • 74.Kang HG, Park S, Matsuoka M, An G. White-core endosperm floury endosperm-4 in rice is generated by knockout mutations in the C-type pyruvate orthophosphate dikinase gene (OsPPDKB). Plant J. 2005;42(6):901–11. Epub 2005/06/09. 10.1111/j.1365-313X.2005.02423.x . [DOI] [PubMed] [Google Scholar]
  • 75.Hirano HY, Sano Y. Molecular Characterization of the Waxy Locus of Rice (Oryza-Sativa). Plant Cell Physiol. 1991;32(7):989–97. 10.1093/oxfordjournals.pcp.a078186 WOS:A1991GN75000009. [DOI] [Google Scholar]
  • 76.Kawakatsu T, Yamamoto MP, Touno SM, Yasuda H, Takaiwa F. Compensation and interaction between RISBZ1 and RPBF during grain filling in rice. Plant Journal. 2009;59(6):908–20. 10.1111/j.1365-313X.2009.03925.x WOS:000269708400005. [DOI] [PubMed] [Google Scholar]
  • 77.Miura A, Nakamura M, Inagaki S, Kobayashi A, Saze H, Kakutani T. An Arabidopsis jmjC domain protein protects transcribed genes from DNA methylation at CHG sites. EMBO J. 2009;28(8):1078–86. Epub 2009/03/06. 10.1038/emboj.2009.59 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Inagaki S, Miura-Kamio A, Nakamura Y, Lu F, Cui X, Cao X, et al. Autocatalytic differentiation of epigenetic modifications within the Arabidopsis genome. EMBO J. 2010;29(20):3496–506. Epub 2010/09/14. 10.1038/emboj.2010.227 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Lu FL, Li GL, Cui X, Liu CY, Wang XJ, Cao XF. Comparative analysis of JmjC domain-containing proteins reveals the potential histone demethylases in Arabidopsis and rice. J Integr Plant Biol. 2008;50(7):886–96. 10.1111/j.1744-7909.2008.00692.x WOS:000257708300014. [DOI] [PubMed] [Google Scholar]
  • 80.Zhang QJ, Zhu T, Xia EH, Shi C, Liu YL, Zhang Y, et al. Rapid diversification of five Oryza AA genomes associated with rice adaptation. Proc Natl Acad Sci USA. 2014;111(46):E4954–E62. 10.1073/pnas.1418307111 WOS:000345153300010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Cheng C, Tarutani Y, Miyao A, Ito T, Yamazaki M, Sakai H, et al. Loss of function mutations in the rice chromomethylase OsCMT3a cause a burst of transposition. Plant J. 2015;83(6):1069–81. Epub 2015/08/06. 10.1111/tpj.12952 . [DOI] [PubMed] [Google Scholar]
  • 82.Moritoh S, Eun CH, Ono A, Asao H, Okano Y, Yamaguchi K, et al. Targeted disruption of an orthologue of DOMAINS REARRANGED METHYLASE 2, OsDRM2, impairs the growth of rice plants by abnormal DNA methylation. Plant Journal. 2012;71(1):85–98. 10.1111/j.1365-313X.2012.04974.x WOS:000305407000008. [DOI] [PubMed] [Google Scholar]
  • 83.Higo H, Tahir M, Takashima K, Miura A, Watanabe K, Tagiri A, et al. DDM1 (Decrease in DNA Methylation) genes in rice (Oryza sativa). Molecular Genetics and Genomics. 2012;287(10):785–92. 10.1007/s00438-012-0717-5 WOS:000309240500002. [DOI] [PubMed] [Google Scholar]
  • 84.Kakutani T, Jeddeloh JA, Flowers SK, Munakata K, Richards EJ. Developmental abnormalities and epimutations associated with DNA hypomethylation mutations. Proc Natl Acad Sci USA. 1996;93(22):12406–11. 10.1073/pnas.93.22.12406 WOS:A1996VP93700065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Bartee L, Malagnac F, Bender J. Arabidopsis cmt3 chromomethylase mutations block non-CG methylation and silencing of an endogenous gene. Gene Dev. 2001;15(14):1753–8. 10.1101/gad.905701 WOS:000170020000002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Lindroth AM, Cao XF, Jackson JP, Zilberman D, McCallum CM, Henikoff S, et al. Requirement of CHROMOMETHYLASE3 for maintenance of CpXpG methylation. Science. 2001;292(5524):2077–80. 10.1126/science.1059745 WOS:000169284700048. [DOI] [PubMed] [Google Scholar]
  • 87.Cao XF, Jacobsen SE. Role of the Arabidopsis DRM methyltransferases in de novo DNA methylation and gene silencing. Curr Biol. 2002;12(13):1138–44. Pii S0960-9822(02)00925-9 10.1016/s0960-9822(02)00925-9 WOS:000176916900026. [DOI] [PubMed] [Google Scholar]
  • 88.Wei LY, Gu LF, Song XW, Cui XK, Lu ZK, Zhou M, et al. Dicer-like 3 produces transposable element-associated 24-nt siRNAs that control agricultural traits in rice. Proc Natl Acad Sci USA. 2014;111(10):3877–82. 10.1073/pnas.1318131111 WOS:000332564800056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Niu XM, Xu YC, Li ZW, Bian YT, Hou XH, Chen JF, et al. Transposable elements drive rapid phenotypic variation in Capsella rubella. Proc Natl Acad Sci USA. 2019;116(14):6908–13. 10.1073/pnas.1811498116 WOS:000463069900067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Maumus F, Quesneville H. Ancestral repeats have shaped epigenome and genome composition for millions of years in Arabidopsis thaliana. Nat Commun. 2014;5 ARTN 4104 10.1038/ncomms5104 WOS:000338838200018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Chamary JV, Hurst LD. Similar rates but different modes of sequence evolution in introns and at exonic silent sites in rodents: Evidence for selectively driven codon usage. Mol Biol Evol. 2004;21(6):1014–23. 10.1093/molbev/msh087 WOS:000221599300006. [DOI] [PubMed] [Google Scholar]
  • 92.Keightley PD, Gaffney DJ. Functional constraints and frequency of deleterious mutations in noncoding DNA of rodents. Proc Natl Acad Sci USA. 2003;100(23):13402–6. 10.1073/pnas.2233252100 WOS:000186573700053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Parra G, Bradnam K, Rose AB, Korf I. Comparative and functional analysis of intron-mediated enhancement signals reveals conserved features among plants. Nuc Acids Res. 2011;39(13):5328–37. 10.1093/nar/gkr043 WOS:000293020000009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Jeon JS, Lee S, Jung KH, Jun SH, Kim C, An G. Tissue-preferential expression of a rice alpha-tubulin gene, OsTubA1, mediated by the first intron. Plant Physiol. 2000;123(3):1005–14. 10.1104/pp.123.3.1005 WOS:000088213300023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Morello L, Bardini M, Sala F, Breviario D. A long leader intron of the Ostub16 rice beta-tubulin gene is required for high-level gene expression and can autonomously promote transcription both in vivo and in vitro. Plant J. 2002;29(1):33–44. 10.1046/j.0960-7412.2001.01192.x WOS:000173544800004. [DOI] [PubMed] [Google Scholar]
  • 96.Liu SZ, Yeh CT, Ji TM, Ying K, Wu HY, Tang HM, et al. Mu Transposon Insertion Sites and Meiotic Recombination Events Co-Localize with Epigenetic Marks for Open Chromatin across the Maize Genome. Plos Genet. 2009;5(11). ARTN e1000733 10.1371/journal.pgen.1000733 WOS:000272419500028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Vollbrecht E, Duvick J, Schares JP, Ahern KR, Deewatthanawong P, Xu L, et al. Genome-Wide Distribution of Transposed Dissociation Elements in Maize. Plant Cell. 2010;22(6):1667–85. 10.1105/tpc.109.073452 WOS:000280505300004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Yang L, Gaut BS. Factors that Contribute to Variation in Evolutionary Rate among Arabidopsis Genes. Mol Biol Evol. 2011;28(8):2359–69. 10.1093/molbev/msr058 WOS:000293304700017. [DOI] [PubMed] [Google Scholar]
  • 99.Turner BM. Epigenetic responses to environmental change and their evolutionary implications. Philos T R Soc B. 2009;364(1534):3403–18. 10.1098/rstb.2009.0125 WOS:000270800800009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Meyers BC, Kaushik S, Nandety RS. Evolving disease resistance genes. Curr Opin Plant Biol. 2005;8(2):129–34. Epub 2005/03/09. 10.1016/j.pbi.2005.01.002 . [DOI] [PubMed] [Google Scholar]
  • 101.Espinas NA, Saze H, Saijo Y. Epigenetic Control of Defense Signaling and Priming in Plants. Front Plant Sci. 2016;7 ARTN 1201 10.3389/fpls.2016.01201 WOS:000381206400001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Hosaka A, Kakutani T. Transposable elements, genome evolution and transgenerational epigenetic variation. Curr Opin Plant Biol. 2018;49:43–8. 10.1016/j.gde.2018.02.012 WOS:000433211500007. [DOI] [PubMed] [Google Scholar]
  • 103.Li X, Guo K, Zhu X, Chen P, Li Y, Xie G, et al. Domestication of rice has reduced the occurrence of transposable elements within gene coding regions. BMC Genomics. 2017;18(1):55 Epub 2017/01/11. 10.1186/s12864-016-3454-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 104.Parenteau J, Maignon L, Berthoumieux M, Catala M, Gagnon V, Abou Elela S. Introns are mediators of cell response to starvation. Nature. 2019;565(7741):612–7. Epub 2019/01/18. 10.1038/s41586-018-0859-7 . [DOI] [PubMed] [Google Scholar]
  • 105.Morgan JT, Fink GR, Bartel DP. Excised linear introns regulate growth in yeast. Nature. 2019;565(7741):606–11. Epub 2019/01/18. 10.1038/s41586-018-0828-1 . [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 106.Sakai H, Lee SS, Tanaka T, Numa H, Kim J, Kawahara Y, et al. Rice Annotation Project Database (RAP-DB): An Integrative and Interactive Database for Rice Genomics. Plant Cell Physiol. 2013;54(2):E6–+. 10.1093/pcp/pcs183 WOS:000315218700006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 107.Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005;110(1–4):462–7. 10.1159/000084979 WOS:000231064600047. [DOI] [PubMed] [Google Scholar]
  • 108.Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL. NCBI BLAST: a better web interface. Nucleic Acids Res. 2008;36(Web Server issue):W5–9. Epub 2008/04/29. 10.1093/nar/gkn201 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 109.Lu L, Chen JF, Robb SMC, Okumoto Y, Stajich JE, Wessler SR. Tracking the genome-wide outcomes of a transposable element burst over decades of amplification. Proc Natl Acad Sci USA. 2017;114(49):E10550–E9. 10.1073/pnas.1716459114 WOS:000417339700009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 110.Miki D, Shimamoto K. Simple RNAi vectors for stable and transient suppression of gene function in rice. Plant Cell Physiol. 2004;45(4):490–5. 10.1093/pcp/pch048 WOS:000221037200015. [DOI] [PubMed] [Google Scholar]
  • 111.Miura F, Enomoto Y, Dairiki R, Ito T. Amplification-free whole-genome bisulfite sequencing by post-bisulfite adaptor tagging. Nucleic Acids Res. 2012;40(17):e136 Epub 2012/06/01. 10.1093/nar/gks454 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 112.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20. 10.1093/bioinformatics/btu170 WOS:000340049100004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 113.Krueger F, Andrews SR. Bismark: a flexible aligner and methylation caller for Bisulfite-Seq applications. Bioinformatics. 2011;27(11):1571–2. 10.1093/bioinformatics/btr167 WOS:000291062400018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 114.Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2. 10.1093/bioinformatics/btq033 WOS:000275243500019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 115.Takuno S, Gaut BS. Body-methylated genes in Arabidopsis thaliana are functionally important and evolve slowly. Mol Biol Evol. 2012;29(1):219–27. Epub 2011/08/05. 10.1093/molbev/msr188 . [DOI] [PubMed] [Google Scholar]
  • 116.Stroud H, Greenberg MV, Feng S, Bernatavichute YV, Jacobsen SE. Comprehensive analysis of silencing mutants reveals complex regulation of the Arabidopsis methylome. Cell. 2013;152(1–2):352–64. Epub 2013/01/15. 10.1016/j.cell.2012.10.054 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 117.Ramirez F, Dundar F, Diehl S, Gruning BA, Manke T. deepTools: a flexible platform for exploring deep-sequencing data. Nuc Acids Res. 2014;42(W1):W187–W91. 10.1093/nar/gku365 WOS:000339715000031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 118.Tian T, Liu Y, Yan HY, You Q, Yi X, Du Z, et al. agriGO v2.0: a GO analysis toolkit for the agricultural community, 2017 update. Nuc Acids Res. 2017;45(W1):W122–W9. 10.1093/nar/gkx382 WOS:000404427000019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 119.Mi HY, Lazareva-Ulitsky B, Loo R, Kejariwal A, Vandergriff J, Rabkin S, et al. The PANTHER database of protein families, subfamilies, functions and pathways. Nuc Acids Res. 2005;33:D284–D8. 10.1093/nar/gki078 WOS:000226524300058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 120.Sun JQ, Nishiyama T, Shimizu K, Kadota K. TCC: an R package for comparing tag count data with robust normalization strategies. BMC Bioinformatics. 2013;14 Artn 219 10.1186/1471-2105-14-219 WOS:000321835900001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 121.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–U54. 10.1038/nmeth.1923 WOS:000302218500017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 122.Alexandrov N, Tai S, Wang W, Mansueto L, Palis K, Fuentes RR, et al. SNP-Seek database of SNPs derived from 3000 rice genomes. Nucleic Acids Res. 2015;43(Database issue):D1023–7. Epub 2014/11/29. 10.1093/nar/gku1039 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 123.Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12(4):357–60. Epub 2015/03/10. 10.1038/nmeth.3317 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 124.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. Epub 2009/06/10. 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 125.Freese NH, Norris DC, Loraine AE. Integrated genome browser: visual analytics platform for genomics. Bioinformatics. 2016;32(14):2089–95. Epub 2016/05/07. 10.1093/bioinformatics/btw069 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 126.Liao Y, Smyth GK, Shi W. The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote. Nuc Acids Res. 2013;41(10). ARTN e108 10.1093/nar/gkt214 WOS:000319806600005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 127.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12). ARTN 550 10.1186/s13059-014-0550-8 WOS:000346609500022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 128.Gremme G, Brendel V, Sparks ME, Kurtz S. Engineering a software tool for gene structure prediction in higher organisms. Inform Software Tech. 2005;47(15):965–78. 10.1016/j.infsof.2005.09.005 WOS:000234322400003. [DOI] [Google Scholar]
  • 129.Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al. Clustal W and clustal X version 2.0. Bioinformatics. 2007;23(21):2947–8. 10.1093/bioinformatics/btm404 WOS:000251197700021. [DOI] [PubMed] [Google Scholar]
  • 130.Nei M, Gojobori T. Simple Methods for Estimating the Numbers of Synonymous and Nonsynonymous Nucleotide Substitutions. Mol Biol Evol. 1986;3(5):418–26. WOS:A1986E136000004. 10.1093/oxfordjournals.molbev.a040410 [DOI] [PubMed] [Google Scholar]
  • 131.Thompson JD, Higgins DG, Gibson TJ. Clustal-W—Improving the Sensitivity of Progressive Multiple Sequence Alignment through Sequence Weighting, Position-Specific Gap Penalties and Weight Matrix Choice. Nuc Acids Res. 1994;22(22):4673–80. 10.1093/nar/22.22.4673 WOS:A1994PU19900018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 132.Letunic I, Bork P. 20 years of the SMART protein domain annotation resource. Nuc Acids Res. 2018;46(D1):D493–D6. 10.1093/nar/gkx922 WOS:000419550700075. [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Wendy A Bickmore, Ortrun Mittelsten Scheid

18 Jul 2019

Dear Dr Saze,

Thank you very much for submitting your Research Article entitled 'Genome-wide distribution of intronic heterochromatin impacts gene transcription and sequence divergence in the rice genome' to PLOS Genetics. Your manuscript was fully evaluated at the editorial level and by independent peer reviewers. The reviewers appreciated the attention to an important problem, but raised some substantial concerns about the current manuscript. Based on the reviews, we will not be able to accept this version of the manuscript, but we would be willing to review again a much-revised version. We cannot, of course, promise publication at that time.

As you will see from the detailed comments of the reviewers, they all consider the work as potentially interesting, but in the current state, they did not find it suitable for publishing. They have a number of suggestions for shifting the focus within the paper, for specifying or adding details, and for considering alternative interpretations. Specifically, they raise quite some points about the statistical handling of the data, and addressing these issues might require adjustment of some of the statements correspondingly. In addition, they suggest some improvements for the data presentation. We hope that these comments are helpful to revise the manuscript.

Should you decide to revise the manuscript for further consideration here, your revisions should address the specific points made by each reviewer. We will also require a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

If you decide to revise the manuscript for further consideration at PLOS Genetics, please aim to resubmit within the next 60 days, unless it will take extra time to address the concerns of the reviewers, in which case we would appreciate an expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments are included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see our guidelines.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool.  PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, use the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

We are sorry that we cannot be more positive about your manuscript at this stage. Please do not hesitate to contact us if you have any concerns or questions.

Yours sincerely,

Ortrun Mittelsten Scheid

Associate Editor

PLOS Genetics

Wendy Bickmore

Section Editor: Epigenetics

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors characterise heterochromatic introns (defined as introns with 5 or more consecutive mCHG with average methylation >=0.5) in the rice genome, and demonstrate that the OsIBM2 gene plays a similar role to its Arabidopsis counterpart in the transcription through heterochromatic regions, but unlike in Arabidopsis OsIBM2 mutations have severe phenotypic consequences. The manuscript is clear and interesting. My comments are therefore mostly technical. The conclusions about regulatory roles of intronic heterochromatin can be de-emphasised, while the data from OsIBM2 knock-downs and knock-out is very nice and could have a greater place in the abstract.

Major comments:

1. The evidence for the regulatory roles of intronic heterochromatin is too thin to include it as a conclusion in the abstract. The sentence in the summary (“may have regulatory roles...”) is more accurate, and should be preferred.

2. Part of the evidence comes from a GO analysis. GO enrichment results are in general difficult to interpret, but it is even more difficult here as the fold-enrichments and number of genes of each category are not reported. The broad term “catalytic activity” is significantly enriched, but how many of the 2,500 genes with heterochromatic introns are concerned?

3. Additional evidence comes from the entropy of gene expression. Although there are significant differences between genes with and without heterochromatic introns, the magnitudes of the difference seem very small. The text should reflect this.

4. The evidence around the JA responsiveness is very weak. From just 4 cultivars and 3 genes with no consistent response patterns across cultivars, I believe the carefully-worded conclusion that “these results suggest that structural variations of heretochromatic introns may have impacts on gene responses to environmental signals” is still too strong: it may also have no impact. One would need a lot more cultivars to start deriving trends, and/or a lot more JA-responsive genes. I recommend keeping the results in the manuscripts but noting that they do not provide evidence. The authors rightly note that non-intronic sequence polymorphims may also explain the altered JA response; trans-factors may also be at play.

In presenting this data, plotting individual points rather than mean + sd would be preferable. The use of asterisks is better kept for statistical significance than category of structural variation. Please also give the results of the multiple-testing-corrected statistical tests.

Minor comments:

1. Please provide a summary table for the sequencing (WGBS and RNA-seq): number of reads sequenced and mapped, bisulfite conversion rate, median cytosine coverage, etc. It will help in quickly assessing study design and robustness of the underlying data.

2. The mapping strategy for the bisulfite data may cause a small amount of double-counting: if the reads of a pair overlap each other, and map to a SNP (or PCR error), they would not be aligned in paired end (no mismatches allowed) but would both be successfully aligned in the second round as single-end reads with 1 mismatch. The cytosines in the overlapping region would then be counted twice. This should be rare and is unlikely to affect the results, but I’d recommend relaxing the mapping parameters in the paired-end round. Here again the summary table of mapping efficiency/... would help in evaluating the mapping strategy.

3. Cytosines with low coverage (0-2) are excluded, but it may also be worth to exclude outliers at the upper end: regions of unnaturally high coverage may be chloroplast insertions in the genome assembly or, particularly relevant to this study, repeated elements that are not well represented in the assembly.

4. It is unusual to deduplicate RNA-seq reads, in the absence of UMIs it risks introducing biases (for a discussion see https://sequencing.qcfail.com/articles/libraries-can-contain-technical-duplication/ and https://www.biostars.org/p/55648/). Did the libraries require this? If they did, any results obtained after deduplication should be taken with a pinch of salt.

5. It is also unusual to collapse biological RNA-seq replicates (l. 524-528). Please keep the replication in the analysis even if it means relaxing the significance threshold to look at an OK number of genes.

6. TPM usually refers to Transcripts Per Million rather than Tags Per Million. If the authors mean tags per million, the cpm (counts per million) unit would be clearer.

7. Plots in Fig 4B and C are the showing the same thing, only with a different axis-scale. It’s a good idea to plot log values of cpms/tpms rather than non-logged counts, as in Fig4C. If the style of Fig4B is desired, a violin plot would advantageously summarise all the information into one graph.

boxplots

8. In the context of their study, the authors discuss the documented link between R genes and TEs. Could this be specifically addressed? For instance, are R genes over-represented in the set of genes with intronic-heterochromatin?

Reviewer #2: This manuscript provides a comprehensive analysis of introns with heterochromatin in rice. Given that we know very little about this in larger genomes, and that the authors did comparative analysis with other rice cultivars, and that they included mutant analysis, I was initially excited about these results. The idea that intronic TE insertions may be an important source regulator information is indeed intriguing, but I do not think the authors have made the case, nor have to considered numerous alternative interpretations of their data. The mutant analysis was particularly disappointing, as only a tiny proportion of any of the genes’ expression was affected in the expected way and a more comprehensive analysis of the RNAseq of the mutants was missing. In the end, the authors ended up with anecdotes that may or may not support what is clearly a favored hypothesis. However, particularly when it comes to TEs and epigenetics, we must always consider the null hypothesis, which is that most TEs most of the time do not provide a selective advantage to their host. The authors are have made some intriguing arguments based on their data, but, in the end, it is not convincing.

Line 68: “often acquire regulatory functions for surrounding genes”. Unclear what this means.

Line 113: “introns at genome-wide” should read “introns at a genome-wide”

Line 159: or preferential targeting? Many TEs appear to preferentially target the 5’ ends of genes.

Line 165: Hopefully, this data set will be made available.

Line 178: This is very likely the case. Are these repetitive or single copy? Do they form hairpins? Are they helitrons, which are notoriously difficult to identify?

Line 246: I’m not sure I find these data particularly convincing. It is a small number of genes and, as the authors state, there certainly could be other causal sequence polymorphisms. It would have been more convincing with more examples, perhaps of other enriched terms? Also, the error bars from the RT-PCR are pretty large.

Line 279: I’m having a hard time interpreting this. “detected 198 genes both with and without heterochromatic introns that showed changes in transcripts downstream of introns (27%; 54/198 genes with heterochromatic intron, expression of which commonly changed in RNAi_#2, #16 and osibm2_g#24 lines; Table S3)”. So a total of 198 showed changes and some were with and some were without heterochromatic introns. And only 27% of the total were heterochromatic. Is this statistically significant? Since 11% of introns are heterochromatic, I guess it could be, but the argument should be made here. And the percent of heterochromatic introns that are affected was only 54/4,150, or 1.3%. I may be missing something, but this result does not appear to represent a trend.

Line 283: But the same could be for the non-heterochromatin intron genes.

Line 288: Why not do the control, which are genes whose downstream expression went down even though they didn’t have any heterochromatin?

Line 289: I disagree, what the data suggests is that a very small fraction of genes with and without heterochromatin are affected when OsIBM2 is knocked down.

Line 295: Which allele was this?

Line 326: In plants with large genomes, many gene models do not actually correspond to genes. Rather, they represent Pack-MULEs and helitrons, that capture gene fragments. If they were inadvertently counted, then this would explain an overall relaxation in selection. Alternatively, genes under relaxed selective contraints may simply be able to tolerate these insertions.

Line 332: Don't you mean exon here? You can’t get Ka/Ks from introns.

Line 343: I’m not convinced of this. The mutants could cause lethality for a variety reasons, and only a minority of genes that were changed downstream of the introns had heterochronic introns, as the authors point out. Further, I’m guessing that expression of a very large number of genes besides these were affected in the mutant.

Line 359: Alternatively, some genes, such as those expressed at lower levels or under certain conditions may simply tolerate these insertions.

Line 376: There is also ample evidence that many TEs target the 5’ ends of genes, and insertions into the exons in the 5’ ends of genes would be selected against.

Line 389: I don’t think so. It just shows that neutral sequences evolve rapidly, and methylation makes them evolve more rapidly.

Reviewer #3: The manuscript by Espinas and colleagues reports on the description of heterochromatic introns (HIs) in the rice genome. The authors have performed WGBS to reveal introns with heterochromatic signature (DNA methylation in all sequence contexts) and found around 4000 genes harboring such HIs. They further analyzed the epigenetic control of these introns, and obtained mutants (RNAI lines and CRISPR mutant) for IBM2. The mutants are sterile suggesting embryonic lethality. However only 200 common genes are affected in the 3 lines analyzed. While the study is of interest I think the manuscript would benefit from a more complete description of some results.

1. From Figure 3C it seems that repeats are enriched (compared to TEs) at HIs. Could the authors precise how were these repeats annotated? Concerning the authors' own annotation of TEs, how does it compare with published ones? (What % of the genome is covered?).

2. Figure 4: what are the functional categories depleted for HIs?

3. Study of polymorphimss at HIs: why did the authors limit their analysis to 3 varietes when 3,000 rice genomes are available? Did they compare with the 12 rice species with assembled genomes?

4. The functional evidence based on only one gene (line 245) should be interpreted with great caution. Only CRISPR excision of the HI or introgression line could properly address this question. This validation would be beyond the scope of this study.

5. IBM2 mutant analysis is not described in the abstract, not in the introduction. It should be mentioned as it represents an important part of the manuscript. Did the authors investigate the putative target genes involved in the sterile phenotype? What about TE expression in this mutant? I think this part of the work will be of interest to many rice researchers and could be described in more details. Fore instance how to explain the strong phenotype given that osibm2 has no impact on DNA methylation? Could the difference in overexpressed genes in the 3 lines be stochastic and depend on the level of IBM2 remaining expression (line 16 clearly the most affected)?

Minor comments:

Line 105 reference to be edited.

Line 106: please specify that this epiallele (Karma) is revealed under in vitro culture conditions

0s11g0229300 could be shown in Figure 1.

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Quentin Gouil

Reviewer #2: No

Reviewer #3: Yes: Marie Mirouze

Decision Letter 1

Wendy A Bickmore, Ortrun Mittelsten Scheid

3 Jan 2020

* Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. *

Dear Dr Saze,

Thank you very much for submitting the revision of your Research Article entitled 'Transcriptional regulation of genes bearing intronic heterochromatin in the rice genome' to PLOS Genetics. Two of the previous reviewers accepted to re-review the revision, and both agree that the manuscript has substantially improved. The main concern has been addressed, but there are some aspects that still need minor revisions. As both reviewers state, providing the TE annotation data in an accessible format is essential. The entropy analysis in connection with the GO terms should be better specified, and some specific questions need to be addressed, as you will see from the detailed comments. There are also several suggestions for text edits.

We therefore ask you to modify the manuscript according to the review recommendations before we can consider your manuscript for acceptance. Your revisions should address the specific points made by each reviewer.

In addition we ask that you:

1) Provide a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript.

2) Upload a Striking Image with a corresponding caption to accompany your manuscript if one is available (either a new image or an existing one from within your manuscript). If this image is judged to be suitable, it may be featured on our website. Images should ideally be high resolution, eye-catching, single panel square images. For examples, please browse our archive. If your image is from someone other than yourself, please ensure that the artist has read and agreed to the terms and conditions of the Creative Commons Attribution License. Note: we cannot publish copyrighted images.

We hope to receive your revised manuscript within the next 30 days. If you anticipate any delay in its return, we would ask you to let us know the expected resubmission date by email to plosgenetics@plos.org.

If present, accompanying reviewer attachments should be included with this email; please notify the journal office if any appear to be missing. They will also be available for download from the link below. You can use this link to log into the system when you are ready to submit a revised version, having first consulted our Submission Checklist.

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org.

Please be aware that our data availability policy requires that all numerical data underlying graphs or summary statistics are included with the submission, and you will need to provide this upon resubmission if not already present. In addition, we do not permit the inclusion of phrases such as "data not shown" or "unpublished results" in manuscripts. All points should be backed up by data provided with the submission.

PLOS has incorporated Similarity Check, powered by iThenticate, into its journal-wide submission system in order to screen submitted content for originality before publication. Each PLOS journal undertakes screening on a proportion of submitted articles. You will be contacted if needed following the screening process.

To resubmit, you will need to go to the link below and 'Revise Submission' in the 'Submissions Needing Revision' folder.

[LINK]

Please let us know if you have any questions while making these revisions.

Yours sincerely,

Ortrun Mittelsten Scheid

Associate Editor

PLOS Genetics

Wendy Bickmore

Section Editor: Epigenetics

PLOS Genetics

Reviewer's Responses to Questions

Comments to the Authors:

Please note here if the review is uploaded as an attachment.

Reviewer #1: The authors have made a substantial effort to address the comments. The data are better presented and the toned-down conclusions are more accurate.

I would still like to see a mention of effect size, and not only statistical significance, in the description of the GO enrichment and entropy analyses. For example, an 18% enrichment in cell death terms seems quite small. Overall it seems that the GO terms of heretochromatic intron-containing genes are not too different from the rest of the genes. For the entropy (and overall expression), the changes also appear to be modest. For the R gene enrichment, please specify the “background” R gene percentage.

As requested by reviewer 2, the TE annotation should be deposited rather than relying on direct requests.

There is a small formatting error on the p-value on l.161.

Reviewer #2: Overall, the authors have done a good job addressing the concerns I had. However, I do have some remaining issues that should be addressed before publication.

General note: Since the locations of the TEs is essential for replicating these experiments, I do not think that making the TEs “available on request” is sufficient. The data should be provided as a supplemental data set in an easily convertible format (not a PDF). Also, please be sure to address my confusion concerning entropy values for the various conditions.

Line 39. Not sure what “basal functions” means here.

Line 53. And yet most genes with h-introns are not affected by the mutation.

Line 66. “ species with small genome” should be small genomes

Line 95. You should say what IBM2 is here.

Line 144. should read, “similar to”

Line 210. Figs. should be Fig.

Line 223. I find this a bit confusing. How is the entropy for the treatments work? Were multiple tissues for each treatment examined? If not, how was entropy calculated? By just comparing treated and untreated? If you are talking about “responsiveness” wouldn’t you want to look at changes in expression (up or down?).

Line 229. looked for, or sought, rather than sought for.

Line 275. So expression of the downstream exons relative to upstream exons, or relative to w.t.?

Line 276. Given that the degree of knockdown for the two RNAi lines are nearly identical, why is there a massive difference between the two with respect to differential expression of downstream exons? Also, just to be clear, these are all DEG with respect to the ratio of 5’ to 3’ exons, not all the DEGs?

Line 281. Since I can’t tell where the primers are and the “truncated” transcript looks like a smear, I am not convinced that figure 5G shows premature polyadenylation. Figure 5F is confusing, since it shows that 610300 (OsIBM) has a loss of 5’ exons in plants targeting the 3’ exons. However, the overall trend, for downstream exons to be lost (presumably through premature polyadenylation) is clear and convincing, and the examples in Figure S14 are more convincing (perhaps one of these should be used in place of the one in the main figure).

Line 287. I’m assuming that none of these were associated with premature polyadenylation?

Line 316. Indeed, the TE insertion is even missing from the A. lyrate homolog of IBM1, so the global effects of IBM2 are almost certainly contingent on a random insertion into IBM1 in A. thaliana.

Line 333. should read constraint. Interestingly, the same appears to be true for upstream insertions of TEs in maize.

Line 341. What if you masked the methylated regions?

Line 352. This is hardly surprising given that many genes are presumably indirectly affected by the mutant.

Line 363. Not to quibble, but I often find this kind of argument frustrating. The implication is that heterochromatin formed at TEs is “functional” because premature polyadenylation at the TE in the IBM2 mutant occurs. However, I would argue that the IBM2 protein simply masks the presence of the TE so that it can be tolerated. What would convince me otherwise would evidence that the TE insertion in wild type plants has an effect on any aspect of the phenotype.

Line 383. I never bought that argument, unless you think that maize has a far more sophisticated regulatory apparatus than does arabidopsis.

Line 420. Selective pressure against?

Line 425. Yes, but aren’t these wild rices at least partly outcrossing?

Line 520. How could you get entropy values for this if you only looked at one tissue?

**********

Have all data underlying the figures and results presented in the manuscript been provided?

Large-scale datasets should be made available via a public repository as described in the PLOS Genetics data availability policy, and numerical data that underlies graphs or summary statistics should be provided in spreadsheet form as supporting information.

Reviewer #1: No: The TE annotation, central to the manuscript, should be deposited as pointed out by reviewer 2. "Upon request" is not sufficient, as outlined by the Plos Genetics guidelines

Reviewer #2: No: They need to provide the TE data as a supplemental, not "on request".

**********

PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Quentin Gouil

Reviewer #2: No

Decision Letter 2

Wendy A Bickmore, Ortrun Mittelsten Scheid

28 Jan 2020

Dear Dr Saze,

We were glad to read that the reviewers' comments were helpful to improve the manuscript. Thank you for the careful revision and considering the input. We are pleased to inform you that your manuscript entitled "Transcriptional regulation of genes bearing intronic heterochromatin in the rice genome" has been editorially accepted for publication in PLOS Genetics. Congratulations!

Before your submission can be formally accepted and sent to production you will need to complete our formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Please note: the accept date on your published article will reflect the date of this provisional accept, but your manuscript will not be scheduled for publication until the required changes have been made.

Once your paper is formally accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you’ve already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosgenetics@plos.org.

In the meantime, please log into Editorial Manager at https://www.editorialmanager.com/pgenetics/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production and billing process. Note that PLOS requires an ORCID iD for all corresponding authors. Therefore, please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field.  This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager.

If you have a press-related query, or would like to know about one way to make your underlying data available (as you will be aware, this is required for publication), please see the end of this email. If your institution or institutions have a press office, please notify them about your upcoming article at this point, to enable them to help maximise its impact. Inform journal staff as soon as possible if you are preparing a press release for your article and need a publication date.

Thank you again for supporting open-access publishing; we are looking forward to publishing your work in PLOS Genetics!

Yours sincerely,

Ortrun Mittelsten Scheid

Associate Editor

PLOS Genetics

Wendy Bickmore

Section Editor: Epigenetics

PLOS Genetics

www.plosgenetics.org

Twitter: @PLOSGenetics

----------------------------------------------------

Comments from the reviewers (if applicable):

----------------------------------------------------

Data Deposition

If you have submitted a Research Article or Front Matter that has associated data that are not suitable for deposition in a subject-specific public repository (such as GenBank or ArrayExpress), one way to make that data available is to deposit it in the Dryad Digital Repository. As you may recall, we ask all authors to agree to make data available; this is one way to achieve that. A full list of recommended repositories can be found on our website.

The following link will take you to the Dryad record for your article, so you won't have to re‐enter its bibliographic information, and can upload your files directly: 

http://datadryad.org/submit?journalID=pgenetics&manu=PGENETICS-D-19-00869R2

More information about depositing data in Dryad is available at http://www.datadryad.org/depositing. If you experience any difficulties in submitting your data, please contact help@datadryad.org for support.

Additionally, please be aware that our data availability policy requires that all numerical data underlying display items are included with the submission, and you will need to provide this before we can formally accept your manuscript, if not already present.

----------------------------------------------------

Press Queries

If you or your institution will be preparing press materials for this manuscript, or if you need to know your paper's publication date for media purposes, please inform the journal staff as soon as possible so that your submission can be scheduled accordingly. Your manuscript will remain under a strict press embargo until the publication date and time. This means an early version of your manuscript will not be published ahead of your final version. PLOS Genetics may also choose to issue a press release for your article. If there's anything the journal should know or you'd like more information, please get in touch via plosgenetics@plos.org.

Acceptance letter

Wendy A Bickmore, Ortrun Mittelsten Scheid

11 Mar 2020

PGENETICS-D-19-00869R2

Transcriptional regulation of genes bearing intronic heterochromatin in the rice genome

Dear Dr Saze,

We are pleased to inform you that your manuscript entitled "Transcriptional regulation of genes bearing intronic heterochromatin in the rice genome" has been formally accepted for publication in PLOS Genetics! Your manuscript is now with our production department and you will be notified of the publication date in due course.

The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.

Soon after your final files are uploaded, unless you have opted out or your manuscript is a front-matter piece, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.

Thank you again for supporting PLOS Genetics and open-access publishing. We are looking forward to publishing your work!

With kind regards,

Kaitlin Butler

PLOS Genetics

On behalf of:

The PLOS Genetics Team

Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom

plosgenetics@plos.org | +44 (0) 1223-442823

plosgenetics.org | Twitter: @PLOSGenetics

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Heterochromatic introns in Arabidopsis thaliana and rice genomes.

    (A) Arabidopsis thaliana genes (TAIR10) containing intron with heterochromatic domains. (B) Heatmap showing accumulation of H3K9 di-methylation on genome features in the rice genome. Data from [109] were used for the analysis.

    (PDF)

    S2 Fig. Length of introns in Arabidopsis thaliana and rice genomes.

    (A) A comparison of intron length between Arabidopsis thaliana (n = 127,836; average 169.0 bp) and Oryza sativa (n = 126,068; average 446.9 bp). (B) Fraction of repetitive elements in intronic regions of the rice genome. (C) Enrichment of heterochromatin and TEs in promoter-proximal introns. Fractions of all intron (n = 151,045), and heterochromatic introns (n = 6,086), and TE-containing introns (n = 1,982) are shown in the relative positions. Identical intronic regions annotated in different positions in different splicing variants were independently counted.

    (PDF)

    S3 Fig. TE families in rice introns.

    (A) Fraction of TE families in the intronic regions of the Oryza sativa genome. (B) Orientation of intronic TE insertion against gene annotations in each TE family. No significant orientation bias was observed in the TE families (p > 0.01; two-sided binominal test). (C) Metaplots of DNA methylation in CG, CHG and CHH contexts for heterochromatic introns with TEs and repeats (n = 4,886), heterochromatic introns without repeat (n = 923), and non-heterochromatic introns (n = 145,235). (D) Heatmap of methylation profiles of intronic TEs in wild-type O. sativa and mutants of OsMET1 (met1) and of OsDDM1 (ddm1) at CG, CHG, and CHH-contexts.

    (PDF)

    S4 Fig. DNA methylation of rice intergenic and intronic TEs.

    Histograms of the number of representative intergenic and intronic TE families (>20 copies in each category) and their methylation levels (0 to 1) in CG, CHG, and CHH contexts. TEs with methylation data at ≥ 5 Cs were analyzed.

    (PDF)

    S5 Fig. Length and DNA methylation of intronic TEs.

    Boxplots showing length of representative intergenic and intronic TE families (>10 copies in each category) and their methylation levels in CG (high; mCG ≥ 0.9, low; mCG < 0.9), CHG (high; mCHG ≥ 0.2, low; mCHG < 0.2), and CHH (high; mCHH ≥ 0.1, low; mCHH < 0.1). * p < 0.05, ** p < 0.01, *** p < 0.001, Wilcoxon exact test. N.S.: no significance, p ≥ 0.05. TEs with methylation data at ≥ 5 Cs were analyzed.

    (PDF)

    S6 Fig. DNA methylation of MITEs in rice introns.

    (A) Histograms of the number of representative intergenic and intronic MITEs (data retrieved from the P-MITE database [61] and their methylation levels (0 to 1) in CG, CHG, and CHH contexts. TEs with methylation data at ≥ 5 Cs were used in the analysis. (B) Density plots showing length (log10) and methylation levels (0 to 1) of intergenic and intronic MITEs in CG, CHG, and CHH contexts.

    (PDF)

    S7 Fig. Protein classes and expression changes of genes containing heterochromatic introns.

    (A) Protein classes defined by the Panther database [119]. 1,407 of 4,227 genes containing heterochromatic introns matching the database are indicated. (B) Gene Ontology depletion for genes containing heterochromatic introns. P-values were obtained by Fisher test, and terms with FDR < 0.05 are indicated. (C) Expression changes of all genes and genes with or without heterochromatic introns by various stress treatments. Specificity of the responses to given treatments were measured as entropy values. P-values from Wilcoxon exact test are indicated. Effect size (r) in each analysis: Low phosphate; 0.024, High phosphate; 0.020, Drought; 0.007, Osmotic stress; 0.009.

    (PDF)

    S8 Fig. JA response of genes in Nipponbare (NB) and KASALATH (KAS) with structural variations in heterochromatic intron.

    (A) Heatmap showing expression levels of the indicated genes after Jasmonic Acid (JA) treatment in the Nipponbare root. Expression data were obtained from TENOR [63]. (B) Quantitative RT-PCR (qRT-PCR) analysis of genes before (pre-treatment), and after JA (JA treatment). OsAOS2 was included as a control for JA-dependent induction of expression. Relative expression levels in each sample were normalized by UBQ1 expression levels, and the average of expression values in pre-treatment NB samples was set as 1, and plotted as dots (n = 6) with blue (NB) and yellow (KAS). The large dots and bars represent means of 6 biological replicates ± standard deviation (S. D.). P-values were obtained by t-test.

    (PDF)

    S9 Fig. Structural variations of heterochromatic introns in Nipponbare and KASALATH strains.

    Insertion/deletion polymorphisms in Nipponbare and KASALATH. Tracks: Top to bottom: mCG ratio (0 to1), mCHG ratio (0 to1), mCHH ratio (0 to1), genome-resequencing data coverage (0 to 30) [65], repeats (orange), TE annotation (blue), gene model (purple). Structural variations detected by PCR are indicated under the tracks as gel pictures. Red arrows indicate the primer positions used for PCR amplifications shown in the gel panel. The region used for qRT-PCR is indicated as red bar.

    (PDF)

    S10 Fig. Amino acid alignment of homologs of OsIBM2.

    Amino acid alignment of homologs of OsIBM2 in plants based on [34]. Bromo-Adjacent Homology (BAH) domain and RNA-Recognition Motif (RRM) are framed with a blue line. Arrows indicate regions designed for guide RNAs used for CRISPR-Cas9 mediated deletion. At; Arabidopsis thaliana: Zm; Zea mays: Os; Oryza sativa: Sb; Sorghum bicolor: Pt; Populus trichocarpa: Rc; Ricinus communis.

    (PDF)

    S11 Fig. Developmental phenotypes of osibm2 mutants.

    (A) Whole plant picture of three-month-old Nipponbare (left), RNAi_#2 line (middle) and RNAi_GFP control line (right). (B) Close-up pictures of seeds set in Nipponbare and RNAi lines (T1). (C) A close-up picture of seeds set in osibm2_g2#24 and their segregating wild-type siblings (WT; T4). White bar: 1 cm. (D) RT-PCR analysis of gene expression in endosperm and embryo of Nipponbare and osibm2. RNAs from ~10 DAF (Days After Fertilization) developing endosperm and embryo of osibm2_g2 #24 (T2) were used for the analysis.

    (PDF)

    S12 Fig. Expression changes in genes containing heterochromatic introns in osibm2.

    (A) (Top) DNA methylation levels of differentially expressed genes (DEGs) with heterochromatic introns (n = 93), DEGs without heterochromatic intron (n = 361), and non DEGs (n = 20293) in Nipponbare background. (middle) DNA methylation difference in osibm2 (osibm2_ g2 #24) and wild type at loci as above. (Bottom) H3K9 methylation levels at loci as above. (B) 5′/3′ ratio of transcripts mapped to up- and down-stream of introns relative to wild type. RNA-seq data from osibm2_ g2 #24 and WT (wild-type segregants of osibm2) were used. In each locus, the 5′/3′ ratio of a representative transcript variant with TPM >1 was used for calculation. Bars represent the means of DEGs with heterochromatic introns (n = 68), DEGs without heterochromatic intron (n = 335), and randomly selected 300 nonDEG loci ± S.E.M. p-values were obtained by Tukey-Kramer test.

    (PDF)

    S13 Fig. Expression changes of genes in osibm2.

    (A) Representative rice genome loci showing altered expression patterns in mutants of OsIBM2. Tracks; Top to bottom: RNAseq (Reads per Million are indicated at top left), mCG ratio (0 to1), mCHG ratio (0 to1), mCHH ratio (0 to1), H3K9me2 (RPM; 0 to 1), TE annotation (blue), repeats (orange), gene model (purple). The black arrow indicates the orientation of coding sequence. (B) Quantitative RT-PCR (qRT-PCR) analysis of expression of genes containing heterochromatic introns in osibm2_g2#24 (osibm2) and WT (wild-type segregants of osibm2). Primer positions are indicated in Fig 5F and S13A Fig as red bars. Expression levels in each sample were normalized by UBQ1 expression levels, and the average of OsIBM2/UBQ1 in WT was set as 1. Bars represent the means of three biological replicates ± S. D. (n = 3).

    (PDF)

    S14 Fig. 3′ Rapid Amplification of cDNA Ends (RACE) of genes containing heterochromatin in mutants of OsIBM2.

    (A) 3′ RACE of Os01g0650200. Upper panel: Structure of Os01g0650200 locus and polyadenylated mRNA variants detected by 3′ RACE. Exons and spliced introns confirmed by sequencing analysis are shown as black/red boxes and lines, respectively. Primer positions used for 3′ RACE are indicated by arrowheads. Lower panel: Gel picture of DNA fragments amplified by 3′ RACE. Two biological replicates for each genotype were examined. DNA fragments indicated by arrowheads were cloned and sequenced at least for 8 clones, and the representative sequences supported with more than 3 clones are shown in the upper panel. The black arrow indicates the orientation of coding sequence. NB: Nipponbare; osibm2: osibm2_g2#24; WT: wild-type segregants of osibm2; (A)n: polyadenylation. (B) 3′ RACE of Os06g0360600 as in (A). (C) 3′ RACE of Os08g0567200 as in (A). (D) The number of TEs showing expression changes in osibm2_g2#24 (osibm2). 22 LTR TEs, and 1DNA/En-Spm showed significant changes (q<0.05) of both up-regulation (12 TEs) and down-regulation (11 TEs). (E) Rice genome loci showing altered expression patterns of intronic TEs in mutants of OsIBM2. Tracks; Top to bottom: RNAseq (Reads per Million are indicated at top left), mCG ratio (0 to1), mCHG ratio (0 to1), mCHH ratio (0 to1), H3K9me2 (RPM; 0 to 1), TE annotation (blue), repeats (orange), gene model (purple). The black arrow indicates the orientation of coding sequence.

    (PDF)

    S15 Fig. DNA methylation in osibm2.

    (A) Genome-wide DNA methylation in osibm2_g2#24 (osibm2, T4) and their wild type segregating siblings (WT, T4) in CG, CHG and CHH contexts for each chromosome. Average methylation levels in 1 MB bins were plotted. (B) Metaplots of DNA methylation in osibm2_g2#24 (osibm2) and their wild-type segregating siblings (WT) in CG, CHG and CHH contexts for indicated genome features.

    (PDF)

    S16 Fig. Rice homologs of the Arabidopsis H3K9 demethylase IBM1.

    Genome loci for OsJMJ718 (Os09g0393200) (A) and OsJMJ719 (Os02g0109400, Os02G0109501) (B). RNA-seq, DNA methylation and H3K9me2 tracks are shown as in S13 Fig. (C) An alignment of amino acids sequences of A. thaliana IBM1 (At_IBM1) and OsJMJ718. The amino acid sequence of the N-terminal part of OsJMJ718 is predicted based on RNA-seq reads in this study. The alignment was generated by CLUSTAL W [131]. Jumonji-C (JmjC) domains predicted by SMART [132] are circled with blue lines. Positions of heterochromatic introns are indicated by red arrowheads.

    (PDF)

    S1 Table. Genes containing heterochromatic introns.

    (XLSX)

    S2 Table. Chromosomal positions of heterochromatic introns.

    (XLSX)

    S3 Table. Transposon annotation used in this study.

    (XLSX)

    S4 Table. Genes showing expression changes in osibm2 mutants.

    (XLSX)

    S5 Table. Primers used in the study.

    (XLSX)

    S6 Table. A summary table for Whole Genome Bisulfite Sequencing (WGBS) analysis.

    (XLSX)

    S7 Table. A summary table for RNA-seq analysis.

    (XLSX)

    S1 Data. Numerical data used to generate Figures.

    (XLSX)

    Attachment

    Submitted filename: Comments to the Authors191213.docx

    Attachment

    Submitted filename: Responses to reviewer.docx

    Data Availability Statement

    All the sequence data reported in this study have been deposited in the DDBJ Sequence Read Archive under accession ID DRA008322. All other data are within the manuscript and its Supporting Information files.


    Articles from PLoS Genetics are provided here courtesy of PLOS

    RESOURCES