Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2020 Dec 21;117(52):33700–33710. doi: 10.1073/pnas.2011361117

DNA methylation mutants in Physcomitrella patens elucidate individual roles of CG and non-CG methylation in genome regulation

Katherine Domb a,1, Aviva Katz a,1, Keith D Harris a, Rafael Yaari a, Efrat Kaisler a, Vu H Nguyen a, Uyen V T Hong a, Ofir Griess a, Karina G Heskiau a, Nir Ohad a,2, Assaf Zemach a,2
PMCID: PMC7777129  PMID: 33376225

Significance

While DNA methylation in plants occurs at CG, CHG, and CHH sequence contexts, their comparative roles in genome regulation remain elusive. Here, we utilized Physcomitrella patens as a plant model suitable to study this question. By profiling transcriptomes in a series of context-specific methylation mutants, we found redundancy between asymmetric (CHH) methylation and symmetric methylation in silencing transposons, and that CHH methylation regulates the expression of CG/CHG-depleted transposons. Specific elimination of CG methylation had a significantly smaller effect on transcription than elimination of non-CG methylation. Additionally, we found CHG methylation as a stronger silencer than CG methylation. These results disentangle the transcriptional roles of the individual methylation contexts and elucidate the crucial role of non-CG methylation in genome regulation.

Keywords: CG methylation, non-CG methylation, CHG methylation, CHH methylation, transposons

Abstract

Cytosine (DNA) methylation in plants regulates the expression of genes and transposons. While methylation in plant genomes occurs at CG, CHG, and CHH sequence contexts, the comparative roles of the individual methylation contexts remain elusive. Here, we present Physcomitrella patens as the second plant system, besides Arabidopsis thaliana, with viable mutants with an essentially complete loss of methylation in the CG and non-CG contexts. In contrast to A. thaliana, P. patens has more robust CHH methylation, similar CG and CHG methylation levels, and minimal cross-talk between CG and non-CG methylation, making it possible to study context-specific effects independently. Our data found CHH methylation to act in redundancy with symmetric methylation in silencing transposons and to regulate the expression of CG/CHG-depleted transposons. Specific elimination of CG methylation did not dysregulate transposons or genes. In contrast, exclusive removal of non-CG methylation massively up-regulated transposons and genes. In addition, comparing two exclusively but equally CG- or CHG-methylated genomes, we show that CHG methylation acts as a greater transcriptional regulator than CG methylation. These results disentangle the transcriptional roles of CG and non-CG, as well as symmetric and asymmetric methylation in a plant genome, and point to the crucial role of non-CG methylation in genome regulation.


Cytosine (DNA) methylation is a prominent DNA modification in many eukaryotes (16). In plants, DNA methylation regulates the expression of genes and transposons (710). In land plants, transposable elements (TEs) are preferentially or exclusively methylated (1, 6) and natural or artificial TE hypomethylation is associated with transcriptional and/or transpositional activities (1119). The effect of DNA methylation on genes involves silencing as well as activation (2024). Transcription start site (TSS) regions are generally depleted of methylation, and when methylated, the level of methylation is negatively associated with gene expression (1, 6). Intragenic methylation was found to regulate alternative or antisense transcription (2527). Gene expression can be affected by the methylation of adjacent TEs (10, 20, 24); thus, the role of methylation in gene regulation can be associated with TE content and dispersion in the genome.

DNA methylation in plants occurs at three sequence contexts, CG, CHG, and CHH (H = A, C, or T). CG sites are symmetrically methylated (cytosines on both CG strands) by MET1, a plant DNMT1 homolog (3, 6, 9). Non-CG sites are methylated by either chromomethylases (CMTs), domains rearranged methyltransferases (DRMs), or by plant DNMT3s (14, 15, 2833). CMTs are close homologs of MET1, with a chromodomain inside their methyltransferase domain (34). DRM’s methyltransferase domain is rearranged and homologous to that of plant and animal DNMT3 (35). In the early terrestrial plant, Physcomitrella patens, CHG and CHH sites are methylated mainly by CMT and DNMT3, respectively (29, 30). In flowering plants, DNMT3s become extinct, and their non-CG methylation is mediated by CMTs and DRMs (14, 15, 30). The chromatin targeting mechanism of plant DNMT3 orthologs is still unknown; however, similarly to CMTs, moss DNMT3 preferentially methylates heterochromatic TEs enriched by the histone mark H3K9me2 (30, 36). In contrast, DRMs are generally targeted to euchromatic TEs via the RNA-directed DNA methylation pathway (RdDM) (14, 15, 30).

The preferential targeting of all three methylation contexts to transposons, and the activation of the latter in plants hypomethylated in either of the methylation contexts, suggests that all three methylation contexts are important for TE silencing. However, despite the extensive understanding of these contexts in terms of methylation mechanisms, the relative contribution of each of the methylation contexts to TE regulation, especially at the whole-genome level, is still unknown. To investigate the relative roles of CG and non-CG methylation in genome regulation requires a plant system that: 1) has similar levels and patterns of CG and non-CG methylation; 2) has a limited cross-talk between the different methylation contexts; and 3) would tolerate strong hypomethylation phenotypes.

To date, methylation mutants of the handful of investigated plant species were either lethal, hypomorphic, or not context specific (i.e., depleted in multiple methylation contexts) (17, 18, 28, 3743). This includes Arabidopsis thaliana, for which exist viable CG and non-CG null methylation mutants. However, its confounding contextual methylation phenotypes make it impossible to elucidate context-specific effects using these mutants.

Here, we present the moss P. patens as a suitable model to investigate regulatory roles of individual methylation contexts. Firstly, CG and CHG are similarly methylated in P. patens (1, 30, 44), both in terms of methylation level and symmetry. Secondly, CG, CHG, and CHH methylation can be separately eliminated in P. patens with minimal cross-talk between CG and non-CG methylation (30). Thirdly, mutated plants eliminated in either single or multiple methylation contexts are viable in P. patens (29, 30, 45). In addition to our recently characterized P. patens mutants eliminated in either CG, CHG, or CHH (30), we generated two mutants, a null CHG/CHH methylation mutant, as well as a null CG/CHH methylation mutant, a first of its kind in plants. Fundamentally, as CG and CHG are similarly methylated in P. patens and this pattern remained unchanged in the mutants, we achieved two equally and exclusively CG and CHG methylated epigenomes. By profiling the transcriptomes of the single- and double-context P. patens methylation mutants, our results allow us to disentangle the transcriptional roles of CG and non-CG, symmetric and asymmetric, as well as symmetric-CG and symmetric-CHG methylation at a whole-genome level. We discovered that in P. patens, transcription regulation by CG methylation can be offset by non-CG methylation but not vice versa; that asymmetric methylation acts in redundancy to symmetric methylation; and that CHG methylation is a stronger transcriptional silencer than CG methylation. These results illustrate the crucial role of non-CG methylation in the regulation of a plant genome.

Results

A. thaliana Is Not a Suitable Model to Elucidate the Individual Roles of CG and Non-CG Methylation.

To compare the role of CG and non-CG methylation in A. thaliana, we profiled TE methylation and expression in the methylome and transcriptome of three met1 and two ddcc mutant datasets, either previously published (14, 15, 46, 47) or produced by us (SI Appendix, Tables S1 and S2). Despite some differences in genetic and experimental conditions (different alleles, tissues, and laboratories), the number and identity of up-regulated TEs were significantly similar (hypergeometric P value <10−10) among the two sets of ddcc samples as well as between the three sets of met1 (SI Appendix, Fig. S1A), suggesting consistency of TE regulation in these methylation mutants. Our analyses show that the CG hypomethylated mutants (met1) lead to a broader and stronger up-regulation of TE expression than a null non-CG methylation (ddcc) mutant (Fig. 1 A, B, and G and SI Appendix, Fig. S1A). While these results imply that MET1 (the CG methylase) is more crucial to silencing TEs than the non-CG methylases (DRM1, DRM2, CMT2, and CMT3 altogether), they do not necessarily imply that CG methylation is more efficient than non-CG methylation in silencing TEs. First, the level of CG methylation in A. thaliana is substantially higher than non-CG methylation (Fig. 1C), including within met1 up-regulated TEs (SI Appendix, Fig. S1B). Hence, the relatively greater TE activation in A. thaliana met1 plants in comparison to ddcc could be related to the robustness of CG methylation versus the limited non-CG methylation signal. Second, CG hypomethylation in met1 up-regulated TEs is associated with significant reduction in non-CG methylation (Fig. 1D), which is localized preferentially to TE edges (Fig. 1E and SI Appendix, Fig. S1C) or spread out throughout the entire element (Fig. 1 F and G). Thus, the activation of TEs in met1 cannot be explained solely by CG hypomethylation.

Fig. 1.

Fig. 1.

TE methylation and expression in A. thaliana met1 and ddcc mutants. (A) Venn diagram of up-regulated TEs in met1 (14 TEs) and ddcc (15 TEs) mutants. (B) RNA log fold change (logFC) of TEs up-regulated in both met1 and ddcc mutants.(C) Box plots of WT DNA methylation in methylated TEs separated to CG, CHG, and CHH contexts in A. thaliana TEs. ns: not significant. (D) Box plots of percent methylation change between averaged methylation (CG, CHG, or CHH) in WT versus mutant calculated for 50-bp windows within TEs commonly and exclusively up-regulated in met1 and ddcc mutants. Asterisks indicate significant difference (Welch’s two-sample t test, P value <0.05). (E) Patterns of percent methylation change in TEs up-regulated in met1 mutant. met1 up-regulated TEs were aligned at the 5′ or 3′ ends, and CHG or CHH methylation change between met1 and WT was averaged within 50-bp intervals. The dashed lines represent points of alignment. (FH) Genomic snapshots of a met1-specific up-regulated TE (F), a TE up-regulated in met1 and ddcc (G), and a ddcc-specific up-regulated TE (H). CG, CHG, and CHH methylation levels in 10-bp window resolution are shown in blue, green, and red, respectively. Gray bars represent the presence of covered sites in each methylation context in the given 10-bp window. Expression is normalized reads coverage (RPM) in 50-bp windows. (I) Patterns of percent CG methylation change between ddcc and WT along TEs commonly or exclusively up-regulated in met1 and ddcc.

The presence of ddcc-specific up-regulated TEs could have suggested that, at some TEs, non-CG methylation and CG methylation are crucial and trivial for silencing, respectively (Fig. 1H). However, analysis of CG methylation in ddcc across the sequence of different activated TE subgroups revealed it to be particularly depleted at the edges (primarily 5-prime ones) of only ddcc-activated TEs and not in those specifically activated in met1 (Fig. 1 H and I). Thus, we cannot exclude the possible contribution of CG hypomethylation to the activation of TEs in the null non-CG methylation mutant, ddcc.

To summarize, while Arabidopsis provides us viable mutants with near-complete hypomethylation in either CG and non-CG contexts, the confounding effects of their methylation in wild-type (WT) and mutant plants challenge our ability to discover their relative transcriptional roles.

Total and Specific Elimination of CG Methylation in P. patens Does Not Activate TE Expression.

To investigate the relative functional roles of CG and non-CG methylation, we next profiled transcriptomes in P. patens plants mutated in either MET, CMT, or in DNMT3b (termed dnmt3), which are eliminated in CG, CHG, and CHH methylation, respectively, with minimal influence between CG and non-CG methylation (29, 30, 45) (Fig. 2A). We also profiled transcription in a plant mutated in the two P. patens DRM genes (named drm), which have a trivial methylation phenotype (30) (Fig. 2A). First, we found a small subset (120) of transcribed hypomethylated TEs in wild type (SI Appendix, Fig. S2), substantiating the role of methylation in TE regulation in P. patens (44). Consistent with their substantial hypomethylation phenotypes, cmt, dnmt3, and met mutants caused a considerable activation of TEs (130 to 760 TEs), whereas drm in correlation to its minute hypomethylation effect activated only 4 TEs (Fig. 2 A and B and Dataset S1). Up-regulated TEs in cmt, dnmt3, and met, included different proportions of TE subtypes with partial overlapping, where met had the lowest activation effect (130 TEs), dnmt3 showed a slightly higher effect (183 TEs), and cmt had the greatest TE activation response (760 TEs) (Fig. 2B and SI Appendix, Fig. S3A). Overall, these results suggest that DNA methylation restricts TE activity in P. patens, where each DNMT controls a distinct subset of TEs, and that CMT has the strongest effect among all PpDNMTs.

Fig. 2.

Fig. 2.

TE up-regulation in CG, CHG, or CHH null-methylation mutants in P. patens. (A) Box plots of CG, CHG, and CHH methylation in methylated TEs quantified in 50-bp windows in P. patens WT and indicated mutants. drm is mutated in DRM1 and DRM2. dnmt3 is mutated in DNMT3b. (B) Venn diagram of up-regulated annotated TEs (intact or fragmented) in met, cmt, dnmt3, and drm mutants. (C, E, G, and I) Box plots of percent methylation change (Top) and CG, CHG, and CHH sites frequency (Bottom) in 50-bp windows within up-regulated or unaffected TEs in each mutant. Unaffected TEs were selected by first identifying 4,187 TEs that are capable of being activated, i.e., that were transcriptionally up-regulated in any of the mutants investigated in this study (including in met/dnmt3 and cmt/dnmt3 mutants described in Fig. 6), then subtracting the TEs specifically up-regulated in the indicated mutant. Asterisks indicate significant difference (Welch’s two-sample t test, P value <0.05). ns: not significant. (D, F, H, and J) Genome browser snapshots of TEs up-regulated in drm (D), cmt (F), met (H), and dnmt3 (J) mutants.

To better understand the role of CG and non-CG methylation on TE regulation, we next analyzed the CG, CHG, and CHH frequency and methylation level within up-regulated TEs in each of the mutants. Firstly, we found that drm up-regulated TEs were already lowly methylated and expressed in wild type (SI Appendix, Fig. S3B). Hypomethylation and expression were enhanced in drm (Fig. 2 C and D). This finding is consistent with PpDRMs preference to methylate actively transcribed TEs (30). Secondly, we found that in cmt a total CHG hypomethylation (98.4%) is associated with substantial CHH hypomethylation (57%) within cmt up-regulated TEs (Fig. 2 E and F), suggesting that TE activation in cmt cannot be attributed solely to the loss in CHG methylation. Thirdly, we discovered that in comparison to the general CG-specific hypomethylation effect observed in met (Fig. 2A), within its 130 up-regulated TEs, methylation is reduced also at non-CG sites by about 40% (Fig. 2 G and H). Moreover, in TEs up-regulated specifically in met mutant but not in the other mutants, non-CG hypomethylation in met reaches over 85% for CHG and over 70% for CHH (SI Appendix, Fig. S3C). Thus, except for a small number of TEs that were hypomethylated in all three methylation contexts, a total and specific elimination of CG methylation in met did not affect TE expression, suggesting functional redundancy with non-CG methylation.

CHH Methylation Is Crucial for Silencing CG and CHG Depleted TEs.

In the dnmt3 mutant, methylation in up-regulated TEs was lost at CHH sites but remained close to normal in CG and CHG sites (Fig. 2 I and J). Interestingly, we discovered that dnmt3 up-regulated TEs are significantly depleted of CG and CHG sites (Fig. 2 I and J). Additionally, up-regulated TEs in dnmt3 were relatively enriched for intact TEs (43% out of its total up-regulated TEs), which were composed of mainly two types of long terminal repeat (LTR) elements, Gypsy-11 and Copia-2 (Fig. 3A and SI Appendix, Fig. S3A). Copia-2s were transcribed in dnmt3 throughout their entire sequences whereas Gypsy-11s were preferentially transcribed downstream to the 5′-LTR (Fig. 3B). Additionally, our data show that the TE up-regulation in dnmt3 is enriched among young elements that are significantly depleted of symmetric CG and CHG sites in comparison to unaffected TEs of similar age (Fig. 3 C and D).

Fig. 3.

Fig. 3.

CHH methylation silences CG- and CHG-depleted TEs. (A) Pie chart of TE families up-regulated in dnmt3 mutant. (B) RNA-seq reads distribution along dnmt3 up-regulated Copia-2 (Top) or Gypsy-11 (Bottom) in WT and dnmt3. (C) Bar graph of the number of dnmt3 up-regulated intact Copia-2 (Top) and Gypsy-11 (Bottom) over five ascending age quantiles. (D) Boxplots of frequency of symmetric sites (CG and CHG) in dnmt3 up-regulated Copia-2 (Top) or Gypsy-11 (Bottom) versus unaffected TEs of the same subtype and age group, separated to LTR and internal sequences. Asterisks indicate significant difference (Welch’s two-sample t test, P value <0.05). ns: not significant. (E) Boxplots of CG plus CHG sites frequency per kilobase of Copia-2 (Top) or Gypsy-11 (Bottom) over five ascending age quantiles. TE age was calculated based on divergence between 5′ and 3′ LTRs of intact TEs. (F) Boxplots of CHH sites frequency per kilobase of Copia-2 (Top) or Gypsy-11 (Bottom) over five ascending age quantiles. Note the relative smaller range in CHH frequency in comparison to that of CG + CHG. In Gypsy-11 CHH frequency drops only in the fourth quantile, in contrast to symmetric sites where it gradually drops in each of the quantiles. (G) Proportion of CG, CHG, and CHH sites in A. thaliana TE reference genome and among mutagenized sites in 107 mutational accumulation (MA) lines following 25 generations [Weng et al. (50)] or in 5 MA lines following 30 generations [Ossowski et al. (49)] of inbreeding (Fisher exact test P value <0.05). N = the number of cytosines included in each of the datasets.

Methylated cytosines can be mutated via deamination (48). By quantifying the frequency of symmetric and asymmetric sites over TE age, we found CG and CHG sites to have a wider range than CHH sites, i.e., ∼60% and ∼8% change in frequency for symmetric and asymmetric sites, respectively (Fig. 3 E and F), suggesting a faster mutability rate for symmetric versus asymmetric sites. To test this hypothesis, we next checked the frequency of CG, CHG, and CHH mutations in the A. thaliana genome following 25 or 30 generations of inbreeding (49, 50). In these mutagenized datasets, we found symmetric (CG and CHG) and asymmetric CHH sites to be significantly enriched and depleted, respectively, in comparison to their normal frequency in TEs (Fig. 3G), representing a 2- to 13 times higher mutation rate for cytosines in symmetric versus CHH sites. These results suggest that lowly methylated CHH sites in plants are mutated at a lower rate than highly methylated CG and CHG sites and that in P. patens CHH methylation is crucial for the silencing of young CG/CHG-depleted TEs.

Generating Comparable CG- and CHG-Specifically Methylated Epigenomes.

Up-regulated TEs in wild type, or in the single mutants met, cmt, and dnmt3 were typically hypomethylated in multiple contexts, i.e., CG and/or CHG and CHH, or hypomethylated in CHH and depleted of CG and CHG sites (Fig. 2 CJ). These results suggest a functional redundancy between symmetrical and asymmetrical methylation in P. patens. To test this hypothesis and to explore the greater potential of DNA methylation in TE regulation in P. patens, we next aimed at generating mutants depleted of methylation in multiple contexts. We succeeded in generating viable cmt/dnmt3 and met/dnmt3 double mutants (Fig. 4A and SI Appendix, Fig. S4). However, following numerous attempts, we failed to generate a met/cmt double mutant. Neither directed mutagenesis of knocking out MET on a cmt background nor mutating CMT on a met background yielded a met/cmt double mutant, suggesting that a complete loss of CG and CHG methylation is lethal. Methylome analysis of cmt/dnmt3 found it to be entirely depleted of CHG and CHH methylation (Fig. 4B). In comparison, met/dnmt3 methylome was entirely depleted of CG and CHH methylation (Fig. 4B). Consequently, met/dnmt3 and cmt/dnmt3 were CHG- and CG-specifically methylated, respectively. Notably, CG and CHG methylation profiles in genes, TEs, and along chromosomes, in cmt/dnmt3 and met/dnmt3, respectively, were similar and close to their normal levels in wild type (Fig. 4 B and C). Furthermore, both CG and CHG methylation remained symmetrical, i.e., similarly methylated on both strands (Fig. 4D). Finally, as CG and CHG frequencies are generally similar in P. patens TEs (Fig. 4E), total methylation in both mutants is comparable (Fig. 4F).

Fig. 4.

Fig. 4.

Generating comparable CG- and CHG-specifically methylated P. patens epigenomes. (A) Pictures of representative WT and mutant plants at the gametophytic stage. Statistical analysis of the phenotype can be found in SI Appendix, Fig. S4D. (B) Patterns of CG, CHG, and CHH methylation in WT and met/dnmt3 and cmt/dnmt3 mutants, along TEs (Top) and genes (Bottom). (C) Average CG methylation in cmt/dnmt3 and CHG methylation in met/dnmt3 along P. patens chromosome 1 averaged in 100-kb sliding window. (D) Density scatterplot correlating CG and CHG methylation within the same sites on each of the strands. (E) Patterns of CG, CHG, and CHH sites frequency in all P. patens TEs. (F) Patterns of total DNA methylation in P. patens TEs in WT and in met/dnmt3 and cmt/dnmt3 mutants.

To summarize, the close-to-normal and specific CG methylation in cmt/dnmt3 (Fig. 4 B and C) suggests a weak feedback of non-CG (CHG and CHH) methylation on CG methylation. Similarly, close-to-normal and specific CHG methylation in met/dnmt3 (Fig. 4 B and C) suggests a weak influence of CHH and CG methylation on CHG methylation. Finally, and most importantly, we succeeded in generating two distinct epigenomes consisting of either CG-specific or CHG-specific methylation, at comparable levels, frequencies, and chromosomal distributions (Fig. 4). These unique epigenomes can be utilized for studying the interplay between symmetrical (CG/CHG) and asymmetrical (CHH) methylation, as well as between the two symmetrical contexts, CG and CHG.

Redundancy between Symmetric (CG/CHG) and Asymmetric (CHH) Methylation in Silencing TEs.

To test for functional redundancy between symmetric and asymmetric methylation, we next profiled transcriptomes and analyzed TE regulation in the double mutants, met/dnmt3 and cmt/dnmt3 (Fig. 5). In correspondence with their strong hypomethylation profiles (∼80% reduction in total methylation), both double mutants had a substantially stronger activation of TEs in comparison to their single mutants (Fig. 5 A and B and Dataset S1). Most up-regulated TEs in the double mutants belong to Gypsy-11 (SI Appendix, Fig. S5A), a TE subtype that is relatively enriched for younger elements (44) (SI Appendix, Fig. S5B). In the met/dnmt3 mutant, 16 and 22 times more TEs were activated in comparison to single dnmt3 and met mutants, respectively (Fig. 5A). In the cmt/dnmt3 mutant, 21 and 5 times more TEs were activated in comparison to dnmt3 and cmt single mutants, respectively (Fig. 5B). Additionally, most up-regulated TEs in the single mutants (>90%) were found to be up-regulated in the relevant double mutants (Fig. 5 A and B). Lastly, all groups of activated TEs in the single mutants, i.e., which were activated in either or both single mutants, showed significantly enhanced activation in the relevant double mutant, which was associated with significantly greater hypomethylation (Fig. 5 CF). Overall, these synergistic silencing effects indicate a stronger functional redundancy between symmetrical CG and CHG methylation than that of asymmetric CHH methylation in controlling TE expression.

Fig. 5.

Fig. 5.

Synergistic transposon silencing by symmetric and asymmetric methylation. (A and B) Venn diagrams of all up-regulated TEs (intact or fragmented) in single (dnmt3 and met or cmt) and double (met/dnmt3 or cmt/dnmt3) mutants. (C and D) Box plots of percent of total methylation change between WT and indicated mutants (Top) and RNA log fold change (Bottom) of TEs commonly up-regulated in single and double mutants. Asterisks indicate significant difference (paired t test, P value <0.05). (E and F) Genome browser snapshots illustrating single examples of synergistically up-regulated TEs in the double mutants.

CHG Methylation Plays a Greater Role Than CG Methylation in Silencing TEs.

The presence of thousands of up-regulated TEs in cmt/dnmt3 and met/dnmt3 suggests that each of the symmetrical methylated contexts (CG and CHG) in isolation are not sufficient to silence a substantial number of TEs in the P. patens genome. At the same time, these double mutants provide us the unique opportunity to analyze the transcripti onal alterations in comparable methylomes that are either completely depleted of CG and CHH methylation or of CHG and CHH methylation. This allows us to investigate the strength of the remaining comparable CG and CHG methylation on genome regulation.

Our differential expression analysis found more up-regulated TEs in cmt/dnmt3 (3,901) than in met/dnmt3 (2,866) (Fig. 6A). Additionally, among the 2,580 commonly up-regulated TEs, i.e., activated in both mutants, expression in cmt/dnmt3 was significantly higher (∼5×) than in met/dnmt3 (Fig. 5B and SI Appendix, Fig. S6A). To verify the role of methylation in TE activation, we profiled methylation levels and frequency in various up-regulated subgroups (Fig. 6 CI).

Fig. 6.

Fig. 6.

CHG methylation is superior to CG methylation in silencing TEs. (A) Venn diagram of all up-regulated TEs (intact or fragmented) in met/dnmt3 and cmt/dnmt3 mutants. (B) Box plots represent logarithmic fold change (LogFC) of expression between WT and either of the indicated mutants within the 2,580 TEs that are up-regulated in both met/dnmt3 and cmt/dnmt3 mutants. LogFC in cmt/dnmt3 is significantly higher than in met/dnmt3 (paired t test, P value <0.05). (C) Box plots of CG, CHG, and CHH sites frequency in 50-bp windows within TEs up-regulated exclusively in either met/dnmt3 mutant or cmt/dnmt3 mutant or in both double mutants. Asterisks indicate significant difference (paired t test, P value <0.05). ns: not significant. (DF) Box plots of WT and mutant CG, CHG, and CHH methylation level in 50-bp windows within TEs up-regulated exclusively in either met/dnmt3 mutant (D) or in cmt/dnmt3 mutant (E), or that were commonly up-regulated in both mutants (F). Asterisks indicate significant difference (paired t test, P value <0.05) between total methylation level in the mutants. (GI) Methylation and expression profiles of single LTR-TEs that are either exclusively up-regulated in met/dnmt3 (G), specifically up-regulated in cmt/dnmt3 (H), or commonly up-regulated in both mutants (I). CG, CHG, and CHH methylation are in blue, green, and red, respectively. Normalized RNA read levels are in gray. (J) Heatmaps of expression (log transformed), CG/CHG site frequency, and methylation levels averaged across 500-bp windows within indicated intact TE subtypes that were commonly up-regulated in cmt/dnmt3 and met/dnmt3. Maximum levels of the expression scale are for each of the TE subtypes. Metaplots show the average signal in the heatmaps above them.

First, we found that mutant-specific up-regulated TEs have a greater reduction in their total DNA methylation level in the relevant mutant (Fig. 6 D and E). For example, total DNA methylation in met/dnmt3-specifically up-regulated TEs (n = 286) is significantly lower in met/dnmt3 than in cmt/dnmt3 (Fig. 6D). Similarly, total DNA methylation in cmt/dnmt3-specifically up-regulated TEs (n = 1,321) is significantly lower in cmt/dnmt3 than in met/dnmt3 (Fig. 6E). These differences in total DNA methylation are associated with significant differences in CG and CHG content, i.e., met/dnmt3- and cmt/dnmt3-specifically up-regulated TEs were enriched and depleted for CG sites, respectively (Fig. 6 C and H). Additionally, within met/dnmt3-specifically up-regulated TEs, CHG sites were substantially hypomethylated in met/dnmt3 (Fig. 6 D and G). Together, these results suggest that differences in CG and CHG content and methylation relate to the up-regulation of specific TE subsets in each of the double mutants. In contrast to mutant-specific up-regulated TEs, within the commonly activated TEs (n = 2,580), CG and CHG frequencies and methylation levels are similar (Fig. 6 C and F), but associated with significantly stronger expression in cmt/dnmt3 than in met/dnmt3 (Fig. 6 B and I and SI Appendix, Fig. S6A).

We next checked for particular effects across intact and commonly up-regulated TEs separated into the four most enriched TE subtypes, i.e., Gypsy-11, Copia-2, Gypsy-4, and Gypsy-6 (SI Appendix, Fig. S5A). Heatmaps and metaplots show differences in CG and CHG frequency patterns between the different TE subtypes, while TE expression is consistently stronger in cmt/dnmt3 over met/dnmt3 (Fig. 6J and SI Appendix, Fig. S6B). In Gypsy-4, we found the stronger expression in cmt/dnmt3 to be associated with a localized CG hypomethylation around the LTR sequences, whereas CHG methylation was not reduced in those regions in met/dnmt3 (Fig. 6J). In contrast to Gypsy-4, in Gypsy-6, Gypsy-11, and Copia-2, CG and CHG methylation levels are similar in WT and were hardly affected in cmt/dnm3 and met/dnmt3, respectively (Fig. 6J). The stronger TE effect in cmt/dnmt3 over met/dnmt3 has been confirmed in a second RNA sequencing (RNA-seq experiment) (SI Appendix, Fig. S6 C and D). These results suggest that CHG methylation plays a greater role than CG methylation in silencing transposons in P. patens.

Non-CG Methylation Is Crucial for Gene Regulation in P. patens.

While DNA methylation in P. patens is mostly targeted to TEs, transposon methylation could influence the expression of nearby genes (14, 20). In comparison to TEs, where internal methylation autosilences the elements, in genes, methylation commonly regulates expression (activation and suppression) via external genic elements (5154). Additionally, genetic interaction networks are more robust in genes than in transposons. Accordingly, it is expected that mutants defective in DNA methylation will have complex differential gene expression profiles composed of up and down regulations which are directly and indirectly influenced by DNA hypomethylation.

Our data show that the numbers of differentially expressed genes correlate with the number of activated transposons in the various methylation mutants, while a significantly higher linear correlation was found between the number of activated TEs and the number of activated genes (r = 0.94, P < 0.006) versus the number of suppressed genes (r = 0.79, P < 0.06) (SI Appendix, Fig. S7A). In line with its trivial methylation and TE activation effects, no genes were differentially expressed in the drm mutant (SI Appendix, Fig. S7B). In all other mutants, we detected both activated and suppressed genes (SI Appendix, Fig. S7B). Most up-regulated genes were not methylated internally or expressed continuously from adjacent TEs (SI Appendix, Fig. S7C). In comparison, we found adjacent intergenic regions, especially promoters, of up-regulated genes to be more methylated in wild-type than intergenic regions of unaffected genes (SI Appendix, Fig. S7D) and to be enriched with DMRs in the mutants (SI Appendix, Fig. S7E). Conversely, promoter regions of suppressed genes had mostly lower methylation in wild type than the background of unaffected genes (SI Appendix, Fig. S7D). These results suggest that a portion of the differentially expressed genes in the methylation mutants is regulated by DNA methylation.

We detected only 50 up-regulated genes in the met mutant (SI Appendix, Fig. S7B). Similar to the effect in TEs, met up-regulated genes have been associated with a loss in non-CG methylation. The first 500 bp upstream to the TSS of met up-regulated genes were particularly hypomethylated in CHG and CHH methylation (SI Appendix, Fig. S7F). Out of the 14 met up-regulated genes associated with hypo-DMRs, only 3 were hypomethylated specifically in CG, i.e., not in non-CG (SI Appendix, Fig. S7G). Thus, total and specific elimination of CG methylation hardly activated genes in P. patens. In comparison to the met mutant, dnmt3, cmt, and cmt/dnmt3 mutants were depleted mainly in non-CG methylation and hardly in CG methylation (SI Appendix, Fig. S7G) but led to a greater number of up-regulated genes (194 to 1,804 genes; SI Appendix, Fig. S7B). including those associated with hypo-DMRs (SI Appendix, Fig. S7G).

Double mutants showed a higher number of activated genes than the sum of the single mutants (Fig. 6 B and H). While cmt/dnmt3 and met/dnmt3 show specific and comparable CG and CHG methylation levels, respectively, at gene-adjacent regions (Fig. 4B), cmt/dnmt3 had 2.4 times more up-regulated genes than met/dnmt3 and 4.7 times more up-regulated genes associated with promoter hypo-DMRs (SI Appendix, Fig. S7 B, G, and H). The prominent differential gene expression in cmt/dnmt3 versus met/dnmt3 is correlated to its altered morphological phenotype (Fig. 4A and SI Appendix, Fig. S4D).

Overall, these results suggest that, similarly to TEs, a total and specific elimination of CG methylation in P. patens hardly affects gene expression. In contrast, we show that non-CG methylation is crucial for gene regulation and that symmetric (CG and CHG) and asymmetric (CHH) methylation function synergistically in regulating gene expression.

Discussion

While transposons in plants are consistently highly methylated in CG sites (>80%), non-CG sites are methylated between 20 and 90% for CHG and between 2 and 30% for CHH (6). Such a diverse level of methylation among plants emphasizes the importance of studying DNA methylation in plant species with distinct methylation levels. Thus far, A. thaliana was the only plant species with available null methylation mutants in CG as well as in non-CG methylation that have been transcriptionally characterized. CG, CHG, and CHH methylation in A. thaliana TEs averages at about 80%, 40%, and 15%, respectively. Accordingly, A. thaliana represents a plant species with a relatively low non-CG methylation level. Here, we introduced P. patens as the second plant system with viable null CG and non-CG methylation mutants associated with transcriptome data. In comparison to A. thaliana, CG, CHG, and CHH methylation in P. patens TEs averages around 80%, 80%, and 30%, respectively. Thus, P. patens represents a plant system with a higher level of non-CG methylation as well as equivalent CG and non-CG (CHG) symmetrically methylated contexts.

In addition to our recently characterized P. patens mutants eliminated in either CG, CHG, or CHH methylation, in this study we generated two mutants eliminated of either CG and CHH or CHG and CHH methylation, thus exclusively methylated at either CHG or CG, respectively. Fundamentally, these mutants maintained close to normal and similar levels of CHG and CG methylation, allowing the comparison between the particular roles of each of these methylation contexts at the whole-genome level.

Consistently with previous findings in plants, our results verify that DNA methylation in P. patens plays a significant role in gene and TE expression. Firstly, we show that, in wild type, the few expressed TEs are hypomethylated. Secondly, in correlation with its trivial methylation effect, we found only four activated TEs and no differentially expressed genes in the drm1/drm2 double mutant. Thirdly, strong hypomethylated mutants caused substantial transcriptional effects, which increased in combinatorial mutants.

With regard to contextual methylation, our data show that, except for a minute group of regions depleted of all three methylation contexts, a specific elimination of CG methylation in the rest of the P. patens met genome did not alter TE or gene expression (Fig. 5 and SI Appendix, Fig. 7B). Specific removal of CHH methylation, activated a particular group of TEs depleted in CG and CHG sites. In contrast to single context hypomethylation, removal of CG and CHH methylation caused a synergistic differential expression and activation of thousands of genes and TEs, respectively (Fig. 5F and SI Appendix, Fig. 7B). Similarly, elimination of CHG and CHH caused synergistic transcriptional alterations and activation of thousands of genes and TEs, respectively. Moreover, between the two CG and CHG exclusively methylated genomes, substantially broader and stronger differential gene expression and activation of TEs occurred in the CG-methylated genome rather than the CHG-methylated one (Fig. 6 and SI Appendix, Fig. 7).

These results imply a strong functional redundancy between asymmetric (CHH) methylation and either of the symmetric methylation contexts (CG and CHG). They also suggest that non-CG methylation in P. patens can function independently of CG methylation but not vice versa (i.e., CG methylation alone is not sufficient to silence P. patens TEs and to maintain normal gene expression). Finally, they demonstrate that between the two symmetrically methylated contexts, CHG methylation has a greater role than CG methylation in the transcriptional regulation of the P. patens genome (Fig. 7A).

Fig. 7.

Fig. 7.

Model for TE transcriptional control by CG, CHG, and CHH methylation. (A) Illustration of TE expression in context-specific null-methylated P. patens mutants. TEs remain silenced under total and specific elimination of either CG or CHH methylation. In contrast, total elimination of CG and CHH methylation up-regulates TE expression. Total removal of non-CG methylation (i.e., of CHG and CHH methylation), caused a greater TE expression (in number of TEs and their expression level). (B) Model for CHH methylation in silencing TEs that are pseudohypomethylated at CG and CHG sites. Highly methylated cytosines of CG and CHG sites are more rapidly deaminated than lowly methylated CHH. Thus, while mutagenesis eventually leads to permanent silencing of TEs, CHH methylation plays a role in silencing young CG/CHG-depleted TEs.

In A. thaliana, elimination of non-CG methylation in the ddcc mutant activates a small group of TEs, which is about six times smaller than the one activated in the CG null-methylated met1 mutants (Fig. 1A and SI Appendix, Fig. S1A). These results could have suggested that non-CG methylation is not sufficient for TE silencing and less efficient than CG methylation. However, our results showing that non-CG methylation in P. patens TEs is crucial for TE regulation support the alternative hypothesis, that the lower TE activation effect in ddcc is the consequence of the lower non-CG methylation in A. thaliana TEs. Additionally, in A. thaliana essentially all TEs in the met1 mutant are also depleted in non-CG methylation, whereas in the P. patens met mutant most TEs lose only CG methylation, and only a minority group that lose also non-CG methylation are activated. Thus, in both species, removal of CG methylation is associated with TE activation only when it is accompanied by reduction in non-CG methylation. Accordingly, while currently untestable, it is plausible that even in A. thaliana, exclusive reduction of CG methylation might not be sufficient to activate TEs and could be offset by non-CG methylation.

Symmetrical sequence contexts allow the efficient maintainenance of methylation upon DNA replication in a semiconservative manner. Our data show that CHG can be symmetrically methylated, efficiently maintained, and act as a strong transcriptional controller. Accordingly, one could ask why CG methylation has become the prominent methylation context in eukaryotes. One possible explanation could be variations in their sequences upon DNA replication. While newly synthesized CGs remain consistently CGs, newly replicated CHG sites would be of either CAG or CTG for CWG (W = A or T), and either CCG or CGG for CSG (S = C or G). Once CG methylation had been adopted in early eukaryotic genomes, e.g., by developing specialized readout mechanisms, it may have been too late to switch to a different methylation context. In contrast, expanding methylation to new sequence contexts later in evolution, e.g., CHG and CHH in plants, could have enhanced existing methylation functions or developed new ones, such as linking it to the heterochromatic H3K9me2 mark via the binding of the SET- or RING-associated (SRA) domain of histone methylases to non-CG methylated sites and the reciprocal binding of CMTs to H3K9me2 (15, 55).

Another interesting question raised by our findings is, if non-CG methylation is efficient in regulating transcription, and possibly stronger than CG methylation, how is non-CG methylation not consistently robust among plant genomes? A possible answer could be that non-CG methylation might be dictated by TE number and content (e.g., genomes with more potentially active TEs would require higher non-CG methylation). Differences in chromatin features among species, such as the unique enrichment of H3K27me3 in Marchantia polymorpha TEs (56), could influence DNA methylation patterns. Additionally, maintaining a lower methylation level could be an advantage under certain conditions. For example, having lower methylation allows genomes the ability to increase it upon demand in particular cells or developmental stages, e.g., increasing non-CG methylation in Arabidopsis root columella and gametophytes has been suggested to ensure the silencing of TEs in meristematic and germ cells, respectively (11, 57).

Our study suggests another role for maintaining methylation at a lower level. Methylation is mutagenic as it accelerates deamination of methylated cytosines to thymines. While harmful mutations in TE sequences are beneficial by leading to permanent silencing, insufficient or harmless mutations of methylated cytosines could place the host at risk due to pseudohypomethylation of the elements which could activate them or affect the expression of nearby genes. Here, we found that CHH methylation in P. patens is crucial for silencing young CG/CHG-depleted TEs. Additionally, our data suggest that cytosines of lowly methylated CHH sites are less prone to mutagenesis in comparison to that of highly methylated CG or CHG sites. These findings suggest an important role for lower methylation at CHH sites in safeguarding the silencing of TEs that are mutagenized (i.e., pseudohypomethylated) in their highly methylated CG and CHG sites (Fig. 7B).

To summarize, our study in P. patens disentangles the effects of CG, CHG, and CHH methylation and elucidates the crucial role of symmetric and asymmetric non-CG methylation in genome regulation.

Materials and Methods

Generation of Transgenic Mutant Lines and Growth Conditions.

A ΔPpmetΔPpdnmt3b double deletion mutant was generated by replacing the genomic region coding for PpDNMT3b in a ΔPpmet single deletion mutant background, and a ΔPpcmtΔPpdnmt3b double mutant was generated by replacing the genomic region coding for PpCMT in the ΔPpdnmt3b single deletion mutant using the same constructs as were used for generating single mutants as described previously (29, 30, 45). Mutants were verified by the absence of bisulfite sequencing (BS-seq) reads in the knockout regions (SI Appendix, Fig. S3 AC).

Three independent experiments have been conducted for knocking out PpCMT in the background of ΔPpmet or knocking out PpMET in the background of the ΔPpcmt line, but these attempts yielded no ΔPpmet ΔPpcmt double mutants, suggesting that such a combination is lethal. P. patens WT and mutant plants were propagated on BCDAT (solutions B, C, D, and ammonium tartrate) media (58) at 25 °C under a 16-h light and 8-h dark regime (59). To ensure harvesting of clean protonema tissue, protonema tissue was ground prior to bud development and plated on BCDAT medium; following 7 d of growth, the tissue was ground and plated for another 7-d incubation, at the end of which DNA or RNA was extracted.

Arabidopsis WT (Col-0) and ddcc mutant (drm1-2 SALK_021316; drm2-2 SALK_150863; cmt2-3 SALK_012874; cmt3-12 SALK_148381) seeds were sown on soil in pots, stratified at 4 °C for 48 h, and grown under 22 °C and a long-day (16 h light/8 h darkness) regime.

P. patens Plant Size Analysis.

Images of Petri dishes containing P. patens 27-d-old plants were processed using ImageJ v1.52i (Fiji) to assess the area of each plant. Mann–Whitney U test was performed for statistical evaluation of size variation between WT and double mutant plants, using GraphPad Prism software.

BS-Seq Library Preparation.

Genomic DNA was extracted from 7-d-old protonema using DNeasy Plant Mini Kit (Qiagen) according to manufacturer’s instructions. About 1 μg of purified genomic DNA was fragmented by sonication, end repaired, and ligated to custom synthesized methylated adaptors. Adaptor-ligated libraries were subjected to sodium bisulfite treatment using the Methylation-Gold Bisulfite Kit (ZYMO) as outlined in the manufacturer’s instructions. The converted libraries were then amplified by PCR, gel purified to select the 350- to 400-bp size range, quantified using real-time quantitative PCR with TaqMan probe, and validated with a Bioanalyzer (Agilent). The libraries were amplified on cBot to generate the cluster on the flow cell (TruSeq PE Cluster Kit V3–cBot–HS, Illumina) and sequenced as paired end on the HiSEq. 2000 System (TruSeq SBS KIT-HS V3, Illumina) (SI Appendix, Table S3).

BS-Seq Data Analysis.

The quality of the reads was evaluated using FastQC v0.11.8 (Babraham Bioinformatics). Identical duplicate reads and low-quality reads were filtered out using Prinseq-lite software v0-20-4 (60). BS-seq data processing was performed as described previously (11). Briefly, we used custom Perl scripts to convert all of the Cs in the “forward” reads (and in the scaffold) to Ts, and all of the Gs in the “reverse” reads and scaffold to As. The converted reads were aligned to the converted scaffold using Bowtie2 v2.3.4.1 (61), allowing two mismatches and multimapping reads with up to 10 hits. We then used Perl scripts to recover the original sequence information and, for each C (on either strand), counted the number of times it was sequenced as a C or a T. We calculated fractional methylation within a 50-bp sliding window for each sequence context separately and regardless of context, for use in downstream analyses. The percent of DNA methylation change (such as in Fig. 1F) was calculated by dividing the difference in methylation level between two samples by the level of methylation in the sample with the higher methylation level. Two-dimensional (2D) density plots for visualizing the CG/CHG symmetry level (such as in Fig. 4C) were generated using the hist2d function with norm = LogNorm() parameter from the matplotlib.pyplot Python package. For identifying DMRs, fractional methylation in 50-bp windows across the genome was compared between WT and each of the mutants. DMRs were called for windows with minimal coverage of three reads and Fisher’s exact test P value <0.05. For hypo-DMRs and gene proximity analysis, CG and CHG DMRs with at least 40% methylation decrease between WT and mutant, and CHH DMRs with at least 20% methylation decrease between WT and mutant were considered. Public raw BS-seq reads of A. thaliana WT, met1 and ddcc (SI Appendix, Table S1) were downloaded from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) and processed as described above.

RNA-Seq Library Preparation.

Total RNA from P. patens wild-type and mutant plants, two to three biological replicates per genotype (SI Appendix, Table S2), was extracted from 7-d-old protonema using the SV Total RNA Isolation System (Promega) and DNase treatment (Thermo Fisher Scientific). Total RNA from Arabidopsis plants was extracted from 1-m-old rosette leaves using RNeasy Plant Mini Kit (Qiagen) according to manufacturer’s instructions. The poly-A fraction (mRNA) was purified from 500 ng of total input RNA, followed by fragmentation and the generation of double-stranded cDNA. Then, Agencourt Ampure XP beads cleanup (Beckman Coulter), end repair, A base addition, adaptor ligation, and PCR amplification steps were performed. Libraries were quantified by Qubit (Thermo Fisher Scientific) and TapeStation (Agilent). The libraries were sequenced as single end reads on HiSEq. 2500.

RNA-Seq Data Analysis.

The quality of the reads was evaluated using FastQC v0.11.8 (Babraham Bioinformatics). Adaptor and quality trimming and poly-A tail removal were performed using TrimGalore v0.6.0 (Babraham Bioinformatics) with Cutadapt v1.15. The preprocessed reads were aligned to the reference genome and read counts per gene/TE were obtained using STAR v2.7.0f (62). Sequencing depth and additional information are shown in SI Appendix, Table S2. Additionally, aligned reads were counted in a 50-bp sliding window genome-wide and normalized to library size (reads per million reads), for use in downstream analysis, e.g., genome browser snapshots and expression profiling across aligned TEs.

For differential expression analysis based on data from two to three biological replicates per genotype, we used voom from the edgeR/limma Bioconductor R packages; the differential expression model accounted for a possible batch effect. Genes with a raw read count below 10 were omitted from the analysis. Genes and TEs with twofold change in expression level and false discovery rate (FDR) below 0.1 were considered differentially expressed.

Public raw RNA-seq reads of A. thaliana WT, met1, and ddcc (SI Appendix, Table S1) were downloaded from NCBI SRA and processed as described above.

Gene and TE Annotations.

Gene annotations for P. patens (v3.3) and A. thaliana (Araport11) were downloaded from Phytozome https://phytozome.jgi.doe.gov/.

A. thaliana TE annotations were downloaded from TAIR https://www.arabidopsis.org/, TAIR10 genome release. P. patens TE annotations, produced by REPET software (63, 64), were downloaded from Co-Ge https://genomevolution.org/coge/GenomeView.pl?gid=33928. A separate annotation for intact LTR retrotransposons produced with LTRharvest (65) was kindly provided by Stefan Rensing, University of Marburg, Marburg, Germany (44). REPET-based annotation divided P. patens LTR/Gypsy TEs into three main groups (RLG1, RLG2, and RLG3) and LTR/Copia into two main groups (RLC4 and RLC5), while we detected within some of them, for example RLG3, subgroups with different sequence properties, and sensitivity to different DNA methylation context loss. Information from P. patens Repeatmasked assembly (v3.3) downloaded from Phytozome was used to increase the resolution of LTR-TE families.

To assess the ages of LTR retrotransposon families based on LTR similarity level, REannotate software (66) was used for RepeatMasker output processing. LTR retrotransposon age was calculated assuming neutral mutation rate of 9.4*10−9 synonymous substitutions per synonymous site per year (67).

Metaanalyses across Genes and TEs.

Metaanalysis of DNA methylation, CG/CHG/CHH site abundancy, RNA-seq reads, and DMRs relative to gene and TE edges was performed by first aligning genes or TEs at either their 5′ or 3′ ends. Then, for each gene or TE, a score (mean methylation level or total number of sites) or the presence of DMR (1 or 0) was calculated in a sliding window (50 or 100 bp) upstream and downstream of point of alignment and a mean value per each window was calculated for all included elements. Elements were generally included when reached to either the end of their sequence, another annotated element, or the indicated selected length.

Heatmaps.

For heatmap illustrating methylation and expression changes within commonly up-regulated TEs in met/dnmt3 and cmt/dnmt3 mutants, a matrix of mean CG, CHG, and CHH methylation in WT and mutants and expression logarithmic fold change was used. Heatmaps were generated using the clustermap() function from Python package Seaborn v0.9.0. Clustering was performed using scipy.cluster.hierarchy.linkage() function, with “average” linkage method. For heatmaps illustrating methylation and expression changes along TEs from four subgroups of up-regulated intact LTR TEs, CG, CHG, and CHH methylation in wild type and mutants, CG and CHG site frequency, and log-transformed expression levels were calculated for each TE in 500-bp sliding windows inside each TE, as described above. Methylation and expression data were organized in descending order of TE expression in cmt/dnmt3 mutant and plotted using the clustermap() function.

Supplementary Material

Supplementary File
Supplementary File
pnas.2011361117.sd01.xlsx (308.1KB, xlsx)

Acknowledgments

This work was supported by the Israeli Centers for Research Excellence Program of the Planning and Budgeting Committee, Israel Science Foundation Grant 757/12, Israel Science Foundation Grant 1636/15, and the European Research Council (ERC, 679551) (to A.Z.); and Israel Science Foundation Grant 767/09 (to N.O.). We thank Prof. Stefan Rensing (University of Marburg) for providing TE annotations, Prof. Vicki Plaks (University of California, San Francisco) for editing, and Prof. Avi Levy (Weizmann Institute) and Prof. Daniel Zilberman (John Innes Centre) for scientific comments.

Footnotes

The authors declare no competing interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2011361117/-/DCSupplemental.

Data Availability.

Methylome and transcriptome data have been deposited in Gene Expression Omnibus (GSE142054).

References

  • 1.Zemach A., McDaniel I. E., Silva P., Zilberman D., Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science 328, 916–919 (2010). [DOI] [PubMed] [Google Scholar]
  • 2.Feng S., et al. , Conservation and divergence of methylation patterning in plants and animals. Proc. Natl. Acad. Sci. U.S.A. 107, 8689–8694 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Huff J. T., Zilberman D., Dnmt1-independent CG methylation contributes to nucleosome positioning in diverse eukaryotes. Cell 156, 1286–1297 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bewick A. J., Vogel K. J., Moore A. J., Schmitz R. J., Evolution of DNA methylation across insects. Mol. Biol. Evol. 34, 654–665 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bewick A. J., et al. , Diversity of cytosine methylation across the fungal tree of life. Nat. Ecol. Evol. 3, 479–490 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Niederhuth C. E., et al. , Widespread natural variation of DNA methylation within angiosperms. Genome Biol. 17, 194 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Goll M. G., Bestor T. H., Eukaryotic cytosine methyltransferases. Annu. Rev. Biochem. 74, 481–514 (2005). [DOI] [PubMed] [Google Scholar]
  • 8.Du J., Johnson L. M., Jacobsen S. E., Patel D. J., DNA methylation pathways and their crosstalk with histone methylation. Nat. Rev. Mol. Cell Biol. 16, 519–532 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Schmitz R. J., Lewis Z. A., Goll M. G., DNA methylation: Shared and divergent features across eukaryotes. Trends Genet. 35, 818–827 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Kim M. Y., Zilberman D., DNA methylation as a system of plant genomic immunity. Trends Plant Sci. 19, 320–326 (2014). [DOI] [PubMed] [Google Scholar]
  • 11.Ibarra C. A., et al. , Active DNA demethylation in plant companion cells reinforces transposon methylation in gametes. Science 337, 1360–1364 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Benoit M., et al. , Environmental and epigenetic regulation of Rider retrotransposons in tomato. PLoS Genet. 15, e1008370 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Miura A., et al. , Mobilization of transposons by a mutation abolishing full DNA methylation in Arabidopsis. Nature 411, 212–214 (2001). [DOI] [PubMed] [Google Scholar]
  • 14.Zemach A., et al. , The Arabidopsis nucleosome remodeler DDM1 allows DNA methyltransferases to access H1-containing heterochromatin. Cell 153, 193–205 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Stroud H., et al. , Non-CG methylation patterns shape the epigenetic landscape in Arabidopsis. Nat. Struct. Mol. Biol. 21, 64–72 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Griffin P. T., Niederhuth C. E., Schmitz R. J., A comparative analysis of 5-azacytidine-and zebularine-induced DNA demethylation. G3 (Bethesda) 6, 2773–2780 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Hu L., et al. , Mutation of a major CG methylase in rice causes genome-wide hypomethylation, dysregulated genome expression, and seedling lethality. Proc. Natl. Acad. Sci. U.S.A. 111, 10642–10647 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Anderson S. N., et al. , Subtle perturbations of the maize methylome reveal genes and transposons silenced by chromomethylase or RNA-directed DNA methylation pathways. G3 (Bethesda) 8, 1921–1932 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Corem S., et al. , Redistribution of CHH methylation and small interfering RNAs across the genome of tomato ddm1 mutants. Plant Cell 30, 1628–1644 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Hirsch C. D., Springer N. M., Transposable element influences on gene expression in plants. Biochim. Biophys. Acta. Gene Regul. Mech. 1860, 157–165 (2017). [DOI] [PubMed] [Google Scholar]
  • 21.Zemach A., et al. , Local DNA hypomethylation activates genes in rice endosperm. Proc. Natl. Acad. Sci. U.S.A. 107, 18729–18734 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Harris C. J., et al. , A DNA methylation reader complex that enhances gene transcription. Science 362, 1182–1186 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Williams B. P., Pignatta D., Henikoff S., Gehring M., Methylation-sensitive expression of a DNA demethylase gene serves as an epigenetic rheostat. PLoS Genet. 11, e1005142 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Zhang H., Lang Z., Zhu J.-K., Dynamics and function of DNA methylation in plants. Nat. Rev. Mol. Cell Biol. 19, 489–506 (2018). [DOI] [PubMed] [Google Scholar]
  • 25.Choi J., Lyons D. B., Kim M. Y., Moore J. D., Zilberman D., DNA methylation and histone H1 jointly repress transposable elements and aberrant intragenic transcripts. Mol. Cell 77, 310–323.e7 (2020). [DOI] [PubMed] [Google Scholar]
  • 26.Rigal M., Kevei Z., Pélissier T., Mathieu O., DNA methylation in an intron of the IBM1 histone demethylase gene stabilizes chromatin modification patterns. EMBO J. 31, 2981–2993 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Wang X., et al. , RNA-binding protein regulates plant DNA methylation by controlling mRNA processing at the intronic heterochromatin-containing gene IBM1. Proc. Natl. Acad. Sci. U.S.A. 110, 15467–15472 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Gouil Q., Baulcombe D. C., DNA methylation signatures of the plant chromomethyltransferases. PLoS Genet. 12, e1006526 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Noy-Malka C., et al. , A single CMT methyltransferase homolog is involved in CHG DNA methylation and development of Physcomitrella patens. Plant Mol. Biol. 84, 719–735 (2014). [DOI] [PubMed] [Google Scholar]
  • 30.Yaari R., et al. , RdDM-independent de novo and heterochromatin DNA methylation by plant CMT and DNMT3 orthologs. Nat. Commun. 10, 1613 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Lindroth A. M., Requirement of CHROMOMETHYLASE3 for maintenance of CpXpG methylation. Science 292, 2077–2080 (2001). [DOI] [PubMed] [Google Scholar]
  • 32.Cao X., Jacobsen S. E., Role of the arabidopsis DRM methyltransferases in de novo DNA methylation and gene silencing. Curr. Biol. 12, 1138–1144 (2002). [DOI] [PubMed] [Google Scholar]
  • 33.Fu F. F., Dawe R. K., Gent J. I., Loss of RNA-directed DNA methylation in maize chromomethylase and DDM1-type nucleosome remodeler mutants. Plant Cell 30, 1617–1627 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Henikoff S., Comai L., A DNA methyltransferase homolog with a chromodomain exists in multiple polymorphic forms in Arabidopsis. Genetics 149, 307–318 (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Cao X., et al. , Conserved plant genes with similarity to mammalian de novo DNA methyltransferases. Proc. Natl. Acad. Sci. U.S.A. 97, 4979–4984 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Harris K. D., Zemach A., Contingous and stochastic CHH methylation by plant DRM2 and CMT2 revealed by single-read methylome analysis. Genome Biol. 21, 194 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Ikeda Y., et al. , Loss of CG methylation in Marchantia polymorpha causes disorganization of cell division and reveals unique DNA methylation regulatory mechanisms of non-CG methylation. Plant Cell Physiol. 59, 2421–2431 (2018). [DOI] [PubMed] [Google Scholar]
  • 38.Tan F., et al. , Analysis of chromatin regulators reveals specific features of rice DNA methylation pathways. Plant Physiol. 171, 2041–2054 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Cheng C., et al. , Loss of function mutations in the rice chromomethylase OsCMT3a cause a burst of transposition. Plant J. 83, 1069–1081 (2015). [DOI] [PubMed] [Google Scholar]
  • 40.Yang Y., et al. , Critical function of DNA methyltransferase 1 in tomato development and regulation of the DNA methylome and transcriptome. J. Integr. Plant Biol. 61, 1224–1242 (2019). [DOI] [PubMed] [Google Scholar]
  • 41.Wang Z., Baulcombe D. C., Transposon age and non-CG methylation. Nat. Commun. 11, 1221 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Wang Z., et al. , Polymerase IV plays a crucial role in pollen development in Capsella. Plant Cell 32, 950–966 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Li Q., et al. , Genetic perturbation of the maize methylome. Plant Cell 26, 4602–4616 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Lang D., et al. , The Physcomitrella patens chromosome-scale assembly reveals moss genome structure and evolution. Plant J. 93, 515–533 (2018). [DOI] [PubMed] [Google Scholar]
  • 45.Yaari R., et al. , DNA methyltransferase 1 is involved in (m)CG and (m)CCG DNA methylation and is essential for sporophyte development in Physcomitrella patens. Plant Mol. Biol. 88, 387–400 (2015). [DOI] [PubMed] [Google Scholar]
  • 46.Zhang R., et al. , A high quality Arabidopsis transcriptome for accurate transcript-level analysis of alternative splicing. Nucleic Acids Res. 45, 5061–5073 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Rigal M., et al. , Epigenome confrontation triggers immediate reprogramming of DNA methylation and transposon silencing in Arabidopsis thaliana F1 epihybrids. Proc. Natl. Acad. Sci. U.S.A. 113, E2083–E2092 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Holliday R., Grigg G. W., DNA methylation and mutation. Mutat. Res. 285, 61–67 (1993). [DOI] [PubMed] [Google Scholar]
  • 49.Ossowski S., et al. , The rate and molecular spectrum of spontaneous mutations in arabidopsis thaliana. Science 327, 92–94 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Weng M. L., et al. , Fine-grained analysis of spontaneous mutation spectrum and frequency in arabidopsis thaliana. Genetics 211, 703–714 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Ricci W. A., et al. , Widespread long-range cis-regulatory elements in the maize genome. Nat. Plants 5, 1237–1249 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Lu Z., et al. , The prevalence, evolution and chromatin signatures of plant regulatory elements. Nat. Plants 5, 1250–1259 (2019). [DOI] [PubMed] [Google Scholar]
  • 53.Angeloni A., Bogdanovic O., Enhancer DNA methylation: Implications for gene regulation. Essays Biochem. 63, 707–715 (2019). [DOI] [PubMed] [Google Scholar]
  • 54.Hosaka A., et al. , Evolution of sequence-specific anti-silencing systems in Arabidopsis. Nat. Commun. 8, 2161 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Johnson L. M., et al. , The SRA methyl-cytosine-binding domain links DNA and histone methylation. Curr. Biol. 17, 379–384 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Montgomery S. A., et al. , Chromatin organization in early land plants reveals an ancestral association between H3K27me3, transposons, and constitutive heterochromatin. Curr. Biol. 30, 573–588.e7 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Kawakatsu T., et al. , Unique cell-type-specific patterns of DNA methylation in the root meristem. Nat. Plants 2, 16058 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Nishiyama T., Hiwatashi Y., Sakakibara I., Kato M., Hasebe M., Tagged mutagenesis and gene-trap in the moss, Physcomitrella patens by shuttle mutagenesis. DNA Res. 7, 9–17 (2000). [DOI] [PubMed] [Google Scholar]
  • 59.Frank W., Decker E. L., Reski R., Molecular tools to study Physcomitrella patens. Plant Biol. 7, 220–227 (2005). [DOI] [PubMed] [Google Scholar]
  • 60.Schmieder R., Edwards R., Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Langmead B., Salzberg S. L., Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Dobin A., et al. , STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Flutre T., Duprat E., Feuillet C., Quesneville H., Considering transposable element diversification in de novo annotation approaches. PLoS One 6, e16526 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Quesneville H., et al. , Combined evidence annotation of transposable elements in genome sequences. PLoS Comput. Biol. 1, 0166–0175 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Ellinghaus D., Kurtz S., Willhoeft U., LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Pereira V., Automated paleontology of repetitive DNA with reannotate. BMC Genomics 9, 614 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Rensing S. A., et al. , An ancient genome duplication contributed to the abundance of metabolic genes in the moss Physcomitrella patens. BMC Evol. Biol. 7, 130 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
Supplementary File
pnas.2011361117.sd01.xlsx (308.1KB, xlsx)

Data Availability Statement

Methylome and transcriptome data have been deposited in Gene Expression Omnibus (GSE142054).


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES