Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Sep 21.
Published in final edited form as: Microbiol Spectr. 2017 Jul;5(4):10.1128/microbiolspec.FUNK-0042-2017. doi: 10.1128/microbiolspec.FUNK-0042-2017

Repeat-Induced Point Mutation (RIP) and Other Genome Defense Mechanisms in Fungi

EUGENE GLADYSHEV 1
PMCID: PMC5607778  NIHMSID: NIHMS875123  PMID: 28721856

Abstract

Transposable elements (TEs) have colonized the genomes of nearly all organisms, including fungi. Although the effects of TEs may sometimes be beneficial to their hosts, their overall impact is considered deleterious. As a result, the activity of TEs needs to be counterbalanced by the host genome defenses. In fungi, the primary genome defense mechanisms include Repeat-Induced Point mutation (RIP) and Methylation Induced Premeiotically (MIP), Meiotic Silencing by Unpaired DNA (MSUD), Sex-Induced Silencing (SIS), cosuppression (also known as somatic quelling) and cotranscriptional RNA surveillance. Recent studies in the filamentous fungus Neurospora crassa have shown that the process of repeat recognition for RIP apparently involves interactions between coaligned double-stranded segments of chromosomal DNA. These studies have also shown that RIP can be mediated by the conserved pathway that establishes transcriptional (heterochromatic) silencing of repetitive DNA. In the light of these new findings, RIP emerges as a specialized case of the general phenomenon of heterochromatic silencing of repetitive DNA.

DEFENDING THE GENOME AGAINST MOBILE DNA

Mobile DNA, comprising both active and decaying copies of transposable elements (TEs), is present in nearly all living organisms. Although fungal genomes tend to be significantly smaller than the genomes of plants and animals, they still can vary dramatically with respect to their TE loads (1). While TEs have been proposed to provide some beneficial functions to their hosts, e.g., by promoting genetic diversity and accelerating adaptive evolution (25), their overall impact is considered deleterious (6). Insertional mutagenesis, gene misexpression, and genome instability represent some well-known examples of the deleterious effects associated with TEs. Importantly, by being able to move between vertical genetic lineages, TEs can still proliferate in a population of sexually reproducing individuals despite causing substantial fitness defects (6).

The ability of TEs to multiply exponentially in the host genome is opposed by the host genome defense systems. Known defense mechanisms differ significantly with respect to their specific modes of TE recognition and suppression, yet all of them face the same basic challenge of being able to recognize TEs among many other potential genomic targets. Specific recognizable features of TEs include (i) frequently aberrant modes of replication and transcription of TEs, (ii) the presence of multiple TE copies in the host genome, and (iii) the occurrence of polymorphic TE insertion sites in the host population. Once detected, TEs can be silenced by a number of molecular approaches. The canonical examples of transcriptional and co-transcriptional silencing include RNA-directed DNA (cytosine) methylation in plants (7), piRNA-directed H3K9 methylation in the Drosophila germ line (8), and various cotranscriptional modalities involving RNA processing and quality control (914). TE-encoded RNA can also be restricted posttranscriptional by RNA interference (RNAi) and RNA editing. In addition to these transcription-related processes, TEs can also be inactivated by the physical elimination of their DNA, as it occurs during chromatin diminution in selected eukaryotic groups including ciliates, foraminifera, and some animals (15, 16).

GENOME DEFENSE MECHANISMS IN FUNGI

In fungi, TEs are opposed by several genome defense mechanisms including Repeat-Induced Point mutation (RIP) and Methylation Induced Premeiotically (MIP), Meiotic Silencing by Unpaired DNA (MSUD), Sex-Induced Silencing (SIS), somatic quelling (cosuppression), and several cotranscriptional RNA surveillance processes (exemplified by Spliceosome-Coupled And Nuclear RNAi, SCANR).

Cotranscriptional RNA Surveillance Processes: SCANR

TEs are often associated with aberrant transcription, which is recognized by cotranscriptional RNA surveillance and which triggers silencing either by a localized assembly of heterochromatin or by an RNAi response (914). Because cotranscriptional RNA surveillance can act on individual TE copies in vegetative cells, it provides the first line of defense and allows the host to silence TEs that do not normally produce dsRNA and do not exist as tandem repeat arrays. One such mechanism was recently described in the basidiomycete Cryptococcus neoformans by Phillip Dumesic, Hiten Madhani, and colleagues (13), who discovered and characterized the Spliceosome-Coupled And Nuclear RNAi complex (SCANR). Their results led to a kinetic model of posttranscriptional silencing, by which the presence of suboptimal introns in TE-encoded pre-mRNAs promotes inefficient splicing and induces spliceosome stalling, thus providing SCANR with sufficient time to initiate the RNAi (13).

Somatic Quelling (Cosuppression)

The phenomenon of somatic quelling was first described by Nicoletta Romano and Giuseppe Macino in Neurospora crassa (17). Generally known as cosuppression in plants and animals, somatic quelling represents a paradigmatic example of an RNAi-mediated posttranscriptional defense mechanism. In somatic quelling, small interfering RNAs are produced specifically from chromosomal loci that contain repetitive DNA (18). In addition to the canonical RNAi factors (the RNA-dependent RNA polymerase QDE-1, the Argonaute protein QDE-2, and Dicer-like proteins DCL-1 and DCL-2), somatic quelling also requires RPA (Replication Protein A), QDE-3 (RecQ/SGS1 DNA helicase), and the key recombination factors Rad51, Rad52, and Rad54 (19). In the light of these and other results, a model was proposed by which single-stranded DNA intermediates formed during aberrant recombinational DNA repair can serve as templates for the production of primary single-stranded “aberrant” RNAs (aRNAs) using the DNA-dependent RNA polymerase activity of QDE-1. Single-stranded aRNAs would be converted to double-stranded RNAs by the second, RNA-dependent RNA polymerase activity of QDE-1. In their turn, dsRNAs would then be processed into mature small RNAs (qiRNAs) associated with the Argonaute protein QDE-2 (19).

Interestingly, in N. crassa, repetitive transgenes that can potentially produce qiRNAs may also become associated with heterochromatic H3K9me3 and DNA methylation by a poorly understood process that does not involve RNAi (2023). It is tempting to speculate that both RNAi-dependent and RNAi-independent mechanisms cooperate to silence expression of deleterious repetitive elements in vegetatively growing cells.

Meiotic Silencing

The phenomenon by which “unpairable” chromosomal segments become reversibly silenced in meiosis was first discovered by Robert Metzenberg and colleagues in N. crassa (24, 25). Together, their pioneering studies demonstrated that the presence of a gene-sized DNA segment on only one of the two homologous (parental) chromosomes could trigger potent yet transient silencing of all its copies (both “pairable” and “unpairable”) in the meiotic cell. This phenomenon was named “Meiotic silencing by unpaired DNA” or MSUD (25). In N. crassa, MSUD acts as a bona fide defense mechanism against TEs (26) and meiotic-drive elements called spore killers (27). This silencing process can be triggered by a chromosomal segment of “unpairable” DNA as short as 1 to 2 kbp, and increasing the segment length to 4–5 kbp leads to a much stronger effect (27). In its scope of detected targets, MSUD appears to be complementary to RIP (below). Both processes operate during the sexual cycle, yet while RIP specifically inactivates TEs that exist in a given parental genome as multiple copies, MSUD can detect only one TE copy as long as this copy is present in only one parental genome.

Processes analogous to MSUD were also described in animals (28). The earliest observations of meiotic silencing in mammalian males pertained the phenomenon of meiotic sex chromosome inactivation (MSCI). In MSCI, the XY bivalent becomes progressively condensed and associated with repressive chromatin modifications during the first meiotic prophase (28). Subsequent studies by several groups showed that meiotic silencing in animals was not restricted to the sex chromosomes (2932). Because much larger chromosomal regions were required to trigger silencing (as compared with MSUD in N. crassa), the animal process was named “Meiotic silencing of unsynapsed chromatin” or MSUC. In both animals and fungi, meiotic silencing is activated in prophase I, when pairs of homologous chromosomes become progressively condensed and synapsed (28). Whereas meiotic silencing in animals operates primarily at the transcriptional level and involves changes in the pattern of histone modifications over large chromosomal domains, meiotic silencing in fungi is mediated at the posttranscriptional level by RNAi (27, 33).

How MSUD can recognize “unpairable” DNA remains a great mystery. Intriguingly, MSUD fails to detect two identical chromosomal segments as “unpairable” if these segments are present on homologous chromosomes at nonallelic positions separated by a few thousand base pairs (34). This result suggests that as long as two DNA segments with the same or nearly the same nucleotide sequences have an opportunity to interact with one another in the context of the juxtaposed homologous chromosomes, they are likely to be recognized as homologous and thus suppress the MSUD response. Once the distance between such nonallelic segments is increased, the probability of their interaction decreases, triggering MSUD (34). These important results hint at a possibility that relatively short chromosomal segments behave as spatially constrained yet quasi-autonomous entities (27). In this case, each segment may be evaluated for the presence of a pairable partner independently from its neighboring genomic sequences. The nature of the constraint that limits interactions between homologous segments present at widely separated nonallelic positions is unknown. Yet it is has been speculated that it might be imposed by the basic organization of meiotic chromosomes that involves chromatin loops tethered to the chromosomal axis (27). In the light of these considerations, MSUD may emerge as a powerful new tool for probing the structure of condensed meiotic chromosomes.

The majority of factors required for MSUD in N. crassa (including an RNA-dependent RNA polymerase SAD-1, the Dicer-like protein DCL-1, an Argonaute protein SMS-2, and an RNA helicase SAD-3) are concentrated in the perinuclear space and, therefore, are physically removed from the presumptive site of homology recognition involving meiotic chromosomes (35, 36). The mechanism by which the presence of a locally “unpairable” DNA segment in the nucleus triggers the RNAi response in the perinuclear space remains deeply mysterious (27, 33). Recent work by Tom Hammond, Patrick Shiu, and colleagues in N. crassa identified SAD-6 as the first conserved nuclear factor involved in MSUD (34). SAD-6 belongs to a SWI/SNF subfamily of chromatin-remodeling proteins that also includes ATRX (34), yet, because MSUD still occurs in the absence of SAD-6, this factor could play a redundant or a secondary role in meiotic silencing.

Sex-Induced Silencing

Sex-Induced Silencing (SIS) was discovered by Joseph Heitman and colleagues in C. neoformans (37). In SIS, tandem repeat arrays are detected and silenced by RNAi during the premeiotic stage of the sexual cycle. While the same tandem repeat array can trigger an RNAi response in C. neoformans prior to entering the sexual phase, the efficiency of this somatic process appears to be much lower than in SIS (37, 38). Interestingly but not unexpectedly, all the canonical RNAi components required for SIS (RNA-dependent RNA polymerase as well as Dicer and Argonaute proteins) become transcriptionally upregulated specifically during the sexual phase when SIS occurs (37). Once initiated, SIS can be propagated during vegetative growth for many generations in the absence of any apparent repressive epigenetic modifications that are normally associated with heritable silencing in fungi (37), suggesting that RNAi alone can permit stable inheritance of the silenced state.

Intriguingly, SIS in the basidiomycete C. neoformans appears to share some basic features with both RIP (below) and somatic quelling (above) in ascomycete fungi. For example, similarly to somatic quelling, SIS involves RNAi and is triggered by the presence of tandem repeat arrays. On the other hand, similarly to RIP, SIS is strongly activated during the premeiotic phase. In general, the premeiotic phase appears to play a special role both in basidiomycetes and ascomycetes, perhaps, by allowing the parental genomes to be thoroughly searched for TEs before these genomes become mixed, recombined and inherited by the next generation.

Repeat-Induced Point Mutation

Soon after transformation techniques were developed in N. crassa (39), it was noticed that transgenic repetitive DNA became unstable specifically during the sexual phase in this organism (40). At about the same time, Eric Selker and Judith Stevens reported strong cytosine-to-thymine (C-to-T) mutation and concomitant methylation of nearly all remaining cytosines in a naturally occurring tandem duplication of a 5S RNA gene (41). A very peculiar association of strong C5-cytosine methylation with repeated DNA encouraged the idea of a causal relationship, whereby the presence of repeats provided a signal for cytosine methylation and mutation (41). Pioneering work by Eric Selker and colleagues demonstrated that the repetitive nature of DNA indeed provided such a signal in a process named “Repeat-Induced Point mutation” (RIP). Subsequent work by Eric Selker and colleagues defined some key properties of RIP in N. crassa (4244). Specifically, it was shown that RIP

  • occurs specifically in haploid parental nuclei that continue to divide by mitosis in preparation for karyogamy and ensuing meiosis

  • detects duplications of chromosomal DNA above a certain length threshold (~0.4 kbp), irrespective of their transcriptional status, origin, and relative as well as absolute positions in the genome (although closely positioned repeats are detected much more readily than widely separated repeats)

  • mutates cytosines on both strands of each DNA duplex in a pairwise fashion (e.g., preferentially affecting none or both repeated sequences)

  • occasionally spreads from duplicated sequences into neighboring nonrepetitive regions;

  • remains unaffected in mei-2 crosses that exhibit a strong defect in meiotic pairing of homologous chromosomes

Since its discovery in N. crassa nearly 30 years ago, the existence of RIP has been experimentally confirmed in several ascomycete species including Magnaporthe oryzae, Podospora anserina, Leptosphaeria maculans, and Fusarium graminearum (45). Further bioinformatic analysis of the sequenced genomes of filamentous fungi revealed widespread signatures of RIP-like mutation (4648). Intriguingly, elevated cytosine-to-thymine mutation of repetitive sequences was also reported in several basidiomycete fungi, suggesting that the evolutionary appearance of RIP had preceded the separation of Ascomycota and Basidiomycota (49). Yet, while being a potentially ancient phenomenon, RIP does not seem to be strongly conserved even between related species. For example, while RIP is particularly strong in N. crassa, it cannot be detected at all in its relative Sordaria macrospora (50).

Methylation Induced Premeiotically

Soon after the discovery of RIP, Christophe Goyon and Godeleine Faugeron (51) provided the first account of a related premeiotic silencing phenomenon that occurred in a distant relative of N. crassa, the ascomycete Ascobolus immersus. Unlike RIP mutations, the silenced phenotype could be spontaneously reversed during subsequent vegetative growth by the loss of cytosine methylation (51). These and other pioneering observations suggested that in A. immersus premeiotic silencing was mediated by cytosine methylation alone and did not involve permanent changes in the DNA sequence. By analogy with RIP, this process was named “Methylation-Induced Premeiotically” or MIP (52).

Further work by Jean-Luc Rossignol, Godeleine Faugeron, and colleagues provided strong evidence that both MIP and RIP represented two manifestations of the same basic process (43, 52, 53). Just like RIP, MIP detects gene-sized duplications of chromosomal DNA involving either native or foreign sequences (54). For a duplication of a given length, closely positioned repeats were always detected more efficiently than widely separated repeats. Similarly to RIP (55), MIP also appears to sense repeats in a pairwise fashion: if three repeat copies were present in the same nucleus, normally two or, less frequently, three copies would become inactivated, whereas silencing of only one copy was rarely observed (54). Further analysis of the homology length requirements for MIP showed that closely positioned repeats as short as 317 bp (56) and widely separated repeats as short as 1.2 kbp (57) could be detected by MIP. Very similar length requirements had been reported for RIP (4244), and in both processes cytosine modifications occurred primarily over the extent of the repeated region.

CYTOSINE METHYLTRANSFERASES MASC1 AND RID AS THE CANONICAL MEDIATORS OF MIP AND RIP

The discovery of Masc1 (Methyltransferase from Ascobolus 1) by Fabienne Malagnac and colleagues offered the first mechanistic insight into the nature of MIP and, consequently, RIP (58). Masc1 was identified by cloning a short fragment of the masc1 gene using a degenerate PCR approach. Because Masc1 appeared to play an essential role during the sexual stage in A. immersus, its role in MIP could only be examined in heterozygous conditions. In those situations, MIP could still be reduced by more than two orders of magnitude, thus showing a strong dependence on Masc1. Interestingly, cytosine methylation was still maintained in postmeiotic cells carrying the silenced copy of masc1 (58). This finding suggested that another, Masc1-independent mechanism was responsible for propagating cytosine methylation induced by Masc1 in the premeiotic cells. Because Masc1 played an essential role during the sexual cycle and because it specifically affected DNA repeats, it was proposed to regulate timely expression of developmental genes associated with naturally occurring instances of repetitive DNA (58). By this proposal, MIP could fulfill both the genome-defense and regulatory functions.

N. crassa has only one Masc1 homolog, and studies by Michael Freitag, Eric Selker, and colleagues showed that it also played a critical role in RIP (59). Because of its essential role in RIP in N. crassa, the protein was named “RIP Defective” or RID. Unlike Masc1, the absence of RID does not result in any apparent defect during the sexual cycle as homozygous ridΔ/Δ crosses develop normally and still produce large numbers of viable progeny spores (59). This finding suggests that, unlike Masc1 in A. immersus, RID may play a more specialized role in N. crassa, perhaps restricted only to its genome defense function during RIP. One RID homolog is also present in S. macrospora, which does not appear to have active RIP but which still contains a surprisingly low number of TEs, hinting at a possbility that a related RID-dependent genome defense mechanism could be involved (50).

Subsequently, Masc1/RID homologs were identified in nearly all ascomycetes, where they appear to play a conserved role during the sexual stage (60). For example, another RID homolog, DmtA, was recently implicated in regulation of asexual development and secondary metabolism in the ascomycete Aspergillus flavus (61). Among all known representatives of this ancient clade of putative cytosine methyltransferases, its founding member Masc1 still appears to have the most prototypical structure. The predicted amino acid sequence of Masc1 includes a compact catalytic domain of about 300 amino acids and an N-terminal region of about 200 amino acids. Neurospora RID, in addition to the catalytic and the N-terminal regions, also includes a long C-terminal extension of 260 amino acids that is also present in the homolog of RID in S. macrospora (50). The N-terminal regions of both Masc1/RID can be fully aligned with the corresponding region of the mammalian DNMT1 and contain a weak signature of a bromo-adjacent homology (BAH) domain (62). Interestingly, Masc1/RID proteins comprise a very ancient group of proteins that emerged long before the separation of animals, plants, and fungi (58, 63), yet these proteins can only be found at the present moment in the subphylum of Pezizomycotina.

Intriguingly, all Masc1/RID proteins appear to have the structure of Motif VI that deviates from the structure of Motif VI in all other prokaryotic and eukaryotic C5-cytosine methyltransferases (63). The canonical amino acid sequence of Motif VI contains the ENV (glutamate-asparagine-valine) triad. The glutamate residue is absolutely essential for catalysis (64) and it is also conserved in the Masc1/RID proteins (Fig. 1). The adjacent asparagine residue (ENV) interacts with the proline residue of the catalytic triad PCQ (in Motif IV) and thus plays an important architectural role by controlling the positions of the ENV and PCQ triads with respect to one another (65, 66). The valine residue of ENV also appears to be functionally important, because changing it to alanine in the model prokaryotic methyltransferase M.HhaI inactivates the enzyme (67). Yet, in all Masc1/RID proteins, the NV diad is replaced with either QT (e.g., in Neurospora RID) or ET (e.g., in Ascobolus Masc1), hinting at the possibility that these enzymes might have evolved unique catalytic and/or substrate requirements. Interestingly, the catalytic activity of the purified Masc1 protein could not be detected in vitro (58), thus further supporting that idea that Masc1/RID proteins may be different from the other C5-cytosine methyltransferases and/or may require auxiliary factors for catalysis.

FIGURE 1.

FIGURE 1

The structure of Motif VI in Masc1/RID proteins is not canonical. The canonical Motif VI contains the absolutely conserved NV diad (asparagine-valine). This diad is present in all C5-cytosine methyltransferases except Masc1/RID. The asparagine residue of NV physically interacts with the proline residue of the catalytic triad PCQ (in Motif IV) and thus plays a critical role by controlling the positions of these segments with respect to one another in the native structure of the protein. The valine residue of NV is also functionally important, as its substitution for alanine is known to inactivate the catalytic activity of M.HhaI. Yet in all Masc1/RID proteins the NV diad is replaced with either QT (e.g., in Neurospora RID) or ET (e.g., in Ascobolus Masc1), hinting at the possibility that Masc1/RID proteins might have unique catalytic and/or substrate requirements.

UNDERSTANDING THE HOMOLOGY REQUIREMENTS FOR RIP

The apparently pairwise nature of repeat mutation during RIP (55, 68) implied that the process of repeat recognition might also involve a pairwise comparison of DNA sequences. The discovery that RIP proceeded normally in the absence of SPO11 and MEI-3 (the only RecA homolog in Neurospora) excluded one most obvious recombination-mediated pairing mechanism (69, 70), yet it did not provide any new leads toward understanding the nature of the homology search process during RIP.

A Quantitative Measure of RIP Mutation

Additional insights into the mechanism of repeat recognition for RIP were obtained by systematically investigating the homology requirements of this process. As a starting point, a sensitive assay for RIP was developed in N. crassa, by which the propensity of engineered repeats to induce RIP mutation could be accurately quantified (69, 70). By this approach, each tested repeat construct was first integrated into a standard chromosomal locus of a standard Neurospora strain. It was then exposed to RIP during the sexual phase, while the occurrence of C-to-T (and G-to-A) mutations was revealed by sequencing the entire construct in a sample of randomly selected progeny spore clones (62). This quantitative approach was made possible by some specific features of Neurospora reproductive biology, by which individual DNA molecules were subjected to RIP over the course of several days, while the final molecular product of RIP was released from the perithecium as a spore clone that could be recovered by PCR and sequenced. A bioinformatic pipeline was developed to detect and analyze the occurrence of C-to-T and G-to-A mutations along each individual RIP product. By this approach, new aspects of RIP could be revealed by evaluating the dependence of RIP mutation on the configuration and homology relationships of strategically designed repeat units.

The assay also incorporated a number of features that permitted reliable quantitative readout of RIP (62). First, RIP activity was expressed and analyzed as the total number of mutations. Second, RIP activity was maximized by assaying pairs of closely positioned repeats. Third, the analysis was focused on meiotic spores produced relatively late in the cross, as such spores were associated with intrinsically higher levels of RIP. Fourth, noting that RIP mutation could vary substantially between different strains and between spore clones in the same cross, the extraneous variation in RIP activity was minimized by (i) integrating all repeat constructs into the same locus (csr-1) of the same strain (FGSC#9720), and (ii) crossing each transformant (as a male parent) to the same female strain (FGSC#4200). This particular pair of strains (FGSC#4200 × FGSC#9720) was chosen because they consistently produced a very large number of viable progeny spores, thus providing an adequate statistical population from which individual RIP products could be sampled. In this case, per-spore variability in RIP levels could be addressed by sampling a substantial number of spores from these large populations (62). Finally, the variation in RIP levels between experiments involving different repeat constructs was further reduced by analyzing constructs in which one (and always the same) repeat copy was held constant, while the other repeat copy could be manipulated as desired (both with respect to its length and orientation as well its base pair sequence).

Quantifying RIP Response to Perfect Homologies of Varying Length

During RIP, the presence of homologous DNA sequences results in characteristic mutations over these same sequences. In order to understand the basic relationship between the length of homology and the number of ensuing mutations, the response of RIP to repeats of graded length (ranging from 155 to 802 base pairs) was measured (69). It was discovered that 155 base pairs of perfect homology could trigger very low but still detectable levels of RIP. The fact that this homology length was significantly shorter than the previously published threshold (71) suggested that the assay indeed provided a very sensitive readout of RIP activity. Intriguingly, further analysis showed that the mean number of RIP mutations over a region of interest shared between all the constructs was strongly proportional to the fourth power of homology length, specifically for repeats ranging between 220 and 520 base pairs (69). This finding suggested that the number of mutations could be used as a sensitive and accurate measure of DNA homology in the length range amenable to experimentation.

Recognition of Weak Interspersed Homologies by RIP

Recognition of partially homologous repeats by RIP in N. crassa was previously examined by Eric Selker and colleagues (72). In this pioneering work, instances of partial homology were generated by the process RIP mutation itself. It was discovered that new cycles of RIP could be triggered by previously mutated repeats that retains more than 80% of sequence identity (72). This result suggested that the mechanism of repeat recognition for RIP required substantial DNA homology. It was consistent with an idea that DNA homology was sensed in the context of an uninterrupted alignment of single strands, as it is the case for the canonical, recombination-mediated homology recognition.

Intriguingly, recent studies revealed that, despite being very sensitive to the overall amount of homology, RIP could easily “overlook” substantial interruptions in homology if such interruptions occurred in the middle of the repeated sequence (69). This result hinted at a possibility that the mechanism of homology recognition for RIP did not necessarily require a continuous sequence alignment. The the next question raised by these results was the following: if RIP could readily detect trivial cases of interrupted homology, what would be the weakest (or the atomic) instance of interrupted homology that could still be recognized by RIP?

The advent of efficient DNA synthesis technologies permitted systematic analysis of the homology requirements for RIP (69). Because RIP was particularly sensitive to the overall amount of homology in the range of 220 to 520 base pairs (69), synthetic DNA segments of the relevant size could be made rapidly and inexpensively, and thus many such segments could be tested and analyzed for their ability to promote RIP (62, 69). By this approach, it became clear that homology patterns with overall sequence identity much less than 80% could still be detected by RIP. Systematic examination of many such patterns suggested that only specific homology configurations could be recognized. All “active” homology patterns consisted of short homologous units (3 base pairs or more) that had to be spaced at regular intervals of 11 or 12 base pairs along the overall extent of homology (69,70). Figure 2 provides an example of one such pattern, in which homologous units of 4 base pairs are interspersed with the periodicity of 11 base pairs along the total length of 500 base pairs. Importantly, in order for this process to work, DNA segments had to be coaligned along their length. This result suggested that homology was detected between coaligned and rigid objects. When considered together with the periodicity requirement, these results suggested that these rigid objects could be the double helices of DNA. Thus, a model was proposed by which homologous DNA molecules could become engaged in multiple, interspersed, triplet-mediated, cooperative interactions taking place over the length of several hundred base pairs. Additional results suggested that RIP mutation could in fact be decoupled from the process of homology recognition, suggesting a possibility in which the DNA-modifying activities were recruited to DNA segments after they have been defined as homologous.

FIGURE 2.

FIGURE 2

Recognition of interspersed homology during RIP in N. crassa. This assay detects and quantifies the occurrence of RIP mutations in response to engineered DNA repeats. Instances of DNA homology are created between two short segments of chromosomal DNA, one of which is normally represented by an endogenous sequence, while the sequence and orientation of the other segment can be manipulated as desired. In this situation, the number of RIP mutations provides a very sensitive readout of DNA homology perceived by the recombination-independent mechanism of repeat recognition for RIP. (A) Weak interspersed homology is formed between the endogenous 500-bp segment (blue) and a synthetic DNA segment (green) integrated at a nearby position as the replacement of the cyclosporin-resistant-1 (csr-1) gene. This particular pattern involves 4-bp units of homology spaced with the periodicity of 11 base-pairs and exists between “repeat units” in the inverted orientation. (B) Pairwise sequence comparisons showing all matches of 4 base pairs long. Two situations are presented: random homology (left panel) and interspersed homology (right panel). No cryptic homology can be seen except the intended pattern of weak interspersed homology (magenta box). (C) The occurrence of mutations induced by weak interspersed homology. Seventy progeny spores from the “XKO” cross (70), which had been previously found to contain at least one RIP mutation, were reanalyzed by sequencing of additional 255 base pairs in the “left” flank of the construct (corresponding to the single-copy coding/translated sequence of NCU00725).

THE HETEROCHROMATIN-RELATED PATHWAY OF RIP

The Universal Phenomenon of Heterochromatic Silencing of Repetitive DNA

In the genomes of nearly all eukaryotic organisms, highly repetitive DNA sequences are organied in the form of constitutive heterochromatin, a specialized state of chromatin that remains condensed during interphase and associated with conserved repressive epigenetic modifications, such as di- and trimethylated lysine-9 residues of histone H3 (H3K9me2/3). The apparent mechanistic connection between the repetitive nature of the genetic material and its heterochromatic nature was noted by Guido Pontecorvo in 1944 as he wrote “a heterochromatic segment should arise every time that a minute euchromatic region undergoes repeated reduplications in the genotype and the replicas remain adjacent to each other on the chromosome” (73).

The idea that the heterochromatic state can be induced by homologous interactions between repeat elements was further advanced by Douglas Dorer and Steven Henikoff in 1944 (74). Working with Drosophila melanogaster, the authors generated a series of tandemly repeated mini-white transgenes and discovered that the level silencing of these repetitive transgenes was dependent on the number of repeat copies. Based on these and other results, it was suggested that heterochromatin proteins could specifically recognize the higher-order chromatin structures produced by the stable, homology-dependent association of nearby repeat copies (74). By this proposal, and in accord with the original idea of Pontecorvo, heterochromatin could form on repetitive DNA of any particular sequence, simply because of its ability to pair with itself and produce higher-order structures. The authors also noted that such homology-directed pairing could underlie silencing of long tracts of simple trinucleotide repeats (74). Notably, this same idea, by which the expanded trinucleotide arrays could act as two closely positioned repeat copies, was also proposed by Christophe Goyon, Jean-Luc Rossignol, and colleagues in the context of their studies on MIP (56).

Subsequent studies in D. melanogaster by Sarah Elgin and colleagues, using some of the original mini-white arrays of Dorer and Henikoff, showed that the heterochromatic state of these repeats was dependent on the presence of functional RNAi (75). Furthermore, around the same time, the key role of RNAi in heterochromatic silencing was also demonstrated in fission yeast (76, 77) and in plants (78), suggesting that the presence of tandem repeats per se were not sufficient, at least in some situations, to program the formation of functional heterochromatin.

The Canonical Heterochromatin Pathway in Neurospora

In N. crassa, the bulk of H3K9me3 and C5-cytosine methylation can be found in the context of AT-rich DNA that was previously mutated by RIP (33). All mechanistic aspects of this pathway were elucidated largely by the work of Eric Selker and colleagues (33). In brief, the canonical pathway starts with RIP generating AT-rich DNA in premeiotic cells, which then provides a signal for the recruitment of heterochromatin factors in vegetative cells (33). The key heterochromatin mark, H3K9me3, is catalyzed by the conserved SUV39 methyltransferase DIM-5 (33). H3K9me3 is recognized by the Neurospora HP1, which directly recruits the cytosine methyltransferase DIM-2 to DNA (33). This pathway involves some additional factors (33), and, in some situations, it can deviate from its canonical form (79).

Both RIP and repeat-induced heterochromatic silencing may be considered as serving the same purpose of suppressing repetitive DNA. Yet, in Neurospora, the relationship between RIP (which occurs in premeiotic cells) and heterochromatic silencing (which occurs in vegetative cells) presented a puzzle (43). It was known for a long time that in N. crassa newly introduced highly repetitive transgenes could become associated with H3K9me3 and cytosine methylation in vegetative cells, prior to RIP and in the absence of RNAi (2023, 80, 81). These results, which were obtained by several independent research groups, supported the idea that the presence of repetitive DNA could induce heterochromatin directly, in accord with the original proposal by Pontecorvo (73) and Dorer and Henikoff (74).

Neurospora RIP Can Be Mediated by the Heterochromatin Pathway

Recently, it was discovered that Neurospora RIP could occur in the absence of RID, suggesting that (i) RID was not essential for the process of repeat recognition per se, and (ii) RID was not the only mediator of RIP mutation (82). Surprisingly, it was further discovered that all RID-independent RIP required DIM-2 and the catalytic activity of DIM-5 (82). Further investigation revealed that the DIM-5/DIM-2 pathway of RIP (i) was triggered specifically by the presence of repetitive sequences, (ii) could respond to repeats present at closely positioned as well as widely separated genomic positions, and (iii) mutated repeats that had never been previously subjected to RIP. Importantly, the DIM-5/DIM-2-mediated RIP could also be activated by weak interspersed homologies, in which 3-bp or 4-bp units of homology that were arrayed with the 11-bp periodicity, supporting the idea that both the RID- and DIM-5/DIM-2-dependent pathways of RIP are triggered by the same homology signal. In every case examined, DIM-5/DIM-2-mediated mutations typically spread from the repetitive elements into the neighboring single-copy regions, whereas RID-mediated mutations are largely restricted to the repeats per se (82).

A Putative Two-Step Mechanism of RIP Mutation

The original discovery of RID as a principal mediator of RIP (59) presented a puzzle: while RIP involves C-to-T mutation, RID resembles a canonical C5-cytosine methyltransferase. Although the predicted catalytic site of RID differs from all other C5-cytosine methyltransferases (above), so does the active site of Masc1, which mediates cytosine methylation, without any sign of mutation (58). Thus, it was originally proposed that the modification of cytosines during RIP might occur as a two-step process, in which C5-methylation by RID would be followed by N4-deamination of the C5-methylated cytosine by an unrelated enzymatic activity (59). An alternative proposal was also put forward by which RID alone could mediate C-to-T mutation, via modulation of its catalytic activity (59). This hypothesis was supported by earlier studies (83, 84), which implicated canonical cytosine methyltransferases (such as the prokaryotic site-specific methyltransferase M.HpaII) in catalyzed deamination of cytosines under conditions of extremely low levels of the methyl group donor S-adenosylmethionine (SAM).

The finding that DIM-2 can mediate RIP provides additional support for the two-step mechanism. In vegetative cells, DIM-2 normally catalyzes cytosine methylation in all dinucleotide contexts (85). Yet DIM-2-dependent RIP exhibits a very strong bias for CpA dinucleotides (82). While it is formally possible that the intrinsic substrate preference of DIM-2 in premeiotic versus vegetative cells is different, the strong CpA bias of DIM-2-mediated RIP mutation argues in favor of the presence of two separate enzymatic activities for catalyzing cytosine methylation and deamination. By this model, repeat-induced cytosine methylation (by either DIM-2 or RID) would be converted to mutation (5meC → T) by a deaminase activity that is specifically present during the premeiotic (germ line) phase. Interestingly, novel deaminase activities (involved in A → I editing of RNA) were recently identified in Neurospora and some other filamentous fungi (86, 87). These activities are only present during the sexual stage, as it would be expected from the hypothesized deaminases involved in RIP. Nevertheless, it is important to note that in Neurospora vegetative cells, DIM-2 can also methylate coding, nonrepetitive parts of the genome (88). If cytosine methylation of genic regions also takes place in premeiotic cells, then the proposed two-step mechanism would further require the deamination step to be homology-dependent.

Hypothesis: Nucleation of Heterochromatin by Homologous dsDNA/dsDNA Interactions

The newly discovered role of the DIM-5/DIM-2 pathway in RIP suggests that homologous interactions between coaligned double-stranded segments of chromosomal DNA, which were previously proposed to mediated repeat recognition for RIP (69), can provide a specific signal for the recruitment of these heterochromatin factors. It has been known for a long time that in N. crassa the same pathway can mediates RNAi-independent heterochromatic silencing of repetitive transgenes in vegetative cells. Taken together these results suggest that the heterochromatic silencing of transgenes in vegetative cells can also be programmed by interactions between homologous double-stranded DNA molecules. By this general model, the occurrence of homologous dsDNA/dsDNA interactions provides a physical signature of DNA homology.

In Neurospora vegetative cells, homology-directed heterochromatic silencing could play a backup role and/or could come into play in situations where repetitive DNA is newly introduced prior to RIP. Intriguingly, the structure and function of DIM-5 are conserved in other fungi, as well as in plants and animals, hinting at the possibility that homology-directed heterochromatic silencing may be a general phenomenon. This homology-directed process can function in combination with other, more canonical silencing mechanisms. In this case, transient H3K9me3 marks, induced by dsDNA/dsDNA interactions, can be amplified and propagated by additional mechanisms involving site-specific DNA binding proteins, such as Zeste (74, 89), or RNA (74). Because the proposed pairwise dsDNA/dsDNA interactions involving nearby repeats can occur in several successive rounds, and because multiple independent interactions may take place over a long repeat array at the same time, this process could be particularly efficient over large repetitive regions, such as those found in (peri)centromeric and (sub)telomeric regions of chromosomes.

CONCLUDING REMARKS

Because of their ability to move between genetic lineages and proliferate exponentially, TEs pose an existential threat to the genome stability. The evolutionary race between TEs and the host genome defense systems produced a large repertoire of solutions to the basic problem of keeping the mobile repetitive DNA in check. Most of these solutions are not absolute, and none of them (with the exception of RIP in N. crassa) can neutralize all TEs at once. Yet the emergence of especially strong RIP in N. crassa has come with a cost, as this organism appears to lack gene duplications, a major source of evolutionary novelty (90).

The genomes of most eukaryotes contain large amounts of TEs and other forms of repetitive DNA that are normally silenced in the form of constitutive heterochromatin. The newly discovered role of the heterochromatic pathway in Neurospora RIP suggests that these two phenomena may be evolutionarily related. In this case, RIP may simply represent a specialized heterochromatin-related process that terminates in cytosine-to-thymine mutation instead of cytosine methylation. Previous studies have indicated that the mechanism of repeat recognition for RIP may involve interactions between coaligned double-stranded DNA molecules. A possibility now emerges that such direct dsDNA/dsDNA interactions may underlie a much more general phenomenon of heterochromatic silencing of repetitive DNA, possbibly representing a primordial genome defense mechanism. Further analysis of homology-directed silencing phenomena will likely illuminate the fundamental properties of this mechanism in fungi and, potentially, in other eukaryotic organisms.

References

RESOURCES