Recombination-independent recognition of DNA homology for repeat-induced point mutation
- First Online:
- Received:
- Revised:
- Accepted:
DOI: 10.1007/s00294-016-0649-4
- Cite this article as:
- Gladyshev, E. & Kleckner, N. Curr Genet (2016). doi:10.1007/s00294-016-0649-4
Abstract
Numerous cytogenetic observations have shown that homologous chromosomes (or individual chromosomal loci) can engage in specific pairing interactions in the apparent absence of DNA breakage and recombination, suggesting that canonical recombination-mediated mechanisms may not be the only option for sensing DNA/DNA homology. One proposed mechanism for such recombination-independent homology recognition involves direct contacts between intact double-stranded DNA molecules. The strongest in vivo evidence for the existence of such a mechanism is provided by the phenomena of homology-directed DNA modifications in fungi, known as repeat-induced point mutation (RIP, discovered in Neurospora crassa) and methylation-induced premeiotically (MIP, discovered in Ascobolus immersus). In principle, Neurospora RIP can detect the presence of gene-sized DNA duplications irrespectively of their origin, underlying nucleotide sequence, coding capacity or relative, as well as absolute positions in the genome. Once detected, both sequence copies are altered by numerous cytosine-to-thymine (C-to-T) mutations that extend specifically over the duplicated region. We have recently shown that Neurospora RIP does not require MEI-3, the only RecA/Rad51 protein in this organism, consistent with a recombination-independent mechanism. Using an ultra-sensitive assay for RIP mutation, we have defined additional features of this process. We have shown that RIP can detect short islands of homology of only three base-pairs as long as many such islands are arrayed with a periodicity of 11 or 12 base-pairs along a pair of DNA molecules. While the presence of perfect homology is advantageous, it is not required: chromosomal segments with overall sequence identity of only 35–36 % can still be recognized by RIP. Importantly, in order for this process to work efficiently, participating DNA molecules must be able to co-align along their lengths. Based on these findings, we have proposed a model, in which sequence homology is detected by direct interactions between slightly-extended double-stranded DNAs. As a next step, it will be important to determine if the uncovered principles also apply to other processes that involve recombination-independent interactions between homologous chromosomal loci in vivo as well as to protein-free DNA/DNA interactions that were recently observed under biologically relevant conditions in vitro.
Keywords
DNAHomology recognitionHomologous pairingRecombination-independentMutationRIPRecognition of DNA homology in the absence of breakage and recombination
The ability of homologous chromosomes (or individual chromosomal loci) to specifically pair with one another in the apparent absence of DNA breakage and recombination is a prominent feature of chromosomal biology. A paradigmatic example of such recombination-independent pairing is provided by the persistent association of homologous chromosomes in somatic cells of Drosophila and other Diptera insects (Duncan 2002). Somatic pairing of homologous loci was reported in S. cerevisiae (Weiner and Kleckner 1994; Keeney and Kleckner 1996; Burgess et al. 1999; Burgess and Kleckner 1999; Cha et al. 2000; Dekker et al. 2002) and S. pombe (Scherthan et al. 1994; Molnar and Kleckner 2008). Numerous instances of transient locus-specific pairing have also been documented during the mammalian development, where they have been implicated in the regulation of gene expression, including X-chromosome inactivation (Apte and Meller 2012).
Recombination-independent homologous pairing is also featured during meiosis, where it normally occurs prior to, and independent of, the Spo11-induced DNA breaks that initiate recombination. Meiotic recombination-independent pairing has been observed in S. cerevisiae (Weiner and Kleckner 1994; Cha et al. 2000). Substantial levels of meiotic recombination-independent pairing have also been discovered in M. musculus (Boateng et al. 2013; Ishiguro et al. 2014). In C. elegans, Spo11-independent homologous pairing occurs, most prominently, between specialized regions (“pairing centers”) near one end of each chromosome, and also between numerous interstitial sites along each chromosome pair (reviewed in Tsai and McKee 2011; Rog and Dernburg 2013). In Drosophila meiosis, recombination-independent pairing of homologous chromosomes normally precedes the formation DNA breaks in females (Lake and Hawley 2012), and fully substitutes recombinational mechanisms in males (McKee et al. 2012). More generally, there is a tendency for the early meiotic recombination-independent pairing of homologous telomeres or centromeres (Stewart and Dawson 2008; Klutstein and Cooper 2014).
In all of these cases, the informational basis by which homologous DNA sequences are recognized remains unclear (Barzel and Kupiec 2008). Many different models have been proposed, including transcription-dependent (Cook 1997), protein-mediated (e.g., Ishiguro et al. 2014), and RNA-mediated (e.g., Ding et al. 2012) co-localization, as well as direct interactions between intact DNA molecules (McGavin 1977; Keeney and Kleckner 1996).
Repeat-induced DNA modifications in filamentous fungi
Phenomenon of repeat-induced point mutation (RIP). RIP can detect homology between DNA sequences exhibiting a wide range of particular base-pair compositions, transcriptional capacities, and relative as well as absolute positions in the genome (Galagan and Selker 2004). Both sequence copies undergo mutation by numerous C-to-T transitions specifically over the extent of shared homology. RIP occurs during the premeiotic stage, after fertilization but before karyogamy, in parental haploid nuclei that continue to divide by mitosis
RIP detects widely separated repeats as short as 406 base-pairs. a Unlinked duplications of the pan-2 gene spanning 534 (right) and 742 (left) base-pairs are mutated by RIP. Fragments of the pan-2 gene corresponding to the chromosomal regions 3265392–3265925 (534 bp) and 3265184–3265925 (742 bp) on Supercontig_12.6 (Chr. VI) were cloned into the plasmid pEAG66 (Gladyshev and Kleckner 2014) between the restriction sites SacII and StuI and integrated into the standard strain FGSC#9720 (Colot et al. 2006) as the replacement of the csr-1 gene. Crosses were set up and mutation of the ectopic pan-2 copies was analyzed exactly as previously described (Gladyshev and Kleckner 2014). Thirty random “late-arising” progeny spores were examined in each case. The number of spores with at least one RIP mutation was 3 (10 %) for the 534-bp repeat, and 12 (40 %) for the 742-bp repeat. b 406-bp fragment of the csr-1 gene corresponding to the chromosomal region 7404971–7405376 on Supercontig_12.1 (Chr. I) was cloned into the plasmid pMF272 (Freitag et al. 2004) between the restriction sites NotI and EcoRI and integrated into the standard strain FGSC#9720 (Colot et al. 2006) near the his-3 gene, 2.7 Mbp away from the endogenous locus. Inactivation of the endogenous csr-1 gene by RIP produces cyclosporin-resistant progeny that can be selected by plating ejected spores en masse on sorbose agar with Cyclosporin A (5 μg/ml). 24 cyclosporin-resistant progeny were collected for analysis. Each sequenced csr-1 allele contained at least one mutation that could be attributed to RIP. No other sequence changes except those apparently produced by RIP could be detected, suggesting that mutation of the endogenous csr-1 gene can be used as a sensitive genetic test for RIP activity
Neurospora RIP evolved to be particularly efficient; however, RIP-like mutation has been experimentally demonstrated in several species of filamentous fungi (reviewed in Hane et al. 2015), and signatures of RIP mutation have been detected in nearly all sequenced genomes of Pezizomycotina (Clutterbuck 2011; Amselem et al. 2015; Testa et al. 2016). Moreover, a similar homology-sensing process, known as Methylation Induced Premeiotically (MIP), occurs in a very distant relative of Neurospora, the filamentous fungus Ascobolus immersus (Rossignol and Faugeron 1994). During MIP, DNA repeats are similarly recognized during the premeiotic phase of the sexual stage, but instead of mutation, their cytosines undergo C5 methylation, which is then stably maintained through meiosis and during vegetative growth.
Neurospora RIP is mediated by a putative C5-cytosine methylase RID (RIP Deficient). a Structure of RID includes a conserved C5-cytosine methyltransferase domain (shown in blue) flanked by the N-terminal and C-terminal regions. The C-terminal region is absent in Masc1 (a homolog of RID that mediates a closely related phenomenon of “methylation induced premeiotically” in the fungus Ascobolus immersus). The difference in length between the methyltransferase domains of DNMT1, M.HaeIII and RID/Masc1 is largely explained by the reduction of the target recognition domain (TRD) in RID/Masc1 (shown in cyan). The structure of DNMT1 includes other conserved domains in the N-terminal extension that are omitted here for clarity. GenBank accession numbers are NP_001124295 (hDNMT1), XP_011392925 (RID), AAC49849 (Masc1), and P20589 (M.HaeIII). b N-terminal region of RID (also conserved in Masc1) can be fully aligned with a corresponding segment of the mammalian DNMT1 that includes a bromo-adjacent homology domain BAH2. This region was proposed to mediate putative interactions of DNMT1 with other proteins (Song et al. 2011) and double-stranded DNA (Song et al. 2012). c Predicted atomic structure of the conserved portion of RID (amino-acid positions 61-580): 97 % of residues are modeled at >90 % confidence with Phyre2 (Kelley et al. 2015). Representative structures of DNMT1 (PDB accession number 3PT6, amino-acid positions 928–1602) and M. HaeIII (PDB accession number 1DCT) are also provided. Domains are colored in accord with a and b
The three-dimensional atomic structure of the conserved portion of RID can be predicted with high confidence using existing structural data on mammalian DNMT1 (DNA-methyltransferase 1) and prokaryotic site-specific C5-cytosine methylases, such as M.HaeIII (Fig. 3c, the model is provided in PDB format as Supplementary Material). The catalytic domains of Masc1 and RID can be fully aligned with prokaryotic methylases except for the target recognition domain (TRD) which appears reduced in the Masc1/RID clade (Fig. 3a). The N-terminal regions of Masc1 and RID can be fully aligned with a corresponding segment of DNMT1 which includes a bromo-adjacent homology domain BAH2 (Fig. 3b). Notably, this region was proposed to mediate the interactions of DNMT1 with other proteins (Song et al. 2011) as well as double-stranded DNA (Song et al. 2012). Because BAH domains have been generally implicated in protein–protein interactions in the context of epigenetic silencing (Yang and Xu 2013), their presence in Masc1/RID hints at the possibility that the recruitment of Masc1/RID to repetitive DNA might also occur by an epigenetic mechanism.
RIP is recombination-independent
RIP and MIP can detect gene-sized duplication present at unrelated genomic positions. This fact suggests that both processes involve a general and efficient DNA homology search. Ever since the discovery of RIP and MIP, their relationship with recombinational mechanisms was repeatedly brought into question (Foss and Selker 1991; Irelan et al. 1994; Goyon et al. 1996; Watters et al. 1999). Two observations in Neurospora provided somewhat conflicting conclusions. On the one hand, RIP coincided with a period of increased intra-chromosomal recombination (Butler and Metzenberg 1993), suggesting a putative functional connection (also discussed in Watters et al. 1999). On the other hand, Neurospora RIP proceeded normally in crosses of mei-2, a mutant that showed a substantial defect in pairing of homologous chromosomes during meiosis (Schroeder and Raju 1991). However, because mei-2 identity was and remains unknown, and because the requirements for meiotic pairing are complex, it was hard to specifically interpret these findings.
Widely separated repeats are detected by RIP in the absence of Spo11 and MEI-3. Each parental strain carries a pair hygromycin-resistance cassettes (hph) replacing spo11 and mei-3 genes. All parental hph cassettes have the original sequence introduced by the Neurospora Genome Project (Colot et al. 2006). RIP mutation of the hph cassettes replacing the mei-3 gene was assessed in the same progeny spores used previously to analyze mutation of closely positioned repeats (Gladyshev and Kleckner 2014)
A new sensitive assay for RIP
Elucidating the homology requirements for RIP. a RIP mutation of closely positioned direct repeats of graded lengths (802, 524, 460, 400, 337, 279, 220, and 155 base-pairs) as reported by Gladyshev and Kleckner (2014). C-to-T and G-to-A mutations are counted together in the longest continuous region shared by all the repeat constructs (the invariant segment, shown in green). b Short interspersed islands of homology can be detected by RIP only when arrayed with an appropriate matching periodicity along the participating DNA segments (Gladyshev and Kleckner 2014). c Efficient recognition of homology for RIP requires the global co-alignment of participating DNA segments (Gladyshev and Kleckner 2014). The basic construct (ii) includes a 337-bp segment of perfect homology (dark blue) and an adjoining 500-bp segment of weak interspersed homology, in which 4-bp homologous units are arrayed with a matching periodicity of 11 base-pairs (light blue/magenta). Disrupting their global alignment significantly attenuates RIP. d When homologous interactions are weakened, the underlying DNA sequence plays a prominent role. i a 500-bp segment of interspersed homology, same as in (c), does not trigger much RIP by itself. ii another instance of the same homology pattern (featuring homologous units of 4 base-pairs arrayed with a 11-bp periodicity over the same total length of 500 base-pairs) triggers a nearly 100-fold stronger RIP response. This effect is attributed to the inclusion of 5′-GAC-3′/5′-GTC-3′ triplets (underlined) into the homologous units in (ii) but not (i) or (iii) (Gladyshev and Kleckner 2016)
This assay incorporates a number of features that permit accurate and unbiased measurement of RIP mutation. First, all repeat constructs are integrated by homologous recombination into the same locus (csr-1) of the same recipient strain (FGSC#9720), and the homokaryotic transformants are crossed as male parents to a standard wild-type strain of an opposite mating type (FGSC#4200), thus minimizing the influence of extraneous genetic variation. Second, one (and always the same) repeat copy is held constant and serves as a reference sequence, while the other repeat copy is altered as desired with respect to its length and base-pair sequence. Third, for each repeat construct of interest, at least 24 progeny spores are randomly sampled from the entire population of ejected “late” spores (many tens of thousands per cross). These random repeat-carrying progenies are then propagated as individual haploid clones. From each such clone, the entire repeat cassette is recovered by PCR and sequenced directly by the Sanger method (Sanger et al. 1977), without library construction that could potentially introduce spurious DNA base-pair changes.
Understanding the homology requirements for RIP
Using the above approach, we first examined RIP activity as a function of repeat length (Fig. 5a). Here, we found that as few as 155 base-pairs of perfect homology could trigger detectable mutation, and that 400 base-pairs of perfect homology already promoted strong mutation. Furthermore, for repeat lengths ranging between 220 and 520 base-pairs, the amount of perfect homology and the corresponding number of mutations formed a linear relationship on a log–log plot with a Pearson correlation coefficient of 0.9994 and a slope value of 4. This relationship has implied that in this particular context, the number of mutations is strictly proportional to the fourth power of the repeat length. This strong, regular correspondence, while not understood, has suggested that the number of mutations could be used as a sensitive measure of homology perceived by RIP.
Further experiments revealed an unexpected result: despite its remarkable sensitivity to the overall repeat length, RIP easily “overlooked” substantial internal gaps in homology (Gladyshev and Kleckner 2014). For example, we found that RIP was unaffected by the insertion of 19 base-pairs of non-homology in the middle of perfect homology (Gladyshev and Kleckner 2014). This finding, in turn, raised a new question: if the presence of continuous homology was not absolutely required for high levels of RIP, then what was the weakest kind of discontinuous (imperfect) homology that could still promote RIP? We wondered, more specifically, whether the underlying mechanism of homology recognition for RIP might be revealed by the systematic analysis of appropriately designed imperfect homologies.
We pursued this possibility by creating a modified version of the basic repeat construct, considering the fact that interactions between imperfect homologies might be relatively weak. Noting that 220 base-pairs of perfect homology could trigger low yet predictable levels of RIP (Fig. 5a), we designed a new repeat system comprising the same 220 base-pairs of perfect homology (to provide a basal level of RIP activity) plus additional 200 base-pairs of the adjoining imperfect homology, the nature of which could be manipulated as desired (Fig. 5b). As in all our experiments, one repeat copy was represented by endogenous chromosomal DNA (the reference sequence), while the other repeat copy (which included both the 220-bp and the 200-bp parts) was integrated ectopically in close proximity (Fig. 5b).
Among many different ways of designing imperfect homologies, we began using the underlying basic structure of the DNA double helix, with a canonical pitch of 10.5 base-pairs per helical turn, as a guide. Interspersed homologies were synthesized, in which short homologous units of fixed length (from 2 to 9 base-pairs) were spaced with a periodicity of 11 base-pairs approximating one double-helical turn, with homologous units separated by appropriate regions of randomly defined non-homology. Strikingly, we discovered that homologous units of 3 base-pairs or longer could be readily detected by RIP. We further extended these findings by analyzing longer stand-alone interspersed homologies, where we found that an array of 4-bp homologous units, spaced with an 11-bp periodicity and corresponding to the overall sequence identity of only 36 %, could by itself induce substantial RIP (Fig. 5d, construct ii; Gladyshev and Kleckner 2016).
Additional findings revealed that RIP could sense interspersed homologous units only when they were arrayed with a matching periodicity of 11/11 or 12/12 base-pairs, whereas matching periodicities of 10/10 and 13/13 base-pairs were ineffective (Fig. 5b). Moreover, RIP failed to detect homologous units that were spaced with two different (“mismatching”) periodicities along their respective DNA segments, including a mismatching case of 11/12 base-pairs, implying that homologous units had to be all in a proper phase relationship to be recognized by RIP (Fig. 5b).
The matching-periodicity requirement suggested that homology recognition for RIP involved physical communication along the participating DNA segments and, thus, by implication, their global co-alignment. We provided further experimental support for this idea. First, when a pair of weakly homologous segments (comprising 4-bp homologous units interspersed with a 11-bp periodicity over the total length of 500 base-pairs) were placed next to a 337-bp region of perfect homology (provided in the same relative orientation), they were readily recognized by RIP (Fig. 5c, construct ii). Reversing their relative orientation with respect to the 337-bp region (by flipping the “right” 500-bp fragment) nearly precluded their recognition (Fig. 5c, construct i). Second, a similar outcome could be observed when their global alignment in line with the region of perfect homology was interrupted by the insertion of 22 base-pairs of unrelated sequence (Fig. 5c, construct iii).
These and other observations led us to propose that recombination-independent recognition of DNA homology for RIP involves localized interspersed interactions between co-aligned double-stranded DNA molecules, with a triplet of homologous base-pairs representing the fundamental recognition unit. In this context, the revealed periodicity requirement of 11 or 12 base-pairs is interesting, for two reasons. First, the 11-bp periodicity can be achieved naturally by negative supercoiling that can be created by the release of constrained supercoils as a result of nucleosome remodeling (e.g., discussed in Baranello et al. 2012). Second, 11-bp and 12-bp periodicities appear to be equally effective despite the fact that the energetic requirements for these two states are quite different.
When homologous interactions are weak, DNA sequence plays a prominent role
In the course of analyzing various interspersed homologies, it became clear that different nucleotide sequences could promote dramatically different levels of RIP even if they formed identical patterns of homology (Gladyshev and Kleckner 2016). This aspect is illustrated by the comparison of two constructs, in which the same reference sequence is paired with two alternative sequences (Fig. 5d, compare constructs i and ii). Both cases feature homologous units of 4 base-pairs arrayed with an 11-bp periodicity over the same total length, but with different specific base-pairs involved in those units. These constructs induce RIP levels that differ by two orders of magnitude (Gladyshev and Kleckner 2016). To further understand the basis for this unexpected result, we compared the base-pair triplet compositions between several effective and ineffective interspersed homologies. We found that the stronger RIP activity was associated with a particular homologous trinucleotide, 5′-GAC-3′/5′-GTC-3′, and that the appropriate elimination of these triplets suppressed nearly all RIP activity (e.g., Fig. 5d, compare ii and iii). We note, however, that other triplets and/or sequence/homology features may be important, as some interspersed homologies without 5′-GAC-3′/5′-GTC-3′ triplets can, nonetheless, promote substantial levels of RIP (Gladyshev and Kleckner 2016).
Homology recognition is functionally separable from mutation specificity
In our studies of the homology requirements for RIP, patterns of interspersed homology were always analyzed for constructs involving two interacting sequences, one of which (the endogenous copy, also known as the reference sequence) was kept identical in all the constructs, while the other (ectopic) copy was varied as desired. By manipulating the nucleotide composition of the ectopic sequence, we could designate specific base-pairs in the reference sequence as homologous (to the ectopic copy). When we compared two instances of interspersed homology that induced similar levels of RIP but featured non-overlapping homologous units, we found that the reference sequence still exhibited the same stereotypical pattern of mutation, with C-to-T and G-to-A transitions at identical positions occurring at similar levels between the two constructs, and thus regardless of which particular base-pairs were able to participate in a homologous interaction (Gladyshev and Kleckner 2016).
This result implies that homology recognition and mutation are functionally separable. While the basis for this separation remains to be determined, it is tempting to consider that the two processes are also separable molecularly, where homology recognition by one set of factors leads to mutation by another set of factors (which would include RID).
Does homology recognition for RIP require specific molecules?
The putative C5-cytosine methylase RID has been implicated as a direct enzymatic mediator of RIP (Fig. 3, also discussed above; Freitag et al. 2002). It has remained unclear, however, if RID also promotes homology recognition per se, in addition to its cytosine-modifying role. Given the apparent functional uncoupling between homology sensing and mutation (Gladyshev and Kleckner 2016; also discussed above) and the lack of any obvious domain in the predicted structure of RID that could be dedicated to homology recognition (Fig. 3c), we speculate that the process of homology search for RIP is independent of RID. This idea is further supported by our recent finding that RIP can also be mediated by a canonical C5-cytosine methylase DIM-2 (Kouzminova and Selker 2001) in the absence of RID (Gladyshev and Kleckner, unpublished). In any case, it still remains to be determined whether the homology recognition aspect of RIP requires specific protein(s) or whether it occurs as a product of direct DNA/DNA interactions independent of other molecules.
Thus, two important central questions still remain. First, how is the sequence information between two DNA molecules compared? Does it, indeed, involve direct homologous interactions at the DNA level, as we propose, or, perhaps, some less direct forms of communication (e.g., mediated by proteins, RNA, or even some unknown type of small “adapter” molecules)? Second, how are the DNA sequences compared in what appears to be, superficially at least, an exhaustive search of all-against-all sequences in the genome? This situation contrasts with canonical homology searches mediated by RecA proteins where each search process is initiated by a separate DNA lesion (usually or always a double-strand break), which then scans the genome for a true target sequence. In contrast, the recombination-independent mechanism for RIP allows apparently every chromosomal site to search for a corresponding target interdependently from its immediate neighbors. This feature accentuates the general problem faced by any homology-recognition process: to find the true target, any general mechanism needs to rapidly scan many potential sites without becoming trapped at numerous suboptimal targets (Kleckner and Weiner 1993). In addition, the homology comparison process may well have to accommodate the fact that the involved regions may be already bound by other molecules, e.g., nucleosomes, that need to be removed or remodeled to allow productive homology recognition to occur on the time scale that makes the all-to-all homology search possible.
Our results lead us to hypothesize that homology recognition for RIP involves interspersed cooperative interactions along the pairs of co-aligned DNA double helices. This hypothesis begets two interconnected questions: (1) how is such DNA/DNA pairing achieved at the atomic level, and (2) how is such pairing related to the recruitment of RID and potentially other factors that mediate RIP? In these contexts, it is interesting to consider that the genome-wide homology search for RIP, while being RecA-independent, could have some general features of a RecA-mediated process. In this latter case, the conformations of both participating DNA molecules (the incoming ssDNA and the target dsDNA) are altered substantially in the context of the RecA filament to facilitate homologous ssDNA/dsDNA pairing (Prentiss et al. 2015). Likewise, in the case of RIP, some hypothetical shape-distorting factors may randomly and transiently license subsets of chromosomal loci for dsDNA/dsDNA pairing, with many individual pairing reactions taking place concurrently and independently from one another.
Models of direct dsDNA/dsDNA interactions
DNA/DNA homology recognition involving direct interactions between DNA double helices has previously been considered by a broad range of models (Gladyshev and Kleckner 2016). In general, support for direct dsDNA/dsDNA pairing is provided by in vitro experiments under the biologically relevant conditions, including the absence of divalent metal ions (Baldwin et al. 2008; Danilowicz et al. 2009; O’ Lee et al. 2016). Our results (Gladyshev and Kleckner 2014, 2016) appear to exclude models involving G-quartets (Sen and Gilbert 1988) and DNA triplexes (Sakamoto et al. 1999) that require specific nucleotide sequences, such as poly-G and polypurine/polypyrimidine tracts, respectively. Our results also exclude the electrostatic zipper model (Kornyshev and Leikin 2001), in which pairing of homologous double-stranded DNA molecules is based on their mutual electrostatic complementarity that cannot be achieved by the partially homologous sequences that still promote RIP (O’ Lee et al. 2015, 2016).
One proposed mechanism of direct dsDNA/dsDNA pairing is based on the well-known fact that the four canonical Watson–Crick base-pairs (A·T, T·A, G·C, and C·G) can make specific, self-complementary contacts at their major-grove edges (Kubitschek and Henderson 1966). This principle of self-complementarity theoretically allows the identical base-pairs to form stable tetrads without sacrificing the existing intra-duplex hydrogen bonds (McGavin 1977). Moreover, the four tetrads formed by the identical base-pairs (A·T~A·T, T·A~T·A, G·C~G·C, or C·G~C·G) all have equivalent dimensions, a property that may allow stacking in a continuous tetraplex irrespectively of the underlying base-pair sequence (McGavin 1977).
Recently, the principle of base-pair self-complementarity has been applied to conjecture homologous pairing between long double-stranded DNA molecules (Mazur 2016). According to this newly proposed model, two DNA duplexes can form a tetraplex interaction involving only 3–4 consecutive self-complementary base-pairs, compatible with the uncovered homology requirements for RIP. This mechanism, however, appears to have two limitations. First, once the initial interaction is established, the next interaction can only occur some distance away, due to the rigid nature of the participating DNA molecules. Our preliminary results suggest that RIP fails to detect homologous units of 10 base-pairs interspersed with a matching periodicity of 55 or 66 or 77 base-pairs over the total length of 500 base-pairs. However, it remains possible that such contacts may still underlie the fundamental pairing configuration and that such configuration occurs transiently and dynamically at many different combinations of distant homologies along co-aligned DNA duplexes. Second, this mechanism appears to involve a large energetic requirement, but this obstacle is not absolute, as the reaction barrier can be reduced by altering the local geometry of the major grove without breaking or unstacking the involved base-pairs (Mazur 2016). We note that it remains entirely possible that RIP engages much more sophisticated homology-recognition mechanisms, analogously to multi-protein complexes associated with RecA/Rad51-mediated reactions (Prentiss et al. 2015), or may involve additional adapter molecules, such as RNA (e.g., Ding et al. 2012) or inorganic ions (O’ Lee et al. 2016).
In conclusion, out work has demonstrated that recombination-independent recognition of DNA homology can occur between chromosomal sites that share a set of short interspersed homologous units and have overall sequence identity of only 25–36 %. Systematic analysis of the homology requirements for RIP has led us to propose a model of homology recognition that evokes interactions between co-aligned double-stranded DNA molecules. While the exact mechanisms of homology recognition for RIP remain to be elucidated, the complementary work done in vitro and in silico suggests that it may, indeed, involve direct dsDNA/dsDNA interactions. It is tempting to speculate that such interactions are not restricted to RIP (and MIP) and may underlie a variety of chromosomal phenomena that sport recombination-independent pairing and (epi)genetic modification of homologous DNA sequences. Given the emerging higher-order structures of copious repetitive DNA present at (peri)centromeric and (sub)telomeric regions of eukaryotic chromosomes (e.g., Yang and Li 2016), it is not impossible to envision that homologous dsDNA/dsDNA interactions might also be particularly important in this broader context.
Acknowledgments
This work was supported by the Grants GM044794 and GM025326 from the National Institutes of Health to N. K. and The Helen Hay Whitney Foundation, The Howard Hughes Medical Institute, and Charles A. King Trust to E.G.