Abstract
For more than half a century, genotoxic agents have been used to induce mutations in the genome of model organisms to establish genotype-phenotype relationships. While inaccurate replication across damaged bases can explain the formation of single nucleotide variants, it remained unknown how DNA damage induces more severe genomic alterations. Here, we demonstrate for two of the most widely used mutagens, i.e. ethyl methanesulfonate (EMS) and photo-activated trimethylpsoralen (UV/TMP), that deletion mutagenesis is the result of polymerase Theta (POLQ)-mediated end joining (TMEJ) of double strand breaks (DSBs). This discovery allowed us to survey many thousands of available C. elegans deletion alleles to address the biology of this alternative end-joining repair mechanism. Analysis of ~7,000 deletion breakpoints and their cognate junctions reveals a distinct order of events. We found that nascent strands blocked at sites of DNA damage can engage in one or more cycles of primer extension using a more downstream located break end as a template. Resolution is accomplished when 3’ overhangs have matching ends. Our study provides a step-wise and versatile model for the in vivo mechanism of POLQ action, which explains the molecular nature of mutagen-induced deletion alleles.
Author Summary
DNA damage poses a threat to cell survival as it impedes accurate and efficient copying of DNA that precedes mitotic division. If left unrepaired, DNA damage can give rise to DNA double-strand breaks (DSBs), which is considered to be one of the most dangerous types of genomic insult. Repairing DSBs is vital to preserve genomic integrity and to promote cellular survival. Here, we demonstrate in C. elegans that exposure to mutagenic agents leads to replication-associated DSBs that require polymerase Theta (POLQ)-mediated end joining for their repair. We provide a mechanistic basis for understanding how mutagens that interfere with DNA replication induce the type of deletion mutations that have been generated in a great variety of organisms in order to establish genetic null alleles. Bio-informatical data-mining of many thousands of deletion alleles subsequently has allowed us to establish DNA repair pathway-specific signatures and to uncover a surprisingly simple and elegant solution of how cells can repair a broken chromosome, while keeping the loss of genetic information to a minimum.
Introduction
DNA mutations fuel evolution of organisms giving rise to speciation, and of cells within an organisms giving rise to cancer. Two replication-associated mechanisms are responsible for most if not all single nucleotide variants (SNVs) as well as small insertions/deletions (indels) at repetitive sequences: i) copying errors made by the replicative polymerases delta and epsilon, which are mostly undone by DNA mismatch repair, and ii) replication of damaged DNA by specialized so-called translesion synthesis (TLS) polymerases. TLS polymerases, in contrast to the replicative polymerases, have the ability to extend nascent DNA strands across non- or poorly coding damaged bases, often leading to mutation. It is, however, less well understood which mechanisms are responsible for other types of genomic alterations, such as deletions that are larger than a few bases.
A recent study that involved whole genome analysis of C. elegans animals that were propagated for many generations revealed that vast majority of accumulating deletions larger than 1 bp required the activity of the A-family polymerase Theta (POLQ). Upon unperturbed growth, wild-type C. elegans genomes accumulate SNVs as well as deletions but the latter class was strikingly absent in strains that were defective for POLQ [1]. Instead, much more dramatic chromosomal rearrangements were noticed indicating that POLQ action protects the genome against deterioration but at the cost of a small genomic scar. A similar profile of mutagenesis was observed resulting from DNA double-strand break repair, which hinted towards DSBs as being a very prominent source of genome diversification during evolution, and towards error-prone DSB repair as the mechanism responsible for this type of genome alterations [1].
The first demonstration of POLQ acting on DSBs was made in Drosophila: in vivo processing of artificially-induced DSBs in POLQ-mutant flies deviated from that in wild-type flies [2]. POLQ deficiency did not increase sensitivity to ionizing radiation, yet it did greatly exacerbate hypersensitivity in flies impaired in homologous recombination. Apparently, a POLQ-dependent DSB-repair pathway can act as a backup in HR-compromised circumstances. Indeed, recent work on human POLQ revealed a strong synergistic relationship between the HR pathway and POLQ-mediated DSB repair [3,4]. The synthetic lethal nature of this genetic interaction may be of great clinical importance as it identifies POLQ as a druggable target for tumours carrying mutations in HR genes. Another indication that POLQ repairs DSBs in contexts where HR is compromised came from genetic studies performed in C. elegans. Here it was shown that POLQ-mediated repair is the only pathway (also in HR-proficient conditions) capable of repairing replication-associated DSBs that are induced when persistent DNA damage or stable secondary structures cause a permanent block to DNA replication [5,6]. It was subsequently shown that these DSBs result from inheritable ssDNA gaps opposite to the strand containing the damage, which could thus not serve as a template for HR [7].
Extensive analyses of repair products in both flies and worms provided a clear signature of POLQ-mediated DSB repair with two prominent features: i) the notion of microhomology at the repair junctions, a feature previously ascribed to non-canonical end-joining also called alternative end-joining [8,9], and ii) the occasional presence of so-called template inserts: deletions that contain, at the deletion junction, the inclusion of a DNA insert (hereafter called delins). These inserts are of variable length but their origin can be mapped to DNA regions that lie in very close proximity to the DSBs ends that produced the delins. Similar hallmarks can be found for POLQ-mediated DSB repair in human and mouse cells [4,10]. A recent in vitro study provided a molecular explanation for the prominent presence of microhomology at the DSB repair junctions: repair reactions with purified protein showed that two base pairs of complementarity is enough for human POLQ to pair and extend 3’ overhangs of partially double-stranded oligonucleotides [11].
Although it is now becoming increasingly clear that POLQ plays an evolutionarily conserved role in DSB repair, how POLQ acts in vivo to explain all the observed consequences remains to be elucidated. Over the last four decades, the C. elegans community has used EMS and UV/TMP to generate many thousands of deletion alleles, but the underlying mechanism has remained unknown. Here, we demonstrate that mutagen-induced replication breaks in C. elegans germ cells are exclusively repaired by POLQ. This publically available allele collection, reflecting ~7,000 in vivo POLQ-mediated end joining reactions, allows us to analyse and describe the POLQ-mediated repair mechanism in great detail.
Results
POLQ-deficient animals are hypersensitive to EMS and UV/TMP
To investigate whether POLQ plays a general role in the processing of mutagen-induced DNA damage, we assayed embryonic survival in animals that were exposed to two of the most widely used mutagens in C. elegans: EMS, which causes alkylating damage, and TMP, which, upon exposure to UVA light, results in monoadducts and crosslinks. We found polq-1-deficient animals to produce more unviable embryos than wild-type animals when exposed to EMS (Fig 1A and S1 Fig), but not to the extent observed in animals that are defective for polymerase eta (polh-1), a translesion synthesis (TLS) polymerase that is involved in replicative bypass of DNA damage [12]. A similar mild hypersensitivity was observed when polq-1-mutant animals were incubated with TMP and subsequently exposed to UVA (Fig 1B and S1 Fig), in agreement with previously published work [13]. In addition to monitoring the survival of embryos, we monitored their ability to produce functional gametes. Complete or partial sterility of daughters from exposed mothers is another phenotype that is related to genotoxic stress, likely because germ cells, or their progenitors, are more susceptible to DNA damage-induced arrest, apoptosis, and mitotic catastrophe [14]. Indeed, at EMS or UV/TMP doses where the brood size of exposed mothers were only moderately affected in both wild-type and polq-1-mutant animals (Fig 1C and 1D) dramatic sterility was observed in polq-1 but not in wild-type progeny animals (Fig 1E and 1F): 99% versus 16% median reduction, in brood for EMS-treated animals, and 65% versus 5% for UV/TMP-treated animals. These data establish a prominent role for POLQ in protecting germ cells against EMS and UV/TMP-induced toxicity.
EMS and UV/TMP-induced deletions are dependent on POLQ
EMS and UV/TMP are widely used mutagens in C. elegans to create loss-of-function alleles [15]. Given the sensitivity of polq-1 animals towards these agents we wanted to investigate whether POLQ functionality is relevant for generating these alleles. EMS predominantly alkylates guanine which can be bypassed, leading predominantly to GC>AT transitions [15–17]. Deletions also result from EMS treatment through yet unknown biology [17]. UV/TMP treatment results in a different spectrum of mutations: for this mutagen, deletions dominate base pair substitutions [17,18], but also here, the underlying mechanism of deletion formation is unknown. To address the candidate role of POLQ in producing deletion alleles, we created libraries of mutagenized wild-type and polq-1-mutant animals and screened them for deletions. We used standard protocols that were previously used by numerous laboratories and consortia leading to the ~10,000 C. elegans deletion alleles that are currently available [19–21]. The general concept of these protocols is to find by PCR a smaller than wild-type product for a target of interest in pooled broods of mutagenized animals; then use a sib-selection strategy to isolate the mutant allele (S2 Fig and Methods section). Because the progeny of mutagenized polq-1-animals have a reduced brood size (Fig 1E and 1F), we screened the F1 generation, and not the F2, which allowed us to inspect the same number of animals for polq-1-mutant and wild-type genotypes. We screened the libraries for deletions using eight different amplicons, all ~1 kb in size. Positive pools were chased by PCR of less-complex pools and individual library addresses (in duplicate) to exclude false positives (See Methods for details). This strategy proved to be robust and specific as deletion alleles were readily detected in wild-type animals exposed to either EMS or UV/TMP, but not in mock-treated animals (Fig 2A and 2B and S2B and S2C Fig). In contrast, we did not find a single deletion allele in libraries of either EMS- or UV/TMP-mutagenized polq-1 animals (Fig 2A and 2B). From this data we conclude that EMS- and UV/TMP-induced deletion mutagenesis, in the size range of 50 bp up to ~1 kb, requires functional POLQ.
To further validate this conclusion we investigated UV/TMP-induced mutagenesis in a more unbiased fashion by catching loss-of-function mutations in an endogenous genomic target, unc-93. A dominant mutation in the transmembrane protein UNC-93, unc-93(e1500), causes worms to move uncoordinatedly. Loss of UNC-93 expression, or of one of its cofactors SUP-9 and SUP-10 results in a reversion to wild-type movement, which provides an easy phenotypic manner to monitor loss of function mutagenesis. We exposed POLQ-proficient and -deficient animals, carrying the unc-93(e1500) allele to TMP with or without UVA irradiation to introduce crosslinks. Wild-type-moving animals were isolated from the brood of exposed animals and subsequently inspected for deletions in unc-93, sup-9 and sup-10. The mutants that did not, by DNA gel electrophoresis, reveal a deletion in any of the three genes are likely the result of single nucleotide variations (SNVs) and were not further analysed. In treated wild-type animals, we observed an increase in two distinct categories of deletions (Fig 2C and 2D): one class, comprising of small, 50 bp to 1 kb, deletions with median size of ~100 bp (S2D Fig), and another class in which deletions are substantially larger, being >5 kb in size (Fig 2D). No deletions were found in the size range 1–5 kb. UV/TMP-treated polq-1-deficient animals were, however, devoid of small deletions, while the ratio of very large deletions further increased (Fig 2C and 2D). Based on these data and the PCR-based screenings of UV/TMP-treated mutant libraries, we conclude that the vast majority (if not all) of small deletions in the range of 50 bp up to at least 1 kb are the result of POLQ action. In its absence large deletions manifest, which, in agreement with our previous work, argue that POLQ prevents large genomic alterations at replication blocking DNA lesions at the expense of relatively small deletions [1,5,6].
Replication approaches to one nucleotide from the damage
Above, we demonstrate that deletion alleles isolated from libraries of EMS- and UV/TMP-treated populations are the result of POLQ action. This notion allows us to systematically analyse a uniquely rich collection of ~2,000 EMS- and ~8,000 UV/TMP-induced deletion alleles that were generated by the C. elegans community to elucidate the in vivo mechanism of POLQ action. Fig 2E displays the sizes for all ~10,000 alleles, for which the sequence information was retrieved from WormBase [22]. The majority of alleles are between 50 bp and 1kb and can be categorized into two groups: i) simple deletions, which make up the majority of events (~70–75%) in both the EMS and in the UV/TMP dataset, and ii) deletions that are accompanied by an insertion of a small segment (median: 5 bp for both sets) of novel DNA; we refer to this class (~25–30%) of alleles as delins (Fig 2F–2H). We set out to characterize the ~5,000 deletions and ~1,800 delins, filtered to size (50–1,000 bp), into great detail.
First, we investigated the base composition of deletion junctions to further examine an earlier reported relationship in POLQ-mediated mutagenesis between the position of a deletion breakpoint and the position of a replicating blocking lesion: we previously found for deletions resulting from replication blocking G-quadruplexes that one of the breakpoints maps close to the replication impediment [6]. This led to a model where deletions result from processing the 3’ hydroxyl ends of blocked nascent strands. DNA lesions induced by EMS and UV/TMP also have the potential to block replication, and we thus questioned whether cognate deletions close to their breakpoints carry the signature of EMS- or UV/TMP-inflicted base damage. More precisely, if one of both breakpoints results from processing a stable but reactive nascent strand that was extended up to the damaged base, then the first nucleotide immediately downstream of the breakpoint (the -1 position) might reveal the nature of the replication impediment (see Fig 3A for a graphical illustration of this concept). Indeed, we found a clear non-random base composition at position -1: for EMS we found an overrepresentation of cytosine (Fig 3B and S3 Fig), which perfectly fits the damage spectrum of EMS predominantly ethylating guanines [16,17]. Blocked DNA synthesis, incapable of extending across a damaged guanine, would result in a 3’ hydroxyl end immediately upstream of a cytosine. Also for deletions induced by UV/TMP we found at the -1 position a clear mutagen-specific overrepresentation of a particular base, in this case an adenine (Fig 3C), which reflect TMPs reactivity towards thymines [23]. Strikingly, and in contrast to the EMS spectrum, we here also observed a non-random distribution at the +1 position, being a thymine. This outcome suggests that UV/TMP-induced deletions are preferentially induced at sites where replication is blocked by a thymine that is preceded by an adenine, a conclusion that is further supported by probing the datasets with pairs of nucleotides (S3 Fig). This prevalent signature is in perfect agreement with the preference of psoralens to intercalate into and react with 5’TA in duplexed DNA [24,25]. Without further genetic dissection, however, it is impossible to discriminate between interstrand crosslinks at 5’TA sites or monoadducts (or DNA-protein complexes) formed at sites of preferred intercalation, being responsible for POLQ-dependent deletion formation. Irrespective which lesion, our data indicates that replication can proceed right up to the base that is damaged by the psoralen moiety.
Our analysis of ~7,000 mutagen-induced deletion alleles reveals a clear lesion-specific signature in POLQ-mediated deletion formation. Importantly, a single replication fork block triggers such a deletion, as we observed a damage signature at only one of both breakpoints (S4 Fig). The position of the damage with respect to the deletion junction supports a mechanistic model where the nascent strand blocked at the site of base damage is not subjected to extensive trimming but instead is reactive towards a POLQ-mediated end-joining reaction that has small sized deletions as an end-product. The putative mechanism responsible for generating the other reactive end at a 50–1,000 bp distance will be discussed later, but we will provide evidence that, with respect to reactivity, it is indistinguishable from the blocked nascent strand.
Single nucleotide priming is sufficient to initiate repair by POLQ
We reveal above that the terminal nucleotide of the nascent strand, blocked at the site of base damage, is retained in the repair product, it is the base immediately flanking the deletion, but does it also guide repair? To address this question we compiled all simple deletions from the UV/TMP dataset that had the signature T+1,A-1 composition at one of both breakpoints, because only for this subclass (n = 1,248) the identity of the terminal nucleotide of the nascent strand is known, i.e. a thymine. We then tested the following prediction: if this 3’ thymine is guiding repair of the break, by providing a minimal primer for POLQ, a thymine should be overrepresented at the -1 position of the opposite flank (Fig 4A for a graphical illustration). This is indeed what we found: Fig 4B shows that the composition of the donor sequence opposite to the blocked nascent strand is completely random apart from position -1, which is dominated by a thymine. A similar conclusion results if we use an approach that is blind to the replication-obstructing base and does not restrict the analysis to a single nucleotide. For each of the ~5,000 alleles we established the degree of homology between both breakpoints by scoring the degree of sequence identity in a 16-nt window, encompassing the 8 outermost nucleotide of the flanking sequence and the 8 nucleotides of the adjacent but deleted sequence (see Fig 4C for a schematic illustration of the approach). These plots were subsequently compiled to generate heat maps for the different category of alleles. In both the UV/TMP-induced (n = 4,461) and the EMS-induced deletions (n = 662) crosstalk between both breakpoints is observed, but only for the nucleotide at the -1 position of the deletion and the +1 position of the opposing flank (Fig 4D). This outcome lends further support to the hypothesis that the terminal base of one end, upon minimal pairing with the opposing template, is guiding POLQ-mediated repair.
Templated inserts and simple deletions have a common origin
Once priming has been established and extension has commenced there are two possible fates: i) continuation and further processing; in which case the outcome will be a deletion with single nucleotide identity at the junction, or ii) discontinuation. If, in the latter case, the extended end serves as a new nucleation site for yet another round of POLQ-mediated repair, templated inserts will result (Fig 5A). If so, delins are suspected to have some features identical to those described above for simple deletions. To address this, and to further dissect the in vivo mechanism of POLQ-dependent mutagenesis, we characterized the ~25–30% of mutagen-induced deletion alleles that are accompanied by small insertions in great detail. First we placed them, based on their size and suspected origin, in different categories (Fig 5B): ~47–50% are so small (<5 bp) that their origin is untraceable, and another 5–10% are larger in size but their sequence does not provide enough certainty as to their origin. However, ~40–45% of delins (~700) have inserts with sufficient sequence information to reveal their source: apart from a small percentage (~3%) that comprise of sequences mapping to distant sites at the same chromosome or to other chromosomes (S5 Fig), the majority (~37–44%) maps very close to the deletion. These insertions are either completely or partially identical to parts of the flanking sequences and have been designated ‘templated inserts’ because of a presumed role for the flanking DNA to serve as a template for a repair reaction. Because the majority of templated inserts map a few bases away from the deletion junction (the template is located within the flank) a number of parameters can be investigated centred around the questions: i) what defines the start of POLQ-mediated DNA synthesis, ii) what defines the end, and iii) how accurate is it?
With respect to the start, we focused on templated inserts that are 100% identical to sequences in their flanks to avoid possible ambiguity in interpretation. For both UV/TMP and EMS-induced alleles (n = 227 and 41, respectively) we found that templated inserts, similar to simple deletions, are primed by a single base pair. This priming becomes apparent when the base composition of one breakpoint is plotted to the base pairs that are neighbouring the sequence that served as a template for extension (Fig 5C and 5D). Overrepresentation of sequence identity is confined to one position, the +1 base of one breakpoint (the reactive end) and the base flanking the origin of the insert in the opposite breakpoint (the template), providing further confirmation that a single base pair is sufficient to drive POLQ-mediated repair. We found that ~85% of inserts originate from priming within 10 base pairs of the breakpoints (Fig 5E), which could point to homology search close to the end of the available sequence.
Templated inserts result from template switching and reiterated priming
The observed similarities in the initiation steps of deletions that are simple and those that include a templated insert means that the difference between both outcomes is the consequence of a downstream step, for instance, discontinuity of POLQ action. The determinants influencing discontinuity in the repair reaction are currently unknown but it is a remarkable frequent event as ~25% of all alleles have insertions. From plotting the size of all inserts (Fig 5B), we infer that templated inserts do not have a minimal length: although it is impossible to reliably map inserts of only one or a few bases to the flanking sequences, we observe that the percentage of inserts that can be mapped is constant, yet high, over the complete range of small insert size. This notion argues that also the very small, unmappable, insertions are flank-derived. Fig 5B also shows that while template inserts are overall rather small (<25 bp), they do not have a preferred size. Instead, a gradual decline in length is observed which may suggest that comprehensive extension prevents discontinuity. Still, we also found inserts where stretches of more than 20 consecutive bases have been templated, indicating that substantial base pairing can still be disrupted before the two opposite ends are irreversibly connected. Whether POLQ dissociates from the template in this process or whether POLQ facilitates template switching is an interesting question as the latter option could serve to broaden the resolving potential of POLQ-mediated repair. Some delins have complex combinatorial inserts with two or more mostly overlapping templated inserts, arguing for reiterative steps of priming, extension and dissociation. In most of these cases (16 out of 17) only one flank provided the template, which hints towards directionality in POLQ-mediated resolution.
To complete repair of aborted reactions, it seems plausible that another round of priming and extension is required, analogous to the biology leading to simple deletions, only in this case, one end has been extended using the other end as a template. To test this hypothesis, we again created heat maps, but here compared the terminal bases of the origin of the template inserts as well as their flanking bases (as this constitutes the new reactive end), to the border of the same flank, which in this scenario is considered the opposing end (Fig 6A). We indeed found support for a single base pair priming reaction as also here a clear overrepresentation of single nucleotide identity is observed (Fig 6B and 6C). Our combined analysis thus supports a model, where simple deletions and template inserts result from the same chemistry, displaying the same features, the only difference being an aborted POLQ-mediated extension of a single base paired-primed intermediate.
Probing the entire collection of ~10,000 EMS- and UV/TMP-induced C. elegans deletion alleles for single nucleotide identity at break junctions and the presence of template inserts suggest that POLQ-mediated end joining is responsible for the majority of deletions in a 50–3,000bp range (S6 Fig).
POLQ activity is error prone
At present it is unknown what underlies the discontinuity in POLQ-mediated repair that leads to delins instead of simple deletions. One possibility is polymerase errors. POLQ is a relatively error-prone polymerase generating single base errors at rates 10- to more than 100-fold higher than other polymerase A family members [26]. Mismatches resulting from wrongly incorporated nucleotides may reduce POLQ’s processivity and promote dissociation and/or template switching. One observation provides strong support for such a scenario: the frequency of errors observed in templated inserts is extremely high as compared to mutations in the flanks of the simple deletions, while for both repair products the flank has served as a template for POLQ action. Although ~30% of all templated inserts are perfect, in the sense that they do not show mismatches, another 15% can be matched to the flank through a single run of consecutive bases if one mismatch or one slippage event is allowed (Fig 7A). It can thus be argued that at least 1 in 3 templated inserts suffers from a mutation which translates to an error rate of ~1 in 30 base pairs during templated extension (average insert size = ~10bp). In sharp contrast, we found only few mutations in the flanks of ~4,500 UV/TMP- induced simple deletions. Assuming that here POLQ is required to extend the reactive end with at least 10 bp, we calculate an error rate of <1 in 3,000 bp for simple deletions. To explain the >100 fold higher mutation frequency in extension leading to templated inserts, we propose that POLQ errors in fact provoke template switching, thus are causal to the formation of delins. A supporting observation is that mismatches are more frequently found closer to where the reaction is abrogated (Fig 7B).
POLQ replication errors could result from replicating non-damaged or damaged DNA. The in vitro demonstrated bypass activity of POLQ may help to extend past base damage or abasic sites. We mostly found incorrect incorporation of adenines opposite to any nucleotide other than a thymine (Fig 7C), making up for half of all mismatches, which fits with the preferential incorporation of adenine that has been observed for POLQ in vitro [27].
Mutagen-induced deletions are the product of DSB repair
Finally, using this unique dataset of ~7,000 in vivo POLQ reactions we re-evaluated the assumption that POLQ acts to protect against mutagen-induced damage by acting on replication-associated DSBs. Despite having demonstrated that POLQ-mediated end joining is a stand-alone DSB-repair pathway that is able to process bona fide DSBs [1], it remained difficult to formally prove that a DSB is an intermediate in a repair reaction that produces simple deletions and templated inserts that were previously also found to accumulate in mutants defective for TLS polymerases. Through combining the features that characterize POLQ-mediated deletions, a mutagen, i.e. UV/TMP, that leaves a signature in the final product, and the sheer size of the collection analysed here, we are now able to establish that replication-associated deletion mutagenesis results from the processing of two opposing 3’ extendable ends, hence a DSB. Above, we have shown that a nascent strand blocked at a site of base damage can serve as a single nucleotide primer to be extended, using a donor sequence, located 50–1,000 bp away, as a template. In Fig 8, we show that there is an equal likelihood of finding the reciprocal event: that the sequence immediately upstream of the blocked fork has served as a template for a priming, reactive end that is located 50–1,000 bp more downstream. This argues that POLQ-mediated repair, as in repairing bona fide DSBs, here acts to connect two 3’ reactive ends. It is currently unknown whether POLQ-mediated repair of replication-associated DSBs necessitates end-resection to create sizable 3’ ssDNA regions (which then function as primer or as template). In vitro, human POLQ can extend ssDNA molecules intra-molecularly through a fold-back-stimulated templated reaction [28]. Here, by probing the delins for inserts that had a reverse-complement orientation with respect to their flanking matches we indeed found in vivo support for 3’ extension in which both the primer and the template reside on the same DSB end (S7 Fig).
Discussion
In this study, we have shown that EMS and UV/TMP-induced DSBs are predominantly repaired via POLQ-mediated repair and in-depth analysis of ~7,000 unique deletion footprints allowed us to unveil important characteristics of the in vivo repair mechanism. We found that mutagen-induced deletions are the product of alternative DSB repair in which one end is produced by the replication machinery that approached the damage up to one nucleotide. Base pairing of the terminal nucleotide of the blocked nascent strand to single stranded DNA at the opposite break end primes POLQ to polymerize, resulting in DNA tracts that are templated by the sequence immediately flanking the DSB. Further processing of the ensuing stable joints produces simple deletions. However, in case DNA synthesis is interrupted, likely resulting from POLQ errors, a primer-template switch is induced in which the newly formed terminal nucleotides again pair in order for POLQ-mediated extension to continue. We find that one or more cycles of such templated DNA synthesis and primer-template switching can fully explain the composition of deletions that are associated with inserts.
From a conservative point of view, POLQ-mediated repair is a surprisingly elegant solution to the problem how to repair a DSB while keeping the loss of genetic information to an absolute minimum: the repair reaction does not depend on removal of nucleotides to create ligatable ends. It is thus an intriguing idea that nature, perhaps because of the polarity in DNA synthesis being in a 5’ to 3’ direction, has evolved DNA repair and recombination mechanisms that use or tolerate extensive 5’ but not 3’ end-resection; it is obvious that having both these activities prominently used inside nuclei would constitute a great threat to genomes. We have shown here for POLQ-mediated repair of DSBs that the 3’ end of a DNA molecule is very stable and acts as a nucleation site in the repair reaction. Using a specialized polymerase to extend and as such stabilize minimally paired 3’ ends, as opposed to trimming by exonucleases provides a simple yet powerful and versatile solution to a complex problem. One striking aspect of C. elegans POLQ is the notion of single nucleotide homology. The degree of microhomology in (POLQ-dependent and potentially POLQ-independent) alternative end-joining in a number of other biological systems, such as mouse, human and also plants appear to concern more bases, frequently 3 to 4 bp [4,10,29]. It is yet unclear whether this difference reflects species specific adaptation to the enzyme or differences in the context in which POLQ was studied: a recent in vitro study using purified human POLQ demonstrated pairing and extension of 3’ overhangs with just two nucleotides of homology [11]. Another perhaps more striking difference in POLQ-mediated repair between species is the composition of insertions that are found in between the break junctions. While insertions in C. elegans are mostly derived from a single proximal location, footprints in other species suggest that POLQ is more promiscuous, because inserts often originate from multiple locations, which is suggestive of iterative rounds of abortive repair [4,29]. It is currently unknown what is the cause of this apparent discrepancy between POLQ-mediated repair in different species, but it is of interest to note that mammalian POLQ has evolved to include three additional loop regions in the polymerase domain. One of these loops, loop2, was recently implicated in non-templated terminal transferase activity [28]. The ability to add random nucleotides to the 3’ end of a DSB-repair intermediate may help to generate more opportunity for microhomology-mediated templated resolution.
We have previously shown that POLQ is the primary pathway acting on DSBs that result from DNA replication blocking endogenous lesions [5–7]. An intriguing question concerns the size distribution of resulting deletions: as also shown here, one junction is defined by the replication fork impediment, but what defines the other end? Genetic and molecular dissection of replication-obstructing G-quadruplex structures has led to the model where a replication-stalling DNA lesion results in a ssDNA gap downstream of the impediment [5–7]. More recently, we provided evidence supporting the idea that it is this gap that is responsible for a DSB (with ends 50 to a few hundred bps apart) when the gapped strand is replicated in the next S-phase [7]. POLQ-mediated alternative end-joining subsequently acts on these replication-associated DSBs, instead of HR, which cannot repair the break using the sister chromatid as the latter still contains the replication-blocking impediment (see [7] for details).
In this study, we demonstrate an identical genetic requirement for the repair of DSBs resulting from mutagen exposure; however, it is yet uncertain which replication-blocking lesions are causative. EMS induces a plethora of lesions [30] some of which have been shown to be potent blocks of the replicative polymerases [31], whereas UV/TMP treatment generates psoralen monoadducts on thymines and interstrand crosslinks with a great preference for thymines. Whether deletions induced by UV/TMP are the result of ICL or monoadducts is an outstanding question because the notion of preferential junction formation at 5’TA sites is not discriminatory. Although this outcome perfectly fits a scenario of replication up to the first damaged base of juxtaposed T-T ICLs, it also fits to replication blocking at monoadducts that are preferentially induced at 5’TA sites. The hypersensitivity of C. elegans POLQ mutant animals towards alkylating and crosslinking agents (as also observed for POLQ/Mus308 mutant Drosophila) may seem to contradict to an apparent lack of sensitivity in other systems, such as POLQ knockout mouse cells. We suspect this difference to primarily originate from the fact that C. elegans toxicity assays, especially those encompassing early embryonic cell divisions, are very sensitive to perturbations of DNA replication [12,32].
Exposure to mutagens, such as EMS and UV/TMP, is widely used to induce random mutations in a great variety of organisms other than C. elegans, such as Drosophila, Zebrafish, Arabidopsis, Tomato, and mouse. Although EMS-induced damage predominantly induces SNVs, in all these biological systems deletions have been observed ranging in size from a few base pairs to numerous kb [18,20,33–39], and it will be of great interest to investigate whether the causal involvement of POLQ-mediated repair is evolutionary conserved.
In this work, we have linked a specific type of mutations, i.e. deletions of small size, to carcinogenic mutagens that are used in clinical setting. It is becoming increasingly important to establish causal relationships between the exact type and nature of their DNA damaging agents and genome alterations, especially because of the growing interest in mutational signatures in cancer genomes. Recently, the altered genomes of cancer cells are not only inspected for potentially cancer promoting (driver) mutations but also for signatures that testify to the history of the tumour, with respect to genetic makeup and/or environmental exposure [40]. Currently, the majority of these signatures are based on single base substitutions and their surrounding DNA context, but cancer genomes are loaded with copy number variations, deletions and insertions, and also gross chromosomal rearrangements that are likely resulting from mutagenic DNA repair processes [41,42]. It will be interesting to inspect cancer genomes, especially those evolving in cancer cells that are characterized by a defect in homologous recombination for genomic scars that carry the signature of POLQ-mediated end joining, to also determine the contribution of this mutagenic pathway to tumorigenesis.
Methods
C. elegans genetics
Standard methods and conditions for culturing C. elegans were used [15]. The alleles used in this study were: polh-1(lf31); polq-1 (tm2026); fcd-2 (tm1298). Bristol N2 was used as wild type in all experiments.
Nematode mutagenesis
Mutagenesis with EMS was performed at 12.5mM, 25mM, 50mM or 100mM according to standard protocols [15]. In brief, populations were synchronized by alkaline hypochlorite treatment and eggs were allowed to hatch o/n. L1 worms were plated out on 9cm NGM agar plates seeded with E. coli (OP50) and grown at 20 degrees. Two days later L4 worms were washed off the plates and treated for 4 hours with EMS dissolved in M9.
A similar staging protocol was used for UV/TMP mutagenesis. Subsequently, animals of the L4 stage were treated for one hour with 10μg/ml TMP (Sigma, T6137, stock: 100mg dissolved in 40ml acetone) dissolved in M9. Animals were distributed onto non-seeded NGM plates and exposed to UVA irradiation (366nm; CAMAG 29200 Universal UV LAMP) at a dose rate of 160μW/cm2 (Blak-Ray UV-meter model no. J221), after which the animals were transferred to standard OP50/NGM plates.
Sensitivity assays
Staged animals were exposed to either EMS or UV/TMP at the L4 larval stage and per experimental condition four plates each containing three worms were started. After a 24-36-hour period of egg laying the mothers were removed. The number of (dead) eggs and hatched progeny (after 24 hours) was determined. All experiments were performed in triplicate. We determined the brood size for animals by collecting eggs from individual hermaphrodites in sequential periods of 24 hours. For each period the number of (dead) eggs and hatched progeny (after 24 hours) was determined and then added.
Deletion library PCR assay
For each deletion library ~80,000 animals were used for synchronization by hypochlorite treatment (0.5M NaOH, 2% hypochlorite) and overnight starvation. Animals of the L4 stage were treated with EMS (50mM), UV/TMP (50 J/m2) or mock-treated. P0 animals were removed by hypochlorite treatment 24 hours post-UV/TMP-treatment, and after o/n hatching ~100,000 F1 animals were transferred to 10 9 cm plates and were grown for two days at 20 degrees. Then, animals were collected by rinsing the plates with M9 and distributed over 10 96-well plates such that each well contained ~100 worms in a 5 μl volume. To this 10 μl of lysis buffer was added and animals were subsequently subjected to a standard lysis protocol to liberate the DNA. All 10 plates were pooled into 1 master plate (using 10 μl original DNA mixture), which was used for another round of pooling by combining 10μl from each of the eight wells in a column, finally yielding one row of 12 wells for library. Prior to performing nested PCRs for eight different genomic targets (see S1 Table), the DNA was digested with the thermostable restriction enzyme PspGI. Upon detection of a smaller-than-wild-type product in the pools, PCRs were repeated on the master plate and then on individual plates. The PCR products of the samples that remained positive during this deconvolution exercise (in duplicate) were sequenced. We considered a result a false positive if the samples of lower complexity failed to reproduce the PCR product.
Bioinformatic analyses
The sequence information for publically available deletion alleles was retrieved from WormBase (WS243). A custom Java program was written to analyse and annotate the WormBase alleles (available upon request). We included a number of additional stringency criteria: 1) the coordinates of the allele should match the information about the allele’s left and right border sequence, 2) insertions within deletions should be as minimal as possible, 3) insertions that contained one or more Ns were discarded. In addition, for cases where sequence homology at the junctions allowed for more than one possible mapping position we placed the homology at the retained flank of the 5’ side. To identify the origin of the insertions in the delins alleles we i) performed BLAST for insertions ≥15 nt, and ii) used a custom-made algorithm aimed to find the longest common substring, i.e. the longest possible match between a stretch of the insertion (≥5 nt) and the sequence that is in close proximity of the junctions (≤50 nt of each flank and 50 nt within either side of the deletion). All deletion alleles used in our analyses can be found in S3 Table.
Supporting Information
Acknowledgments
We thank the C. elegans Knockout Consortium, Shohei Mitani and the C. elegans Genetics Center for providing strains and sequence information of all deletion alleles.
Data Availability
Data are available from WormBase (WormBase.org) under the ID numbers provided in the manuscript (S3 Table).
Funding Statement
MT is supported by a starting grant from the European Research Council (203379, DSBrepair). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.van Schendel R, Roerink SF, Portegijs V, van den Heuvel S, Tijsterman M (2015) Polymerase Theta is a key driver of genome evolution and of CRISPR/Cas9-mediated mutagenesis. Nat Commun 6: 7394 ncomms8394 [pii]; 10.1038/ncomms8394 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Chan SH, Yu AM, McVey M (2010) Dual roles for DNA polymerase theta in alternative end-joining repair of double-strand breaks in Drosophila. PLoS Genet 6: e1001005 10.1371/journal.pgen.1001005 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Ceccaldi R, Liu JC, Amunugama R, Hajdu I, Primack B, Petalcorin MI et al. (2015) Homologous-recombination-deficient tumours are dependent on Poltheta-mediated repair. Nature 518: 258–262. nature14184 [pii]; 10.1038/nature14184 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Mateos-Gomez PA, Gong F, Nair N, Miller KM, Lazzerini-Denchi E, Sfeir A (2015) Mammalian polymerase theta promotes alternative NHEJ and suppresses recombination. Nature 518: 254–257. nature14157 [pii]; 10.1038/nature14157 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Roerink SF, van Schendel R, Tijsterman M (2014) Polymerase theta-mediated end joining of replication-associated DNA breaks in C. elegans. Genome Res 24: 954–962. gr.170431.113 [pii]; 10.1101/gr.170431.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Koole W, van Schendel R, Karambelas AE, van Heteren JT, Okihara KL, Tijsterman M (2014) A Polymerase Theta-dependent repair pathway suppresses extensive genomic instability at endogenous G4 DNA sites. Nat Commun 5: 3216 ncomms4216 [pii]; 10.1038/ncomms4216 [DOI] [PubMed] [Google Scholar]
- 7.Lemmens B, van Schendel R, Tijsterman M (2015) Mutagenic consequences of a single G-quadruplex demonstrate mitotic inheritance of DNA replication fork barriers. Nat Commun 6: 8909 ncomms9909 [pii]; 10.1038/ncomms9909 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.McVey M, Lee SE (2008) MMEJ repair of double-strand breaks (director's cut): deleted sequences and alternative endings. Trends Genet 24: 529–538. S0168-9525(08)00229-1 [pii]; 10.1016/j.tig.2008.08.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Ceccaldi R, Rondinelli B, D'Andrea AD (2016) Repair Pathway Choices and Consequences at the Double-Strand Break. Trends Cell Biol 26: 52–64. S0962-8924(15)00142-7 [pii]; 10.1016/j.tcb.2015.07.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Yousefzadeh MJ, Wyatt DW, Takata K, Mu Y, Hensley SC, Tomida J et al. (2014) Mechanism of suppression of chromosomal instability by DNA polymerase POLQ. PLoS Genet 10: e1004654 10.1371/journal.pgen.1004654;PGENETICS-D-14-01461 [pii]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Kent T, Chandramouly G, McDevitt SM, Ozdemir AY, Pomerantz RT (2015) Mechanism of microhomology-mediated end-joining promoted by human DNA polymerase theta. Nat Struct Mol Biol. nsmb.2961 [pii]; 10.1038/nsmb.2961 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Roerink SF, Koole W, Stapel LC, Romeijn RJ, Tijsterman M (2012) A broad requirement for TLS polymerases eta and kappa, and interacting sumoylation and nuclear pore proteins, in lesion bypass during C. elegans embryogenesis. PLoS Genet 8: e1002800 10.1371/journal.pgen.1002800;PGENETICS-D-12-00062 [pii]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Muzzini DM, Plevani P, Boulton SJ, Cassata G, Marini F (2008) Caenorhabditis elegans POLQ-1 and HEL-308 function in two distinct DNA interstrand cross-link repair pathways. DNA Repair (Amst) 7: 941–950. S1568-7864(08)00126-2 [pii]; 10.1016/j.dnarep.2008.03.021 [DOI] [PubMed] [Google Scholar]
- 14.Brauchle M, Baumer K, Gonczy P (2003) Differential activation of the DNA replication checkpoint contributes to asynchrony of cell division in C. elegans embryos. Curr Biol 13: 819–827. S0960982203002951 [pii]. 10.1016/S0960-9822(03)00295-1 [DOI] [PubMed] [Google Scholar]
- 15.Brenner S (1974) The genetics of Caenorhabditis elegans. Genetics 77: 71–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Coulondre C, Miller JH (1977) Genetic studies of the lac repressor. IV. Mutagenic specificity in the lacI gene of Escherichia coli. J Mol Biol 117: 577–606. 0022-2836(77)90059-6 [pii]. [DOI] [PubMed] [Google Scholar]
- 17.Anderson P (1995) Mutagenesis. Methods Cell Biol 48: 31–58. [PubMed] [Google Scholar]
- 18.Flibotte S, Edgley ML, Chaudhry I, Taylor J, Neil SE, Rogula A et al. (2010) Whole-genome profiling of mutagenesis in Caenorhabditis elegans. Genetics 185: 431–441. genetics.110.116616 [pii]; 10.1534/genetics.110.116616 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wei A, Yuan A, Fawcett G, Butler A, Davis T, Xu SY et al. (2002) Efficient isolation of targeted Caenorhabditis elegans deletion strains using highly thermostable restriction endonucleases and PCR. Nucleic Acids Res 30: e110 10.1093/nar/gnf109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Jansen G, Hazendonk E, Thijssen KL, Plasterk RH (1997) Reverse genetics by chemical mutagenesis in Caenorhabditis elegans. Nat Genet 17: 119–121. 10.1038/ng0997-119 [DOI] [PubMed] [Google Scholar]
- 21.Edgley M, D'Souza A, Moulder G, McKay S, Shen B, Gilchrist E et al. (2002) Improved detection of small deletions in complex pools of DNA. Nucleic Acids Res 30: e52 10.1093/nar/gnf051 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Harris TW, Antoshechkin I, Bieri T, Blasiar D, Chan J, Chen WJ et al. (2010) WormBase: a comprehensive resource for nematode research. Nucleic Acids Res 38: D463–D467. gkp952 [pii]; 10.1093/nar/gkp952 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Muniandy PA, Liu J, Majumdar A, Liu ST, Seidman MM (2010) DNA interstrand crosslink repair in mammalian cells: step by step. Crit Rev Biochem Mol Biol 45: 23–49. 10.3109/10409230903501819 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Esposito F, Brankamp RG, Sinden RR (1988) DNA sequence specificity of 4,5',8-trimethylpsoralen cross-linking. Effect of neighboring bases on cross-linking the 5'-TA dinucleotide. J Biol Chem 263: 11466–11472. [PubMed] [Google Scholar]
- 25.Boyer V, Moustacchi E, Sage E (1988) Sequence specificity in photoreaction of various psoralen derivatives with DNA: role in biological activity. Biochemistry 27: 3011–3018. 10.1021/bi00408a052 [DOI] [PubMed] [Google Scholar]
- 26.Arana ME, Seki M, Wood RD, Rogozin IB, Kunkel TA (2008) Low-fidelity DNA synthesis by human DNA polymerase theta. Nucleic Acids Res 36: 3847–3856. gkn310 [pii]; 10.1093/nar/gkn310 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Seki M, Masutani C, Yang LW, Schuffert A, Iwai S, Bahar I et al. (2004) High-efficiency bypass of DNA damage by human DNA polymerase Q. EMBO J 23: 4484–4494. 7600424 [pii]; 10.1038/sj.emboj.7600424 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kent T, Mateos-Gomez PA, Sfeir A, Pomerantz RT (2016) Polymerase theta is a robust terminal transferase that oscillates between three different mechanisms during end-joining. Elife 5 10.7554/eLife.13740 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Kleinboelting N, Huep G, Appelhagen I, Viehoever P, Li Y, Weisshaar B (2015) The Structural Features of Thousands of T-DNA Insertion Sites Are Consistent with a Double-Strand Break Repair-Based Insertion Mechanism. Mol Plant 8: 1651–1664. S1674-2052(15)00357-3 [pii]; 10.1016/j.molp.2015.08.011 [DOI] [PubMed] [Google Scholar]
- 30.Sega GA (1984) A review of the genetic effects of ethyl methanesulfonate. Mutat Res 134: 113–142. 0165-1110(84)90007-1 [pii]. 10.1016/0165-1110(84)90007-1 [DOI] [PubMed] [Google Scholar]
- 31.Sedgwick B (2004) Repairing DNA-methylation damage. Nat Rev Mol Cell Biol 5: 148–157. 10.1038/nrm1312 [DOI] [PubMed] [Google Scholar]
- 32.Brauchle M, Baumer K, Gonczy P (2003) Differential activation of the DNA replication checkpoint contributes to asynchrony of cell division in C. elegans embryos. Curr Biol 13: 819–827. S0960982203002951 [pii]. 10.1016/S0960-9822(03)00295-1 [DOI] [PubMed] [Google Scholar]
- 33.Wienholds E, Schulte-Merker S, Walderich B, Plasterk RH (2002) Target-selected inactivation of the zebrafish rag1 gene. Science 297: 99–102. 10.1126/science.1071762;297/5578/99 [pii]. [DOI] [PubMed] [Google Scholar]
- 34.Wienholds E, van EF, Kosters M, Mudde J, Plasterk RH, Cuppen E (2003) Efficient target-selected mutagenesis in zebrafish. Genome Res 13: 2700–2707. 10.1101/gr.1725103;1725103 [pii]. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Greene EA, Codomo CA, Taylor NE, Henikoff JG, Till BJ, Reynolds SH et al. (2003) Spectrum of chemically induced mutations from a large-scale reverse-genetic screen in Arabidopsis. Genetics 164: 731–740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Nairz K, Zipperlen P, Dearolf C, Basler K, Hafen E (2004) A reverse genetic screen in Drosophila using a deletion-inducing mutagen. Genome Biol 5: R83 gb-2004-5-10-r83 [pii]; 10.1186/gb-2004-5-10-r83 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Cooper JL, Greene EA, Till BJ, Codomo CA, Wakimoto BT, Henikoff S (2008) Retention of induced mutations in a Drosophila reverse-genetic resource. Genetics 180: 661–667. genetics.108.092437 [pii]; 10.1534/genetics.108.092437 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Thompson O, Edgley M, Strasbourger P, Flibotte S, Ewing B, Adair R et al. (2013) The million mutation project: a new approach to genetics in Caenorhabditis elegans. Genome Res 23: 1749–1762. gr.157651.113 [pii]; 10.1101/gr.157651.113 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Shirasawa K, Hirakawa H, Nunome T, Tabata S, Isobe S (2016) Genome-wide survey of artificial mutations induced by ethyl methanesulfonate and gamma rays in tomato. Plant Biotechnol J 14: 51–60. 10.1111/pbi.12348 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Helleday T, Eshtad S, Nik-Zainal S (2014) Mechanisms underlying mutational signatures in human cancers. Nat Rev Genet 15: 585–598. nrg3729 [pii]; 10.1038/nrg3729 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Nik-Zainal S, Alexandrov LB, Wedge DC, Van LP, Greenman CD, Raine K et al. (2012) Mutational processes molding the genomes of 21 breast cancers. Cell 149: 979–993. S0092-8674(12)00528-4 [pii]; 10.1016/j.cell.2012.04.024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Nik-Zainal S, Davies H, Staaf J, Ramakrishna M, Glodzik D, Zou X et al. (2016) Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature. nature17676 [pii]; 10.1038/nature17676 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Wei A, Yuan A, Fawcett G, Butler A, Davis T, Xu SY et al. (2002) Efficient isolation of targeted Caenorhabditis elegans deletion strains using highly thermostable restriction endonucleases and PCR. Nucleic Acids Res 30: e110 10.1093/nar/gnf109 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Data are available from WormBase (WormBase.org) under the ID numbers provided in the manuscript (S3 Table).