Abstract
G-quadruplexes (G4s), a type of non-B DNA, play important roles in a wide range of molecular processes, including replication, transcription, and translation. Genome integrity relies on efficient and accurate DNA synthesis, and is compromised by various stressors, to which non-B DNA structures such as G4s can be particularly vulnerable. However, the impact of G4 structures on DNA polymerase fidelity is largely unknown. Using an in vitro forward mutation assay, we investigated the fidelity of human DNA polymerases delta (δ4, four-subunit), eta (η), and kappa (κ) during synthesis of G4 motifs representing those in the human genome. The motifs differ in sequence, topology, and stability, features that may affect DNA polymerase errors. Polymerase error rate hierarchy (δ4<κ<η) is largely maintained during G4 synthesis. Importantly, we observed unique polymerase error signatures during synthesis of VEGF G4 motifs, stable G4s which form parallel topologies. These statistically significant errors occurred within, immediately flanking, and encompassing the G4 motif. For pol δ4, the errors were deletions, insertions and complex errors within the G4 or encompassing the G4 motif and surrounding sequence. For pol η, the errors occurred in 3’ sequences flanking the G4 motif. For pol κ, the errors were frameshift mutations within G-tracts of the G4. Because these error signatures were not observed during synthesis of an antiparallel G4 and, to a lesser extent, a hybrid G4, we suggest that G4 topology and/or stability could influence polymerase fidelity. Using in silico analyses, we show that most polymerase errors are predicted to have minimal effects on predicted G4 stability. Our results provide a unique view of G4s not previously elucidated, showing that G4 motif heterogeneity differentially influences polymerase fidelity within the motif and flanking sequences. Thus, our study advances the understanding of how DNA polymerase errors contribute to G4 mutagenesis.
Keywords: non-B DNA, error signature, DNA replication, DNA polymerase η, DNA polymerase κ, DNA polymerase δ
1. Introduction
DNA polymerases are key determinants of genome evolution and mutagenesis. Human genome integrity is maintained by orchestrating numerous nuclear DNA polymerases, whose diverse functions are required for DNA replication, repair, recombination, and specialized functions such as translesion synthesis (TLS). DNA polymerase δ (pol δ) is the workhorse of genome duplication and carries out not only the bulk of DNA replication, but also functions in DNA repair and recombination (1, 2). Beyond TLS, polymerases η (pol η) and κ (pol κ) efficiently replicate sequences with non-B DNA potential that cause a natural obstacle to replicative polymerases (reviewed in (3)) and create genome-stabilizing interruption mutations within microsatellite sequences (4). Pol η is critical for replicating and maintaining common fragile sites (5, 6). Pol κ can efficiently synthesize AT repeats and quasipalindromes (7) and promotes recovery of stalled replication forks (8).
G-quadruplexes (G4s) are a type of non-B DNA structure formed through Hoogsteen base pairing of repeated guanines within tracts (G-tracts) separated by loop sequences and stabilized by monovalent cations (9-12). Genome-wide studies predict the human genome encodes between ~376,000 (13) and >700,000 potential G4 motifs (14). Transient G4 formation is implicated in the regulation of various cellular and molecular processes, including but not limited to, replication, epigenetic regulation, transcription, and translation (reviewed in (15)). Importantly, G4s are emerging as attractive therapeutic targets (reviewed in (16)), particularly as a synthetic lethal strategy for homologous recombination-deficient cancers (17, 18). However, G4 DNA structure formation also can have negative biological consequences. G4 structures impact DNA helicase and polymerase enzymes (reviewed in (19)), and replication through G4s may rely on regulated coordination between these enzymes (20). Notably, DNA replication in the absence of helicases that unwind G4 structures (such as FANCJ or DHX36) leads to replisome stalling (21). G4 motifs also are associated with genome instability (reviewed in (22, 23)). Increased levels of germline single nucleotide variants are found at or near G4 loci and G4 regions are associated with sites of point mutation, translocation, and copy number variant breakpoints in cancer genomes (24-28). These negative effects of G4 motifs (inhibition and instability) set up a potential paradox: because G4 structures play many prominent biological roles, it is crucial that G4 motifs be accurately and efficiently replicated in the genome in order to retain cellular functions.
Several potential mechanisms could underlie increased mutagenesis at genomic G4 loci (26). One source of mutagenesis and genome instability is DNA polymerase errors; however, the fidelity of DNA polymerases during G4 synthesis remains largely unknown. A few studies provide evidence for replication-associated polymerases in G4 bypass (e.g., Rev1 (29-31)) or for polymerases involved in repair upon G4-induced double-strand breaks (e.g., pol theta (pol θ) (32)). In vitro evidence suggests that the catalytic cores of human pol η and pol κ can efficiently bind and facilitate G4 synthesis (33, 34); however, stabilized G4s can induce stalling of these polymerases (33, 35). Recently, pol η was shown to interact with the DHX9 helicase and to localize to genomic G4 motifs (20). Thus, the roles of specialized polymerases in G4 synthesis and genome maintenance warrants further investigation.
To better understand the mechanisms underlying G4 motif maintenance and evolution, we measured DNA polymerase errors during in vitro G4 synthesis for the human replicative DNA pol δ, as well as two specialized polymerases – pols η and κ. We focused on three G4 motifs found in different regions of the human genome and differing in sequence, topology, and thermal stability. We describe distinct polymerase error signatures arising from G4 synthesis that include intra-G4 motif errors, errors in flanking sequences, and large-scale deletion and insertion errors. These signatures are dependent on G4 sequence and/or topology and on polymerase identity. In silico analysis revealed that the majority of polymerase errors are not predicted to substantially impact inherent G4 stability, suggesting that G4 function of the resulting mutated motif might be retained. Taken together, this study shows that some G4s differentially influence polymerase errors not only in the motif but also in the flanking sequences, thereby contributing to genome evolution.
2. Materials and methods
2.1. G4 Biophysical Characterization
All sequences were tested for G4 formation by circular dichroism (CD) as previously described (36). Each oligonucleotide was analyzed without or with denaturation (5 minutes at 90°C) in low ionic salt (1 mM Na-phosphate buffer, pH 7). After cooling back to room temperature, individual polymerase reaction buffer components were added sequentially in the following order: 25 mM K+-phosphate (pH 7.2); 5 mM MgCl2; 5 mM DTT; 100 ug/ml BSA; 100 mM KCl. CD spectra were measured after each addition and again after 24 hours. Native polyacrylamide gel electrophoresis (PAGE) was used to determine the molecularity of the studied oligos as described (36). Melting curves were measured according to (36) in 1 cm cells in 25 mM K+-phosphate (pH 7.2) with 100 mM KCl. The temperature was increased in 1°C steps and the samples were equilibrated for 2 min before each measurement. The melting curves were monitored by the decrease of absorbance at 297 nm and expressed as 1–0 normalized data (1 - native and 0 - denatured forms). The Tm values were calculated as the mid-transition point from the normalized melting dependencies.
2.2. Construction of G4 motif-containing ssDNA substrates
The pGEM-derived pRStu (chloramphenicol resistant, ChlorR) and pSStu (chloramphenicol sensitive, Chlors) vectors containing the Herpes simplex virus-thymidine kinase (HSV-tk) sequence were previously described (37, 38). G4 motif inserts were constructed as described in these publications. Briefly, oligonucleotides including each G4 sequence (Table 1) were obtained from Integrated DNA Technologies (Coralville, Iowa). G4 sequences were inserted in-frame into the coding region of the HSV-tk sequence of the pSStu vector between nucleotides 111 and 112. The integrity of the tk gene in the final clones was verified by Sanger sequencing. HSV-tk phenotype was verified by growing vector-containing bacteria on selective media for both HSV-tk + and HSV-tk − as described (39). Single-stranded DNA (ssDNA) was isolated after R408 helper phage infection of F’ E.coli strain JM109 (endA1, recA1, gyrA96, thi, hsdR17 (r’k,mk+), relA1, supE44, λ−, Δ(lac-proAB), (F’, traD36, proAB, lacIqZΔM15)), followed by chloroform: isoamyl alcohol and phenol extractions of purified phage particles.
Table 1.
G4 Motif Sequencea | IDb | Length (nts) |
Locationc | Topologyd | Tm (°C) |
Quadron Score |
---|---|---|---|---|---|---|
5’ GGGcgaaGGGGcgagccaGGGGtaaGGGG 3’ | FER1L4 | 29 | intron | antiparallel | 75.6 | 9.19 |
5’GGGGcGGGccGGGGGcGGGGtcccggcGGGG3’ | VEGF | 31 | promoter | parallel | 83.5 | 26.09 |
5’ GGGGcGGGccGGGGGcGGGG 3’ | VEGFmut | 20 | n.a. | parallel | 85.5 | 24.19 |
5’ GGGGcggacccGGGGcGGGGcaaGGG 3’ | TAGLN2 | 26 | enhancer | hybrid | 68 | 13.46 |
G-tracts, underlined; loops, lowercase
ID of G4 based on gene in which it is located or near
Based on human genome (Dec. 2013 Human GRCh38/hg38)
Determined by circular dichroism
nts, nucleotides; Tm, thermal melting temperature; n.a., not applicable
2.3. DNA polymerases and accessory proteins
Purified full-length human pols η and κ were purchased from Enzymax (Lexington, KY). Human pol δ4 and proliferating cell nuclear antigen (PCNA) were purified as previously described (40). Yeast replication factor C (RFC) was a gift from Dr. Linda Bloom (University of Florida).
2.4. DNA polymerase pausing assay
The primer-extension assay to assess sequence-specific DNA polymerase pausing was performed as described (38). Briefly, ssDNAs for each G4-containing vector (above) were combined with 32P-labeled GStu2 primer (5’-CTGCTCAGGCCTGACTTCCG-3’) at approximately a 1:1 molar ratio. GStu2 hybridizes to the Stu I site located at HSV-tk position 180, the same site used to construct the gapped heteroduplex substrate for the mutagenesis assay (see below). Hybridizations were completed in 1X SSC by heating the reaction components to 80°C, followed by slow cooling to room temperature. Pol η reactions were performed using approximately 50 fmol primer-template (2.5 nM) in 25 mM K+-phosphate (pH 7.2), 5 mM MgCl2, 5 mM DTT, 100 μg/mL non-acetylated BSA, and 1 mM dNTPs, the same reaction buffer used for gap-filling reactions (below). Primer-template was pre-incubated in buffer for 3 min at 37°C and reactions were initiated upon addition of 1- 2 pmol (limiting enzyme amounts), or 7.5 pmol (excess enzyme amount) of human Pol η. Aliquots were removed at 30, 60, and 120 min and quenched in an equal volume of stop dye. A control reaction was performed with approximately 50 fmol primer-template in reaction buffer without polymerase to determine any high molecular mass species present in the primer (- Pol). A second control reaction was performed with excess (5 units) Exo− Klenow polymerase (Thermo-Fisher) to access the amount of primer that productively bound to ssDNA and could be used by the polymerase (% Hyb). Reaction products were denatured and separated on an 8% polyacrylamide gel, along with a sequencing ladder prepared in reactions containing primer-template and Sequenase 2.0 (Thermo-Fisher). Gels were imaged using a Typhoon FLA 9500 laser scanner (GE Healthcare/Cytiva).
2.5. Construction of gapped heteroduplex substrates for polymerase reactions
Gapped heteroduplex molecules were generated by hybridizing Mlu I and Stu I-digested pRStu double-stranded DNA with ssDNA derived from pSStu G4-containing vectors, as previously described for other sequences (38). Briefly, pRStu dsDNA was denatured at 85°C for 12 minutes, followed by incubation with the G4-containing ssDNA for 1 minute at 85°C. DNAs were placed on ice for 5 minutes, 0.5-2.0X SSC was added, and DNAs were renatured by incubating at 60°C for 30 minutes and cooling to room temperature. The gapped heteroduplex was purified by agarose gel purification, using GeneClean (MP Biomedicals). The resulting heteroduplex molecules contain a ssDNA region between the Mlu I and Stu I recognition sites, corresponding to a template sequence that includes the G4 motif plus 97 bases of flanking HSV-tk sequence.
2.6. In vitro HSV-tk forward mutation assay
The in vitro HSV-tk polymerase assay and the use of a common buffer optimized for the three different DNA polymerases studied here have been previously described (37, 41). Briefly, gel-purified gapped heteroduplex molecules were used as DNA substrates. In vitro gap-filling reactions contained ~0.075 pmol of substrate and were performed in buffer containing 25 mM K+-phosphate (pH 7.2), 5 mM MgCl2, 5 mM DTT, 100 μg/mL non-acetylated BSA, 1 mM dNTPs and 0.75–3 pmol of pol η or 1.5–6 pmol pol κ in 100 uL final volume. Polymerase concentrations and reaction times were determined empirically by titration. High dNTP and polymerase concentrations and longer reaction times were used to reveal all potential polymerase errors by promoting extension of pre-mutational intermediates and complete gap-filling synthesis through multiple binding events. Reactions were incubated at 37°C for 1 or 2 h for pol η and 2 h for pol κ. For accurate comparison of all three enzymes, pol δ4 reactions were performed in the same 25 mM K+-phosphate buffer. High salt and ATP is required for yeast RFC loading of PCNA (42). Therefore, we used our PCNA pre-loading model, which has been published previously (41). 0.075 pmol of gapped DNA substrate was first preloaded with 94 fmol human PCNA and 375 fmol yeast RFC in a buffer containing 20 mM Tris (pH 7.5), 8 mM MgCl2, 5 mM DTT, 40 μg/mL non-acetylated BSA, 150 mM KCl, 5% glycerol, and 0.5 mM ATP at 37°C for 5 minutes. DNA substrates with preloaded PCNA were diluted into the 25 mM K+-phosphate buffer, with 200 μg/mL non-acetylated BSA and 0.75 – 3 pmol pol δ4 in 100 uL final volume and incubated at 37°C for 30-45 minutes. All reactions were stopped with 15 mM EDTA, and polymerases were inactivated at 68°C for 3 minutes. Overnight agarose gel electrophoresis with DNA standards was used to verify complete gap-filling synthesis. Complete reaction products were subject to HSV-tk − selection in recA13, upp, tdk E.coli strain FT334, based on 5-fluoro-2’-deoxyuridine (FUdR) resistance, as described (37, 43). Chloramphenicol was used to select progeny of the DNA polymerase synthesized strand. Observed HSV-tk − mutant frequencies were calculated as . Independent FUdRRChlorR mutants were obtained, using at least two independent reactions per G4 substrate, per polymerase, to generate unbiased mutational spectra. For each independent mutant, 600 ng of plasmid DNA was sequenced using Sanger sequencing (Azenta Life Sciences, South Plainfield, NJ, or in-house). Sequencing files were imported into SnapGene software (from Insightful Science; available at snapgene.com), and mutant sequences were aligned to the wildtype sequence and manually interrogated. The presence of large deletions was confirmed by agarose gel electrophoresis after restriction enzyme digest. Background ssDNA mutation frequencies and sequences with no detectable errors within the single-stranded template were subtracted from the observed mutant frequency to derive the polymerase error frequencies (37). To account for multiple detectable errors per mutant, corrected error frequencies for each independent reaction were calculated using the formula , as described (43).
2.7. Quadron analysis
We used the computational tool Quadron to identify sequences with G4-forming potential and predict structure stability (44). Quadron employs a machine-learning algorithm trained on a large-scale experimental G4-seq data set to identify sequences with G4-forming potential in a given nucleotide sequence, and returns a score that indicates thermodynamic stability (herein referred to as the Quadron score). To investigate the effect of mutations on G4 stability, Quadron was run on the wildtype motifs (Table 1) and mutant sequences corresponding to polymerase errors, using the G4 motif embedded in 1,101 bp (540 bp on the 5’ end, 561 bp on the 3’ end) of the flanking HSV-tk gene sequence.
2.8. Statistical analysis
All statistical analyses were performed using GraphPad Prism 9 software. Continuous variables were analyzed with appropriate parametric or non-parametric tests and post hoc tests, where indicated, as detailed in figure legends. Categorical variables were analyzed using Fisher’s Exact or Chi-square (X2) tests where specified. Analyses of data that required 2x3 or 3x3 contingency tables and did not meet Chi-square requirements were performed using an online Fisher’s Exact test (45). A p-value of < 0.05 was considered statistically significant.
3. Results
3.1. FER1L4, VEGF, VEGFmut, and TAGLN2 form G4 structures in vitro
To understand DNA polymerase fidelity during G4 motif synthesis, we sought to study G4 motifs that are representative of those found in the human genome. We investigated three G4 motifs, herein referred to by their genic location, as well as a mutated version of one of these motifs (Table 1). The G4 motifs vary in G-tract number and length, as well as loop number, length, and sequence (Table 1). We used CD to confirm G4 structure formation in the same buffer used for our DNA polymerase assays. Addition of 25 mM K+ phosphate buffer induced G4 formation (Figure 1A, yellow line), consistent with biophysical studies of the c-MYC promoter G4 sequence (46). G4 structure was fully formed after addition of all polymerase reaction buffer components (25 mM K+ phosphate, 5 mM MgCl2, 5 mM DTT, 100 μg/mL non-acetylated BSA) (Fig. 1A, dark blue line, Full RB). CD spectra showing the effect of each buffer component added sequentially is shown in Supplemental Figure S1A. We also performed the CD analyses after the full RB was supplemented with 100 mM KCl (Fig. 1A, teal line). However, these ‘high [K+] conditions’ did not alter the CD spectrum relative to the Full RB alone. Finally, we verified that the HSV-tk sequences flanking the inserted motifs (here after referred to as TK) display characteristic B-DNA CD spectra (Supplemental Fig. S1A).
G4s are characterized by their topology and molecularity, i.e., the number of DNA strands involved in structure formation (reviewed in (47-49)). G4 topology may be classified as parallel or antiparallel, depending on the direction of the DNA strands, or a combination of the two, also known as hybrid. Primary DNA sequence largely determines topology. The equilibrium between B-DNA and G4s can be affected by the G4 motif sequence, monovalent cation identity, and temperature (49). The FER1L4 motif is a four G-tract sequence located within an intron of the FER1L4 gene that forms an intramolecular (Supplemental Fig. S1B) antiparallel G4 structure, with a maximum absorbance peak at 295 nm and a minimum peak at 260 nm (Fig. 1A). The VEGF motif is a five G-tract sequence located within the promoter region of the VEGF gene (50) and forms both intra- and bimolecular parallel G4 structures (Supplemental Fig. S1B), with a maximum peak at 264 nm and a minimum peak at 245 nm (Fig. 1A). We created VEGFmut, a derivative of the native VEGF motif, that contains the central G-tracts of the native motif but one less G-tract and loop (Table 1). Thus, VEGFmut is comparable to the FER1L4 and TAGLN2 motifs in sequence (e.g., number of G-tracts) but has a parallel topology and is more stable (intra- and bimolecular parallel structure; Fig. 1A and Supplemental Fig. S1B). The TAGLN2 motif is a four G-tract sequence that forms an intramolecular hybrid (3+1) G4, generating three parallel strands and one antiparallel (Fig. 1A and Supplemental Fig. S1B). The G4 motifs also vary in thermal melting temperature (Tm), in order of increasing Tm: TAGLN2 < FER1L4 < VEGF < VEGFmut (Supplemental Fig. S1C and Table 1).
We further verified G4 structure formation using a biochemical approach. We created pGEM-derived vectors by inserting the G4 sequences in-frame into the 5’ coding region of the Herpes simplex virus-thymidine kinase (HSV-tk) sequence, as previously described for microsatellite sequences (37, 38). After purifying each G4-containing ssDNA substrate, we performed a DNA polymerase pausing (primer extension) assay with human DNA pol η. DNA synthesis reactions were carried out in the full RB (Fig. 1B). Using limiting enzyme concentrations, pol η displayed increased pausing within each G4 motif (Fig. 1B). When excess enzyme was used, providing the opportunity for multiple polymerase binding events, pol η completed synthesis through each G4 (Fig. 1B). These data are consistent with G4 structure formation in vitro causing transient polymerase stalling, but not absolute blockage to the polymerase. We conclude from these CD and biochemical analyses that the sequences under investigation form intra-molecular G4 structures during the DNA polymerase synthesis reactions used in our study.
3.2. DNA polymerase δ4, κ, and η error frequencies in the HSV-tk assay
We used our established in vitro HSV-tk forward mutation assay to examine DNA polymerase errors during G4 motif synthesis. This approach has been used extensively by us to characterize polymerase fidelity within unique and repetitive microsatellite sequences (37, 38, 43, 51). A strength of the HSV-tk assay is that all frameshift and larger scale insertions, deletions, and complex polymerase errors are detected and quantified, together with base substitutions that inactivate the HSV-tk protein. In addition, the surrounding HSV-tk target sequence is G+C rich and contains several short repetitive motifs (e.g., GG and GCGC) that serve as an internal control against which to compare error specificity at inserted sequences (52).
We examined the four-subunit human DNA pol δ4 as a representative replicative polymerase and pols η and κ as representative specialized polymerases. Gapped heteroduplex molecules containing each G4 motif plus 97 bases of the flanking unique TK sequence were created from the pausing assay pGEM ssDNA vectors (Figure 1B) and used as substrates in the DNA polymerase fidelity assay (Fig. 2A). Polymerase concentrations needed for complete gap filling synthesis of the mutational target were empirically determined (see Figure 1B, “excess” conditions). Only complete reaction products, as determined by gel electrophoresis (Supplemental Fig. S2), were used for subsequent HSV-tk− selection in E. coli. Complete gap-filling reveals all polymerase errors by promoting extension of pre-mutational intermediates through multiple binding events, and avoids ambiguous results caused by residual gap-filling by E. coli polymerases after bacterial transformation.
As expected, the observed mutation frequencies varied significantly among polymerases, with pol δ4 having the lowest mutation frequency (highest fidelity) and pol η having the highest mutation frequency (lowest fidelity) (Fig. 2B). The pol δ4 error frequency was only slightly elevated (1.1- to 3.5-fold) above the background mutation frequency (Supplemental Table S1), consistent with our previous results (43). The pol κ error frequency was 1.9- to 18-fold higher than that of pol δ4, while the pol η error frequency was 2.6- to 4.4-fold higher than that of pol κ, except for TAGLN2 where pol η and pol κ error frequencies were similar (Supplemental Table S1).
To examine mutational signatures, we generated polymerase error spectra from two to four reactions for each polymerase and each G4-containing substrate (Supplemental Fig. S3-S6). We observed three general classes of errors within the mutational target (Figure 2A): (1) errors contained wholly within the G4 motif (hereafter called intra-G4 errors); (2) errors contained wholly within the TK internal control sequence (TK errors); and (3) errors involving both the G4 and TK sequences (G4+TK errors). Table 2 summarizes the distribution of detectable errors (i.e., those creating a non-functional TK) by G4 motif and polymerase. Below, we discuss results from each of these classes separately.
Table 2.
Pol δ4 | Pol κ | Pol η | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
||||||||||
FER1L4 | VEGF | VEGFmut | TAGLN2 | FER1L4 | VEGF | VEGFmut | TAGLN2 | FER1L4 | VEGF | VEGFmut | TAGLN2 | |
TK a | 36 (0.95) | 25 (0.54) | 20 (0.30) | n.d. | 39 (0.76) | 133 (0.83) | 103 (0.82) | 26 (0.50) | 64 (0.67) | 118 (0.78) | 95 (0.82) | 37 (0.62) |
Intra-G4 b | 2 (0.05) | 16 (0.35) | 18 (0.27) | n.d. | 11 (0.22) | 24 (0.15) | 19 (0.15) | 10 (0.19) | 28 (0.29) | 23 (0.15) | 10 (0.09) | 12 (0.20) |
TK+G4 c | 0 | 5 (0.11) | 29 (0.43) | n.d. | 1 (0.02) | 4 (0.02) | 3 (0.02) | 16 (0.31) | 3 (0.03) | 10 (0.07) | 11 (0.09) | 11 (0.18) |
|
|
|
||||||||||
Total | 38 | 46 | 67 | n.d. | 51 | 161 | 125 | 52 | 95 | 151 | 116 | 60 |
Counts of detectable polymerase errors (i.e., errors that resulted in a non-functional thymidine kinase), with proportions in parentheses, that are
confined to the HSV-tk sequence
confined to the G4 motif, or
large-scale deletions and errors that could not be unambiguously assigned to TK or intra-G4.
n.d., not determined
3.3. Intra-G4 errors are sequence and polymerase dependent
To determine whether unique polymerase error signatures are present within G4 motifs, we designated three types of intra-G4 motif errors – defined as G-tract, Loop, or G-tract & Loop (Fig. 3A and Supplemental Table S2). G-tract & Loop errors include both a G-tract and adjacent loop sequence. (Errors that included both G4 and TK sequences were excluded from this analysis.) Both the FER1L4 and TAGLN2 motifs include three, GGGG mononucleotide G-tracts, but these are not frameshift error hotspots for any of the polymerases (Fig. 3B, 3E; Supplemental Figures S3, S6; Table S2). However, for TAGLN2, the proportion of the three error types varied between polymerases in a statistically significant manner (p=0.0208, Fisher’s Exact test). Notably, pol κ errors included complex errors involving both G-tract & Loop sequences (Fig. 3E and Supplemental Tables S2 and S3). Another unusual error observed within TAGLN2 is the insertion of a guanine within a CCC sequence of the third loop (Supplemental Figure S6).
We measured statistically significant differences in error distribution among the three polymerases for the two VEGF motifs (p<0.00001, Fisher’s Exact test) (Fig. 3C, D). As one might expect when synthesizing G4 motifs with longer G-tracts (e.g., GGGGG in VEGF/VEGFmut), pols η and κ become increasingly more error-prone for one base frameshifts (indels) (Supplemental Figures S4, S5). Pol κ indel errors within G tracts account for up to 95% of intra-G4 errors and generally shorten the G-tracts. Up to ~60% of intra-G4 Pol η errors are within the G-tracts, frequently extending the G-tract by one nucleotide. However, this error specificity was not true for pol δ4 (Table S2; Supplemental Fig. S4, S5). Instead, within the VEGF motifs, 75%-94% of pol δ4 errors were ≥ 2 base deletions/insertions and complex mutations that span both a G-tract & Loop, rather than single base indels (Fig. 3C,D; Supplemental Fig. S4; Supplemental Table S2). Taken together, these data support the notion that G4 structure and not only the G-rich sequence per se is contributing to intra-G4 polymerase errors in a polymerase-dependent manner.
3.4. G4 motifs influence Pol η errors in HSV-tk flanking sequences
When comparing mutational spectra across G4 motifs, we observed a difference in the number and specificity of polymerase errors that occurred in the flanking TK sequences. To determine if this observation was unique to G4s, we compared the pol η and pol κ G4 mutational spectra (Fig. 4A and Supplemental Fig. S7A) to that of substrates containing microsatellite inserts that we previously studied using the HSV-tk assay – (GT)10, (TC)11, (T)8, and (A)8 (4, 43, 53). These microsatellites were inserted at the same HSV-tk position as the G4 motifs studied here, but cannot form G-quadruplex structures. We designated 5’ and 3’ TK flanking windows around the inserts to examine how the polymerase error distribution (within the inserts versus the flanking TK sequence) is affected by the identity of the insert sequence. Window lengths were set as the average G4 motif length (27 nt) so we could accurately compare these data to each G4 motif (see below). We counted all pol η and pol κ errors within the designated window, excluding those errors which could not be unambiguously assigned to a single window. Comparison of pol η error distribution between the insert and flanking regions revealed that each microsatellite (short tandem repeat, STR) insert is an error hotspot (Fig. 4B), and the difference in distribution among the STR sequences is statistically significant (p = 0.0001, Chi-square).
Next, we performed the same analysis for the G4 motifs. Contingency table analysis showed a significant association between G4 motif and error distribution (p<0.0001, Chi-square; Fig. 4C). The differences in pol η error count and specificity at the 3’ TK flank were particularly striking. In general, the 3’ TK sequence flanking the VEGF and VEGFmut motifs had a higher proportion of pol η errors compared to FER1L4 and TAGLN2. For instance, pol η’s signature C to T and T to C transition errors were increased in TK sequences adjacent to VEGF or VEGFmut, compared to FER1L4 and TAGLN2. Interestingly, deletions of more than one nucleotide made by pol η, categorized as large deletions, were also prominent in 3’ TK sequences surrounding VEGF and VEGFmut compared to FER1L4 and TAGLN2 (Fig. 4A).
Lastly, we asked whether the pol η error distributions differed between G4 motifs and the microsatellites. For this, we calculated the average pol η error counts for each sequence window among the STRs and compared them to each G4 (Fig. 4C). We found that substrates containing the more inherently stable, parallel G4s (i.e., VEGF and VEGFmut) had significantly different error distributions compared to the average STR, with errors predominantly at the 3’ flank (p<0.0001, Chi-square, Fig. 4C). These data suggest that the sequence composition of the G4 motif, and by extension the topology and/or stability of the structure, influence the proportion of pol η errors made in the 3’ flanking sequences as the polymerase approaches the G4 motif.
We performed the same analysis for pol κ errors (Supplemental Fig. S7). The distribution of pol κ errors between the STR insert and flanking regions was statistically significant. However, when comparing all G4 motifs, pol κ error distributions were not different (p=0.0953, Chi-square; Supplemental Fig. S7C). When comparing pol κ error distributions within G4 motifs to the average STR, only VEGFmut showed a marginal difference (p<0.05, Chi-square, Supplemental Fig. S7C). Taken together, these data suggest that elevated pol η, but not pol κ, errors in the 3’ flanking sequences of G4 motifs is dependent on G4 stability and/or topology.
3.5. G4 sequences promote large-scale polymerase errors encompassing the surrounding sequences.
Large-scale (> 2 bases) and/or complex polymerase errors (i.e., combinations of multiple error types) that involve both the G4 and TK sequences also highlight the influence of G4 motifs on polymerase fidelity. These large errors occurred most often in VEGF, VEGFmut, and TAGLN2, G4 motifs with parallel or hybrid topology, but infrequently in the FER1L4 sequence, a G4 motif with antiparallel topology (Table 2; Supplemental Table S3). In our previous studies of human Pol δ4 fidelity, among the 355 mutations analyzed, we observed large deletions infrequently (2.5%) and large insertions rarely (0.28%) (43; K.A.E. and S.E.H., unpublished data). In contrast, here we readily observed Pol δ4 mutations of the TK+G4 class (Table 2), and both large deletions and insertions were created during synthesis of the VEGF motifs (43%, VEGFmut and 11%, VEGF). As an example, pol δ4 repeatedly created insertions of the sequence GGCGGGGTCCC during synthesis VEGFmut, which expanded the G4 motif (Supplemental Table S3). Pol δ4 did not create such errors during synthesis of the FER1L4 motif. Interestingly, Pols η and κ also created a substantial proportion of large-scale G4+TK errors but did so within the TAGLN2 motif (Table 2; Supplemental Table S3).
To examine mechanisms underlying larger scale errors, we used the mfold program (54) to predict hairpin structures within our substrates, with the caveat that mfold cannot predict G4s. We found that each G4-containing substrate has the potential to form one or more hairpin structures (Supplemental Fig. S8). The G4 sequences are involved in hairpin formation and the G4-containing hairpins have lower free energy (ΔG) compared to the absence of a G4 (TK only). However, most large-scale deletion errors (Supplemental Table S3) could not be mapped directly to any one potential hairpin endpoint. This does not discount the possibility of transient stalling by the polymerase due to hairpin formation, resulting in deletions due to misalignment and slippage (55).
3.6. Polymerase error frequencies at single G4 motifs
We asked whether the polymerase error frequencies within G4 motifs, with their distinctive sequences and topologies, are elevated relative to their surrounding sequences. For this analysis, we compared two classes of mutations: intra-G4 errors and errors contained wholly within the TK internal control sequence (TK errors). To account for the differences in G4 and TK target sequence size, we calculated polymerase error frequencies per nucleotide for each G4 motif and its respective TK control sequence. Although some base substitutions will not result in an FUdRRChlorR mutant in our assay, the degree to which we underestimated the frequency of base substitutions is similar for the G4 motifs and TK sequences. For most G4 motifs examined, polymerase error frequencies per nucleotide were not higher within the G4 motif as compared to the surrounding TK sequence, with the notable exception of pol δ4 and VEGFmut (Supplementary Figure S9).
We also investigated whether K+ concentration influences DNA polymerase errors in FER1L4 (antiparallel) and TAGLN2 (hybrid), the two lowest thermostability G4 motifs. Pol η reactions were performed as described above by supplementing the 25 mM K+-phosphate buffer (above) with 100 mM KCl for a total of 125 mM K+ (high [K+]). Note that the addition of 100 mM KCl to our full reaction buffer did not change the CD spectrum (Figure 1A). Within the FER1L4 motif, the high [K+] pol η error frequency (Supplemental Table S4) was not statistically different from the 25 mM K+-phosphate error frequency (Supplemental Table S1) (p=0.5, two-tailed unpaired t-test with Welch’s correction). The polymerase error frequency per nucleotide in the high [K+] reaction was similar within the TK and G4 sequences (Table S4; Supplemental Fig. S10A), and we observed no increase in large-scale errors. Similarly, pol η error frequencies per nucleotide for TAGLN2 were similar in 25 mM K+ and high [K+] reaction conditions (Supplemental Table S1 and Table 3). Again, large-scale deletions and complex errors that span portions of the TK sequence and G4 occurred at similar frequencies (Supplemental Fig. S10B). Together, these data show that the increased K+ concentrations had a negligible effect on pol η fidelity within the G4 motifs examined. We conclude that polymerase error frequencies within these single G4 motifs are not higher than within the surrounding 97 bases of unique TK sequence.
3.7. Predicted impact of DNA polymerase errors on G4 stability
Finally, we asked, what biological significance might polymerase errors have, specifically on G4 stability? To assess this, we used Quadron, an in silico method that incorporates experimental evidence into G4 motif prediction and generates a score for each sequence as a proxy for stability (44). In general, the higher the score, the more stable the G4. Using Quadron, each WT sequence (i.e., G4 + TK sequence) and each mutant sequence was assigned a score (Supplemental Table S5). We then compared the score of each mutant to the respective WT score and calculated the change in Quadron score (Δ Quadron) by subtracting the WT score from the mutant score. Polymerase errors generated during synthesis of the FER1L4 G4 sequence returned predicted stability scores similar to WT (Fig. 5A), suggesting that most errors made by the polymerase will not substantially affect G4 stability. Overall, the Δ Quadron scores for VEGF, VEGFmut, and TAGLN2 had a wider distribution compared to FER1L4, and this was statistically significant for the VEGF motif (p<0.01, Kruskal-Wallis Test with Dunn’s multiple comparisons; Fig. 5A). The majority of errors returned a Quadron score and was predicted to still form a G4. Of importance, some mutant sequences eliminated potential G4 structure, as defined by a Quadron score of zero. These errors included partial or complete deletion of the G4 or the loss of a single G-tract. All three polymerases were capable of making errors that abolished G4 potential (Fig. 5A).
We also compared the Quadron scores by polymerase (Fig. 5B). Both pol η and pol κ created errors in the FER1L4 motif that showed similar distributions around the WT score, suggesting that neither polymerase is better at maintaining the formation and WT stability of this antiparallel, thermally unstable G4. However, when looking at the parallel, thermally stable G4, VEGFmut, pol κ maintained G4 formation and stability, while pol η created mutants that either maintained G4 formation or eliminated G4 formation. These data suggest that pol κ may be more suited than pol η for maintaining G4 stability at certain G4 motifs.
4. Discussion
This study focused on one contributor to the mutagenesis and evolution of G-quadruplex sequences, namely, DNA polymerase errors. We analyzed sequences that reflect the variation in sequence/topology that is inherent to G4 motifs in the human genome. Our results from a forward mutation assay present a unique view of G4s not previously elucidated: the impact of G4s on human DNA polymerase error signatures. We measured specific effects of G4s on polymerase errors within, immediately flanking and encompassing the VEGF G4 motifs (Fig. 3 and Fig. 4). An increased frequency of large-scale errors was induced especially by G4 motifs containing parallel strands (i.e., parallel and hybrid topologies) (Supplemental Table S3). The predicted consequences of polymerase errors are wide-ranging, from simple modification of G4 structure to loss of G4 formation (Fig. 5). Taken together, our study highlights the potential contribution of G4 sequence and/or topology to polymerase errors, with those motifs forming parallel strands and having higher stability producing unique polymerase error signatures. The variable impact of diverse G4 motifs on polymerase fidelity is expected to contribute to G4 mutagenesis in the human genome.
We observed statistically significant differences in the proportion of pol η errors at 3’ flanking regions of VEGF motifs (Fig. 4). These data are consistent with deletions in dog-1 animals arising primarily at the 3’ region flanking a G4 (56). Additionally, Rev1 knockout human cells displayed an increase in supF errors when a G4 motif was inserted 3’ to the target, especially when G4s were stabilized by pyridostatin (31). Therefore, both in vitro and in vivo evidence support the conclusion that G4s induce an increase in flanking sequence mutations. Genome regions containing G4 motifs also are associated with copy number variant breakpoints (27) and translocation breakpoints (28) in various cancers. We detected large-scale errors, sometimes spanning the length of the G4 and a portion of the TK sequence, primarily within VEGF and TAGLN2 motifs (Supplemental Table S3). All three polymerases created such errors, although the frequency and class (i.e., intra-G4, TK+G4) varied depending on the G4 motif and polymerase (Table 2). Previous in vivo studies point to a similar impact of G4s on genome stability. Large deletions were increased at G4 motifs in C. elegans deficient in the FANCJ ortholog dog-1 (32, 56) and dog-1 mdf-1 deficient animals (57). Our data are consistent with this trend and show that the presence of a G4 is sufficient to induce these types of polymerase errors, independent of the status of other replication and repair proteins.
To our knowledge, the impact of G4 topology on DNA mutagenesis is not known. Antiparallel G4s have lateral and diagonal loops, which are positioned differently about the structure, compared to either the propeller loops of parallel G4s or the combination of propeller and lateral loops of the hybrid structure (49). Mechanistically, such structural differences could distort the immediate sequence flanking to G4, especially the 3’ flank, and may impact how polymerases engage with different types of G4s. For instance, as the polymerase approaches the G4, the DNA substrate may be presented in a different conformation that compromises polymerase fidelity. More research is needed to determine how polymerases engage with G4 motifs and the immediate 3’ sequence in order to understand how G4s induce polymerase errors.
The genomic context of G4 motifs is an important consideration for mutagenesis. Non-B secondary structures other than G4s could arise by involving the surrounding sequence and may add structural heterogeneity to the G4 (55, 58, 59). We noted that the G4-containing constructs used in our in vitro assay all have the potential to form stable hairpins between the G4 and surrounding sequence (Supplementary Figure S8). Because VEGF, FER1L4, and TAGLN2 are not in their native genomic context, we used mfold to detect potential secondary structures in each native sequence from the human genome (Dec. 2013 Human GRCh38/hg38). This analysis showed that the G4 sequences can form hairpin structures in their native genomic context with ΔG values similar to those predicted in the TK context (data not shown). Such hairpins might contribute to the polymerase insertion errors that we observed, which could be mapped as whole or partial duplications within our target sequences. The mechanisms underlying such insertions may be derived from previously proposed mechanisms of misalignment and slippage events due to polymerase stalling at non-B DNA secondary structures (55, 60). A common theme among these mutants was the insertion of the sequence GGCGGGGTCCC, where the underlined bases appear to be duplicated from the 3’ G-tract and loop, which pol δ4 would first encounter. One possible mechanism may be transient stalling at the formed G4, resulting in slippage and possibly hairpin formation in the nascent strand. This may occur more than once, as pol δ4 tries to synthesize through the G4.
We used the computational tool Quadron (44) as a proxy to examine the potential impact of polymerase errors on G4 stability. This analysis showed that most DNA polymerase errors generated Quadron scores similar to the wildtype sequence for each G4 motif (Fig. 5A; Supplemental Table S5). G4 motifs with Quadron scores of 19 or greater are considered to be “stable” and those with scores less than 19 to be “unstable” (44). With this definition in mind, the majority of polymerase errors made within the “stable” VEGF motifs are not predicted to greatly destabilize the G4, as the mutant scores remain at or above the 19 value threshold. As expected, most indels and base substitutions that disrupted a G-tract or lengthened a loop sequence slightly decreased the Quadron score, whereas errors that lengthened a G-tract or shortened a loop slightly increased the score. Certain G-tract positions, as well as loop bases, are critical to maintaining inherent G4 stability, but the occurrence of even a single point mutation in a precise position has been shown to affect G4 structure and function (62). For instance, a single nucleotide variant (SNV) that mutated the second guanine of the first G-tract in the IRF8 promoter inhibited G4 formation and substantially decreased gene expression, compared to a SNV in a loop that had a less pronounced effect on both measures (63). Moreover, a point mutation may not always impact structure but can still impact function. In the VEGF promoter, an SNV in the fourth guanine of the middle G-tract did not completely inhibit the ability for a G4 to form; however, topological and expression changes were apparent (63). In yet another study, changes in topology due to a single base or multiple base substitutions have been reported for the telomeric G4 sequence (62), with a dependence on the position of the substituted base. Despite these known negative biological effects of some G4 mutations, our study suggests that at least some G4 motifs can be efficiently synthesized so as to retain G4 biological function over time, a net result of polymerase errors that cause only minor changes to inherent stability.
The in silico analyses also showed that both pols η and κ create errors that are predicted to largely maintain the formation and stability of the FER1L4 motif, an antiparallel, thermally unstable G4 (Fig. 5B). In contrast, for the thermally stable, parallel VEGFmut motif, pol κ errors maintained formation and stability while a larger proportion of pol η errors eliminated the G4 potential of this motif. Also, pol κ created large-scale errors less frequently than pol η (Table S3). Though more motifs would need to be analyzed, these data suggest that synthesis by pol κ may be favorable for retaining G4s in the genome over time. The pol κ catalytic core binds parallel G4s and demonstrates enhanced activity when positioned upstream of the G4 in vitro (34). Limited ex vivo evidence exists to support specialized polymerase recruitment to G4s. U2OS cells depleted for either pol κ or pol η display decreased survival upon telomestatin-induced G4 stabilization (61). These studies are intriguing, but insufficient to establish a defined role for these polymerases. Polymerase partitioning based on G4 topology has not been explored, to our knowledge. However, the dCTP terminal transferase, Rev1, preferentially binds parallel G4s but not other topologies in vitro (31). Therefore, scenarios may exist in which specialized polymerases, such as pol κ, are recruited to different G4 motifs based on topology and/or stability.
Conclusions.
G4s are implicated in a variety of cellular processes, are evolutionarily conserved, and are associated with genome instability. However, G4s are inherently polymorphic both at the sequence and structural level, and much remains unknown about how the cell accurately and efficiently replicates these varied non-B DNA structures. Ours is the first comprehensive study of human DNA polymerase errors during G4 synthesis. The generality of our results will require the analyses of additional genomic G4 motifs and polymerases. This caveat aside, differential impacts of G4s on polymerase fidelity based on topology is a recurring theme in our study. We demonstrate that G4 sequences and/or their topology contribute to unique DNA polymerase error signatures, and our study serves as a foundation for future investigation into these parameters. The biological implications of this are two-fold: (1) There is an intrinsic interplay between G4 structure and DNA polymerase fidelity, and (2) DNA polymerase errors within and surrounding G4s may have downstream consequences on G4 function, regulation, and ligand binding. Thus, our study provides a unique and critical understanding of how polymerases navigate G4s during genome duplication.
Supplementary Material
Acknowledgments
We thank Alexandra Nusawardhana for technical assistance. This research was supported the Penn State Jake Gittlen Cancer Research Foundation.
Funding sources:
NIH grants R01 CA237153 (to K.A.E.); R01 GM136684 (to K.D.M.); ES014737 (to M.Y.L.), and the Czech Science Foundation (grant 21-00580S to EK)
Footnotes
Conflict of Interest Statement
The authors declare no conflict of interest with this research.
References
- 1.Lee M, Zhang S, Wang X, Chao H, Zhao H, Darzynkiewicz Z, et al. Two forms of human DNA polymerase δ: Who does what and why? DNA Repair. 2019;81:102656. doi: 10.1016/j.dnarep.2019.102656. [DOI] [PubMed] [Google Scholar]
- 2.Fuchs J, Cheblal A, Gasser SM. Underappreciated Roles of DNA Polymerase δ in Replication Stress Survival. Trends in Genetics. 2021;37(5):476–87. doi: 10.1016/j.tig.2020.12.003. [DOI] [PubMed] [Google Scholar]
- 3.Eckert KA, Tsao W-C. Detours to Replication: Functions of Specialized DNA Polymerases during Oncogene-induced Replication Stress. International Journal of Molecular Sciences. 2018;19(10):3255. doi: 10.3390/ijms19103255. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Ananda G, Hile S, Breski A, Wang Y, Kelkar Y, Makova K, et al. Microsatellite Interruptions Stabilize Primate Genomes and Exist as Population-Specific Single Nucleotide Polymorphisms within Individual Human Genomes. PLOS Genetics. 2014;10(7):e1004498. doi: 10.1371/journal.pgen.1004498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bergoglio V, Boyer A, Walsh E, Naim V, Legube G, Lee M, et al. DNA synthesis by Pol η promotes fragile site stability by preventing under-replicated DNA in mitosis. The Journal of Cell Biology. 2013;201(3):395–408. doi: 10.1083/jcb.201207066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Twayana S, Bacolla A, Barreto-Galvez A, De-Paula R, Drosopoulos W, Kosiyatrakul S, et al. Translesion polymerase eta both facilitates DNA replication and promotes increased human genetic variation at common fragile sites. Proceedings of the National Academy of Sciences of the United States of America. 2021;118(48):e2106477118. doi: 10.1073/pnas.2106477118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Walsh E, Wang X, Lee M, Eckert K. Mechanism of replicative DNA polymerase delta pausing and a potential role for DNA polymerase kappa in common fragile site replication. Journal of Molecular Biology. 2013;425(2):232–43. doi: 10.1016/j.jmb.2012.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Tonzi P, Yin Y, Lee C, Rothenberg E, Huang T. Translesion polymerase kappa-dependent DNA synthesis underlies replication fork recovery. eLife. 2018;7:e41426. doi: 10.7554/eLife.41426. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Sen D, Gilbert W. Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature. 1988;334(6180):364–6. doi: 10.1038/334364a0. [DOI] [PubMed] [Google Scholar]
- 10.Rhodes D, Lipps HJ. G-quadruplexes and their regulatory roles in biology. Nucleic Acids Research. 2015;43:8627–37. doi: 10.1093/nar/gkv862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Battacharryya D, Mirihana Arachchilage G, Basu S. Metal Cations in G-Quadruplex Folding and Stability. Frontiers in Chemistry. 2016;4(38). doi: 10.3389/fchem.2016.00038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Sen D, Gilbert W. A sodium-potassium switch in the formation of four-stranded G4-DNA. Nature. 1990;344(6265):410–4. doi: 10.1038/344410a0. [DOI] [PubMed] [Google Scholar]
- 13.Huppert JL, Balasubramanian S. Prevalence of quadruplexes in the human genome. Nucleic Acids Research. 2005;33:2908–16. doi: 10.1093/nar/gki609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Chambers VS, Marsico G, Boutell JM, Di Antonio M, Smith GP, Balasubramanian S. High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nature Biotechnology. 2015;33(8):877–81. doi: 10.1038/nbt.3295. [DOI] [PubMed] [Google Scholar]
- 15.Masai H, Tanaka T. G-quadruplex DNA and RNA: Their roles in regulation of DNA replication and other biological functions. Biochemical and biophysical research communications. 2020;531(1):25–38. doi: 10.1016/j.bbrc.2020.05.132. [DOI] [PubMed] [Google Scholar]
- 16.Savva L, Georgiades SN. Recent Developments in Small-Molecule Ligands of Medicinal Relevance for Harnessing the Anticancer Potential of G-Quadruplexes. Molecules. 2021;26(4):841. doi: 10.3390/molecules26040841. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Xu H, Di Antonio M, McKinney S, Mathew V, Ho B, O’Neil NJ, et al. CX-5461 is a DNA G-quadruplex stabilizer with selective lethality in BRCA1/2 deficient tumours. Nature Communications. 2017;8(1):14432. doi: 10.1038/ncomms14432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Zimmer J, Tacconi E, Folio C, Badie S, Porru M, Klare K, et al. Targeting BRCA1 and BRCA2 Deficiencies with G-Quadruplex-Interacting Compounds. Molecular cell. 2016;61(3):449–60. doi: 10.1016/j.molcel.2015.12.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Estep K, Butler T, Ding J, Brosh R. G4-Interacting DNA Helicases and Polymerases: Potential Therapeutic Targets. Current Medicinal Chemistry. 2019;26(16):2881–97. doi: 10.2174/0929867324666171116123345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Tang F, Wang YG, Zi, Guo S, Wang Y. Polymerase η Recruits DHX9 Helicase to Promote Replication across Guanine Quadruplex Structures. Journal of the American Chemical Society. 2022. doi: 10.1021/jacs.2c05312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Sato K, Martin-Pintado N, Post H, Altelaar M, Knipscheer P. Multistep mechanism of G-quadruplex resolution during DNA replication. Science Advances. 2021;7(39):eabf8653. doi: 10.1126/sciadv.abf8653. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Stein M, Eckert K. Impact of G-Quadruplexes and Chronic Inflammation on Genome Instability: Additive Effects during Carcinogenesis. Genes. 2021;12(11):1779. doi: 10.3390/genes12111779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Bacolla A, Ye Z, Ahmed Z, Trainer JA. Cancer mutational burden is shaped by G4 DNA, replication stress and mitochondrial dysfunction. Progress in Biophysics and Molecular Biology. 2019;147:47–61. doi: 10.1016/j.pbiomolbio.2019.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Du X, Gertz EM, Wojtowicz D, Zhabinskaya D, Levens D, Benham CJ, et al. Potential non-B DNA regions in the human genome are associated with higher rates of nucleotide mutation and expression variation. Nucleic Acids Research. 2014;42:12367–79. doi: 10.1093/nar/gku921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Georgakopoulos-Soares I, Morganella S, Jain N, Hemberg M, Nik-Zainal S. Noncanonical secondary structures arising from non-B DNA motifs are determinants of mutagenesis. Genome Research. 2018;28(9):1264–71. doi: 10.1101/gr.231688.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Guiblet W, Cremona M, Harris R, Chen D, Eckert K, Chiaromonte F, et al. Non-B DNA: a major contributor to small- and large-scale variation in nucleotide substitution frequencies across the genome. Nucleic Acids Research. 2021;49(3):1497–516. doi: 10.1093/nar/gkaa1269. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.De S, Michor F. DNA secondary structures and epigenetic determinants of cancer genome evolution. Nature Structural and Molecular Biology. 2011;18:950–5. doi: 10.1038/nsmb.2089. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Bacolla A, Tainer J, Vasquez K, Cooper D. Translocation and deletion breakpoints in cancer genomes are associated with potential non-B DNA-forming sequences. Nucleic Acids Research. 2016;44(12):5673–88. doi: 10.1093/nar/gkw261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Sarkies P, Reams C, Simpson L, Sale J. Epigenetic instability due to defective replication of structured DNA. Molecular cell. 2010;40(5):703–13. doi: 10.1016/j.molcel.2010.11.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Eddy S, Ketkar A, Zafar MK, Maddukuri L, Choi J-Y, Eoff RL. Human Rev1 polymerase disrupts G-quadruplex DNA. Nucleic Acids Research. 2014;42(5):3272–85. doi: 10.1093/nar/gkt1314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Ketkar A, Smith L, Johnson C, Richey A, Berry M, Maddukuri L, et al. Human Rev1 relies on insert-2 to promote selective binding and accurate replication of stabilized G-quadruplex motifs. Nucleic Acids Research. 2021;49(4):2065–84. doi: 10.1093/nar/gkab041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Koole W, van Schendel R, Karambelas AE, van Heteren JT, Okihara KL, Tijsterman M. A Polymerase Theta-dependent repair pathway suppresses extensive genomic instability at endogenous G4 DNA sites. Nature Communications. 2014;5:3216. doi: 10.1038/ncomms4216. [DOI] [PubMed] [Google Scholar]
- 33.Eddy S, Maddukuri L, Ketkar A, Zafar M, Henninger E, Pursell Z, et al. Evidence for the kinetic partitioning of polymerase activity on G-quadruplex DNA. Biochemistry. 2015;54(20):3218–30. doi: 10.1021/acs.biochem.5b00060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Eddy S, Tillman M, Maddukuri L, Ketkar A, Zafar MK, Eoff RL. Human Translesion Polymerase κ Exhibits Enhanced Activity and Reduced Fidelity Two Nucleotides from G-Quadruplex DNA. Biochemistry. 2016;55(37):5218–29. doi: 10.1021/acs.biochem.6b00374. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Edwards DN, Machwe A, Wang Z, Orren DK. Intramolecular Telomeric G-quadruplexes Dramatically Inhibit DNA Synthesis by Replicative and Translesion Polymerases, Revealing their Potential to Lead Genetic Change. PLoS One. 2014;9(1):e80664. doi: 10.1371/journal.pone.0080664. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Kejnovská I, Stadlbauer P, Trantírek L, Renčiuk D, Gajarský M, Krafčík D, et al. G-Quadruplex formation by DNA sequences deficient in guanines : Two tetrad parallel quadruplexes do not fold intramolecularly. Chemistry - A European Journal. 2021;27(47):12115–25 doi: 10.1002/chem.202100895. [DOI] [PubMed] [Google Scholar]
- 37.Eckert K, Mowery A, Hile S. Misalignment-mediated DNA polymerase beta mutations: comparison of microsatellite and frame-shift error rates using a forward mutation assay. Biochemistry. 2002;41(33):10490–8. doi: 10.1021/bi025918c. [DOI] [PubMed] [Google Scholar]
- 38.Hile S, Eckert K. DNA polymerase kappa produces interrupted mutations and displays polar pausing within mononucleotide microsatellite sequences. Nucleic Acids Research. 2008;36(2):688–96. doi: 10.1093/nar/gkm1089. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Eckert KA, Ingle CA, Drinkwater NR. N-Ethyl-N-nitrosourea induces A:T to C:G transversion mutations as well as transition mutations in SOS-induced Escherichia coli. Carcinogenesis. 1989;10(12):2261–7. doi: 10.1093/carcin/10.12.2261. [DOI] [PubMed] [Google Scholar]
- 40.Zhou Y, Meng X, Zhang S, Lee E, Lee M. Characterization of human DNA polymerase delta and its subassemblies reconstituted by expression in the MultiBac system. PloS one. 2012;7(6):e39156. doi: 10.1371/journal.pone.0039156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Barnes R, Hile S, Lee M, Eckert K. DNA polymerases eta and kappa exchange with the polymerase delta holoenzyme to complete common fragile site synthesis. DNA Repair. 2017;57:1–11. doi: 10.1016/j.dnarep.2017.05.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Hingorani MM, Coman MM. On the Specificity of Interaction between the Saccharomyces cerevisiae Clamp Loader Replication Factor C and Primed DNA Templates during DNA Replication *. Journal of Biological Chemistry. 2002;277(49):47213–24. doi: 10.1074/jbc.M206764200. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Hile SE, Wang X, Lee MYWT, Eckert KA. Beyond translesion synthesis: polymerase κ fidelity as a potential determinant of microsatellite stability. Nucleic Acids Research. 2012;40(4):1636–47. doi: 10.1093/nar/gkr889. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Sahakyan AB, Chambers VS, Marsico G, Santner T, Di Antonio M, Balasubramanian S. Machine learning model for sequence-driven DNA G-quadruplex formation. Scientific Reports. 2017;7(1):14535. doi: 10.1038/s41598-017-14017-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Soper DS. Fisher's Exact Test Calculator [Software] 2022. [cited 2022 January]. Available from: https://www.danielsoper.com/statcalc. [Google Scholar]
- 46.Gray RD, Trent JO, Arumugam S, Chaires JB. Folding Landscape of a Parallel G-Quadruplex. The Journal of Physical Chemistry Letters. 2019;10:1146–51. doi: 10.1021/acs.jpclett.9b00227. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Burge S, Parkinson GN, Hazel P, Todd AK, Neidle S. Quadruplex DNA: sequence, topology and structure. Nucleic Acids Research. 2022;34(19):5402–15. doi: 10.1093/nar/gkl655. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Kwok CK, Merrick CJ. G-Quadruplexes: Prediction, Characterization, and Biological Application. Cell Press Reviews. 2017;35:997–1013. doi: 10.1016/j.tibtech.2017.06.012. [DOI] [PubMed] [Google Scholar]
- 49.Jana J, Weisz K. Thermodynamic Stability of G-Quadruplexes: Impact of Sequence and Environment. Chembiochem. 2021;22:2848–56. doi: 10.1002/cbic.202100127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Sun D, Guo K, Shin Y-J. Evidence of the formation of G-quadruplex structures in the promoter region of the human vascular endothelial growth factor gene. Nucleic Acids Research. 2011;39(4):1256–65. doi: 10.1093/nar/gkq926. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Abdulovic A, Hile S, Kunkel T, Eckert K. The in vitro fidelity of yeast DNA polymerase δ and polymerase ε holoenzymes during dinucleotide microsatellite DNA synthesis. DNA Repair. 2011;10(5):497–505. doi: 10.1016/j.dnarep.2011.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Kelkar Y, Strubczewski N, Hile S, Chiaromonte F, Eckert K, Makova K. What is a microsatellite: a computational and experimental definition based upon repeat mutational behavior at A/T and GT/AC repeats. Genome Biology and Evolution. 2010;2:620–35. doi: 10.1093/gbe/evq046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Baptiste B, Jacob K, Eckert K. Genetic evidence that both dNTP-stabilized and strand slippage mechanisms may dictate DNA polymerase errors within mononucleotide microsatellites. DNA Repair. 2015;29:91–100. doi: 10.1016/j.dnarep.2015.02.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Zuker M. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research. 2003;31(13):3406–15. doi: 10.1093/nar/gkg595. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Lovett S. Encoded errors: mutations and rearrangements mediated by misalignment at repetitive DNA sequences. Molecular microbiology. 2004;52(5):1243–53. doi: 10.1111/j.1365-2958.2004.04076.x. [DOI] [PubMed] [Google Scholar]
- 56.Lemmens B, van Schendel R, Tijsterman M. Mutagenic consequences of a single G-quadruplex demonstrate mitotic inheritance of DNA replication fork barriers. Nature Communications. 2015;6:8909. doi: 10.1038/ncomms9909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Tarailo-Graovac M, Wong T, Qin Z, Flibotte S, Taylor J, Moerman D, et al. Spectrum of variations in dog-1/FANCJ and mdf-1/MAD1 defective Caenorhabditis elegans strains after long-term propagation. BMC Genomics. 2015;16(1):210. doi: 10.1186/s12864-015-1402-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Monsen R, DeLeeuw L, Dean W, Gray R, Chakravarthy S, Hopkins J, et al. Long promoter sequences form higher-order G-quadruplexes: an integrative structural biology study of c-Myc, k-Ras and c-Kit promoter sequences. Nucleic Acids Research. 2022;50(7):4127–47. doi: 10.1093/nar/gkac182. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Arora A, Nair D, Maiti S. Effect of flanking bases on quadruplex stability and Watson-Crick duplex competition. The FEBS journal. 2009;276(13):3628–40. doi: 10.1111/j.1742-4658.2009.07082.x. [DOI] [PubMed] [Google Scholar]
- 60.Murat P, Guilbaud G, Sale J. DNA polymerase stalling at structured DNA constrains the expansion of short tandem repeats. Genome biology. 2020;21(1):209. doi: 10.1186/s13059-020-02124-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Betous R, Rey L, Wang G, Pillaire M-J, Puget N, Cazaux C, et al. Role of TLS DNA Polymerases eta and kappa in Processing Naturally Occurring Structured DNA in Human Cells. Molecular Carcinogenesis. 2009;48(4):369–78. doi: 10.1002/mc.20509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Sattin G, Artese A, Nadai M, Costa G, Parrotta L, Alcaro S, et al. Conformation and stability of intramolecular telomeric G-quadruplexes: sequence effects in the loops. PloS one. 2013;8(12):e84113. doi: 10.1371/journal.pone.0084113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Gong J, Wen C, Tang M, Duan R, Chen J, Zhang J, et al. G-quadruplex structural variations in human genome associated with single-nucleotide variations and their impact on gene activity. Proceedings of the National Academy of Sciences of the United States of America. 2021;118(21):e2013230118. doi: 10.1073/pnas.2013230118. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.