Abstract
Background: Replication impediments can produce helicase-polymerase uncoupling allowing lagging strand synthesis to continue for as much as 6 kb from the site of the impediment. Materials and Methods: We developed a cloning procedure designed to recover fragments from lagging strand near the helicase halt site. Results: A total of 62% of clones from a p53-deficient tumor cell line (PC3) and 33% of the clones from a primary cell line (HPS-19I) were within 5 kb of a G-quadruplex forming sequence. Analyses of a RACK7 gene sequence, that was cloned multiple times from the PC3 line, revealed multiple deletions in region about 1 kb from the cloned region that was present in a non-B conformation. Sequences from the region formed G-quadruplex and i-motif structures under physiological conditions. Conclusion: Defects in components of non-B structure suppression systems (e.g. p53 helicase targeting) promote replication-linked damage selectively targeted to sequences prone to G-quadruplex and i-motif formation.
Keywords: CMG helicase uncoupling, replication stress, Gquadruplex, i-motif
Non-B DNA structure formation must be suppressed during replication in order for replication to proceed properly. Many such structures occur at DNA sequences that exhibit GC-skew where cytidine residues are largely confined to one strand and guanidine residues are largely confined to the other. In addition to R-Loops (1), these sequences are capable of forming G-quadruplexes (2,3) triple helices (4) and i-motifs (5). At physiological pH, certain sequences are capable of forming several different structures (6,7), however, the most commonly seen structures in these regions are the R-Loop, G-quadruplex and, less frequently, the triple helix (7) and the i-motif (8). DNA damage associated with R-loop formation is generally associated with transcription blocks (9,10), but can be associated with replication. There is also considerable evidence that G-quadruplex formation can cause both epigenetic (11,12) and genetic damage associated with DNA replication when G-quadruplex suppression systems are compromised (13-15). Here it is important to point out that G4-seq (60) results confirm the earlier report (16) that the frequency of G-quadruplex forming sequences increases with biological complexity suggesting that these sequences have been under positive selective pressure during evolution of higher organisms. To facilitate this increased frequency, the systems that promote replication through G-quadruplex structures or unwind these structures during replication and transcription of DNA appear to have evolved increased sophistication and redundancy in higher organisms. For example the Y polymerases REV1, Pol ĸ and Pol η are known to process G-quadruplex motifs when replication is slowed by them (17). Further, several helicases including Pif1 (15), and the superfamily 2 helicases WRN (18), BLM (19) and FANCJ (20) have all been shown to unwind G-quadruplex DNA structures in vitro. Even so, WRN (21), BLM (22), Pif1 (23), and FNACJ (24) are also capable of unwinding RNA:DNA hybrids in vitro.
In one of the most carefully studied instances, loss of function in the FANC J homolog dog-1 in C. elegans, produces selective deletion mutagenesis at sites where persistent G-quadruplex and i-motif structures can occur (14). In this case, deletion was apparently produced via a polymerase theta end-joining mechanism (25,26). These data and those of Paeschke et al. (15), suggest that when non-B suppression is compromised the resulting structures can produce strong impediments to leading strand progress during replication. Leading strand impediments are known to cause Cdc45-Mcm2-7-GINs (CMG) helicase complex and lagging strand synthesis to become uncoupled from leading strand synthesis (27), permitting lagging strand synthesis to continue for an extended distance beyond a stalled replication fork (28). We hypothesized that the structure formed due to non-B structure induced uncoupling could be exploited to identify sites of recurring non-B structure. As can be seen in Figure 1, short double-stranded DNA fragments originating from the lagging strand at or near the CMG halt site should be produced after shear fragmentation or mild chemical fragmentation of genomic DNA.
Figure 1. Fork stalling at a replication impediment permits cloning of adjacent DNA. (A) DNA damage, a non-B DNA structure or an R-Loop can halt the leading strand. However, the CMG helicase complex and lagging strand synthesis can continue for a considerable distance (up to 6 kb) downstream from site of the halted leading strand (27, 28). (B) If the unrepaired lesion persists for a time, shear and/or chemically induced cleavage (red arrows) within the lesion can lead to a variety of clonable fragments from the intermediate (C). T4 end repair (D) permits LMPCR amplification and cloning of duplex fragments that identify the region downstream from the leading strand halt site. Region with non-B structure potential is shown in red.
Materials and Methods
Cell culture and DNA isolation. Human prostate cancer cells (PC3) were cultured as previously described (30). Human prostate stromal cells (HPS-19I) were a gift of Prof. David R. Rowley Baylor University College of Medicine, Houston, TX, USA. They were cultured as described (31). DNA was isolated from each cell line using the QIAamp® Blood Mini kit (Qiagen Sciences, Frederick, MD, USA) using the instructions supplied by the manufacturer.
Karyotyping. Cells were harvested, fixed with Carnoy fixative, dropped onto glass slides and aged overnight at 60˚C. Following standard GTG banding, 50 metaphases were captured with GenASIs Bandview software (Applied Spectral Imaging, Carlsbad, CA, USA). A 24-color SKY spectral karyotypic was performed, using a standard protocol. Briefly, after RNAse and Pepsin treatment, DNA in metaphases was denatured at 70˚C for 2 min followed by three ice-cold ethanol washes with 70%, 80%, and 90% ethanol. Denatured SKY™ 24-color probes were hybridized to the slide preparation for 48 h at 37˚C. After a wash with formamide containing SSC and incubations with 1xSSC and 4xSSC, images were captured under Vectashield DAPI using the HiSKY system (Applied Spectral Imaging, Carlsbad, CA).
Preparation of random libraries with clones from uncoupled replication forks. Ligation-mediated PCR was used to amplify DNA fragments spontaneously generated during DNA isolation, or after bisulfite treatment of genomic DNA at 37˚C (29) to promote chemical breakage of the DNA (32) at single stranded regions. Under these conditions, bisulfite cannot attack duplex DNA so the final treatment with sodium hydroxide was omitted to preserve duplex structure.
Two DNA preparations from each cell line were treated with CIP (Calf Intestinal Phosphatase, New England Biolabs, USA) in a 30 μl reaction containing 10 U of CIP, in 50 mM Potassium Acetate, 20 mM Tris-acetate 10 mM Magnesium Acetate 100 μg/ml BSA pH 7.9 for 1 h at 37˚C to prevent ligation concatamers. However, concatemers were not observed when CIP treatment was omitted, indicating that CIP treatment was unnecessary. Two DNAs were employed to form the duplex linker: Linker_1 (5’-AGAAGCTTGAATTCGAGCAGTCAG-3’) was annealed to linker_2 (5’-CTGCTCGAATTCAAGCTTCT-3’). A total of 2.5 μl of 100 μM of each oligodeoxynucleotide was brought to a final volume of 45 μl and incubated for 2 min at 94˚C, 5 min at 70˚C, and for 5 min at 50˚C. The duplex thus formed was allowed to cool to room temperature. The duplex was then diluted to a final volume of 250 μl for a 1 μM duplex linker and ligated to 15 μl isolated DNA that had been treated with of T4 polymerase to repair ends for blunt end ligation to the linkers by using 5 units of T4 ligase (New England Biolabs) and 1.5 μl 10x ligase buffer (New England Biolabs), 1 μl 1 μM Linker, to a final volume of 15 μl. The reaction was then incubated overnight at 4˚C. The ligated DNA was then treated with the Qiagen Reaction cleanup kit, and the DNA was eluted in 15 μl of pure water. The eluted DNA was then used for PCR, with 0.25 U of Hotstar Taq (Qiagen), 10 μl of 10x Taq Buffer (Qiagen), 8 μl 25 mM MgCl2, 1.6 μl of 10 mM dNTPs (Roche), 1 μl 100 μM linker 2 in a final volume of 100 μl. The PCR conditions were 55˚C 2 min, 72˚C 5 min, 94˚C 10 min, 24 cycles of 94˚C 1 min, 55˚C 1 min, 72˚C 1 min, and then 72˚C for 5 min, with a 4˚C hold. After amplification, 2 μl of the PCR reaction was cloned using the TOPO® TA cloning® Kit (Invitrogen, Waltham, MA, USA) following the manufacturer’s instructions. The resulting colonies were grown in liquid culture, and plasmids were isolated and sequenced as previously described (11).
Cleavage frequency. A scanned profile of the molecular weight distribution of purified PC3 DNA separated by capillary electrophoresis was obtained by using a DNA 7500 microfluidic chip (Agilent Technologies). The number average molecular weight of the DNA preparation was determined from the scan as previously described (30).
Direct PCR from the RACK7 region (bisulfite-treated PC3 DNA). Each of the colored regions in Figure 2 were amplified from PC3 cell line DNA after non-denaturing bisulfite treatment (29) to detect possible open DNA structures. In each case, 1 μg of native DNA from the cell line was treated with bisulfite using the EZ DNA Methylation Kit (Zymo Research), also as previously described (32), except that the initial sodium hydroxide denaturation step was omitted, and the temperature of the 16 h incubation period was reduced from 55˚C to 37˚C so as to preserve the native secondary structure of the isolated DNA. Initial exposure of the DNA was to the bisulfite reagent adjusted to pH 5.3, as described in (11). The reaction was cleaned up by use of the Zymo spin columns. The manufacturer’s instructions were followed exactly in these experiments. The bisulfite-treated DNA was then eluted in 15 μl of pure water, and all of it used for PCR amplification. Pretreatment of DNA with RNAse H was carried out in some cases as described in (33).
Figure 2. RACK7 Sequence in the vicinity of the multiply cloned region. The sequence for RACK7 from Homo sapiens chromosome 20 depicted here is that given in the human genome sequence (GRCh38.p12 Primary Assembly). Native DNA regions examined for cytosine conversion by bisulfite under non-denaturing conditions are highlighted in color. Turquoise: 5’ region tested for bisulfite conversion under non-denaturing conditions. Red: Region of intense bisulfite conversion. Yellow: Central region tested for bisulfite conversion under non-denaturing conditions. This region also contains the seven overlapping independently cloned sequences cloned in the screen. The arrows highlighted in gray indicate the common 3’ breakpoint present in six of the seven clones. Breakpoints for the seventh clone are highlighted in red. The remaining six arrows indicate 5’ breakpoints for the clones with a common 3’ end. Bright Green: 3’ region tested for bisulfite conversion.
PCR Primers for the region 3’ to the cloned region (Bright Green in Figure 2) were as follows: Forward: 5’-CGTTCCATTGGAACAGC ACCCTGACTATG-3’; Reverse: 5’-ATTTATGGTTTGGGAGTTATT GGGGAAGAC-3’. PCR Primers for the Cloned Region (Yellow in Figure 2) were: Forward 5’-ACTCCTGAGAGTACTGTACAG-3’; and Reverse: 5’-GTCTACCTTCTAATTCTGCCT-3’. PCR primers for the region 5’ to the cloned region (Turquoise in Figure 2) were: Forward 5’-GGTTGGCCAGTTGTGTACCTCACAGTGG-3’; and Reverse 5’-GCTCCAGCCTGGGCAACAAAGCAACAGT-3’. Forward and reverse primers were used to form a solution containing 9 μM of each primer. PCR amplification master mix contained with 0.25 U of Hotstar Taq (Qiagen), 2.5 μl of 10x Taq Buffer (Qiagen), 0.95 μl 5X Q solution (Qiagen) 2 μl 25 mM MgCl2, 0.8 μl of 10 mM dNTPs (Roche) and 2.5 μl 9 μM primers to obtain a final volume of 9 μl. Each reaction contained 9 μl master mix and 200 ng of target DNA brought to a final volume of 25 μl with molecular Biology Water (Sigma). The PCR conditions were 95˚C 10 min, 35 cycles of 95˚C 1 min, 64.5˚C 1 min, 72˚C 1 min, for 35 cycles and then 72˚C for 5 min, with a 4˚C hold.
Direct sequencing of the RACK7 region (native DNA). DNA from the 5’-region (Turquoise in Figure 2) of RACK7 was amplified without bisulfite treatment from PC3 DNA specimens as follows: PCR primers for this region were: Forward: 5’-GGTTGGCCAGTTG TGTACCTCACAGTGG-3’; and Reverse 5’-GCTCCAGCCTGG GCAACAAAGCAACAGT-3’. PCR reactions contained 12.5 μl 2X KAPA2G Fast Multiplex mix (Kapabiosystems, Wilmington, MA, USA), 0.55 μl 9μM of each primer, 200ng target DNA, made up to a final volume of 25 μl with Molecular Biology Water (Sigma). PCR cycling conditions were 95˚C for 3 min, followed by 35 cycles of 95˚C for 1 min, 60˚C for 30 sec, 72˚C for 38 sec followed by a final extension at 72˚C for 26 sec.
Cloning and sequencing of RACK7 region PCR products. DNA amplicons from 20 μl of each RACK7 region PCR reaction were separated on 2% agarose gels (40 mM Tris-Acetate, 10 mM EDTA pH 7.8) and stained with 4 μg/ml Ethidium Bromide and destained in water. Excised fragments of the expected length (710 bp for the 3’ turquoise region in Figure 2, 640 bp for the yellow cloned region in Figure 2, and 430 bp for the bright green 5’-region in Figure 2) were purified using the Qiagen Gel Extraction Kit as described by the manufacturer. Lastly, 2 μl of the purified amplicon was cloned using the TOPO® TA cloning® Kit (Invitrogen, Waltham MA, USA) following the manufacturer’s instructions. The resulting colonies were grown in liquid culture, and plasmids were isolated and sequenced as previously described (11).
DNA sequence alignment software. Cloned DNA sequences in FASTA format we aligned against the reference sequence for RACK7 from Homo sapiens chromosome 20, GRCh38.p12 Primary Assembly, using two software packages: MAFFT (34) and T-Coffee (35) with similar results showing variable length deletions and TCCgTTC OR TAC mutations. Sequence data sets depicted in Figure 3 were constructed from MView (36) renderings of the sequence alignments obtained with MAFFT.
Figure 3. RACK7 sequences obtained from PC3 DNA in the region 5’ to the independently cloned sequences. Each panel is colored to highlight tetrameric repeat elements in the concatenated repeat defining the region (Gray: TTCC; Bright Green: CCTG; Yellow: TCCC). Panel A: Native sequences obtained after bisulfite treatment under non-denaturing conditions colored to show bisulfite converted CgT sites in Turquoise. Note that the region of intense bisulfite conversion in this panel lies within a region of GC-skew defined by the concatemer in the genomic sequence (*) with the general sequence (TCCC)5-(CCTG)8-CC-(TCCC)9-(TTCC)9. The highest degree of CgT conversion occurs in the (CCTG)8-CC-(TCCC)9 region. Panel B: Native sequences obtained after RNAse H treatment followed by bisulfite treatment under non-denaturing conditions colored to show bisulfite converted CgT sites in Turquoise. Panel C: Native sequences obtained without RNAse H or bisulfite treatment. Focused TCCgTTC mutations in the native sequences are colored in Turquoise. (*) indicates the genomic sequence at top of each panel. Each of the isolated sequences contained a deletion (shown as a gap in the panels) relative to the genomic sequence.
Results
A total of 96 clones were sequenced and mapped to chromosomal locations in the human genome: 43 from IPS19I and 53 from PC3 (Table I and Table II). Motifs with non-B structure potential were observed within 5 kb of the cloned sequence in 60% of the clones from IPS-19I and 74% of the clones from PC3. However, clones within 5 kb of a sequence conforming to the G-quadruplex folding rule (2) were twice as frequent in PC3 compared to IPS19I (Table III). This was consistent with the karyotypes of the two cell lines showing that IPS-19I possessed a normal karyotype (Figure 4) suggesting a normal phenotype, while the hyperdiploid and heavily rearranged karyotype of the p53 deficient PC3 cell line (Figures 5 and 6) indicated that ongoing chromosomal damage might be occurring in this tumor cell line. Moreover, clones containing a chromosomal rearrangement (a fusion between Chromosome 20 and Chromosome 7), and clones originating from the same chromosomal location were observed only in the PC3 cell line (Table II).
Table I. HPS-19I cell line.
*B+: DNA treated with bisulfite under native conditions prior to isolation. B-: DNA not treated with bisulfite under native conditions prior to isolation. C+: DNA treated with CIP under native conditions prior to isolation. C-: DNA not treated with CIP under native conditions prior to isolation. **GQ: DNA sequence conforming to the G-Quadruplex folding rule (2).
Table II. PC3 cell line.
*B+: DNA treated with bisulfite under native conditions prior to isolation. B-: DNA not treated with bisulfite under native conditions prior to isolation. C+: DNA treated with CIP under native conditions prior to isolation. C-: DNA not treated with CIP under native conditions prior to isolation. **GQ: DNA sequence conforming to the G-Quadruplex folding rule (2).
Table III. Summary of cloning results.
*See Tables I and II for clone details. §The difference between the values for the two cell lines is statistically significant in a two-sided t-test at the p<0.01 level.
Figure 4. G-Banding karyotype of the HPS-19I cell line. The karyotype of the low passage human prostate stromal cell line, HPS-19I, was normal, with a mean chromosome number of 46 [2n]. In eighteen mitotic cells analyzed, including one tetraploid cell, there was no significant numerical chromosomal abnormality and no structural aberration of any chromosome detectable within the limits of cytogenetic resolution.
In the PC3 cell line, two clones mapped to the same chromosomal location on chromosome 7, two clones mapped to the same location on chromosome 11 and seven clones mapped to the same chromosomal location in an intron of the RACK7 gene on chromosome 20 (Figure 2). Four of the seven clones were obtained from untreated DNA. Three of the seven clones were obtained from DNA that had been treated with sodium bisulfite to facilitate breakage. We estimated the probability that clones could originate from the same location by random shear as follows. Using the method also described in (30) we determined that spontaneous cleavage frequency for the PC3 DNA preparation to be 1/8800bp from the scanned molecular weight profile for purified PC3 DNA used in the preparation of the PC3 DNA library (not shown).
For a genome of length Λ, it has been shown (38,39) that the weight fraction (FW) between DNA fragment lengths L1 and L2that is produced by random breakage at a frequency f is given by:

Since the procedure used for pCR™2.1-TOPO® cloning is designed to clone fragments less than 1,000 bp, the inserts we observed were all in this range. The weight fraction predicted by random breakage in this range for a cleavage frequency f=1/8800 is about 0.006 and could represent almost 1% of the genome. Thus, the twenty independent isolates obtained without bisulfite or CIP treatment could be produced by random breakage. At the measured cleavage frequency, a hyper diploid PC3 genome (Figure 5) comprising about 7.5×109 bp of DNA would have at least 2×1011 fragments/μg DNA. Of these, about 1.7×109 bp representing a random sampling of the genome would be in the size range clonable by pCR™2.1-TOPO®. Consequently, each of the single isolates could be the result of random cleavage unless biologically frangible sites associated with replication stress (e.g. non-B sequence potential) are more frequent.
Figure 5. G-Banding karyotype of the PC3 Cell line. The G-banding karyotype of the high passage established prostate cancer cell line, PC3, was abnormal. All mitotic cells analyzed by conventional G-banding were near triploid and clonally abnormal.
On the other hand, sequences obtained more than once cannot be the result of random cleavage. This is most clearly seen for the case of the sequences four sequences obtained from untreated DNA that appear to have been produced by random shear from RACK7 gene. In this case four independently isolated overlapping fragments originated from the same 656 bp region of RACK7. The weight fraction of the PC3 genome that falls between 0 bp and 656 bp for randomly cleaved DNA is expected to be 0.0026. Since the cloned sequences all fall within a 656 bp region of RACK7, if we assume that there are only 6 copies of the region present in the PC3 hyper-diploid genome based on Giemsa (Figure 5) and SKY Spectral analysis (Figure 6), then the random probability of cloning fragments ranging in size from 0 bp to 656 bp from that 656 bp region of the RACK7 gene sequence with 100% cloning efficiency is given by:
Figure 6. P53 SKY spectral karyotype. The analysis of the complex karyotype of the PC3 cell line was corroborated with SKY spectral karyotyping. The complete analysis of the hyperdiploid stemline showed that it contained recurrent structural rearrangements involving chromosomes 1, 2, 3, 4, 5, 6, 8, 10, 11, 12, 15, 16, 17, 18, and 19. Numerical gains of chromosomes X, 1, 2, 3, 4, 7, 11, 14, 16, 17, 18, 19, 20, and 21 were present. Of note, the only chromosome homologues that did not have structural rearrangements were chromosomes 7, 9, 13, 14, 20, 21, 22, and X. The Y chromosome was apparently lost in this cell line. The mean chromosome number was: 62, with the following ISCN (International System for Human Cytogenetic Nomenclature) description, based on both conventional cytogenetic analysis and SKY spectral karyotyping :62,<2n>, X,+X,-Y,+der(1)t(8;12;1;10), der(2)t(8;2)x2, +der(2)t(17;22;2), der(3)t(3;6;3;6;3)x2, +der(3;10)(p10;p10)x2, der(4;12)(q10;q10), der(4;6)(q10;p10), +der(4)t(4;10)(p10;q10)x2, +i(5)(p10),+7, +7, -8,+11,+der(11)t(5;10;11), der(12)t(12;8)(q24;q13)x2, +14,der(15)t(8;15)(q22;p11.2), der(15)t(5,15)(q13;p11.2), +16,der(16)t(10;1;16), +der(17)t(3;17;15;17),+18,+i(18)(p10), +der(19)t(5;19;5),+20,+20,+21,+21[5].
p=6×656bp/7.504×109bp×0.0026/0.006=2.2×10–7.
The probability of cloning it m times in n trials is given by:

For 4 clones in 20 trails p=1.2×10–23
The observed cloning frequency of 4/20=0.25 clearly rules this out.
Given that the probability that random breakage during DNA isolation would produce multiple clones from the same site is vanishingly small (see Materials and Methods) it is likely that this region represents a site of recurring DNA damage that renders it particularly frangible, and thus subject to repeated isolation by the cloning scheme used here. Further, six of the seven clones had a common breakpoint its 3’ end and a randomly placed break point at its 5’ end (Figure 2) suggesting that a recurring biologically induced gap occurs at the 3’ site and that the clonable duplex fragments were produced by hydrodynamic shear during DNA isolation or by bisulfite induced breakage at the gap in native DNA prior DNA isolation.
Regions that persist as non-B structures in native DNA have been shown to be accessible to chemical modification (6,40,41). When we used published methodology (29) for the chemical deamination of cytosine residues by sodium bisulfite under non-denaturing conditions, we found that deamination only occurred in the region upstream of the cloned region (Turquoise in Figure 2) and not in the cloned region itself (Yellow in Figure 2) or the region downstream (Green in Figure 2) of the cloned region (data not shown). The sequence in this intronic region of RACK7 was not rearranged in the prostate cancer cell line, however the bisulfite accessible region contained deletions of different lengths (Figure 3). Variation among individual isolates of the region suggests that the deletion process may be ongoing and cumulative. Further deletions did not block bisulfite access since deletions were observed even in sequences containing heavily deaminated regions after treatment with bisulfite under non-denaturing conditions (Figure 3A). We also noted TCCgTTC or TAC mutations characteristic of the AID and APOBEC family of single-strand specific DNA cytosine deaminases downstream from the deletions in untreated DNA (Figure 3C).
Studies of dog-1 induced deletions in C. elegans (42) found that deletions are generally sharply confined to the 3’ end of the G-rich strand conforming to the general G-quadruplex folding rule (2) (G3+N1-7G3+N1-7G3+N1-7G3+). The data in Figure 3 display the C-rich strand from this region of the human RACK7 gene, since the extreme GC-skew in the region precludes extensive bisulfite mediated CgT deamination on the G-rich strand. Deletions tended to be more commonly confined to the 5’ end of the (TCCC)9 repeat of the C-rich strand (Figure 3). On the complementary G-rich strand the deletions are confined to what would be the 3’ end of the complementary (GGGA)9 repeat that conforms to the general G-quadruplex folding rule (2). Thus, the positioning of the deletions is similar to those observed in dog-1 deficient C. elegans.
Our cloning strategy was based the possibility that a variety of non-B structures could uncouple lagging strand replication at a leading strand impediment (Figure 1). Although non-B DNA structures can produce this effect hybrid structures like the R-Loop are also candidates. DRIP methods report a weak R-loop signal in this region in epithelial cells, but not in fibroblast or leukemia cell lines (43). RDIP-Seq reports weak R-loop signal in both epithelial and fibroblast cell lines (44), and surprisingly DRIPc reports weak R-Loop signal only on the minus strand in a pluripotent cell line (45). Even so, an R-Loop in RACK7 would have to violate the general rule that the nascent RNA strand is the G-rich strand in the region of GC-Skew (10,46) because transcription of RACK7 produces a nascent C-rich strand RNA. Nevertheless, we tested the region for bisulfite sensitivity after pretreating the DNA with RNAse H. The intense bisulfite induced deamination seen on the C-rich strand under non-denaturing (Figure 3A) is only moderately reduced by pretreatment of the DNA with RNAse H (Figure 3B), suggesting that the open structure in this region is not completely abolished by treatment with RNAse H as is generally seen with R-Loops. The most intense deamination occurs within the (CCTG)8-CC-(TCCC)9 region of the repetitive element (Figure 3A and 3B) with or without RNAse H pretreatment. Assuming that each sequence is an independent isolate of a representative pool, analysis of the sequence sets produced by bisulfite mediated deamination with or without RNAse H pretreatment shows that the difference between the two sets is not statistically significant at the p<0.05 level in either one sided or two-sided t-tests.
In order to study the nature of the structures that can form in the region of bisulfite accessibility, we performed biophysical analysis on model oligodeoxynucleotides corresponding each strand from this region using circular dichroism (CD) and UV thermal melting/annealing analysis (Figure 7). The CD spectrum of the model oligodeoxynucleotide from the G-rich strand shows a strong positive signal at 263 nm, indicative of a parallel G-quadruplex structure (Figure 7B). Consistent with this observation, the UV melting/annealing experiments (Figure 7A) revealed a highly stable structure in physiological-like conditions, such that it would not melt under the conditions of the experiment. Consequently, we had to reduce the potassium cation concentration from 100 mM, similar to physiological conditions, to 25 mM. Only then were we able to observe a melting transition (Tm=81˚C). This suggests the G-quadruplex is highly stable in conditions that mimic physiological pH and cation concentration. The data also clearly support the formation of a G-quadruplex associated with deletions at its 3’ end under physiological conditions. The stability of the structure indicated by its high Tm is consistent with its formation even within sequences that have undergone short deletion near its 3’ end, as evidenced by the bisulfite accessibility of the region in those the sequences carrying deletions depicted in Figure 3.
Figure 7. UV and circular dichroism analysis of model oligodeoxynucleotides. 36-mer (GGGA)9 corresponding to the region with G-quadruplex forming potential in the G-rich strand of RACK7 (top panels) and model 37-mer (TCCC)9T corresponding to the i-motif region in the complementary C-rich strand (bottom panels). (A) UV melting/annealing experiments with 2.5 μM (GGGA)9 in 10 mM sodium cacodylate with 25 mM KCl at pH 7.0. (B) The CD spectrum of 10 μM (GGGA)9 in 10 mM sodium cacodylate with 100 mM KCl at pH 7.0. (C). UV melting/annealing experiments with 2.5 μM (TCCC)9T in 10 mM sodium cacodylate with 100 mM KCl at pH 6.5. (D) CD spectra of 10 μM (TCCC)9T in 10 mM sodium cacodylate with 100 mM KCl at pH 5.0, 5.5, 6.0, 6.5, 6.75, 7.0, 7.5 and 8.0. Inset is a plot of ellipticity at 288 nm versus pH.
We also studied a model oligodeoxynucleotide from the C-rich strand for i-motif structures most commonly detected in C-rich strand sequences. Such structures involve C:C+
Discussion
The model used in the cloning design (Figure 1) is based on evidence demonstrating that CMG uncoupling permits lagging strand synthesis to continue past a replication impediment for as much as 6 kb beyond the leading strand halt site (27,28). The model is supported by the cloning results reported herein. Those results show that a sequence conforming to the G-quadruplex folding rule (2) within a region of GC-skew was present within 5 kb of the cloned sequence in 33% of the sequences cloned from the normal cell line and 62% of the sequences from the tumor cell line. Several studies (13-15) suggest that DNA Quadruplex formation can provide an impediment to DNA replication that can lead to DNA damage in cells compromised in one or more of the numerous systems that aid replication and repair at these structure prone sequences. The two-fold increase in G-quadruplex linked clones in the tumor cell line compared to the normal cell line could be associated with the absence of functional p53 in the tumor cell line (48). Functional p53 is important in binding at least one quadruplex resolving helicase (BLM) (19) and perhaps another (WRN) (18) at sites of DNA damage and repair (49,50), where they appear to cooperate with FANCJ to resolve G-quadruplex replication impediments (51). However, both BLM and WRN have also been shown to resolve RNA:DNA hybrids in vitro (21,22) so the p53 lesion in the PC3 cell line does not exclude the possibility that a fraction of the sites we have cloned reside near replication impediments caused by R-Loops.
In addition to Quadruplex forming structures, other non-B structures that are not characterized by GC-skew are capable of causing DNA damage and influencing repair (52). Consistent with this possibility, our data showed that 60% of clones were within 5 kb of a sequence capable of non-B structure formation in the normal cell line while 74% of the clones from the tumor cell line were within 5 kb of a non-B structure forming sequence. Of particular interest are the representatives of the multiply cloned sequences from the tumor cell line. Only the repair compromised tumor cell line yielded multiple clones from the same region, suggesting that these sites represent recurring replication impediments in the tumor cell line. Consistent with this possibility seven overlapping clones from the same region of RACK7 were obtained from the tumor cell line. Furthermore, six of those seven clones carried an identical breakpoint at one end. Since this breakpoint (gray arrows in Figure 2) is about 1 kb 3’ to the site of the open structures detected with bisulfite treatment of native DNA (red region in Figure 2) it is also consistent with our cloning model where the common breakpoint would represent a site at or very near the uncoupled CMG halt site with the leading strand replication impediment at the site of the non-B structure.
The involvement of quadruplex structures at the putative replication impediment in RACK7 is suggested by the data in Figures 2 and 3, and the biophysical studies on the representative oligodeoxynucleotides from the region (Figure 7). First, nearly all sequences recovered from this region contain a deletion in the region of bisulfite accessibility in native DNA, and neither the deletions nor the RNAse H treatment fully block the formation of the open structure detected by bisulfite treatment of native DNA. Those sequences recovered from bisulfite treated DNA (with or without RNAse H pretreatment) for which bisulfite mediated deamination was detected retain a nearly full copy of the (GGGA)9/(TCCC)9 duplex capable of G-quadruplex and i-motif quadruplex formation.
In many cases of R-Loop formation the shortening of the duplex region caused by the RNA:DNA hybrid’s A-Form structure is thought to bring regions of the looped out G-rich strand into juxtaposition where G-quadruplex formation can occur (53). Given the stability of the G-quadruplex it may persist after an R-Loop or transcription process has been resolved. In the case of the structure at RACK7 under study here, if the RNAse H results are interpreted to mean that non-canonical R-Loop forms with between the nascent C-Rich strand RNA transcript and the G-rich strand DNA generating a C-rich loop, then a quadruplex in the form the i-motif structure that is stable at neutral pH and physiological temperature at the (TCCC)9 sequence in the loop would again be involved. RNAse H treatment could remove the hybridized RNA while permitting the i-motif to persist in sequences with deletions that do not remove the (TCCC)9 region. Alternatively the i-motif structure in the loop might slow RNAse H action. Here again the data are consistent with our cloning model where the putative i-motif and an R-Loop provide the generic replication impediment depicted Figure 1.
Additional support for the single stranded nature of the C-rich strand comes from the detection of TCCgTTC or TAC transitions downstream from the deletions. Transition mutations of this type are characteristic of the APOBEC family (54), and it is reasonable to suspect that AID, and APOBEC family of deaminases may be attacking single-stranded DNA in the cells that contain these extensively remodeled sequences. Moreover, these enzymes are known to prefer low pH (55) and single-stranded DNA. The observation is also consistent with the preference of APOBEC for single-stranded DNA at replication forks and DSBs (56), and the preference of the AID DNA deaminase for single-stranded DNA formed near G-quadruplex model oligodeoxynucleotides (57). Clearly, deletion of the quadruplex forming sequences would preclude their formation, but it is also important to note that dCgdU mutagenesis would have a similar effect. For example, the i-motif is destabilized by the introduction of a damaged base (37).
Finally, in nearly all aspects, the data from the human RACK 7 region mirrors the data obtained from dog-1 deficient C. elegans (42,58). Their data rules out XPF-1 mediated excision repair or MUS-81 linked homologous recombination repair in favor of a bypass mechanism (28,59) or a replication fork merging mechanism that would generate a persistent gap on the leading strand opposite putative G-quadruplex (42,58). Subsequent replication of the region would generate double stand breaks that they showed required DNA polymerase theta (25) end joining repair to create the observed deletions (26). Although the enzymatic properties of DOG-1 do not appear to have been studied in detail, human FNACJ is known to unwind RNA:DNA hybrids (24). Moreover, R-Loops have been observed to impair replication in C. elegans (60). Hence a very similar mechanism in which dog-1 mutants fail to resolve R-Loop replication impediments is not ruled out by the data on deletions in regions conforming to the G-quadruplex folding rule in C. elegans. Models for the genesis of deletions at RACK7 also include bypass as well as template switching mechanisms that can generate the deletions associated with replication impediments at sites of non-B structure formation (Figure 8).
Figure 8. DNA synthesis at non-B DNA replication impediments. (A) Bypass synthesis or merging with a distal replication fork at a persistent Gquadruplex structure generates a persistent gapped structure that on further replication generates double strand breaks repaired by end-joining that result in deletions (26). (B) Bypass at a persistent R-Loop involving an i-motif on the single strand produces two types of persistent gaps if GQuadruplex or i-motif structures form on the looped out strand. Further replication generates two types of deletion. (C) Out of register template switching restarts the replication fork and also generates a deletion after further replication.
Comparison with G4-seq. G4-seq (61,62), provides an effective way of globally mapping G-quadruplex sequence potential, by using high-throughput next generation sequencing to detect sites that can adopt G-quadruplex structures when single-stranded DNA is exposed to K+ and G-quadruplex stabilizing ligands during sequencing in vitro. On the other hand, G4-seq does not offer a means for determining which of the sequences identified in vitro actually play a role in vivo. In this regard it is important to compare the number of sequences identified by G4-seq with the immunohistochemistry results demonstrating the presence of G-quadruplex in the nucleus of mammalian cells (63). In those experiments the highest number of fluorescent foci in the nucleus was seen during S-phase, consistent with replication dependent formation of G-quadruplex (63). Even so, only about 35 foci/nucleus were detected at any one time in S-phase. G4-seq returns more than 716,000 distinct genomic sites capable of G-quadruplex formation (61). Consequently, even if the fluorescent foci represent replication factories (64) containing several hundred growing replication forks, less than 2% of the sequences identified with G4-seq would be present at any one time during replication and most of these would be resolved properly by G-quadruplex suppression systems.
The approach we describe herein offers a much-needed method for supplementing G4-seq data by directly detecting those quadruplex forming sequences that have blocked DNA polymerase progression in vivo in a given cell type. Moreover, while G4-seq is focused on G-quadruplex, and our results support the idea that G-quadruplex formation is a major contributor to replication stress, our method also has the potential to identify other non-B structure forming sequences that induce replication stress linked helicase uncoupling.
Conclusion
All of the data presented above are consistent with the cloning model depicted in Figure 1. Moreover, elevated frequency of cloned sequences adjacent to sites of G-quadruplex forming potential in the repair compromised tumor cell line supports the idea that endogenous G-quadruplex and i-motif formation is an important source of genetic instability in this cell line. Further, the detailed analysis of the site within RACK7 from the tumor cell line strongly suggests that deletions and mutations are linked to recurrent quadruplex DNA formation at that site. This suggests that the cloning procedure we describe permits the in vivo identification of sequences that often produce DNA polymerase-helicase uncoupling during replication where the intercession of damage prone repair processes is required. As such, the method provides a valuable approach to identifying those sequences with G-quadruplex potential that actually produce replication impediments in vivo.
Conflicts of Interest
Conflicts of Interest None to be declared.
Authors’ Contributions
C. A., J. C., V. B., J. M., M. M., E. W. and M. A. conducted experiments, F.P. designed experiments, Z.A. E. W. designed and conducted experiments, S. S. S. designed and conducted experiments and wrote the paper.
Acknowledgements
This work was supported by: The Biotechnology and Biological Sciences Research Council (BB/L02229X/1) to Z.A.E.W, by a grant 5R01-CA102521 to S.S.S. from the U.S. National Cancer Institute of the National Institutes of Health, and by the Ensign Foundation. E.F.W was supported by a Wellcome Trust grant (204515/Z/16/Z). Research reported in this publication also included work performed in the Integrative Genomics and Bioinformatics Core and the Cytogenetics Core supported by the National Cancer Institute of the National Institutes of Health under award number P30CA033572. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
References
- 1.Ginno PA, Lott PL, Christensen HC, Korf I, Chedin F. R-loop formation is a distinctive characteristic of unmethylated human cpg island promoters. Mol Cell. 2012;45(6):814–825. doi: 10.1016/j.molcel.2012.01.017. PMID: 22387027. DOI: 10.1016/j.molcel.2012.01.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Huppert JL, Balasubramanian S. Prevalence of quadruplexes in the human genome. Nucleic Acids Res. 2005;33(9):2908–2916. doi: 10.1093/nar/gki609. PMID: 15914667. DOI: 10.1093/nar/gki609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Huppert JL, Balasubramanian S. G-quadruplexes in promoters throughout the human genome. Nucleic Acids Res. 2007;35(2):406–413. doi: 10.1093/nar/gkl1057. PMID: 17169996. DOI: 10.1093/nar/gkl1057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Wang G, Vasquez KM. Effects of replication and transcription on DNA structure-related genetic instability. Genes (Basel) 2017;8 (1) doi: 10.3390/genes8010017. PMID: 28067787. DOI: 10.3390/genes8010017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Day HA, Pavlou P, Waller ZA. I-motif DNA: Structure, stability and targeting with ligands. Bioorg Med Chem. 2014;22(16):4407–4418. doi: 10.1016/j.bmc.2014.05.047. PMID: 24957878. DOI: 10.1016/j.bmc.2014.05.047. [DOI] [PubMed] [Google Scholar]
- 6.Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH. Direct evidence for a g-quadruplex in a promoter region and its targeting with a small molecule to repress c-myc transcription. Proc Natl Acad Sci USA. 2002;99(18):11593–11598. doi: 10.1073/pnas.182256799. PMID: 12195017. DOI: 0.1073/pnas.182256799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Del Mundo IMA, Zewail-Foote M, Kerwin SM, Vasquez KM. Alternative DNA structure formation in the mutagenic human c-myc promoter. Nucleic Acids Res. 2017;45(8):4929–4943. doi: 10.1093/nar/gkx100. PMID: 28334873. DOI: 10.1093/nar/gkx100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Wright EP, Huppert JL, Waller ZAE. Identification of multiple genomic DNA sequences which form i-motif structures at neutral ph. Nucleic Acids Res. 2017;45(6):2951–2959. doi: 10.1093/nar/gkx090. PMID: 28180276. DOI: 10.1093/nar/gkx090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Sollier J, Stork CT, Garcia-Rubio ML, Paulsen RD, Aguilera A, Cimprich KA. Transcription-coupled nucleotide excision repair factors promote r-loop-induced genome instability. Mol Cell. 2014;56(6):777–785. doi: 10.1016/j.molcel.2014.10.020. PMID: 25435140. DOI: 10.1016/j.molcel.2014.10.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Sollier J, Cimprich KA. Breaking bad: R-loops and genome integrity. Trends Cell Biol. 2015;25(9):514–522. doi: 10.1016/j.tcb.2015.05.003. PMID: 26045257. DOI: 10.1016/j.tcb.2015.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Clark J, Smith SS. Secondary structure at a hot spot for DNA methylation in DNA from human breast cancers. Cancer Genomics Proteomics. 2008;5(5):241–251. PMID: 19129555. [PMC free article] [PubMed] [Google Scholar]
- 12.Guilbaud G, Murat P, Recolin B, Campbell BC, Maiter A, Sale JE, Balasubramanian S. Local epigenetic reprogramming induced by g-quadruplex ligands. Nat Chem. 2017;9(11):1110–1117. doi: 10.1038/nchem.2828. PMID: 29064488. DOI: 10.1038/nchem.2828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lopes J, Piazza A, Bermejo R, Kriegsman B, Colosio A, Teulade-Fichou MP, Foiani M, Nicolas A. G-quadruplex-induced instability during leading-strand replication. EMBO J. 2011;30(19):4033–4046. doi: 10.1038/emboj.2011.316. PMID: 21873979. DOI: 10.1038/emboj.2011.316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Kruisselbrink E, Guryev V, Brouwer K, Pontier DB, Cuppen E, Tijsterman M. Mutagenic capacity of endogenous g4 DNA underlies genome instability in fancj-defective c. Elegans. Curr Biol. 2008;18(12):900–905. doi: 10.1016/j.cub.2008.05.013. PMID: 18538569. DOI: 10.1016/j.cub.2008.05.013. [DOI] [PubMed] [Google Scholar]
- 15.Paeschke K, Capra JA, Zakian VA. DNA replication through g-quadruplex motifs is promoted by the saccharomyces cerevisiae pif1 DNA helicase. Cell. 2011;145(5):678–691. doi: 10.1016/j.cell.2011.04.015. PMID: 21620135. DOI: 10.1016/j.cell.2011.04.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Smith SS. Evolutionary expansion of structurally complex DNA sequences. Cancer Genomics Proteomics. 2010;7(4):207–215. PMID: 20656986. [PubMed] [Google Scholar]
- 17.Boyer AS, Grgurevic S, Cazaux C, Hoffmann JS. The human specialized DNA polymerases and non-b DNA: Vital relationships to preserve genome integrity. J Mol Biol. 2013;425(23):4767–4781. doi: 10.1016/j.jmb.2013.09.022. PMID: 24095858. DOI: 10.1016/j.jmb.2013.09.022. [DOI] [PubMed] [Google Scholar]
- 18.Fry M, Loeb LA. Human werner syndrome DNA helicase unwinds tetrahelical structures of the fragile x syndrome repeat sequence d(cgg)n. J Biol Chem. 1999;274(18):12797–12802. doi: 10.1074/jbc.274.18.12797. PMID: 10212265. DOI: 10.1074/jbc.274.18.12797. [DOI] [PubMed] [Google Scholar]
- 19.Huber MD, Lee DC, Maizels N. G4 DNA unwinding by blm and sgs1p: Substrate specificity and substrate-specific inhibition. Nucleic Acids Res. 2002;30(18):3954–3961. doi: 10.1093/nar/gkf530. PMID: 12235379. DOI: 10.1093/nar/gkf530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Wu Y, Shin-ya K, Brosh RM Jr. Fancj helicase defective in fanconia anemia and breast cancer unwinds g-quadruplex DNA to defend genomic stability. Mol Cell Biol. 2008;28(12):4116–4128. doi: 10.1128/MCB.02210-07. PMID: 218426915. DOI: 10.1128/MCB.02210-07. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Chakraborty P, Grosse F. Wrn helicase unwinds okazaki fragment-like hybrids in a reaction stimulated by the human dhx9 helicase. Nucleic Acids Res. 2010;38(14):4722–4730. doi: 10.1093/nar/gkq240. PMID: 20385589. DOI: 10.1093/nar/gkq240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Grierson PM, Lillard K, Behbehani GK, Combs KA, Bhattacharyya S, Acharya S, Groden J. Blm helicase facilitates rna polymerase i-mediated ribosomal rna transcription. Hum Mol Genet. 2012;21(5):1172–1183. doi: 10.1093/hmg/ddr545. PMID: 22106380. DOI: 10.1093/hmg/ddr545. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Chib S, Byrd AK, Raney KD. Yeast helicase pif1 unwinds rna:DNA hybrids with higher processivity than DNA:DNA duplexes. J Biol Chem. 2016;291(11):5889–5901. doi: 10.1074/jbc.M115.688648. PMID: 26733194. DOI: 10.1074/jbc.M115.688648. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Cantor S, Drapkin R, Zhang F, Lin Y, Han J, Pamidi S, Livingston DM. The brca1-associated protein bach1 is a DNA helicase targeted by clinically relevant inactivating mutations. Proc Natl Acad Sci USA. 2004;101(8):2357–2362. doi: 10.1073/pnas.0308717101. PMID: 14983014. DOI: 10.1073/pnas.0308717101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.He P, Yang W. Template and primer requirements for DNA pol theta-mediated end joining. Proc Natl Acad Sci USA. 2018;115(30):7747–7752. doi: 10.1073/pnas.1807329115. PMID: 29987024. DOI: 10.1073/pnas.1807329115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Koole W, van Schendel R, Karambelas AE, van Heteren JT, Okihara KL, Tijsterman M. A polymerase theta-dependent repair pathway suppresses extensive genomic instability at endogenous g4 DNA sites. Nat Commun. 2014;5:3216. doi: 10.1038/ncomms4216. PMID: 24496117. DOI: 10.1038/ncomms4216. [DOI] [PubMed] [Google Scholar]
- 27.Taylor MRG, Yeeles JTP. Dynamics of replication fork progression following helicase-polymerase uncoupling in eukaryotes. J Mol Biol. 2019;431:2040–2049. doi: 10.1016/j.jmb.2019.03.011. PMID: 30894292. DOI: 10.1016/j.jmb.2019.03.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Yeeles JTP, Poli J, Marians KJ, Pasero P. Rescuing stalled or damaged replication forks. Cold Spring Harb Perspect Biol. 2013;5:a012815. doi: 10.1101/cshperspect.a012815. PMID: 23637285. DOI: 10.1101/cshperspect.a012815. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Raghavan SC, Tsai A, Hsieh CL, Lieber MR. Analysis of non-b DNA structure at chromosomal sites in the mammalian genome. Methods Enzymol. 2006;409:301–316. doi: 10.1016/S0076-6879(05)09017-8. PMID: 16793408. DOI: 10.1016/S0076-6879(05)09017-8. [DOI] [PubMed] [Google Scholar]
- 30.Shevchuk T, Kretzner L, Munson K, Axume J, Clark J, Dyachenko OV, Caudill M, Buryanov Y, Smith SS. Transgene-induced ccwgg methylation does not alter cg methylation patterning in human kidney cells. Nucleic Acids Res. 2005;33(19):6124–6136. doi: 10.1093/nar/gki920. PMID: 16246913. DOI: 10.1093/nar/gki920. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Tuxhorn JA, Ayala GE, Smith MJ, Smith VC, Dang TD, Rowley DR. Reactive stroma in human prostate cancer: Induction of myofibroblast phenotype and extracellular matrix remodeling. Clin Cancer Res. 2002;8(9):2912–2923. PMID: 1231536. [PubMed] [Google Scholar]
- 32.Munson K, Clark J, Lamparska-Kupsik K, Smith SS. Recovery of bisulfite-converted genomic sequences in the methylation-sensitive qpcr. Nucleic Acids Res. 2007;35(9):2893–2903. doi: 10.1093/nar/gkm055. PMID: 17439964. DOI: 10.1093/nar/gkm055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Yu K, Roy D, Huang FT, Lieber MR. Detection and structural analysis of r-loops. Methods Enzymol. 2006;409:316–329. doi: 10.1016/S0076-6879(05)09018-X. PMID: 16793409. DOI: 10.1016/S0076-6879(05)09018-X. [DOI] [PubMed] [Google Scholar]
- 34.Katoh K, Misawa K, Kuma K, Miyata T. Mafft: A novel method for rapid multiple sequence alignment based on fast fourier transform. Nucleic Acids Res. 2002;30(14):3059–3066. doi: 10.1093/nar/gkf436. PMID: 12136088. DOI: 10.1093/nar/gkf436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Notredame C, Higgins DG, Heringa J. T-coffee: A novel method for fast and accurate multiple sequence alignment. J Mol Biol. 2000;302(1):205–217. doi: 10.1006/jmbi.2000.4042. PMID: 10964570. DOI: 10.1006/jmbi.2000.4042. [DOI] [PubMed] [Google Scholar]
- 36.Brown NP, Leroy C, Sander C. Mview: A web-compatible database search or multiple alignment viewer. Bioinformatics. 1998;14(4):380–381. doi: 10.1093/bioinformatics/14.4.380. PMID: 9632837. DOI: 10.1093/bioinformatics/14.4.380. [DOI] [PubMed] [Google Scholar]
- 37.Wright EP, Lamparska K, Smith SS, Waller ZAE. Substitution of cytosine with guanylurea decreases the stability of i-motif DNA. Biochemistry. 2017;56(36):4879–4883. doi: 10.1021/acs.biochem.7b00628. PMID: 28853563. DOI: 10.1021/acs.biochem.7b00628. [DOI] [PubMed] [Google Scholar]
- 38.Botchan M, McKenna G, Sharp PA. Cleavage of mouse DNA by a restriction enzyme as a clue to the arrangement of genes. Cold Spring Harb Symp Quant Biol. 1974;38:383–395. doi: 10.1101/sqb.1974.038.01.041. PMID: 4364785. DOI: 10.1101/sqb.1974.038.01.041. [DOI] [PubMed] [Google Scholar]
- 39.Hamer DH, Thomas CA Jr. The cleavage of Drosophila melanogaster DNA by restriction endonucleases. Chromosoma. 1975;49:243–267. DOI: 10.1007/BF00361069. [Google Scholar]
- 40.Hansel-Hertsch R, Spiegel J, Marsico G, Tannahill D, Balasubramanian S. Genome-wide mapping of endogenous g-quadruplex DNA structures by chromatin immunoprecipitation and high-throughput sequencing. Nat Protoc. 2018;13(3):551–564. doi: 10.1038/nprot.2017.150. PMID: 29470465. DOI: 10.1038/nprot.2017.150. [DOI] [PubMed] [Google Scholar]
- 41.Raghavan SC, Swanson PC, Wu X, Hsieh CL, Lieber MR. A non-b-DNA structure at the bcl-2 major breakpoint region is cleaved by the rag complex. Nature. 2004;428(6978):88–93. doi: 10.1038/nature02355. PMID: 14999286. DOI: 10.1038/nature02355. [DOI] [PubMed] [Google Scholar]
- 42.Lemmens B, van Schendel R, Tijsterman M. Mutagenic consequences of a single g-quadruplex demonstrate mitotic inheritance of DNA replication fork barriers. Nat Commun. 2015;6:8909. doi: 10.1038/ncomms9909. PMID: 26563448. DOI: 10.1038/ncomms9909. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Sanz LA, Hartono SR, Lim YW, Steyaert S, Rajpurkar A, Ginno PA, Xu X, Chedin F. Prevalent, dynamic, and conserved r-loop structures associate with specific epigenomic signatures in mammals. Mol Cell. 2016;63(1):167–178. doi: 10.1016/j.molcel.2016.05.032. PMID: 27373332. DOI: 10.1016/j.molcel.2016.05.032. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Nadel J, Athanasiadou R, Lemetre C, Wijetunga NA, Ó Broin P, Sato H, Zhang Z, Jeddeloh J, Montagna C, Golden A, Seoighe C, Greally JM. RNA:DNA hybrids in the human genome have distinctive nucleotide characteristics, chromatin composition, and transcriptional relationships. Epigenetics Chromatin. 2015;8:46. doi: 10.1186/s13072-015-0040-6. PMID: 26579211. DOI: 10.1186/s13072-015-0040-640. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Jenjaroenpun P, Wongsurawat T, Sutheeworapong S, Kuznetsov VA. R-loopdb: A database for r-loop forming sequences (rlfs) and r-loops. Nucleic Acids Res. 2017;45(D1):D119–D127. doi: 10.1093/nar/gkw1054. PMID: 27899586. DOI: 10.1093/nar/gkw1054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Chedin F. Nascent connections: R-loops and chromatin patterning. Trends Genet. 2016;32(12):828–838. doi: 10.1016/j.tig.2016.10.002. PMID: 27793359. DOI: 10.1016/j.tig.2016.10.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Cui Y, Kong D, Ghimire C, Xu C, Mao H. Mutually exclusive formation of g-quadruplex and i-motif is a general phenomenon governed by steric hindrance in duplex DNA. Biochemistry. 2016;55(15):2291–2299. doi: 10.1021/acs.biochem.6b00016. PMID: 27027664. DOI: 10.1021/acs.biochem.6b00016. [DOI] [PubMed] [Google Scholar]
- 48.Carroll AG, Voeller HJ, Sugars L, Gelmann EP. P53 oncogene mutations in three human prostate cancer cell lines. Prostate. 1993;23(2):123–134. doi: 10.1002/pros.2990230206. PMID: 8104329. DOI: 10.1002/pros.2990230206. [DOI] [PubMed] [Google Scholar]
- 49.Wang XW, Tseng A, Ellis NA, Spillare EA, Linke SP, Robles AI, Seker H, Yang Q, Hu P, Beresten S, Bemmels NA, Garfield S, Harris CC. Functional interaction of p53 and blm DNA helicase in apoptosis. J Biol Chem. 2001;276(35):32948–32955. doi: 10.1074/jbc.M103298200. PMID: 11399766. DOI: 10.1074/jbc.M103298200. [DOI] [PubMed] [Google Scholar]
- 50.Yang Q, Zhang R, Wang XW, Spillare EA, Linke SP, Subramanian D, Griffith JD, Li JL, Hickson ID, Shen JC, Loeb LA, Mazur SJ, Appella E, Brosh RM Jr., Karmakar P, Bohr VA, Harris CC. The processing of holliday junctions by blm and wrn helicases is regulated by p53. J Biol Chem. 2002;277(35):31980–31987. doi: 10.1074/jbc.M204111200. PMID: 12080066. DOI: 10.1074/jbc.M204111200. [DOI] [PubMed] [Google Scholar]
- 51.Sarkies P, Murat P, Phillips LG, Patel KJ, Balasubramanian S, Sale JE. Fancj coordinates two pathways that maintain epigenetic stability at g-quadruplex DNA. Nucleic Acids Res. 2012;40(4):1485–1498. doi: 10.1093/nar/gkr868. PMID: 22021381. DOI: 10.1093/nar/gkr868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Wang G, Vasquez KM. Impact of alternative DNA structures on DNA damage, DNA repair, and genetic instability. DNA Repair (Amst) 2014;19:143–151. doi: 10.1016/j.dnarep.2014.03.017. PMID: 24767258. DOI: 10.1016/j.dnarep.2014.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Duquette ML, Handa P, Vincent JA, Taylor AF, Maizels N. Intracellular transcription of g-rich dnas induces formation of g-loops, novel structures containing g4 DNA. Genes Dev. 2004;18(13):1618–1629. doi: 10.1101/gad.1200804. PMID: 15231739. DOI: 10.1016/j.dnarep.2014.03.017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Hoopes JI, Cortez LM, Mertz TM, Malc EP, Mieczkowski PA, Roberts SA. Apobec3a and apobec3b preferentially deaminate the lagging strand template during DNA replication. Cell Rep. 2016;14(6):1273–1282. doi: 10.1016/j.celrep.2016.01.021. PMID: 26832400. DOI: 10.1016/j.celrep.2016.01.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Ito F, Fu Y, Kao SA, Yang H, Chen XS. Family-wide comparative analysis of cytidine and methylcytidine deamination by eleven human apobec proteins. J Mol Biol. 2017;429(12):1787–1799. doi: 10.1016/j.jmb.2017.04.021. PMID: 28479091. DOI: 10.1016/j.jmb.2017.04.021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.Roberts SA, Sterling J, Thompson C, Harris S, Mav D, Shah R, Klimczak LJ, Kryukov GV, Malc E, Mieczkowski PA, Resnick MA, Gordenin DA. Clustered mutations in yeast and in human cancers can arise from damaged long single-strand DNA regions. Mol Cell. 2012;46(4):424–435. doi: 10.1016/j.molcel.2012.03.030. PMID: 22607975. DOI: 10.1016/j.molcel.2012.03.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Qiao Q, Wang L, Meng FL, Hwang JK, Alt FW, Wu H. Aid recognizes structured DNA for class switch recombination. Mol Cell. 2017;67(3):361–373 e364. doi: 10.1016/j.molcel.2017.06.034. PMID: 28757211. DOI: 10.1016/j.molcel.2017.06.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.van Kregten M, Tijsterman M. The repair of g-quadruplex-induced DNA damage. Exp Cell Res. 2014;329(1):178–183. doi: 10.1016/j.yexcr.2014.08.038. PMID: 25193076. DOI: 10.1016/j.yexcr.2014.08.038. [DOI] [PubMed] [Google Scholar]
- 59.Langston LD, O’Donnell M. DNA replication: Keep moving and don’t mind the gap. Mol Cell. 2006;23(2):155–160. doi: 10.1016/j.molcel.2006.05.034. PMID: 16857582. DOI: 10.1016/j.molcel.2006.05.034. [DOI] [PubMed] [Google Scholar]
- 60.Castellano-Pozo M, Garcia-Muse T, Aguilera A. R-loops cause replication impairment and genome instability during meiosis. EMBO Rep. 2012;13(10):923–929. doi: 10.1038/embor.2012.119. PMID: 22878416. DOI: 10.1038/embor.2012.119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Chambers VS, Marsico G, Boutell JM, Di Antonio M, Smith GP, Balasubramanian S. High-throughput sequencing of DNA g-quadruplex structures in the human genome. Nat Biotechnol. 2015;33(8):877–881. doi: 10.1038/nbt.3295. PMID: 26192317. DOII: 10.1038/nbt.3295. [DOI] [PubMed] [Google Scholar]
- 62.Marsico G, Chambers VS, Sahakyan AB, McCauley P, Boutell JM, Antonio MD, Balasubramanian S. Whole genome experimental maps of DNA g-quadruplexes in multiple species. Nucleic Acids Res. 2019;47(8):3862–3874. doi: 10.1093/nar/gkz179. PMID: 308926626. DOI: 10.1093/nar/gkz179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Biffi GT, Tannahill D, McCafferty J, Balasubramanian S. Quantitative visualization of DNA g-quadruplex structures in human cells. Nat Chem. 2013;5:182–186. doi: 10.1038/nchem.1548. PMID: 23422559. DOI: 10.1038/nchem.1548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Masai H, Matsumoto S, You Z, Yoshizawa-Sugata N, Oda M. Eukaryotic chromosome DNA replication: Where, when, and how. Annu Rev Biochem. 2010;79:89–130. doi: 10.1146/annurev.biochem.052308.103205. PMID: 20373915. DOI: 10.1146/annurev.biochem.052308.103205. [DOI] [PubMed] [Google Scholar]











