Skip to main content
Elsevier Sponsored Documents logoLink to Elsevier Sponsored Documents
. 2020 Jun 4;78(5):975–985.e7. doi: 10.1016/j.molcel.2020.03.027

Genome-wide Nucleotide-Resolution Mapping of DNA Replication Patterns, Single-Strand Breaks, and Lesions by GLOE-Seq

Annie M Sriramachandran 1, Giuseppe Petrosino 1, María Méndez-Lago 1, Axel J Schäfer 1, Liliana S Batista-Nascimento 1, Nicola Zilio 1,, Helle D Ulrich 1,2,∗∗
PMCID: PMC7276987  PMID: 32320643

Summary

DNA single-strand breaks (SSBs) are among the most common lesions in the genome, arising spontaneously and as intermediates of many DNA transactions. Nevertheless, in contrast to double-strand breaks (DSBs), their distribution in the genome has hardly been addressed in a meaningful way. We now present a technique based on genome-wide ligation of 3′-OH ends followed by sequencing (GLOE-Seq) and an associated computational pipeline designed for capturing SSBs but versatile enough to be applied to any lesion convertible into a free 3′-OH terminus. We demonstrate its applicability to mapping of Okazaki fragments without prior size selection and provide insight into the relative contributions of DNA ligase 1 and ligase 3 to Okazaki fragment maturation in human cells. In addition, our analysis reveals biases and asymmetries in the distribution of spontaneous SSBs in yeast and human chromatin, distinct from the patterns of DSBs.

Keywords: DNA replication, DNA repair, DNA damage, DNA single-strand breaks, Okazaki fragments, genome-wide DNA lesion mapping, next-generation sequencing

Graphical Abstract

graphic file with name fx1.jpg

Highlights

  • GLOE-Seq detects 3′-OH ends with nucleotide resolution in purified genomic DNA

  • GLOE-Seq maps single-strand breaks, lesions, and replication and repair intermediates

  • GLOE-Seq reveals insight into the use of ligases 1 and 3 in human cells

  • GLOE-Seq detects asymmetries in spontaneous nicks in yeast and human chromatin


We present a method for genome-wide, nucleotide-resolution mapping of DNA single-strand breaks in purified genomic DNA based on capture of 3′-OH ends followed by sequencing (GLOE-Seq). We validate the method and demonstrate its applicability to mapping of human and yeast Okazaki fragments, spontaneous single-strand breaks, and various other DNA lesions.

Introduction

Our understanding of the DNA damage response and the protective measures cells employ to resist genomic instability and malignant transformation relies to a large extent on analytical methods to detect and quantify DNA lesions in chromatin. Next-generation sequencing (NGS) tools for mapping DNA damage have transformed genome stability research by providing insight into the genome-wide distribution and dynamics of lesions at unprecedented resolution. A growing number of NGS protocols is already available for analysis of base lesions such as abasic sites, oxidative or methylation damage, and incorporated ribonucleotides (Clausen et al., 2015, Ding et al., 2015, Hu et al., 2017, Liu et al., 2019, Mao et al., 2017, Poetsch et al., 2018, Reijns et al., 2015, Wu et al., 2018) and for DNA double-strand breaks (DSBs) (Baranello et al., 2014, Canela et al., 2016, Crosetto et al., 2013, Hoffman et al., 2015, Hu et al., 2016, Lensing et al., 2016, Tsai et al., 2015, Vitelli et al., 2017, Yan et al., 2017). However, even with these established procedures, important information about the 3′ termini of DSBs remains untapped because of the need for end polishing during library preparation. In contrast to DSBs, single-strand breaks (SSBs), which are among the most common lesions and emerge in the genome spontaneously or as important intermediates of DNA replication and repair (Abbotts and Wilson, 2017, Caldecott, 2014), have, until recently, eluded high-resolution mapping by NGS approaches.

We developed a method based on capturing SSBs via genome-wide ligation of 3′-hydroxy (OH) ends followed by sequencing (GLOE-Seq). When applied directly to genomic DNA, this tool is capable of detecting nicks or gaps as well as the 3′ ends of DSBs. Alternatively, pre-digestion of the isolated DNA with a suitable lesion-specific endonuclease allows mapping of a variety of lesions, such as ultraviolet (UV) irradiation-induced pyrimidine dimers, abasic sites, or incorporated ribonucleotides. After assessing the accuracy and sensitivity of GLOE-Seq with in vitro-digested DNA, we benchmarked its performance by comparison with an established method for mapping of base lesions and demonstrated its applicability to physiological DNA damage and repair intermediates introduced in vivo by UV irradiation or an alkylating agent and by a site-specific endonuclease in budding yeast. We then explored the unique feature of GLOE-Seq, its ability to map pre-existing SSBs, by analyzing replication patterns as well as spontaneous breaks and nicks in budding yeast and human cells. We show that GLOE-Seq can accurately map Okazaki fragments without prior size selection, and we detect surprising biases in the distribution of spontaneous strand breaks, non-random and distinct from the pattern observed with DSB-selective methods. Our analysis provides insight into the relative contributions of DNA ligase 1 and ligase 3 to human Okazaki fragment maturation and validates GLOE-Seq as a versatile method for genome-wide mapping of a range of DNA lesions that promises to shed light onto the still poorly understood characteristics of SSBs in the genome.

Design

Most protocols for mapping DSBs rely on direct ligation of sequencing adaptors to in-vitro-blunted ends (Canela et al., 2016, Crosetto et al., 2013, Lensing et al., 2016, Yan et al., 2017). By comparison, it is less straightforward to capture SSBs with nucleotide resolution. Methods involving labeling via nick translation (Baranello et al., 2014) or polymerase tailing for adaptor ligation (Leduc et al., 2011) may obscure the original position of the terminus, reducing resolution. Direct ligation of short single-stranded DNA (ssDNA) fragments can be accomplished by RNA ligase and has been used for sequencing highly degraded DNA; for example, upon isolation from ancient samples (Gansauge and Meyer, 2013). Adaptor ligation was further improved by means of a splinter oligonucleotide harboring a stretch of random nucleotides that allows use of T4 DNA ligase (Gansauge et al., 2017). We explored whether an analogous approach could be applied to heat-denatured, intact DNA for genome-wide mapping of nicks and breaks with nucleotide resolution. In the optimized protocol (Figures 1A and S1), fragmentation of genomic DNA is delayed until after ligation of a biotinylated adaptor. To this end, all steps up to ligation, including cell lysis, are carried out in agarose plugs when applying GLOE-Seq to mammalian DNA. In this manner, SSBs are captured with minimal background. By applying an appropriate enzymatic treatment to the purified genomic DNA before thermal denaturation, nicks can also be generated at relevant positions to mark various types of base damage. For example, treatment with apurinic endonuclease (APE1) would introduce SSBs adjacent to abasic sites, whereas RNase H treatment would mark the positions of ribonucleotides in the genome. Thus, GLOE-Seq should be applicable to analysis of not only pre-existing SSBs but also any base lesion or modification for which a selective endonuclease is available. By developing a computational pipeline that allows mapping of 3′ termini or base lesions, we assembled a ready-to-use platform for downstream data analysis.

Figure 1.

Figure 1

Validation of GLOE-Seq with Budding Yeast Genomic DNA

(A) GLOE-Seq workflow. Green circle, ligatable 3′-OH terminus; red circle, biotin.

(B) 3′ ends of a DSB at a BsrDI site in genome browser view (FWD, forward or Watson strand; REV, reverse or Crick strand).

(C) Histogram plot showing the distribution of read counts for the asymmetric termini generated by BsrDI.

(D) Strand-specific detection of SSBs generated by Nb.BsrDI treatment.

(E) Genome-wide detection of predicted Nb.BsrDI sites (left). Most undetected sites are absent or poorly covered in a randomly fragmented sample (center). Many of the unexpected breaks reside in the immediate neighborhood of predicted sites (right).

(F) Closely spaced SSBs are poorly detected. Nb.BsrDI signals are plotted against the calculated distance to the (upstream) neighboring site on the same strand (red, mean; pink, SE).

(G) Sensitivity of GLOE-Seq. Nb.BsrDI-treated DNA was diluted with untreated DNA at the indicated ratios. The undiluted sample corresponds to (E).

Results

Validation of the GLOE-Seq Protocol

The resolution, specificity, and sensitivity of GLOE-Seq were addressed in proof-of-principle experiments using purified S. cerevisiae genomic DNA. GLOE-Seq libraries were sequenced at a depth of ~3 million reads in two replicates. To facilitate data analysis, we developed an easy-to-use, modular, and versatile computational pipeline called GLOE-Pipe. It detects, annotates, and visualizes strand breaks by assigning each uniquely mapping read to the corresponding original 3′ terminus. Direct inspection of reads from samples digested in vitro with a restriction endonuclease, BsrDI, revealed precise assignment of 3′ termini to the expected sequence (Figure 1B). Because BsrDI cleaves asymmetrically adjacent to its recognition sequence, digestion of genomic DNA yields two populations of termini with distinct complexities. Nevertheless, both populations exhibited an almost identical read distribution, confirming unbiased detection of termini (Figure 1C). Digestion with the analogous nicking enzyme Nb.BsrDI resulted in the expected strand-specific signals (Figure 1D), corresponding to more than 60% of total reads. Automatic peak-calling detected more than 90% of the predicted 6,271 sites in the yeast genome (Figure 1E, left). Among the undetected sites, more than 60% were either absent in our strain or resided in regions that did not sequence efficiently when randomly sheared DNA was used to prepare an equivalent library (Figure 1E, center). Moreover, Nb.BsrDI sites situated less than 100 nt downstream (3′) of a neighboring nick were detected with poor efficiency, presumably because of size selection during library preparation (Figure 1F). Peak calling also resulted in a small number of unexpected signals (Figure 1E, left). More than 50% of these were detected within 5 nt of a predicted Nb.BsrDI site (Figure 1E, right), indicating that they likely resulted from imprecise nicking or erosion of the termini rather than sequencing or computational artifacts.

The sensitivity of the method was assessed by diluting Nb.BsrDI-treated with untreated DNA at defined ratios (Figures S2A–S2C). All samples had been digested with NotI to generate a set of 80 defined peaks for standardization. As shown in Figure S2D, the number of detected Nb.BsrDI signals compared well with the numbers expected based on the dilution. At a sequencing depth of ~3 million reads, 1,000-fold dilution still resulted in a reproducible signal sufficient to provide good coverage of Nb.BsrDI sites (Figure 1G) and excellent correlation between replicates (r2 > 0.86; Figure S2A). A higher sequencing depth could potentially provide additional sensitivity. These results confirm that GLOE-Seq is capable of selectively detecting even relatively rare events against the background of spontaneous SSBs in the genome.

Comparison of GLOE-Seq with EndoSeq

GLOE-Seq resembles a method for detection of ribonucleotides incorporated into the genome called embedded ribonucleotide sequencing, or emRiboSeq (Ding et al., 2015, Reijns et al., 2015). A generalized version of this protocol, termed endonuclease sequencing (EndoSeq), has been applied to mapping of in vitro endonuclease-generated termini (Ding et al., 2015, Reijns et al., 2015), making it directly comparable with GLOE-Seq (Figure S3A). However, although both methods make use of a splinter oligonucleotide for capturing 3′-OH ends, GLOE-Seq critically relies on ligation of the biotinylated adaptor prior to any fragmentation, whereas in EndoSeq, fragmentation and ligation of the distal adaptor precede endonuclease treatment and denaturation (Ding et al., 2015, Reijns et al., 2015). Comparison of the two protocols on the same NGS platform revealed a higher percentage of detected sites and of reads mapped to predicted Nb.BsrDI sites by GLOE-Seq (Figure 2). To exclude sample preparation details in our hands as a source of this difference, we also compared Nb.BtsI sites mapped by GLOE-Seq with a published EndoSeq dataset (Reijns et al., 2015). Again, GLOE-Seq proved to be superior in terms of coverage and signal-to-background ratio (Figures S3B–S3D).

Figure 2.

Figure 2

Comparison of GLOE-Seq with EndoSeq

(A) A scatterplot shows the normalized numbers of reads at all Nb.BsrDI sites for both methods, grouped by peak calling.

(B) GLOE-Seq and EndoSeq detect a similar percentage of Nb.BsrDI sites, whereas pre-existing breaks are poorly detected by EndoSeq.

(C) Comparison of the percentage of total reads mapped to predicted Nb.BsrDI sites.

In contrast to EndoSeq, which was not designed to capture pre-existing nicks, the GLOE-Seq procedure minimizes fragmentation of genomic DNA up until ligation of the splinter adaptor. In this manner, GLOE-Seq preserves the original pattern of SSBs, whereas in the EndoSeq protocol, pre-existing nicks would likely be lost if fragmentation occurs preferentially at these structures. To compare the performance of the two methods in this regard, we applied EndoSeq to DNA that had been digested with a nicking enzyme prior to fragmentation. As expected, break signals were almost undetectable, indicating that fragmentation indeed occurred predominantly at the pre-existing SSBs, effectively preventing use of EndoSeq for mapping them (Figures 2B and 2C). In summary, these experiments validate GLOE-Seq as an efficient and sensitive method uniquely suited for genome-wide mapping of SSBs with nucleotide resolution.

Genome-wide Distribution of Breaks in Budding Yeast

For an assessment of GLOE-Seq in a physiological setting, we compared the break pattern in intact genomic yeast DNA with a sample that was heavily fragmented by sonication, representing a close-to-random distribution of breaks. In addition to excluding repetitive elements, reads were assigned with high stringency by disallowing any mismatches. Two salient features emerged from this analysis: a striking enrichment of break signals adjacent to centromere sequences (Figure 3A) and a number of well-defined peaks close to some chromosome ends, some of them displaying a high degree of strand bias (Figures 3B and S4). Although the centromere-associated breaks could result from topological stress or topoisomerase activity related to chromosome segregation, the subtelomeric patterns were unexpected and deserve further investigation.

Figure 3.

Figure 3

GLOE-Seq Analysis of DNA Lesions and Repair Intermediates in Budding Yeast

(A) Breaks are enriched around yeast centromeres. GLOE-Seq signals of untreated and randomly fragmented DNA were averaged across all 16 centromere regions and plotted.

(B) Breaks are enriched at yeast chromosome ends. Strand-specific GLOE-Seq signals of untreated and randomly fragmented DNA are shown at a representative subtelomeric region. See Figure S4 for an image of all chromosome ends.

(C) GLOE-Seq detects the 3′ ends of a DSB, generated in vivo by galactose (GAL)-mediated induction of the HO endonuclease in yeast. Both panels show normalized numbers of reads around the HO cleavage site in a genome browser view. Left panel: linear scale at high magnification; right panel: logarithmic (log2) scale at lower magnification.

(D) GLOE-Seq detects UV irradiation-induced pyrimidine dimers in yeast. Exponentially growing yeast cultures were exposed to the indicated doses of UV radiation, and lesions were converted to strand breaks by pre-treatment of isolated genomic DNA with T4 endonuclease V and APE1 where indicated. Plots show relative frequencies of dinucleotide sequences adjacent to the detected strand breaks.

(E) GLOE-Seq detects alkylation-induced base damage in yeast. G1-arrested WT and apn1Δ apn2Δ cells were exposed to 0.02% MMS for 30 min and released into S phase in the absence of MMS. Genomic DNA was isolated from samples collected at the indicated time points, and base lesions were converted to strand breaks by pre-treatment with AAG and APE1. Plots show relative nucleotide frequencies over time during the recovery period.

(F) GLOE-Seq detects BER intermediates in yeast. Strand breaks were detected in the same samples of genomic DNA as in (E) by GLOE-Seq without AAG/APE1 pre-treatment, and relative nucleotide frequencies were plotted as in (E).

To generate site-specific break signals in live cells, we used a strain carrying a galactose-inducible allele of the homothallic switching (HO) endonuclease (Lee et al., 1998). Induction for 1 h gave rise to prominent signals at the expected sequence on both strands (Figure 3C). Moreover, a population of 3′ termini was clearly detectable, revealing loss of a few nucleotides from each terminus. Thus, in contrast to DSB-selective methods, GLOE-Seq is capable of visualizing 3′ overhangs of DSBs with high precision.

Genome-wide Mapping of Base Lesions and Repair Intermediates in Budding Yeast

The potential to map base lesions by GLOE-Seq was explored by UV irradiation of live yeast and treatment of isolated genomic DNA with T4 endonuclease V, followed by APE1 endonuclease, to convert UV lesions to 3′-OH termini before adaptor ligation. As shown in Figures 3D and S5A, this treatment revealed a dose-dependent increase in the relative frequencies of break signals adjacent to pyrimidine dimers, with T-T representing the most frequent lesion, followed by T-C, C-T, and C-C.

In addition, we used exposure to the alkylating agent methyl methanesulfonate (MMS) to determine whether GLOE-Seq can detect base lesions and the resulting repair intermediates in a single experiment. MMS predominantly generates N7-methylguanine (m7dG) in addition to ~10% N3-methyladenine (Beranek, 1990), which are removed by base excision repair (BER) and nucleotide excision repair (Xiao and Chow, 1998, Plosky et al., 2002). In vitro treatment of genomic DNA from MMS-exposed cells with human alkyl adenine DNA glycosylase (AAG) and APE1 should create nicks at alkylated purines, marking the damaged bases themselves, whereas omission of this pre-treatment should reveal endogenous nicks representing BER intermediates. Consequently, GLOE-Seq analysis of AAG/APE1-treated genomic DNA showed strong MMS-induced enrichment of SSBs adjacent to G at the expense of C and T, likely representing m7dG (Figure 3E). This enrichment decreased in samples taken during a recovery period, suggesting in vivo repair activity. Deletion of the two major apurinic endonuclease genes, APN1 and APN2, strongly interfered with recovery (Figure 3E) and activated damage signaling (Figure S5B), indicating that BER was indeed responsible for removal of the damaged bases. GLOE-Seq samples without AAG/APE1 pre-treatment showed a similar skew of the nucleotide distribution toward G (Figure 3F). Importantly, this MMS-induced imbalance was almost absent in the apn1Δ apn2Δ mutant, consistent with failure of this strain to initiate BER. These results show that GLOE-Seq can detect biologically relevant base lesions as well as DNA repair intermediates with nucleotide precision.

Analysis of DNA Replication Patterns in Budding Yeast

An important physiological source of nicks are the termini of Okazaki fragments in replicating cells. Usually, these are quickly ligated, but depletion of the replicative DNA ligase Cdc9 has allowed their mapping in budding yeast (Smith and Whitehouse, 2012). To accomplish this, fragments of the expected size of around 200 nt had to be isolated for sequencing. We wanted to find out whether GLOE-Seq would enable us to map Okazaki fragments in an unbiased manner without prior size selection. Genomic DNA was isolated from cdc9AID cells harboring a degron-tagged CDC9 allele (Kubota et al., 2013). GLOE-Seq analysis revealed the expected lagging-strand bias of break signals around replication origins (Figure 4A) and phasing of the reads correlating with nucleosome positioning around transcription start sites (Figure 4B), comparable with the results obtained by sequencing size-selected fragments (Smith and Whitehouse, 2012). Plotting of the replication fork directionality (RFD) index, representing the relative strand bias of the break signals, confirmed that GLOE-Seq is suitable for identifying the positions of replication initiation sites and termination zones in a genome-wide manner under conditions of ligase inhibition (Figure 4C). Intriguingly, RFD analysis of samples from ligase-proficient cells revealed a complementary but less prominent pattern with an excess of nicks in sequences corresponding to the leading rather than the lagging strand (Figure 4C). This pattern likely reflects processing of ribonucleotides naturally incorporated into newly synthesized DNA by replicative polymerases, considering that the leading-strand polymerase ε incorporates four times more ribonucleotides than the lagging-strand polymerase δ (Nick McElhinny et al., 2010).

Figure 4.

Figure 4

GLOE-Seq Analysis of DNA Replication Patterns in Budding Yeast

(A) GLOE-Seq detects enrichment of breaks on the predicted lagging strand around ARS (autonomously replicating sequence) consensus sequences (ACS) in ligase-depleted (cdc9AID) samples. §, data from Smith and Whitehouse (2012), shown for comparison.

(B) The distribution of 3′ ends correlates with nucleosome occupancy around transcription start sites (TSSs). Strand-specific GLOE-Seq signals corresponding to Okazaki fragments synthesized in the same (green) and the opposite (orange) direction as transcription (left to right) are aligned around TSSs together with nucleosome occupancy (black) and binding sites of the transcription factors Abf1, Reb1, and Rap1 (combined, gray) as in Smith and Whitehouse (2012).

(C) RFD (RFD = [REV – FWD]/[REV + FWD]) plots of yeast GLOE-Seq data reveal replication patterns by means of strand bias in the break distribution upon ligase depletion. A complementary pattern in wild-type (WT) samples indicates break enrichment on the predicted leading strand, while randomly fragmented DNA shows no bias. §: data from (Smith and Whitehouse, 2012), shown for comparison.

Genome-wide Distribution of SSBs Relative to Transcription Units in Human Cells

To expand the versatility of GLOE-Seq, we optimized the protocol for mammalian cells by performing all steps up to and including adaptor ligation on agarose-embedded nuclei (Figure S6A). This approach minimizes direct handling, and therefore potential breakage, of the very large genomic DNA of higher eukaryotes during extraction. Comparison of break distribution along coding regions using GLOE-Seq with published DSB-specific end sequencing (END-seq) data (Tubbs et al., 2018) from ligase-proficient HTC116 cells (Figure 5) showed that, unlike DSBs, reported to accumulate around the transcription start sites (TSSs) of transcribed genes, SSBs were underrepresented at this location, also in a transcription-dependent manner. Conversely, both SSBs and DSBs locally peaked around the transcription termination sites (TTSs) of transcribed genes. However, the overall enrichment along transcribed genes that is observable for DSBs was much less pronounced for GLOE-Seq signals. Considering that the latter arise from both SSBs and DSBs, the overall depletion of breaks around TSS illustrates the dominance of SSBs compared with DSBs even at sites where DSBs are enriched relative to other regions. At the same time, our data clearly show that the pattern of SSBs significantly differs from that of DSBs along a transcribed gene.

Figure 5.

Figure 5

Effects of Transcription on the Distribution of SSBs and DSBs in Human Cells

Deviations of break signals from the genome-wide average reveal differences between transcribed and non-transcribed genes. Top: GLOE-Seq signals (representing SSBs and DSBs). Bottom: END-seq signals from Tubbs et al. (2018) (representing only DSBs). Break levels for TSSs, transcription termination sites (TTSs), and 5′ and 3′ exon junctions represent averages of 1-bp-stepped 1-kbp-long sliding windows. Damage levels for 5′ and 3′ UTRs, exons, and introns represent averages of 1-kbp-long non-overlapping bins after concatenation of values (STAR Methods).

Analysis of DNA Replication Patterns in Human Cells

To further test the efficiency of GLOE-Seq in the context of the human genome, we analyzed the distribution of human Okazaki fragments. In mammalian cells, such an analysis has been achieved by Okazaki fragment sequencing (OK-Seq) (Petryk et al., 2016), which uses large amounts of input material (2–3 × 108 cells) and multi-step enrichment of relevant DNA fragments that involves labeling and affinity purification of nascent DNA in combination with size fractionation. We followed a strategy of inactivating Okazaki fragment ligation analogous to the yeast system by small interfering RNA (siRNA)-mediated depletion of ligase 1 and deletion of the nuclear form of ligase 3 in HCT116 cells (Oh et al., 2014; Figures 6A and S6B). GLOE-Seq analysis revealed a pattern similar to published OK-Seq profiles, although with lower efficiency (Figure 6B). The difference may well result from activation of the DNA damage response in ligase-deficient cells, which would suppress late-origin firing and eventually lead to triggering of dormant origins (Alver et al., 2014, Karnani and Dutta, 2011). Indeed, correlation of replication timing with RFD plots indicated that ligase inactivation predominantly affected the efficiency of late-firing origins. Moreover, accumulation of cells in G2 phase (Figure 6C) and phosphorylation of the checkpoint kinase CHK1 (Figure 6A) confirmed activation of ataxia telangiectasia-related protein (ATR)-dependent damage signaling. In contrast, inactivation of ligase 1 or ligase 3 separately caused at most marginal checkpoint activation and no significant disturbance of the cell cycle. We therefore wanted to find out whether Okazaki fragments would still be detectable under those milder conditions. As shown in Figure 6B, deletion of nuclear ligase 3 did not yield a discernible replication pattern in RFD plots, but depletion of ligase 1 resulted in a defined profile with higher amplitude peaks than upon complete ligase inactivation, resembling the published OK-Seq profiles. As observed in yeast, RFD plots from ligase-proficient HCT116 cells revealed a small but detectable signal bias toward the leading strand, suggesting that ribonucleotide incorporation significantly contributes to spontaneous SSBs in vertebrate cells as well (Figure 6D).

Figure 6.

Figure 6

GLOE-Seq Analysis of DNA Replication Patterns in Human Cells

(A) Western blot images showing CHK1 phosphorylated at Ser 345, total CHK1, ligase 1, and tubulin (loading control) in whole-cell extracts prepared from HCT116 WT and LIG3−/−:mL3 cells treated with an unspecific control (siCTRL) or a ligase 1-specific siRNA (siLIG1). Treatment with 25 J/m−2 UV irradiation served as a control for checkpoint activation.

(B) RFD plots of GLOE-Seq data from HCT116 cells reproduce replication patterns under conditions of DNA ligase inactivation (HCT116 WT and LIG3−/−:mL3 cells treated with siCTRL or siLIG1, as in A). Data represent averages of two independent experiments. Replication timing (S50, dashed lines) was modeled in HCT116 WT using genome-wide DNase I hypersensitivity data from the ENCODE project (Data S1). Arrowheads indicate early and strong replication initiation zones. §, OK-Seq-derived RFD data and S50 profiles (dashed lines) from Petryk et al. (2016), shown for comparison.

(C) Flow cytometry analysis of nuclei prepared for GLOE-Seq from the same set of cells as in (A).

(D) Opposite strand biases in the break patterns of HCT116 ligase-competent (WT) versus ligase-deficient (LIG3−/−:mL3 + siLIG1) cells, illustrated by RFD plots.

Importantly, these data were produced with 300–400 times fewer cells than the number required for OK-Seq (~700,000 versus 2–3 × 108 cells) (Petryk et al., 2016) and without any size selection. Except for termination zones that show dispersed signals, progressive downsampling of a full dataset from ligase 1-inactivated cells (~310 million reads) revealed marginal loss of resolution in RFD profiles down to ~50 million reads (Figure S6C), showing that GLOE-Seq is a highly sensitive technique. In combination with ligase 1 depletion, it is a practicable and potentially less cumbersome alternative to OK-Seq.

Discussion

Unique Features of GLOE-Seq

In this study, we present a versatile tool that complements other methods for mapping DNA damage and extends the range of structures accessible to analysis by NGS. The accompanying computational pipeline, GLOE-Pipe, allows straightforward analysis of the data associated with this method. Unlike common procedures for analysis of DSBs, such as breaks labeling, enrichment on streptavidin and next generation sequencing (BLESS) or breaks labeling in situ and sequencing (BLISS) (Crosetto et al., 2013, Yan et al., 2017), END-Seq (Canela et al., 2016), or DSBCapture (Lensing et al., 2016), GLOE-Seq allows the detection of 3′ ends when applied to DSBs. It is flexible enough for application to base lesions and incorporated ribonucleotides. In this mode, it resembles established methods such as EndoSeq (Ding et al., 2015, Reijns et al., 2015) or Click-Code-Seq (Wu et al., 2018) but is more general than tools designed for specific lesions, such as ribonucleotides (Clausen et al., 2015) or abasic sites (Liu et al., 2019, Poetsch et al., 2018). Its most important and distinguishing characteristic, however, is its applicability to pre-existing SSBs. This feature has not only enabled us to map Okazaki fragments without prior size selection but it has also given insight into the overall distribution of spontaneous SSBs in the budding yeast and the mammalian genome, revealing their non-random nature.

Earlier reports of potentially SSB-selective analyses (Baranello et al., 2014, Leduc et al., 2011) have furnished no validation regarding resolution, sensitivity, or possible sequence bias. Although a strategy called damaged DNA immunoprecipitation (dDIP), involving end capture by TUNEL (terminal deoxynucleotidyl transferase-mediated biotin-deoxyuridine triphosphate labeling), could, in principle, provide nucleotide resolution for mapping of nicks and DSBs (Leduc et al., 2011), this approach has not been developed to the sequencing stage and cannot distinguish between SSBs and DSBs. An alternative method called single-strand break sequencing (SSB-Seq) (Baranello et al., 2014), based on labeling of SSBs via nick translation followed by immunoprecipitation, has generated low-resolution data for distribution of breaks in etoposide-treated cells but has not been able to yield strand-specific information.

Very recently, a promising method named SSiNGLe (SSB mapping at Nucleotide Genome Level) has been reported (Cao et al., 2019). It resembles GLOE-Seq in its use of 3′-OH ends to map SSBs but employs poly-dA tailing by terminal transferase for capture. Because of this feature, the procedure is particularly suited for single-molecule sequencing without amplification on a Helicos platform, which uses oligo-dT for immobilization of the templates. At the same time, the repetitive signals generated from the tails complicate use of the more common Illumina system. Moreover, fragmentation is achieved by means of micrococcal nuclease digestion at an early stage, producing unligatable 3′-PO4 termini. Unlike SSiNGLe, GLOE-Seq can map such structures by pre-treatment with phosphatase. SSiNGLE has been used to analyze the distribution of SSBs in response to a panel of anti-cancer drugs to follow DNA breakage in the early stages of apoptosis, and it has revealed correlations of the “SSB breakome” with transcribed and regulatory regions, evolutionarily less conserved regions, and topoisomerase IIA cleavage clusters (Cao et al., 2019). GLOE-Seq, as a turnkey combination of molecular and computational pipelines, is an alternative approach we validated as particularly useful for straightforward and effective analysis of genome-wide replication patterns.

GLOE-Seq Provides Insight into Replication-Associated SSBs in Yeast and Human Cells

Despite their abundance in the genome, SSBs have, until recently, eluded systematic genome-wide analysis because of a lack of high-resolution methods for their analysis. The GLOE-Seq strategy of selectively capturing 3′-OH ends in thermally denatured but otherwise intact DNA fills this need. GLOE-Seq analysis of undamaged DNA from budding yeast and human cells has already provided us a glimpse into the genome-wide distribution of spontaneous SSBs as ubiquitous but understudied structures. Their non-random patterns highlight the importance of investigating the underlying mechanisms responsible for their formation and processing.

The most striking feature emerging from our analysis of unperturbed yeast and human cells is a clear bias of spontaneous SSBs toward the leading strand following the activity profile of polymerase ε (Figures 4C and 6D). Considering the propensity of this enzyme to incorporate ribonucleotides, the observed RFD profiles suggest that repair intermediates at ribonucleotides are a major source of spontaneous SSBs in yeast and human cells. Comparison of the approximate Okazaki fragment length (~200 nt) with the frequency of ribonucleotide incorporation by polymerase ε (~1 in 1,250 nt) (Nick McElhinny et al., 2010) implies that SSBs should emerge on the lagging strand with a more than 6-fold higher frequency than on the leading strand. In light of the leading strand bias in RFD plots from unperturbed cells, it follows that the nicks derived from processing of ribonucleotides must be significantly more persistent in the genome than unligated Okazaki fragments.

Further insight into Okazaki fragment maturation in human cells comes from our GLOE-Seq analysis under conditions of DNA ligase inactivation. The strong lagging-strand bias in RFD plots upon ligase 1 depletion (Figure 6B) clearly confirms this enzyme as the principal replicative DNA ligase. It also shows that ligase 3 acts in an important back-up function in the absence of ligase 1, maintaining cell proliferation and preventing major activation of the damage response despite significant accumulation of unligated Okazaki fragments. An alternative Okazaki fragment ligation pathway mediated by ligase 3 has been reported in chicken DT40 cells (Arakawa et al., 2012). From our results, we conclude that, in human cells, this pathway proceeds with significantly slower kinetics than the standard ligase 1-mediated reaction. However, the absence of a clear leading-strand bias in the Lig3−/−:mL3 cell line suggests a non-negligible contribution of ligase 3 to Okazaki fragment maturation even when ligase 1 is functional.

Potential Applications of GLOE-Seq

For the future, we foresee numerous applications of GLOE-Seq to probe the effects of medically relevant factors or treatments associated with SSBs, such as PARP1, topoisomerases, proteins involved in homologous recombination or protein-DNA crosslink repair, as well as pertinent inhibitors of such factors; e.g., olaparib, camptothecin, or mirin (Abbotts and Wilson, 2017, Caldecott, 2014). The notion that etoposide-mediated inhibition of topoisomerase II predominantly results in SSBs rather than DSBs (Muslimović et al., 2009) illustrates the necessity to differentiate between these two important lesions and their spatial distribution when assessing the actions of commonly used therapeutic agents. Such differentiation is made possible by combining GLOE-Seq with END-seq or BLISS analysis within the same experiment. In the context of replication, GLOE-Seq should be applicable to mapping not only Okazaki fragments but also the distribution of postreplicative daughter-strand gaps arising from replication of damaged templates (Wong et al., 2020). Pre-treatment with various enzymes should give insight into additional lesions in the genome. For example, signatures of covalent topoisomerase adducts could be traced by tyrosyl-DNA phosphodiesterases (Pommier et al., 2014) and 3′-PO4 termini revealed by phosphatase treatment. Absolute quantification of break signals can be accomplished by pre-digestion with rare-cutting restriction endonucleases that would generate a defined set of breaks as internal standards (Figures S2C and S2D). Last but not least, the nucleotide precision of GLOE-Seq may be employed for evaluating the selectivity of emerging techniques in genome engineering not involving DSBs, such as Cas9 nickase-dependent base-editing systems (Eid et al., 2018).

Limitations

One of the limitations inherent in any method for mapping SSBs appears to be an inevitable background of spontaneous nicks in genomic DNA. In our validation experiments, this has obscured sequence-specific signals present at a frequency of lower than 0.1% (Figure 1G). Although a higher sequencing depth would likely improve the signal-to-noise ratio, detection of rare events by GLOE-Seq seems problematic. At the same time, however, RFD analysis has revealed that the distribution of spontaneous breaks detected in DNA from unperturbed yeast or human cells is, in fact, not random but follows the activity profile of the replicative polymerase ε (Figures 4C and 6D). Thus, the “background” signal present in our samples does not actually represent noise derived from spontaneous hydrolysis or shearing during preparation but, rather, a physiological feature of native DNA isolated from a natural source that is detectable by our protocol. Taking this pattern into account by careful comparisons of experimental conditions will be important for correct interpretation of SSB signals.

Because they represent a specialized form of 3′ ends in the genome, we used GLOE-Seq for mapping the 3′ termini of Okazaki fragments. In budding yeast, previously published methods have accomplished this by directly isolating Okazaki fragments by means of size fractionation followed by NGS (Smith and Whitehouse, 2012). Our data accurately reproduce the published patterns without employing size selection. In mammalian cells, sequencing of Okazaki fragments by OK-Seq has required not only size selection but also enrichment of replicating DNA via 5-Ethynyl-2´-deoxyuridine (EdU) pulse labeling (Petryk et al., 2016). In our procedure, we avoided these elaborate steps and were able to generate replication profiles from a much lower number of cells by abolishing replicative ligase activity, analogous to the strategy applied in yeast (Figure 6C). Upon inactivation of both replicative ligases, this approach led to underrepresentation of late-firing origins accompanied by activation of the DNA damage response. Depletion of ligase 1 only by means of siRNA largely resolved this problem, resulting in highly defined replication profiles. Thus, although the OK-Seq protocol might still be preferable in cases where ligase cannot be inhibited or an absolutely undisturbed pattern of late-replicating regions is required, GLOE-Seq in combination with ligase 1 depletion is an attractive and practicable alternative to the potentially more cumbersome OK-Seq method. In conclusion, the strength of GLOE-Seq lies in analysis of structures not covered by other NGS methods; i.e., physiological DNA SSBs, which are some of the most abundant lesions and key metabolic intermediates in the genome.

STAR★Methods

Key Resources Table

REAGENT or RESOURCE SOURCE IDENTIFIER
Antibodies

Mouse monoclonal anti-yeast Rad53 Susan Gasser; Hauer et al., 2017 N/A
Rat monoclonal anti-tubulin, clone YL1/2 Sigma-Aldrich Cat# 92092402-1VL, RRID: CVCL_J781
Rabbit polyclonal anti-DNA Ligase 1 Elabscience Cat# E-AB-31210
Mouse monoclonal anti-Chk1, clone 2G1D5 Cell Signaling Technology Cat# 2360, RRID: AB_2080320
Rabbit polyclonal anti-phospho-Chk1 (Ser345) Cell Signaling Technology Cat# 2341, RRID: AB_330023
Goat anti-mouse immunoglobulins, HRP Dako Cat# P0447, RRID: AB_2617137
Goat anti-rabbit immunoglobulins, HRP Dako Cat# P0448, RRID: AB_2617138
Goat anti-rat immunoglobulins, HRP Dako Cat# P0450, RRID: AB_2630354
Goat anti-rat IgG Secondary Antibody, IRDye® 800CW LI-COR Cat# 926-32219, RRID: AB_1850025

Chemicals, Peptides, and Recombinant Proteins

Zymolyase 20T AMS Biotechnology Cat# 120491-1
T4 DNA ligase, 20,000,000 U/mL New England Biolabs Cat# B0202
Q5® High-Fidelity DNA polymerase New England Biolabs Cat# M0491
RNase A from bovine pancreas Sigma-Aldrich Cat# 10109169001
Proteinase K Roche Cat# 3115801001
Phenylmethanesulfonyl fluoride Sigma-Aldrich Cat# P7626
β-Agarase I, 1000 U/mL New England Biolabs Cat# M0392
Alpha-Factor peptide (WHWLQLKPGQPMY), > 95% ProteoGenix SAS Cat# GM-PT001
BsrDI New England Biolabs Cat# R0574
Nb.BsrDI New England Biolabs Cat# R0648
NotI New England Biolabs Cat# R0189
Nb.BtsI New England Biolabs Cat# R0707
Antarctic Phosphatase New England Biolabs Cat# M0289
T4 Endonuclease V (T4 PDG) New England Biolabs Cat# M0308
APE1 New England Biolabs Cat# M0282
hAAG New England Biolabs Cat# M0313
Lipofectamine RNAiMAX Transfection Reagent Thermo Fisher Scientific Cat# 13778150
Benzonase Sigma-Aldrich Cat# E1014-25KU
PhosSTOP Phosphatase Inhibitor Sigma-Aldrich Cat# 04906837001
Opti-MEM® Reduced Serum Medium Thermo Fisher Scientific Cat# 11058021

Critical Commercial Assays

AMPure XP beads Beckman Coulter Cat# A63881
NEBNext® Ultra II DNA Library Prep Kit for Illumina® New England Biolabs Cat# E7645
Phusion Flash high-fidelity PCR master mix Thermo Fisher Scientific Cat# F-548
Dynabeads MyOne Streptavidin C1 Life Technologies Cat# 65001
High Sensitivity D1000 ScreenTape Agilent Technologies Cat# 5067-5584
High Sensitivity D1000 ScreenTape reagents Agilent Technologies Cat# 5067-5585
RNA ScreenTape Agilent Technologies Cat# 5067-5576
RNA ScreenTape sample buffer Agilent Technologies Cat# 5067-5577
RNA ScreenTape ladder Agilent Technologies Cat# 5067-5578
Qubit dsDNA HS Assay Kit Invitrogen Cat# Q32854
Agilent Bioanalyzer High Sensitivity DNA Kit Agilent Technologies Cat# 5067-4646
NextSeq 550 System High-Output Kit Illumina Cat# 20024906
NextSeq 550 System Mid-Output Kit Illumina Cat# 20024904
NuSieve GTG Agarose Lonza Cat# 859081
Yeast Genomic DNA Extraction Kit QIAGEN Cat# 10243
Bio-Rad Protein Assay Dye Reagent Bio-Rad Cat# 500-0006
NuPAGE LDS sample buffer Thermo Fisher Scientific Cat# NP0007
NuPAGE 4-12% Bis-Tris Protein Gels Thermo Fisher Scientific Cat# NP0322

Deposited Data

Raw and analyzed data This paper GEO: GSE134225
Human reference genome UCSC GRCh37/hg19 UCSC Genome Browser ftp://hgdownload.soe.ucsc.edu/goldenPath/hg19/chromosomes/
Yeast reference genome UCSC sacCer3 UCSC Genome Browser ftp://hgdownload.soe.ucsc.edu/goldenPath/sacCer3/chromosomes/
3′ end maps of Nb.BtsI-digested DNA, WT and pol1-L868M, Fastq (Figure 2A, reanalyzed with GLOE-Pipe) Reijns et al., 2015 GEO: GSM1573437, GSM1573438
Motifs for Abf1, Reb1, Rap1 in yeast, GFF (Figure 4B) MacIsaac et al., 2006 https://downloads.yeastgenome.org/published_datasets/MacIsaac_2006_PMID_16522208/MacIsaac_high_confidence_with_sequence.gff
Nucleosome positions in yeast, BED (Figure 4B) Whitehouse et al., 2007 https://static-content.springer.com/esm/art%3A10.1038%2Fnature06391/MediaObjects/41586_2007_BFnature06391_MOESM319_ESM.xls
Okazaki fragment maps in Cdc9-depleted yeast, Fastq, rep 1 & rep2 (Figure 4C, reanalyzed with GLOE-Pipe) Smith and Whitehouse, 2012 GEO: GSM835650, GSM835651
END-seq data in HCT116, Fastq (Figure 5, reanalyzed with GLOE-Pipe) Tubbs et al., 2018 GEO: GSM3227952
DNase I hypersensitivity data in HCT116, BigWig, rep1 & rep2 (Figure 6B) ENCODE Project Consortium, 2012 GEO: GSM736493, GSM736600
OK-Seq and S50 data in HeLa and GM06990, rep1 & rep2 (BedGraph, received from Chunlong Chen, chunlong.chen@curie.fr) (Figure 6B) Petryk et al., 2016 GEO SRA: SRP065949 (SRX1427549, SRX1427548, SRX1424659, SRX1424656)

Experimental Models: Cell Lines

HCT116 Cancer Research UK London Research Institute Cell Services N/A
HCT116 LIG3−/−:mL3 Oh et al., 2014 N/A

Experimental Models: Organisms/Strains

S. cerevisiae: strain W303 ATCC ATCC: 201238
S. cerevisiae: strain JKM179 Lee et al., 1998 N/A
S. cerevisiae: strain DF5 ATCC ATCC: 200912
S. cerevisiae: strain DF5 apn1::KanMX apn2::KanMX This paper N/A
S. cerevisiae: strain cdc9AID Kubota et al., 2013 N/A

Oligonucleotides

Primer #3898: CTACACGACGCTCTTCCGATCTNNN
NNN-NH2 (phosphorothioate bond, IDT code ; NH2: 3′-amino modification, IDT code /3AmMO/)
Integrated DNA Technologies N/A
Primer #3899: PO4-AGATCGGAAGAGCGTCGTGTAG
GGAAAGAGTG TAGATCTCGTTTT-Bio (PO4: 5′-phosphorylation, IDT code /5Phos/; T-Bio: 3′-biotin-dT, IDT code /3BiodT/)
Integrated DNA Technologies N/A
Primer #3790: CGAGATCTACACTCTTTCCCTACA
CGACGCTCTTCCGATCT
Integrated DNA Technologies N/A
Primer #3791: GACTGGAGTTCAGACGTGTGCTC
TTCCGATCT
Integrated DNA Technologies N/A
Primer #3792: GATCGGAAGAGCACACGTCTG
AACTCCAGTC
Integrated DNA Technologies N/A
Primer P5: AATGATACGGCGACCACCGAGATCT
ACACTCTTTCCCTACACGACGCTCTTCCGATCT
Integrated DNA Technologies N/A
Primer P7: CAAGCAGAAGACGGCATACGAGAT
(X)6GTGACTGGAGTTCAGACGTGTGCT
CTTCCGATCT
Integrated DNA Technologies N/A
siRNA targeting sequence: Silencer Select Negative Control No. 1 Thermo Fisher Scientific Cat# 4390843
siRNA targeting sequence: Ligase 1: Silencer Select s8174 Thermo Fisher Scientific Cat# 4390824

Software and Algorithms

bcl2fastq, version 2.19 Illumina https://support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software.html
Bpipe, version 0.9.9.3 Sadedin et al., 2012 http://docs.bpipe.org/
NGSpipe2go Institute of Molecular Biology gGmbH https://github.com/imbforge/NGSpipe2go
FastQC, version 0.11.5 Andrews, 2019 https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
Trimmomatic, version 0.36 Bolger et al., 2014 http://www.usadellab.org/cms/?page=trimmomatic
Bowtie 2, version 2.3.4 Langmead and Salzberg, 2012 http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
Samtools, version 1.5 Li et al., 2009 http://samtools.sourceforge.net/
BEDTools, version 2.25.0 Quinlan and Hall, 2010 https://bedtools.readthedocs.io/en/latest/
bedGraphToBigWig, version 365 Kent et al., 2010 https://github.com/ENCODE-DCC/kentUtils
MACS2 callpeak, version 2.1.1 Zhang et al., 2008 https://github.com/taoliu/MACS
ChIPseeker package, version 1.14.1 Yu et al., 2015 https://bioconductor.org/packages/release/bioc/html/ChIPseeker.html
deepTools, version 3.1.0 Ramírez et al., 2014 https://deeptools.readthedocs.io/en/develop/
Replicon Gindin et al., 2014 https://github.com/RepliconBioinfo/Replicon [downloaded 20.12.2019]
FlowJo, version 10.6.1 FlowJo, LLC https://www.flowjo.com/
GLOE-Pipe This paper https://github.com/helle-ulrich-lab/ngs-gloepipe

Other

Detailed protocol for the preparation of GLOE-Seq libraries This paper See Supplemental Information
3D model of a custom-made mold for agarose plugs This paper See Data S1
3D model of a tool for extrusion of agarose plugs from custom-made mold This paper See Data S1

Resource Availability

Lead Contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by Helle D. Ulrich (h.ulrich@imb-mainz.de).

Materials Availability

This study did not generate new unique reagents.

Data and Code Availability

The datasets generated and analyzed during this study are available in the Gene Expression Omnibus repository, https://www.ncbi.nlm.nih.gov/geo, under accession number GEO: GSE134225. Published datasets used for this analysis and their accessibility are summarized in the Key Resources Table. GLOE-Pipe is publicly accessible, fully documented and regularly maintained by the developers at https://github.com/helle-ulrich-lab/ngs-gloepipe.

Experimental Model and Subject Details

Saccharomyces cerevisiae Strains

  • W303: MATa, ade2-1 ura3-1 his3-11,15 trp1-1 leu2-3,112 can1-100 (Thomas and Rothstein, 1989)

  • JKM179: MATa, Δho Δhml::ADE1 Δhmr::ADE1 ade1-110 leu2-3,112 lys5 trp1::hisG ura3-52 ade3::GAL10:HO (Lee et al., 1998)

  • DF5: MATa, his3-Δ200 leu2-3,2-112 lys2-801 trp1-1(am) ura3-52 (Finley et al., 1987)

  • DF5 apn1Δ apn2Δ: MATa, his3-Δ200 leu2-3,2-112 lys2-801 trp1-1(am) ura3-52 apn1::KanMX apn2::KanMX (this study)

  • cdc9AID: MATa, bar1Δ::hisG ade2-1 can1-100 his3-11,15 leu2-3,112 trp1-1 ura3-1 RAD5+ URA3::pRS306-PGAL1-10-OsTIR1 cdc9-AID::kanMX (Kubota et al., 2013)

Yeast cultures were grown in YP medium containing 2% (w/v) glucose at 30°C unless otherwise noted.

Cell Lines

  • HCT116 (Cancer Research UK – London Research Institute’s Cell Production Services)

  • HCT116 LIG3−/−:mL3 (Oh et al., 2014; received from Eric Hendrickson)

Cells were grown as monolayers in a humidified incubator at 37°C and 5% (v/v) CO2 in DMEM supplemented with 10% (v/v) fetal bovine serum, penicillin (100 U/mL), streptomycin (100 μg/mL) and glutamine (300 μg/mL).

Method Details

GLOE-Seq Library Preparation (S. cerevisiae)

Note that a detailed step-by-step protocol of the GLOE-Seq method is provided as Supplemental Information. Genomic DNA for validating the method, to detect chromosomal features and to map pyrimidine dimers and MMS damage was prepared using a spheroplast-based genomic DNA extraction kit (QIAGEN) according to the instruction manual. Genomic DNA for mapping Okazaki fragments and for capturing HO-induced DSBs was prepared by gentle lysis of spheroplasts, precipitation of the DNA, RNase A treatment and purification via AMPure beads (Beckman Coulter) as described in the step-by-step protocol (steps y1-28). Following additional treatment of the genomic DNA as specified for each experiment below, libraries were prepared from 2.5 μg of DNA by the core GLOE-Seq procedure (steps 29-50 of the step-by-step protocol). Briefly, this involved thermal denaturation, ligation of the proximal biotinylated adaptor, fragmentation to an average size of 200 nt, capture on streptavidin beads, second strand synthesis, end polishing and ligation of the distal adaptor. The resulting library was amplified using P5 and P7 primers of the Illumina system.

Sequencing

GLOE-Seq libraries were sequenced on an Illumina NextSeq 500 sequencer with High or Mid Output flow cells, depending on the number of libraries loaded at a time. All libraries were sequenced in single-end mode, with read lengths of 75, 84, 92, 150 or 160 bases (depending on the sequencing kit and the level of multiplexing per flow cell) plus 7 bases for single-indexed libraries or 8+8 bases for dual-indexed libraries. Sequencing depth varied with the type of library; for yeast GLOE-Seq libraries we aimed for 3 million reads per sample, while for mammalian libraries a single library was loaded per NextSeq 500 High Output flow cell. However, downsampling the reads to 50 million proved to yield sufficient depth for Okazaki fragment analysis via GLOE-Seq. Upon completion of the run, raw sequencing reads of pooled libraries were demultiplexed based on their index sequences by bcl2fastq (version 2.19, Illumina). The resulting Fastq files were used as input files for GLOE-Pipe.

Sequencing Data Analysis

To automate and standardize the processing and analysis of GLOE-Seq data, we developed GLOE-Pipe, a bioinformatics toolkit for the identification of strand breaks at nucleotide resolution in genomic DNA. GLOE-Pipe uses Bpipe (Sadedin et al., 2012) (version 0.9.9.3) as workflow manager and domain specific language (DSL). It represents a branch of NGSpipe2go, containing a set of modules that together process raw sequencing data and generate output files in order to detect, annotate and visualize breaks from raw data.

The input for GLOE-Pipe are Fastq files containing raw NGS reads and additionally one file describing samples, groups and comparisons to be performed. The pipeline assesses the quality of raw reads using FastQC (Andrews, 2019) (version 0.11.5). Subsequently, raw reads are filtered and trimmed based on quality and adaptor inclusion using Trimmomatic (Bolger et al., 2014) (version 0.36; parameters: ILLUMINACLIP:TruSeq3-SE.fa:2:30:10, SLIDINGWINDOW:4:15, MINLEN:36). Trimmed and filtered reads are mapped on the appropriate genome (SacCer3 or Hg19 for yeast or human samples, respectively) using Bowtie2 (Langmead and Salzberg, 2012) (version 2.3.4) with default parameters, and reads falling in repetitive regions are filtered out using Samtools (Li et al., 2009) (version 1.5; parameter: -q 30).

BAM files are converted into BED files using bamToBed [BEDTools (Quinlan and Hall, 2010), version 2.25.0] and additional custom code. Given that the sequences produced by GLOE-Seq are the reverse-complement of the original captured DNA fragment, GLOE-Pipe assigns the 5′ end of a read to the strand opposite to that on which it mapped. Two output modes are available: in the direct mode, each read represents one unit of signal that is positioned exactly at the 3′-terminal nucleotide of the original captured fragment. This mode would normally be used to visualize strand breaks. The indirect mode is essentially identical, but it positions the signal one nucleotide immediately upstream of the 5′ end of the original captured fragment, which corresponds to the nucleotide immediately 3′ of the break. The indirect mode would therefore normally be used to visualize the positions of modified or damaged bases (under the premise that the enzymatic treatment used to detect this modification or lesion generates a nick 5′ to the affected nucleotide).

BED files are converted into BigWig format and split into plus (FWD) and minus (REV) strands to be visualized in a genome browser using sortBED, genomeCoverageBed [BEDTools (Quinlan and Hall, 2010), version 2.25.0] and bedGraphToBigWig (Kent et al., 2010) (version 365).

Validation of the GLOE-Seq Method

Purified genomic DNA from yeast strain W303 was treated with 1 U/μg DNA of the relevant restriction or nicking enzymes, BsrDI, Nb.BsrDI or NotI (New England Biolabs) for 90 min at 65°C (BsrDI and Nb.BsrDI) or 37°C (NotI), dephosphorylated with 2 U/μg Antarctic Phosphatase (New England Biolabs) for 30 min at 37°C and purified using AMPure beads (Beckman Coulter). The purified DNA was quantified (Qubit, Life Technologies), and 2.5 μg were used for GLOE-Seq library preparation (steps 29-50) and sequencing. Raw GLOE-Seq data from these experiments were processed with GLOE-Pipe in the indirect mode for downstream analysis, but visualized in the direct mode (Figure 1B). Significant peaks were assigned by comparing the BED file for each strand to an undigested sample, when possible, using MACS2 callpeak (Zhang et al., 2008) (version 2.1.1; parameters: --extsize 1, --nomodel, --shift 0, --keep-dup). Custom code based on ChIPseeker package (Yu et al., 2015) (version 1.14.1) was used to check the overlap between the detected and expected breaks.

Comparison of GLOE-Seq with EndoSeq

In order to benchmark GLOE-Seq against a published procedure, the GLOE-Seq protocol (steps 29-50) was applied to 2.5 μg of Nb.BtsI-digested yeast genomic DNA. These data were analyzed in parallel with the corresponding published EndoSeq datasets (Reijns et al., 2015). In order to make the two datasets directly comparable, the EndoSeq datasets were downsampled to 5 million reads and the reads were trimmed to 75 nt. For direct comparison of the two protocols, Nb.BsrDI sites were mapped in genomic yeast DNA by either GLOE-Seq (steps 29-50) as described above or EndoSeq according to the published protocol (Ding et al., 2015). Both samples were sequenced on the same platform (Illumina NextSeq). In order to test the suitability of EndoSeq for capturing pre-existing nicks, genomic DNA was first treated with Nb.BsrDI and subsequently subjected to the EndoSeq protocol. Raw data from these experiments were processed with GLOE-Pipe in the indirect mode. Significant breaks for each protocol were called with MACS2 callpeak (Zhang et al., 2008) (version 2.1.1) without specifying the --control flag. Expected breaks were grouped into three categories (unique, common, neither) using BEDTools (Quinlan and Hall, 2010) intersect (version 2.25.0). The location of each break was combined with its normalized counts calculated by GLOE-Pipe and custom code was used to generate scatterplots showing the comparison between the two protocols.

Analysis of Spontaneous SSBs in Yeast

Genomic DNA prepared from exponential W303 cultures was subjected to GLOE-Seq (steps 29-50). As a control sample representing a random distribution of single-stranded breaks, an aliquot of the genomic DNA was first subjected to fragmentation by sonication to an average size of 300 bp before applying the GLOE-Seq protocol (steps 29-50). Raw GLOE-Seq data from these experiments were processed with GLOE-Pipe in the indirect mode. Reads were assigned with high stringency by disallowing any mismatches. For relevant samples, tracks based on the replication fork directionality (RFD) ratio [RFD = (REV – FWD)/(REV + FWD)] were generated using bigwigCompare [bigwigCompare [deepTools (Ramirez et al., 2014), version 3.1.0].

Analysis of an HO-Induced DSB in Yeast

JKM179 cells (Lee et al., 1998) harboring a galactose-inducible HO endonuclease construct were cultured in YP medium containing 2% (w/v) raffinose, and DSB formation at the mating-type locus was induced by growth in YP medium containing 2% (w/v) galactose for 1 h, as described previously (Lee et al., 1998). Control samples were prepared from cells grown in raffinose medium. Genomic DNA was prepared (steps y1-28), and GLOE-Seq was performed according to the step-by-step protocol (steps 29-50). Raw GLOE-Seq data from these experiments were processed with GLOE-Pipe in the indirect mode for downstream analysis, and in the direct mode for visualization (Figure 3C).

Mapping of Pyrimidine Dimers in Yeast

Exponentially growing DF5 cells were exposed to UV irradiation at 20 and 120 J/m2 (254 nm, Stratalinker, Stratagene) or left unirradiated. Total genomic DNA was prepared, and 2.5 μg of purified DNA was either treated with 5 U/μg T4 Endonuclease V (New England Biolabs, 30 min at 37°C), followed by 2 U/μg APE1 (New England Biolabs, 30 min at 37°C), or left untreated, before application of the GLOE-Seq protocol (steps 29-50). Raw GLOE-Seq data from these experiments were processed with GLOE-Pipe in the indirect mode. Custom code was used to extract dinucleotides corresponding to the reported damaged bases, in the downstream direction, and the percentage of each dinucleotide sequence was calculated.

Mapping of MMS Damage and Repair in Yeast

Yeast strains DF5 and an isogenic apn1Δ apn2Δ mutant were arrested in G1 phase with 5 μg/mL alpha-factor for 90 min, followed by a 30 min incubation with 0.02% MMS and release into S phase in the absence of MMS. For analysis of checkpoint activation, aliquots corresponding to 1 OD600 were collected. Cell pellets were resuspended in 1.85 M NaOH, 7.4% β-mercaptoethanol and incubated on ice for 15 min. After addition of 75 μL of 55% (w/v) trichloroacetic acid, further incubation on ice for 10 min and centrifugation at 13,800 × g for 10 min at 4°C, pellets were resuspended in 40 μL of 100 mM dithiothreitol in 1 × LDS sample buffer (Invitrogen) and incubated at 65°C for 20 min. Protein extracts were loaded onto 4%–12% Bis-Tris gradient gels (Invitrogen) and analyzed by western blotting using anti-Rad53 (Hauer et al., 2017) and anti-tubulin (YL1/2, Sigma-Aldrich) primary antibodies as well as anti-mouse IgG secondary antibody (Dako) and fluorophore-coupled secondary anti-rat antibody (IRDye® 800CW, LI-COR), respectively. Cell cycle phase was monitored by harvesting 1 mL samples, washing cells once with water and fixing in 1 mL of 70% (v/v) ethanol. Fixed cells were washed twice with 1 mL of 50 mM sodium citrate, pH 7.0, treated with 80 μg/mL RNaseA at 50°C for 1 h, followed by 80 μg/ml Proteinase K at 50°C for 1 h. Samples were stained with 32 mg/mL propidium iodide, sonicated and analyzed using flow cytometry (FACSVerse, BD Biosciences). Data were analyzed with FlowJo v10 software (FlowJo, LLC).

In order to detect strand breaks as repair intermediates, genomic DNA was prepared from cells harvested at the indicated time points after release (steps y1-28), followed by application of the GLOE-Seq protocol (steps 29-50). In order to map base lesions, the extracted genomic DNA was pre-treated with 3.75 U/μg hAAG (New England Biolabs, 90 min at 37°C), followed by 2.5 U/μg APE1 (New England Biolabs, 90 min at 37°C), purification via AMPure beads and application of the GLOE-Seq protocol (steps 29-50). In both cases, raw data were processed by GLOE-Pipe in the indirect mode.

Mapping of Okazaki Fragments in Yeast

Ligation of Okazaki fragments was inhibited by auxin-induced degradation of degron-tagged Cdc9 in a cdc9AID strain as previously described (Kubota et al., 2013). Briefly, cells were arrested in G1 phase with 5 μg/mL alpha-factor for 90 min, followed by a 60 min incubation with 1 mM auxin and release into S phase for 60 min. Genomic DNA was prepared (steps y1-28), followed by dephosphorylation with Antarctic Phosphatase (2 U/μg DNA) for 30 min at 37°C, purification via AMPure beads and application of the GLOE-Seq protocol (steps 29-50). Raw GLOE-Seq data from these experiments were processed with GLOE-Pipe in the indirect mode. The distribution of breaks on each strand was compared with nucleosome locations (Whitehouse et al., 2007) and transcription factor binding sites (MacIsaac et al., 2006) (Abf1, Reb1, Rap1 occupancy) around TSSs using computeMatrix and plotProfile (deepTools; Ramírez et al., 2014; version 3.1.0). Tracks based on the RFD ratio were generated as described above.

Ligase 1 Depletion in Human Cells

Treatment of HCT116 and HCT116 LIG3−/−:mL3 cells with siRNA was carried out 24 h after seeding 200,000 WT cells in 6-cm plates (21.3 cm2) with 5.5 mL medium and 1.23 million LIG3−/−:mL3 cells in 10-cm plates (59 cm2) with 15.5 mL medium, to account for the slower growth rate of the latter and the consequent lower cell yield at the end of the procedure. For the 10-cm plates, 185 pmol of control (Silencer Select control no. 1, Thermo Fisher) or Ligase 1 (Silencer Select s8174, Thermo Fisher Scientific) siRNAs were mixed with 31 μL Lipofectamine RNAiMAX (Thermo Fisher Scientific) in a final volume of 3.1 mL OptiMEM (Thermo Fisher Scientific) and added to the cells. For the 6-cm plates, these amounts were scaled down according to surface area. Two days post siRNA transfection, cells were passaged into 10-cm plates and grown for two more days before being harvested and immediately processed for GLOE-Seq. In parallel, total protein extracts were prepared for western blot analysis by incubation of cells in 25 mM HEPES pH 7.5, 100 mM NaCl, 2 mM MgCl2, 1% Triton X-100 and 0.5 U/L Benzonase (Sigma-Aldrich) at 4°C for 1-2 h. Samples of 50 μg (determined by Bio-Rad’s Protein Assay) were resolved on a NuPAGE 4%–12% Bis-Tris gel (Thermo Fisher Scientific) and transferred to a PVDF membrane. The membrane was blocked in 5% (w/v) non-fat milk in PBS with 0.1% (v/v) Tween-20 and incubated with antibodies against Ligase 1 (Elabscience), CHK1 (Cell Signaling Technology), phospho-CHK1 (Cell Signaling Technology) and tubulin (YL1/2, Sigma-Aldrich), which were detected by enhanced chemiluminescence. Nuclei left over from the preparation of genomic DNA (see below) were analyzed by flow cytometry (FACSVerse, BD Biosciences) in PBS with 80 μg/mL propidium iodide and the resulting data visualized in FlowJo 10. For UV irradiation, HCT116 cells were grown to a confluency of ~50%, washed twice in PBS, exposed to 25 J/m2 UV (254 nm, Stratalinker 2400, Stratagene) and harvested after 4 h. Whole-cell extracts were prepared as described above with the addition of phosphatase inhibitors (PhosSTOP, Sigma-Aldrich) to the lysis buffer.

Preparation of Mammalian Genomic DNA

A detailed step-by-step protocol for the GLOE-Seq procedure, including the preparation of genomic DNA, is provided as Supplemental Information. Briefly, mammalian genomic DNA was isolated from agarose-embedded nuclei prepared from cultured cells as described in the step-by-step protocol (steps m1-21). Briefly, cells were harvested by trypsinization and nuclei were prepared by incubation with 0.1% (v/v) Triton X-100. Nuclei were washed, treated with RNase A and embedded in low-melting point agarose. After solidification, the agarose plugs were treated with Proteinase K solution containing 1% (w/v) Sarkosyl to induce lysis. Proteinase K was then inactivated by PMSF treatment, followed by extensive washing of the plugs. 3D models of the plug mold and an extrusion tool are available as Data S1.

Application of GLOE-Seq to Human Genomic DNA

All steps of the GLOE-Seq protocol up to the ligation of the proximal adaptor (steps m22-28, see Supplemental Information) were carried out in agarose plugs. Briefly, the agarose-embedded genomic DNA was denatured by heating to 95°C, followed by quenching on ice. Ligation was performed overnight at 16°C, followed by digestion of the agarose with β-agarase. The DNA was then fragmented by sonication in a Bioruptor Pico, purified with AMPure beads and processed using the core GLOE-Seq workflow (steps 29-50). Raw GLOE-Seq data from these experiments were processed with GLOE-Pipe in the indirect mode. RFD ratios were calculated within 1-kbp-stepped 10-kbp sliding bins.

Distribution of SSBs over Gene Features

RNA transcript levels (FPKM, Fragments Per Kilobase of transcript per Million mapped reads, as determined by RNA-Seq) for HCT116 cells were obtained from the ENCODE project (ENCODE Project Consortium, 2012) and supplemented with relevant additional information from UCSC’s annotation database. For genes with more than one reported isoform, the single largest FPKM value among all isoforms was chosen for downstream analysis. Genes were grouped into two categories: those without (FPKM = 0) and those with detectable transcription (FPKM > 0). For TSSs, TTSs, 5′ and 3′ exon junctions, the signal was extracted 5,000 bp upstream/downstream of the relevant reference point and processed as 1-bp-stepped moving averages of 1-kbp windows. For 5′ and 3′ UTRs, exons and introns, SSB data spanning the exact location of each individual feature were extracted, concatenated and divided into 1-kbp-long non-overlapping bins, which were used to compute averages. The resulting values were converted into percentage deviation from the “background” genome-wide level of signal, which was defined as the mean signal in 1-kbp non-overlapping bins along the entire length of the genome.

Multiple reads mapping to the same coordinate were assumed to derive from one unique break.

Modeling of Replication Timing

DNase I hypersensitivity data for HCT116 cells were obtained from the ENCODE project (ENCODE Project Consortium, 2012). Custom code was used to model the values for the six Repli-Seq fraction profile compartments (G1/G1b, S1, S2, S3, S4, G2) (Hansen et al., 2010), in 1 kbp bins, using Replicon (Gindin et al., 2014). S50 ratios, defined as the fraction of S phase (0 < S50 < 1) at which 50% of the DNA is replicated in a specific bin, were generated by linear regression of these data.

Quantification and Statistical Analysis

Software and statistical details are described in the relevant sub-sections of the Method Details where applicable.

Additional Resources

Detailed Protocol

Methods S1. Related to STAR Methods. Step-by-step protocol for the preparation of GLOE-Seq libraries.

Acknowledgments

Support by the IMB Genomics and Flow Cytometry Core Facilities and use of IMB’s NextSeq500 (INST 247/870-1 FUGG) is gratefully acknowledged. We thank Anke Ries from IMB’s Electronics Workshop for making the plug mold 3D model; Anne Donaldson, Susan Gasser, Jim Haber, and Eric Hendrickson for yeast strains, cell lines, and reagents; Chun-Long Chen and Olivier Hyrien for sharing datasets; and Nicola Crosetto for helpful advice in the early stages of the project. Funding for this work was provided by the European Research Council (ERC; AdG 323179) and the Deutsche Forschungsgemeinschaft (DFG; German Research Foundation; project number 393547839 – SFB 1361, sub-project 07).

Author Contributions

Conceptualization, A.M.S., N.Z., and H.D.U.; Methodology, A.M.S., N.Z., G.P., M.M.-L., and L.S.B.-N.; Investigation, A.M.S., A.J.S., and N.Z.; Software, Formal Analysis, and Data Curation, G.P. and N.Z.; Writing – Original Draft, N.Z. and H.D.U.; Writing – Review & Editing, N.Z., A.M.S., G.P., and H.D.U.; Project Administration and Funding Acquisition, H.D.U.

Declaration of Interests

The authors declare no competing interests.

Published: April 21, 2020

Footnotes

Supplemental Information can be found online at https://doi.org/10.1016/j.molcel.2020.03.027.

Contributor Information

Nicola Zilio, Email: n.zilio@imb-mainz.de.

Helle D. Ulrich, Email: h.ulrich@imb-mainz.de.

Supplemental Information

Document S1. Figures S1–S6 and Methods S1
mmc1.pdf (8.4MB, pdf)
Data S1. 3D Models of the Plug Mold and Extrusion Tool (.stl), Related to STAR Methods
mmc2.zip (299.1KB, zip)
Document S2. Article plus Supplemental Information
mmc3.pdf (12.4MB, pdf)

References

  1. Abbotts R., Wilson D.M., 3rd Coordination of DNA single strand break repair. Free Radic. Biol. Med. 2017;107:228–244. doi: 10.1016/j.freeradbiomed.2016.11.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Alver R.C., Chadha G.S., Blow J.J. The contribution of dormant origins to genome stability: from cell biology to human genetics. DNA Repair (Amst.) 2014;19:182–189. doi: 10.1016/j.dnarep.2014.03.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Andrews S. FastQC: A quality control tool for high throughput sequence data. 2019. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
  4. Arakawa H., Bednar T., Wang M., Paul K., Mladenov E., Bencsik-Theilen A.A., Iliakis G. Functional redundancy between DNA ligases I and III in DNA replication in vertebrate cells. Nucleic Acids Res. 2012;40:2599–2610. doi: 10.1093/nar/gkr1024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Baranello L., Kouzine F., Wojtowicz D., Cui K., Przytycka T.M., Zhao K., Levens D. DNA break mapping reveals topoisomerase II activity genome-wide. Int. J. Mol. Sci. 2014;15:13111–13122. doi: 10.3390/ijms150713111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Beranek D.T. Distribution of methyl and ethyl adducts following alkylation with monofunctional alkylating agents. Mutat. Res. 1990;231:11–30. doi: 10.1016/0027-5107(90)90173-2. [DOI] [PubMed] [Google Scholar]
  7. Bolger A.M., Lohse M., Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Caldecott K.W. DNA single-strand break repair. Exp. Cell Res. 2014;329:2–8. doi: 10.1016/j.yexcr.2014.08.027. [DOI] [PubMed] [Google Scholar]
  9. Canela A., Sridharan S., Sciascia N., Tubbs A., Meltzer P., Sleckman B.P., Nussenzweig A. DNA breaks and end resection measured genome-wide by End Sequencing. Mol. Cell. 2016;63:898–911. doi: 10.1016/j.molcel.2016.06.034. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Cao H., Salazar-García L., Gao F., Wahlestedt T., Wu C.L., Han X., Cai Y., Xu D., Wang F., Tang L. Novel approach reveals genomic landscapes of single-strand DNA breaks with nucleotide resolution in human cells. Nat. Commun. 2019;10:5799. doi: 10.1038/s41467-019-13602-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Clausen A.R., Lujan S.A., Burkholder A.B., Orebaugh C.D., Williams J.S., Clausen M.F., Malc E.P., Mieczkowski P.A., Fargo D.C., Smith D.J., Kunkel T.A. Tracking replication enzymology in vivo by genome-wide mapping of ribonucleotide incorporation. Nat. Struct. Mol. Biol. 2015;22:185–191. doi: 10.1038/nsmb.2957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Crosetto N., Mitra A., Silva M.J., Bienko M., Dojer N., Wang Q., Karaca E., Chiarle R., Skrzypczak M., Ginalski K. Nucleotide-resolution DNA double-strand break mapping by next-generation sequencing. Nat. Methods. 2013;10:361–365. doi: 10.1038/nmeth.2408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Ding J., Taylor M.S., Jackson A.P., Reijns M.A.M. Genome-wide mapping of embedded ribonucleotides and other noncanonical nucleotides using emRiboSeq and EndoSeq. Nat. Protoc. 2015;10:1433–1444. doi: 10.1038/nprot.2015.099. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Eid A., Alshareef S., Mahfouz M.M. CRISPR base editors: genome editing without double-stranded breaks. Biochem. J. 2018;475:1955–1964. doi: 10.1042/BCJ20170793. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. doi: 10.1038/nature11247. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Finley D., Ozkaynak E., Varshavsky A. The yeast polyubiquitin gene is essential for resistance to high temperatures, starvation, and other stresses. Cell. 1987;48:1035–1046. doi: 10.1016/0092-8674(87)90711-2. [DOI] [PubMed] [Google Scholar]
  17. Gansauge M.-T., Meyer M. Single-stranded DNA library preparation for the sequencing of ancient or damaged DNA. Nat. Protoc. 2013;8:737–748. doi: 10.1038/nprot.2013.038. [DOI] [PubMed] [Google Scholar]
  18. Gansauge M.-T., Gerber T., Glocke I., Korlevic P., Lippik L., Nagel S., Riehl L.M., Schmidt A., Meyer M. Single-stranded DNA library preparation from highly degraded DNA using T4 DNA ligase. Nucleic Acids Res. 2017;45:e79. doi: 10.1093/nar/gkx033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Gindin Y., Meltzer P.S., Bilke S. Replicon: a software to accurately predict DNA replication timing in metazoan cells. Front. Genet. 2014;5:378. doi: 10.3389/fgene.2014.00378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hansen R.S., Thomas S., Sandstrom R., Canfield T.K., Thurman R.E., Weaver M., Dorschner M.O., Gartler S.M., Stamatoyannopoulos J.A. Sequencing newly replicated DNA reveals widespread plasticity in human replication timing. Proc. Natl. Acad. Sci. USA. 2010;107:139–144. doi: 10.1073/pnas.0912402107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Hauer M.H., Seeber A., Singh V., Thierry R., Sack R., Amitai A., Kryzhanovska M., Eglinger J., Holcman D., Owen-Hughes T., Gasser S.M. Histone degradation in response to DNA damage enhances chromatin dynamics and recombination rates. Nat. Struct. Mol. Biol. 2017;24:99–107. doi: 10.1038/nsmb.3347. [DOI] [PubMed] [Google Scholar]
  22. Hoffman E.A., McCulley A., Haarer B., Arnak R., Feng W. Break-seq reveals hydroxyurea-induced chromosome fragility as a result of unscheduled conflict between DNA replication and transcription. Genome Res. 2015;25:402–412. doi: 10.1101/gr.180497.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Hu J., Meyers R.M., Dong J., Panchakshari R.A., Alt F.W., Frock R.L. Detecting DNA double-stranded breaks in mammalian genomes by linear amplification-mediated high-throughput genome-wide translocation sequencing. Nat. Protoc. 2016;11:853–871. doi: 10.1038/nprot.2016.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Hu J., Selby C.P., Adar S., Adebali O., Sancar A. Molecular mechanisms and genomic maps of DNA excision repair in Escherichia coli and humans. J. Biol. Chem. 2017;292:15588–15597. doi: 10.1074/jbc.R117.807453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Karnani N., Dutta A. The effect of the intra-S-phase checkpoint on origins of replication in human cells. Genes Dev. 2011;25:621–633. doi: 10.1101/gad.2029711. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kent W.J., Zweig A.S., Barber G., Hinrichs A.S., Karolchik D. BigWig and BigBed: enabling browsing of large distributed datasets. Bioinformatics. 2010;26:2204–2207. doi: 10.1093/bioinformatics/btq351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Kubota T., Nishimura K., Kanemaki M.T., Donaldson A.D. The Elg1 replication factor C-like complex functions in PCNA unloading during DNA replication. Mol. Cell. 2013;50:273–280. doi: 10.1016/j.molcel.2013.02.012. [DOI] [PubMed] [Google Scholar]
  28. Langmead B., Salzberg S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods. 2012;9:357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Leduc F., Faucher D., Bikond Nkoma G., Grégoire M.-C., Arguin M., Wellinger R.J., Boissonneault G. Genome-wide mapping of DNA strand breaks. PLoS ONE. 2011;6:e17353. doi: 10.1371/journal.pone.0017353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Lee S.E., Moore J.K., Holmes A., Umezu K., Kolodner R.D., Haber J.E. Saccharomyces Ku70, mre11/rad50 and RPA proteins regulate adaptation to G2/M arrest after DNA damage. Cell. 1998;94:399–409. doi: 10.1016/s0092-8674(00)81482-8. [DOI] [PubMed] [Google Scholar]
  31. Lensing S.V., Marsico G., Hänsel-Hertsch R., Lam E.Y., Tannahill D., Balasubramanian S. DSBCapture: in situ capture and sequencing of DNA breaks. Nat. Methods. 2016;13:855–857. doi: 10.1038/nmeth.3960. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., 1000 Genome Project Data Processing Subgroup The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–2079. doi: 10.1093/bioinformatics/btp352. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Liu Z.J., Martínez Cuesta S., van Delft P., Balasubramanian S. Sequencing abasic sites in DNA at single-nucleotide resolution. Nat. Chem. 2019;11:629–637. doi: 10.1038/s41557-019-0279-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. MacIsaac K.D., Wang T., Gordon D.B., Gifford D.K., Stormo G.D., Fraenkel E. An improved map of conserved regulatory sites for Saccharomyces cerevisiae. BMC Bioinformatics. 2006;7:113. doi: 10.1186/1471-2105-7-113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Mao P., Brown A.J., Malc E.P., Mieczkowski P.A., Smerdon M.J., Roberts S.A., Wyrick J.J. Genome-wide maps of alkylation damage, repair, and mutagenesis in yeast reveal mechanisms of mutational heterogeneity. Genome Res. 2017;27:1674–1684. doi: 10.1101/gr.225771.117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Muslimović A., Nyström S., Gao Y., Hammarsten O. Numerical analysis of etoposide induced DNA breaks. PLoS ONE. 2009;4:e5859. doi: 10.1371/journal.pone.0005859. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Nick McElhinny S.A., Watts B.E., Kumar D., Watt D.L., Lundström E.-B., Burgers P.M.J., Johansson E., Chabes A., Kunkel T.A. Abundant ribonucleotide incorporation into DNA by yeast replicative polymerases. Proc. Natl. Acad. Sci. USA. 2010;107:4949–4954. doi: 10.1073/pnas.0914857107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Oh S., Harvey A., Zimbric J., Wang Y., Nguyen T., Jackson P.J., Hendrickson E.A. DNA ligase III and DNA ligase IV carry out genetically distinct forms of end joining in human somatic cells. DNA Repair (Amst.) 2014;21:97–110. doi: 10.1016/j.dnarep.2014.04.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Petryk N., Kahli M., d’Aubenton-Carafa Y., Jaszczyszyn Y., Shen Y., Silvain M., Thermes C., Chen C.-L., Hyrien O. Replication landscape of the human genome. Nat. Commun. 2016;7:10208. doi: 10.1038/ncomms10208. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Plosky B., Samson L., Engelward B.P., Gold B., Schlaen B., Millas T., Magnotti M., Schor J., Scicchitano D.A. Base excision repair and nucleotide excision repair contribute to the removal of N-methylpurines from active genes. DNA Repair (Amst.) 2002;1:683–696. doi: 10.1016/s1568-7864(02)00075-7. [DOI] [PubMed] [Google Scholar]
  41. Poetsch A.R., Boulton S.J., Luscombe N.M. Genomic landscape of oxidative DNA damage and repair reveals regioselective protection from mutagenesis. Genome Biol. 2018;19:215. doi: 10.1186/s13059-018-1582-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Pommier Y., Huang S.Y., Gao R., Das B.B., Murai J., Marchand C. Tyrosyl-DNA-phosphodiesterases (TDP1 and TDP2) DNA Repair (Amst.) 2014;19:114–129. doi: 10.1016/j.dnarep.2014.03.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–842. doi: 10.1093/bioinformatics/btq033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Ramírez F., Dündar F., Diehl S., Grüning B.A., Manke T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 2014;42:W187-91. doi: 10.1093/nar/gku365. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Reijns M.A.M., Kemp H., Ding J., de Procé S.M., Jackson A.P., Taylor M.S. Lagging-strand replication shapes the mutational landscape of the genome. Nature. 2015;518:502–506. doi: 10.1038/nature14183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Sadedin S.P., Pope B., Oshlack A. Bpipe: a tool for running and managing bioinformatics pipelines. Bioinformatics. 2012;28:1525–1526. doi: 10.1093/bioinformatics/bts167. [DOI] [PubMed] [Google Scholar]
  47. Smith D.J., Whitehouse I. Intrinsic coupling of lagging-strand synthesis to chromatin assembly. Nature. 2012;483:434–438. doi: 10.1038/nature10895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Thomas B.J., Rothstein R. Elevated recombination rates in transcriptionally active DNA. Cell. 1989;56:619–630. doi: 10.1016/0092-8674(89)90584-9. [DOI] [PubMed] [Google Scholar]
  49. Tsai S.Q., Zheng Z., Nguyen N.T., Liebers M., Topkar V.V., Thapar V., Wyvekens N., Khayter C., Iafrate A.J., Le L.P. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat. Biotechnol. 2015;33:187–197. doi: 10.1038/nbt.3117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Tubbs A., Sridharan S., van Wietmarschen N., Maman Y., Callen E., Stanlie A., Wu W., Wu X., Day A., Wong N. Dual roles of poly(dA:dT) tracts in replication initiation and fork collapse. Cell. 2018;174:1127–1142.e19. doi: 10.1016/j.cell.2018.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Vitelli V., Galbiati A., Iannelli F., Pessina F., Sharma S., d’Adda di Fagagna F. Recent advancements in DNA damage-transcription crosstalk and high-resolution mapping of DNA breaks. Annu. Rev. Genomics Hum. Genet. 2017;18:87–113. doi: 10.1146/annurev-genom-091416-035314. [DOI] [PubMed] [Google Scholar]
  52. Whitehouse I., Rando O.J., Delrow J., Tsukiyama T. Chromatin remodelling at promoters suppresses antisense transcription. Nature. 2007;450:1031–1035. doi: 10.1038/nature06391. [DOI] [PubMed] [Google Scholar]
  53. Wong R.P., García-Rodríguez N., Zilio N., Hanulová M., Ulrich H.D. Processing of DNA polymerase-blocking lesions during genome replication is spatially and temporally segregated from replication forks. Mol. Cell. 2020;77:3–16.e4. doi: 10.1016/j.molcel.2019.09.015. [DOI] [PubMed] [Google Scholar]
  54. Wu J., McKeague M., Sturla S.J. Nucleotide-resolution genome-wide mapping of oxidative DNA damage by Click-Code-Seq. J. Am. Chem. Soc. 2018;140:9783–9787. doi: 10.1021/jacs.8b03715. [DOI] [PubMed] [Google Scholar]
  55. Xiao W., Chow B.L. Synergism between yeast nucleotide and base excision repair pathways in the protection against DNA methylation damage. Curr. Genet. 1998;33:92–99. doi: 10.1007/s002940050313. [DOI] [PubMed] [Google Scholar]
  56. Yan W.X., Mirzazadeh R., Garnerone S., Scott D., Schneider M.W., Kallas T., Custodio J., Wernersson E., Li Y., Gao L. BLISS is a versatile and quantitative method for genome-wide profiling of DNA double-strand breaks. Nat. Commun. 2017;8:15058. doi: 10.1038/ncomms15058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Yu G., Wang L.-G., He Q.-Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. Bioinformatics. 2015;31:2382–2383. doi: 10.1093/bioinformatics/btv145. [DOI] [PubMed] [Google Scholar]
  58. Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E., Nusbaum C., Myers R.M., Brown M., Li W., Liu X.S. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9:R137. doi: 10.1186/gb-2008-9-9-r137. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S6 and Methods S1
mmc1.pdf (8.4MB, pdf)
Data S1. 3D Models of the Plug Mold and Extrusion Tool (.stl), Related to STAR Methods
mmc2.zip (299.1KB, zip)
Document S2. Article plus Supplemental Information
mmc3.pdf (12.4MB, pdf)

Data Availability Statement

The datasets generated and analyzed during this study are available in the Gene Expression Omnibus repository, https://www.ncbi.nlm.nih.gov/geo, under accession number GEO: GSE134225. Published datasets used for this analysis and their accessibility are summarized in the Key Resources Table. GLOE-Pipe is publicly accessible, fully documented and regularly maintained by the developers at https://github.com/helle-ulrich-lab/ngs-gloepipe.

RESOURCES