Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Jul 18.
Published in final edited form as: Chem Commun (Camb). 2023 Jul 18;59(58):8997–9000. doi: 10.1039/d3cc02699j

Sequencing for oxidative DNA damage at single-nucleotide resolution with click-code-seq v2.0

Songjun Xiao a, Aaron M Fleming a, Cynthia J Burrows a,*
PMCID: PMC10909242  NIHMSID: NIHMS1969256  PMID: 37401666

Abstract

Oxidative damage to DNA nucleotides has many cellular outcomes that could be aided by the development of sequencing methods. Herein, the previously reported click-code-seq method for sequencing a single damage type is redeveloped to enable the sequencing of many damage types by making simple changes to the protocol (i.e., click-code-seq v2.0).


Oxidatively-derived damage to DNA occurs by reactive oxygen species (ROS) formed endogenously during metabolism and the inflammatory response or exogenously when cells are exposed to ionizing radiation.1 The cellular milieu is a reducing context with bicarbonate buffer that can influence the chemistry of the ROS and products formed from DNA oxidation.2 The guanosine (dG) nucleotide in DNA is the most electron-rich and is a dominant site of oxidation to yield the 8-oxo-7,8-dihydroguanosine (dOG) analog intracellularly.3,4 Oxidation reactions can also occur on other bases, such as thymine to yield thymine glycol (dTg).5 A long-standing question has been, where are these modifications formed in the genome? To answer this question for each oxidatively derived damage, methodologies must be developed that can sequence the DNA beyond reading the locations of the dA, dC, dG, and dT nucleotides.6 The biological payoff from developing and employing such techniques is a better understanding of the role of site-specific oxidative DNA damage causing mutations found in a variety of diseases.7 DNA oxidation at dG nucleotides may also function as an epigenetic-like regulator for the cellular response to stress, and knowing the sites of genomic dOG is essential for advancing this hypothesis.8

Many methods for sequencing genomic oxidation sites have been reported. These methods fall into two general categories: The first are those that enrich fragmented DNA for strands containing the oxidation site, after which sequencing reports the location of the oxidation site at the resolution of the average fragment length.5,912 The second are those that introduce signatures at the site of modification, which enables single-nucleotide resolution of the modification sites.1317

A sub-category of the nucleotide-resolution methods is those that hijack the natural base excision repair (BER) pathway to utilize the specificity of a DNA glycosylase for removal of the damaged nucleotide followed by the installation of a reporter group to be found during sequencing (Fig. 1A).13,14,17 Two reports that are background for the present report will be discussed. In 2015, the Burrows laboratory first demonstrated this approach by replacing the damaged nucleotide with a PCR-amplifiable unnatural base pair (UBP = dNaM:d5SICS,18 Fig. 1B).13 Sequencing for the dOG heterocycle was achieved by targeting the site with either the Fpg and OGG1 glycosylases, the latter being more selective than the former enzyme,19 to replace the dOG with the UBP that was revealed by a sharp stop in a Sanger sequencing chromatogram. The universality of this approach was demonstrated by the selectivity of the NEIL1 glycosylase for the hyperoxidized dG base spiroiminodihydantoin (dSp)20 and by use of UDG21 to label uracil (dU) resulting from dC amination in DNA. At the time, high-throughput sequencing technology was unable to locate an unnatural nucleotide in the DNA sequence; however, this could now change based on a recent report.22 An improvement to this approach was reported in 2018 by the Sturla laboratory who showed that a BER-generated gap could be filled with a reporter sequence. Their method, “click-code seq,” used Fpg for the removal of dOG that was then replaced with a custom-synthesized 3′-O-propargyl-dGTP (Fig. 1C).14 The installed alkyne group was then subjected to the click reaction (CuAAC) to append a 5′-azide-containing DNA strand of known sequence used to locate the dOG after sequencing. The Brown laboratory previously developed the 1,2,3-triazole linkage formed in this reaction and found it was compatible with polymerase bypass.23 The click-code-seq protocol was employed on yeast genomic DNA;14 however, the demand for the custom synthesis of a 3′-O-propargyl-dNTP imposes a barrier for application to damage at other DNA nucleotides.

Figure 1.

Figure 1.

Prior (A, B, and C) BER methods for nucleotide-resolution mapping of oxidative damaged nucleotides that were inspirational for the (D) present method described. See Figure S1 (ESI) for the complete structures of the 1,2,3-triazole linkages formed in panels C and D.

In the present report, we were inspired by click-code-seq developed in the Sturla laboratory that allowed sequencing for dOG sites via high-throughput sequencing;14 our goal was to adapt the methodology to function universally for targeting other types of DNA damage, similar to the approach using UBP to mark the damaged sites (Fig. 1).13 To achieve the goal, the click partners were reversed in which the gap in the DNA formed by the glycosylase/endonuclease step was filled by a polymerase with a commercially-available 3′-azido-2′,3′-dideoxynucleotide (3′-N3-ddNTP; Fig. 1D). The code DNA sequence was comprised of a 5′-alkynyl group formed by standard phosphoramidite synthesis of the monomer and solid-phase synthesis of the polymer of known sequence. We have termed the update in the methodology click-code-seq v2.0. The universal nature of the update results from all four 3′-N3-ddNTP being commercially available, and the code sequence needs to be synthesized only once and then applied to whichever damage-specific glycosylase and 3′-N3-ddNTP were selected. What follows is an outline of the methodology, a description of the synthetic protocol, optimizations conducted at each step, and application of this to sequence a dOG at a known site in a ~6,300-nt long plasmid DNA. Finally, we outline future DNA damage sites that could be sequenced by the method by simply changing the glycosylase and ordering a new 3′-N3-ddNTP from a commercial vendor.

The method reported was first developed and optimized on DNA oligomers made by standard solid-phase synthesis with dOG at a known location (Fig. S2, ESI). The sequence selected for study flanks codon 12 of the KRAS gene in which there is an established G→T transversion mutation found in many cancers that is also diagnostic of the mutation caused by dOG during replication.7 The dOG-containing DNA strand was 5′−32P labelled and then mixed with the complementary strand to form duplex DNA (Figs. 2A and B lane 1). The chemical reactions were monitored by denaturing PAGE separation and phosphorimager autoradiography visualization (Fig. 2B).

Figure 2.

Figure 2.

(A) Overview of the method and (B) monitoring the reaction progress of click-code-seq v2.0 by PAGE and storage phosphor autoradiography.

The first step of the protocol is where selectivity in the DNA damage recognition occurs because BER glycosylases have evolved to target specific types of DNA lesions.19 In the development of the method, the duplex DNA was treated with the bifunctional glycosylase Fpg that hydrolyses the damaged base from the sugar followed by catalyzing β,δ-elimination of the phosphates to yield a 5′ fragment with a 3′ phosphate (Fig. 2 lane 2).19 The 3′ phosphate must be removed prior to installation of the commercially available 3′-N3-ddGTP. In the original version of click-code-seq, this reaction was conducted with the endonuclease APE1;14 however, we found the reaction is much more efficient using endonuclease IV (Endo IV; Figs. 2B lane 3 and S3, ESI), as we previously reported.13 The original click-code-seq protocol used Therminator DNA polymerase for installation of the 3′-O-propargyl-dGTP; however, in the present approach, we used the DNA polymerase Klenow fragment 3′→5′ exonuclease minus (Kf exo) known to tolerate modified dNTPs,24 which was found to install 3′-N3-ddGTP with nearly quantitative yield (Fig. 2B lane 4). The DNA product in this scenario has a 3′ azide on the DNA strand of interest, which is the reverse of the original click-code-seq having the alkyne at the same position and strand.

Production of the 5′-alkynyl-code DNA sequence first required synthesis of a 5′-alkynyl-dT phosphoramidite following a well-established protocol from the Brown laboratory (Fig. S4, ESI).25 The same laboratory also found the 1,2,3-triazole linkage formed after the click reaction is biocompatible and readily bypassed by a DNA polymerase.25 Synthesis of phosphoramidites, especially on a dT nucleotide that does not require base protection, is a more tractable synthesis than reactions to furnish dNTPs particularly those with the G heterocycle.26 The phosphoramidite formed was used in the solid-phase synthesis of the code DNA strand where it was installed as the final 5′ nucleotide (Fig. S5, ESI). The code sequence was selected to minimize any off-target binding to the human genome (Fig. S2, ESI). In the final step of the protocol, the click reaction was allowed to occur between the 3′-azide group installed where dOG previously resided and the 5′-alkynyl-code DNA sequence. The click reaction was optimized and found to reach yields >95% when the Cu:TBTA catalyst was used in a solution with 55% DMSO when allowed to react at 22 °C for 24 h (Figs. 2B lane 5 and S6, ESI).

Demonstration of the protocol on a large DNA strand of biological origin was then conducted. Using a literature protocol,27 a ~6300-nt long plasmid extracted from E. coli culture was subjected to a chemo-enzymatic synthesis method for installation of dOG at a site of our choosing (Figs. 3A and S7, ESI). The VEGF promoter potential G-quadruplex forming sequence was selected for dOG introduction because this sequence, when oxidized in cellulo, induces mRNA synthesis via the epigenetic-like role of this modified nucleotide.27 Employing the method as described on the dOG-containing plasmid, followed by direct PCR amplification of the click product using Phusion High-Fidelity DNA Polymerase, revealed the oxidation site upon Sanger sequencing (Fig. 3B). In the sequencing chromatogram, the plasmid sequence is observed with the VEGF sequence that is interrupted at the dOG site to continue with the code sequence introduced by the method. The PCR amplicon is a high-fidelity copy of the click conjugate, without any insertion and deletion of nucleotides. This validates the utility of the method on large DNA of biological origin where the site of oxidation is known. Studies in our laboratory routinely applied a gap-ligation method for verification of site selectively installed modification in plasmids;28 a drawback to this approach were cases in which the modification resides in a dG run (e.g., a G-quadruplex forming sequence)—the data revealed the run was modified but not at which site. Click-code-seq has the advantage of sequencing modified plasmids to pinpoint the site of modification in poly-dG sequences (Fig. 3B). A detailed protocol for the method is provided in the Electronic Supplementary Information.

Figure. 3.

Figure. 3.

(A) DNA of a biological origin was chemo-enzymatically modified to contain a site-specific dOG, (B) which was used for validation of the methodology on a large substrate of known composition.

Herein, a modified version of click-code-seq (i.e., click-code-seq v2.0) is reported that retains the single-nt resolution of the original procedure and has the additional benefit of being applicable to more than one type of DNA damage. The requirements to target other modifications are met by the suite of BER glycosylases that target specific DNA lesions.19 Using commercial 3′-N3-ddNTPs allows the gap formed by the glycosylase-endonuclease pair to be filled with the appropriate click reaction partner. Synthesis of one 5′-alkynyl-code DNA sequence allows targeting of any of the 3′-N3-ddNTPs enabling sequencing for the target modification of interest. Similar to click-code-seq, the update version of the method could map lesions that are ~20 nucleotides or more apart on the same strand to enable enough sequence of the target genome to be read for read alignment to the reference sequence. Our previous report that labelled three different types of DNA damage with an UBP demonstrates the feasibility of extending this approach to other lesions (Fig. 1B).13 Furthermore, the original report by the Sturla laboratory demonstrated that the method can be deployed on the genomic scale.14 Lastly, a recent report showed efficient polymerase bypass of a 1,2,3-triazole formed between a 3′-N3-ddG and a 5′-hexynyl terminated DNA strand;29 this is relevant because both click reaction partners would then be commercially available ones, allowing future applications of this approach that avoid custom chemical synthesis altogether.

As a final note, the glycosylase introduces the selectivity in the DNA damage to be sequenced (Fig. 1A). While Fpg is commonly used for targeting dOG base paired with dC, this glycosylase also has one of the broadest substrate scopes and includes the dG modifications Fapy-dG paired with dC and the hyperoxidized dG lesions, dSp and dGh in any base pair context.30 Greater selectivity for dOG paired with dC is achieved by using OGG1 as the glycosylase, but the Fapy-dG lesion paired with dC remains a substrate in either case.30,31 Other DNA damage products that can be sequenced by the click-code-seq v2.0 method are provided in Table 1. These examples include dOG paired with dA using MutY,32 dTg using NTHL1 or NEIL1,5,33,34 thymine dimers (d(T<>T)) using T4 Endo V,35 the hydantoins using NEIL1 or 3,20,36 and dU by UDG.21

Table 1.

Glycosylases, the substrates they target, and the 3`-N3-ddNTP that can be used for sequencing the target substrates.

BER Glycosylase Substrate 3′-ddNTP
Fpg dOG:C, dFapy-G, dSp, and dGh 3′-N3-ddGTP
NEIL1–3 dSp and dGh or dTg 3′-N3-ddGTP or 3′-N3-ddTTP
UDG dU 3′-N3-ddCTP
MutY dOG:A 3′-N3-ddTTP
T4 Endo V d(T<>T) 3′-N3-ddTTP
Endo IV AP 3′-N3-ddNTP
Endo V dI 3′-N3-ddATP

Supplementary Material

Supplementary Material

Acknowledgments

We acknowledge financial support from the National Institute of General Medical Sciences grant no. R35 GM145237 and the University of Utah core facilities for synthesizing the DNA used in the studies and conducting the Sanger sequencing.

Footnotes

Electronic Supplementary Information (ESI) available: complete method details, data for optimization of each step, phosphoramidite synthesis protocol and characterization, and modified oligonucleotide synthesis and characterization. See DOI: 10.1039/x0xx00000x

Conflicts of interest

There are no conflicts to declare.

Notes and references

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material

RESOURCES