Skip to main content
iScience logoLink to iScience
. 2018 Nov 2;9:423–432. doi: 10.1016/j.isci.2018.10.024

Highly Selective 5-Formyluracil Labeling and Genome-wide Mapping Using (2-Benzimidazolyl)Acetonitrile Probe

Yafen Wang 1,4, Chaoxing Liu 1,4, Fan Wu 1,4, Xiong Zhang 1, Sheng Liu 2, Zonggui Chen 3, Weiwu Zeng 1, Wei Yang 1, Xiaolian Zhang 2, Yu Zhou 3, Xiaocheng Weng 1, Zhiguo Wu 3, Xiang Zhou 1,5,
PMCID: PMC6249349  PMID: 30466066

Summary

Chemical modifications to nucleobases have a great influence on various cellular processes, by making gene regulation more complex, thus indicating their profound impact on aspects of heredity, growth, and disease. Here, we provide the first genome-wide map of 5-formyluracil (5fU) in living tissues and evaluate the potential roles for 5fU in genomics. We show that an azido derivative of (2-benzimidazolyl)acetonitrile has high selectivity for enriching 5fU-containing genomic DNA. The results have demonstrated the feasibility of using this method to determine the genome-wide distribution of 5fU. Intriguingly, most 5fU sites were found in intergenic regions and introns. Also, distribution of 5fU in human thyroid carcinoma tissues is positively correlated with binding sites of POLR2A protein, which indicates that 5fU may distributed around POLR2A-binding sites.

Subject Areas: Chemistry, Genetics, Molecular Genetics

Graphical Abstract

graphic file with name fx1.jpg

Highlights

  • The derivative of (2-benzimidazolyl)acetonitrile (azi-BIAN) can selectivity label 5fU

  • Azi-BIAN can selectively label and pull down 5fU in the genome for NGS

  • The first genome-wide map of 5fU in mammalian genomic DNA

  • 5fU is highly enriched at intergenic regions and introns


Chemistry; Genetics; Molecular Genetics

Introduction

Chemical modifications to nucleobases play important roles in mediating fundamental biological processes (Booth et al., 2015, Shu et al., 2018, Suzuki and Bird, 2008, Wu and Zhang, 2017) and are regarded as hallmarks of many diseases (Chen et al., 2017, Jackson and Bartek, 2009, Johnson et al., 2017). Therefore, a detailed analysis of natural nucleobase modifications is essential for a complete understanding of genetic and epigenetic regulation (Hong et al., 2018, Iwan et al., 2017, Liu et al., 2016, Shen et al., 2014). However, a global mapping of the modified nucleobases in the genome is often missing because of the low abundance of these modifications and lack of sensitive, selective, and genome-wide detection methods (Wyrick and Roberts, 2015).

5-Formyluracil (5fU), which is present in many cells and tissues (Hong and Wang, 2007, Pfaffeneder et al., 2014), can be generated by exposure to UV light (Decarroz et al., 1986), ionizing radiation (Hong and Wang, 2007, Kasai et al., 1990), Fenton-type reagents, reactive oxygen species attack (Hong et al., 2006), or enzyme oxidation (Pais et al., 2015). It has been reported that 5fU modification can introduce gene mispairing (Yoshida et al., 1997), alter DNA structures (Kawasaki et al., 2017b), modulate protein-DNA interactions (Kittaka et al., 2001, Rogstad et al., 2004), and induce perturbations of DNA function (Rogstad et al., 2004). Recently, 5-formylcytosine (5fC), the modified cytosine counterpart of 5fU, has been identified as a vital epigenetic modification involved in gene regulation (Kitsera et al., 2017, Song et al., 2013, Wang et al., 2018b), cell differentiation, and development (Wagner et al., 2015, Zhu et al., 2017). 5fU may be an oxidation product of 5-hydroxymethyluracil (5hmU) in vivo, and 5hmU has been identified as not only an oxidized nucleobase but also an essential epigenetic mediator that influences transcription factors, changes the physical properties of local DNA duplex in the genome, and helps in binding of chromatin remodeling proteins (Kawasaki et al., 2017a, Kawasaki et al., 2018, Modrzejewska et al., 2016, Pfaffeneder et al., 2014). Whether 5fU also acts as an epigenetic mediator like 5fC and 5hmU remains an open question. Although many reagents such as aminothiophenol (Hirose et al., 2010), phenylenediamine (Hardisty et al., 2015, Liu et al., 2017b, Wang et al., 2018a), hydrazine (Kawasaki et al., 2017a, Liu et al., 2017a), and indole (Samanta et al., 2015) derivatives have been utilized to selectively label 5fU, genome-wide profiling of 5fU remains a challenge owing to its low abundance in the genome. The development of an efficient, rapid, sensitive, environment-friendly, and catalyst-free method to analyze 5fU is highly desired.

Herein, we present a novel method termed fU-Seq for determining the genome-wide distribution of 5fU in mouse hippocampus and human thyroid carcinoma tissue via an azido-modified reagent that selectively labels 5fU (Figures 1A and 1B). The challenge of selective capturing and profiling 5fU is therefore addressed by employing copper-free click chemistry between azido-modified 5fU-containing genomic DNA and commercial dibenzocyclooctyne (DBCO)-modified biotin (Click Chemistry Tools). After enrichment using streptavidin-coated magnetic beads, dithiothreitol can be used to cleave the biotin linker (Song et al., 2011) to release the pulled down 5fU-containing genomic DNA for further next-generation sequencing (NGS).

Figure 1.

Figure 1

fU-Seq, a Method that Utilizes Selective Tagging of 5fU with an Azido-Modified Reagent

(A) The chemical structures of 5fU, the azido-modified reagent, DBCO derivatives, and their products.

(B) Schematic illustration of fU-Seq. Genomic DNA is fragmented, sequentially labeled with azi-BIAN, conjugated to biotin for pull-down enrichment, and released following enrichment using DTT to define 5fU sites throughout the genome. NGS, next-generation sequencing.

Results

Evaluating the Reactivity of azi-BIAN with ODN-fU

First, we screened for chemicals that have the ability to efficiently tag 5fU with high yield and selectivity under warm conditions, when compared with other aldehyde modifications present in DNA. We successfully identified several chemicals that reacted efficiently with 5fU (Liu et al., 2017a, Liu et al., 2017b, Liu et al., 2018). The inherent chemical properties of (2-benzimidazolyl)acetonitrile (azi-BIAN) make it high selectivity for 5fU in both nucleosides and oligonucleotides. More importantly, azi-BIAN could not react with abasic site and 5fC. We thus designed an azido derivative of azi-BIAN for enriching 5fU-containing genomic DNA (Schemes S1 and S2).

Next, we incubated a 15-mer oligodeoxyribonucleotide containing one 5fU site (ODN-5fU) with azi-BIAN in NaOAc buffer (pH 5.0) at 37°C for 6 hr. Complete conversion to the new product ODN-azi-biaU was recorded by reversed-phase (RP)-high-performance liquid chromatography (HPLC) (monitored at 260 nm) (Figure 2C). The integrity of labeled DNAs was confirmed by MALDI-TOF mass spectrometry (MS) (Figures 2D, S2A, and S2B). In addition, enzymatically digested mononucleosides were analyzed through HPLC-MS to ensure that the reaction of 5fU yielded 5-formyl-2′-deoxyuridine and azi-BIAN adduct (azi-biaU) (Figure S3). In control experiments, ODN-T, ODN-5hmU, ODN-5hmC, ODN-5fC, and ODN-AP (in which the 5fU site was replaced with a T, 5hmU, 5hmC, 5fC, and abasic site, respectively) were also selected to react with azi-BIAN under the same conditions. The high selectivity of azi-BIAN for 5fU was verified by RP-HPLC (monitored at 260 nm) (Figure S1) and denaturing polyacrylamide gel electrophoresis analysis (Figures 2A and 2B), indicating that the substitution of 5fU with 5fC or an abasic site containing aldehydes did not disturb selective labeling of 5fU by azi-BIAN. To manifest whether the oxidative damage occurs during sample workup, ODN-5fC was incubated with azi-BIAN and then the mixture was subjected to DNA MALDI-TOF MS analysis. No mass spectra of ODN-azi-biaU appeared (Figure S4B). Meanwhile, the model DNA (80 bp double-stranded [ds] ODN-fC) was subjected to extraction by DNeasy Blood & Tissue Kit for simulating the process of genomic DNA extraction. After that, the extracted DNA was digested for liquid chromatography (LC)-MS analysis. No 5fU peaks was found (Figure S5). These results indicated that the oxidative damage did not occur during sample workup.

Figure 2.

Figure 2

azi-BIAN Selectively Labels and Enriches 5fU

(A) PAGE analysis of ODN-5fU after incubation with azi-BIAN (lane 3) (dashed line) after being stained with nucleic acid stains (fluorescence mode, λex: 532 nm) when compared with other control DNAs such as ODN-T (lane 1), ODN-5hmU (lane 2), ODN-5hmC (lane 4), ODN-5fC (lane 5), and ODN-AP (lane 6) under the same conditions.

(B) PAGE analysis of ODN-5fU after incubation with azi-BIAN and reaction with DBCO-biotin (lane 3) (dash line) after being stained with nucleic acid stains (fluorescence mode, λex: 532 nm) when compared with other control DNAs such as ODN-T (lane 1), ODN-5hmU (lane 2), ODN-5hmC (lane 4), ODN-5fC (lane 5), and ODN-AP (lane 6) under the same conditions.

(C) RP-HPLC trace (260 nm) of ODN-5fU (black line); ODN-azi-biaU (blue line), which was generated by reaction with azi-BIAN; and ODN-azi-biaU, which was further labeled with DBCO-S-S-PEG3-bitoin (ODN-biotin-U, red line).

(D) MALDI-TOF-spectra of ODN-5fU, ODN-5fU after incubation with azi-BIAN, and ODN-5fU after incubation with azi-BIAN and reaction with DBCO-biotin.

(E) Enrichment tests of a single pool of spike-in amplicons containing 5fU, 5fC, or only canonical nucleobases, using fU-Seq. Values shown are fold enrichment over canonical nucleobases. Data are represented as mean ± SD of biological triplicate.

Enriching 5fU-Containing DNA Fragments

Besides the advantage of 5fU selective labeling, the labeled 5fU containing biotin can be used to enrich DNA fragments bearing 5fU. Because most biological samples bearing 5fU are in ds form, it was also vital to determine whether azi-BIAN selectively labels 5fU in dsDNAs. Therefore, we used a series of 80 bp dsDNAs (containing two 5fC or 5fU sites per strand or only canonical nucleosides) as a test of specificity under conditions described previously (Hardisty et al., 2015) to evaluate the enrichment efficiency. These ODNs were made to react with azi-BIAN followed by biotinylation. fU-DNA was enriched over C-DNA by ∼157-fold with azi-BIAN, whereas fC-DNA was enriched over C-DNA by 1.1-fold, based on qPCR quantitation. These results confirmed that our pull-down method fU-Seq has specificity for enriching 5fU-containing DNA (Figure 2E). Because of NaBH4 can reduce 5fU to 5hmU and hydroxylamine (EtONH2) can react with formyl group of 5fU, we applied the 80bp-dsDNA which contain 5fU was reduced by NaBH4 or blocked by EtONH2 as the control experiments to validate the effectively enrichment on model DNA containing 5fU (Figures S6 and S7). Taken together, these experiments demonstrate that covalent chemical labeling coupled with biotin-based affinity purification ensures accurate and comprehensive capture of 5fU-containing DNA fragments.

Genome-wide Mapping of 5fU in Mouse Hippocampal Tissues

Recently, the LC-tandem MS (MS/MS) quantification results indicated that 5fU levels are slightly higher in mouse hippocampus (2×10−6 per nucleoside) than in other tissues (Pfaffeneder et al., 2014). Thus, we performed selective labeling of 5fU in genomic DNA isolated from mouse hippocampus. Genomic DNA from mouse hippocampus was sonicated into small fragments (∼250–450 base pairs), treated with azi-BIAN to yield azido-modified 5fU-containing genomic DNA, and labeled with DBCO-biotin to install biotin (Figure 1B). Because each step is bio-orthogonal and efficient, this protocol ensures selective labeling of most 5fU sites present in genomic DNA. The presence of the introduced biotin group was confirmed by avidin-horseradish peroxidase tagging and enhanced chemiluminescence visualization to obtain a dot in the dot blot assays (Figure S8), and the HPLC-MS analyses of enzymatically digested mononucleosides from labeled genome DNA proved that the reaction of 5fU yielded the target product (Figure S9).

Pull-down genomic DNA and an input control obtained from the same genomic DNA sample were subjected to high-throughput sequencing. We first removed adapter sequences in sequencing reads with cutadapt and only kept the reads with acceptable sequencing quality using FastQC to obtain clean data (version 0.11.5, Babraham Bioinformatics) (Martin, 2011). Following these steps, Bowtie2 (version 1.2.1.1) (Langmead et al., 2009) was used to map the remaining reads to the reference genome of Mus musculus (GRCm38.p5.genome, downloaded from GENCODE) in single-end alignment mode.

To determine the pull-down efficiency, we identified the peaks with read enrichment in pull-down sample relative to the input control using HOMER (v4.9) software (Heinz et al., 2010). Using the findPeaks command with default parameters, 42,954 peaks were found across the genome, of which 39,829 peaks remained after filtering with the following criteria: fold change of pull down versus control > 4 and p value of pull-down versus control < 10−5.

A chromosome-level analysis of 5fU-enriched peaks indicated that the 5fU sites occur in a near-uniform distribution, although their presence was relatively higher in chromosomes 1 (8.28%), 2 (7.26%), and 5 (6.82%) (Figure S10). In consideration of the difference in chromosome size, a rather higher distribution in chrM (peak number/genome size = 1.23×10−4) and an extremely lower distribution in chrY (peak number/genome size = 3.27×10−8) were found (Figure 3A). The fold change versus control for most peaks (75%) was found to fall between 6.46 and 24.47 (Figure 3B). We further examined the distribution of 5fU sites within different genomic element groups and found that 62.12% of the sites occurred in intergenic regions, 36.02% in introns, and 1.87% in other regions, including promoters, transcriptional termination sites (TTSs), and exons (Figure 3C). Enriched peaks were inspected in the Integrative Genomics Viewer (IGV) (Robinson et al., 2011, Thorvaldsdóttir et al., 2013) using the input control and pull-down data as shown in Figure 4A. We also obtained heatmaps of both the input and pull-down data (Figure 4B) with the script annotatePeaks.pl, from which the pull-down efficiency could be calculated. This result was also further confirmed by qPCR (Figure S11). These results indicated that the selective enrichment of 5fU in genome by the fU-Seq strategy is effective.

Figure 3.

Figure 3

The 5fU Peak Distribution in Mouse Hippocampal Tissues

(A) Normalized to chromosome size at the whole-genome level.

(B) Box chart of 5fU distribution based on fold change versus control. Inline graphic is the maximum value, equal to 309.98; Inline graphic is minimum value, equal to 4.08; Inline graphic is the 99% value, equal to 97.87; Inline graphic is the 1% value, equal to 4.82; and Inline graphic is the mean, equal to 15.96; the box stands for interquality (12.5%–87.5%) area, ranging from 6.46 to 24.47; the red line within is the median, equal to 10.71; upper split line is the 95% point, equal to 44.87; and lower split line is the 5% point, equal to 5.61.

(C) Percentage of 5fU peak lengths overlapping with genomic features.

Figure 4.

Figure 4

fU-Seq Reveals 5fU Maps in the Whole Genome of Mouse Hippocampal Tissues

(A) Visual representation of the enrichment peak coverage of fU-Seq (below) and the input control (above) are shown.

(B) Heatmap representations of 5fU-normalized read densities (reads/million/base) across the genome. 5fU-containing read signals across the genome ranked by Reads Per Kilobase per Million mapped reads (RPKM) in default chromosome sort order. Heatmap scales correspond to normalized read densities.

(C) Distribution patterns of 5fU with respect to H3K27me3 and H3K27ac modification sites in the cerebellum.

To speculate the potential genetic significance of 5fU on histone modifications, with the annotatePeaks.pl (-size: 4,000; -hist: 10), several major existing histone modification peak data of brain tissues in adult M. musculus, including H3K27ac, H3K27me3, H3K4me1, and H3K4me3 (downloaded from ENCODE database), were compared with 5fU sites. Interestingly, the appearance of 5fU sites in the genome negatively correlated with H3K27ac modification peak but positively correlated with H3K27me3 modification peak (Figures 4C and S12). H3K27me3 is known for preventing transcription. These two histone modifications have been reported to be physiologically antagonistic. When H3K27 is trimethylated, it is tightly associated with inactive gene promoters, whereas acetylation of H3K27 is associated with active transcription (Barski et al., 2007, Ferrari et al., 2014). Thus, it is reasonable to speculate that 5fU sites might also play an inhibitory role in gene transcription. Also, more efforts need to be made for testifying this speculation.

Genome-wide Mapping of 5fU in Human Thyroid Carcinoma Tissues

With the advances of new techniques for whole-genome sequencing, carcinomas are discovered to be associated with modified nucleobases. Recently, researchers found that 5fU levels are about ten 5-formyldeoxyuridine per 106 nucleotides in human thyroid carcinoma tissues (Jiang et al., 2017). So exploring the distribution of 5fU in cancer tissues might be vital for understanding the relationship between diseases and 5fU. Encouraged by the results of genome-wide mapping of 5fU in mouse hippocampal tissues, we further analyzed the distribution of 5fU in human thyroid carcinoma tissues. Similarly, the genomic DNA was fragmented to 250–450 bp, labeled with azi-BIAN, and biotinylated for enrichment. The pull-down samples are applied for library construction and NGS.

To verify if the methods of fU-Seq are reproducible, two biological replicates were prepared to validate the results. Input 1 (I1) and input 2 (I2) represented the input groups, and pull down 1 (P1) and pull down 2 (P2) represented the pull-down groups. The NGS reads were aligned to the reference human genome (GRCh38.p7, downloaded from GENCODE). After overlapping the two biological replicates, about 950 peaks were identified (Figures S13 and S14).

We next analyzed the distribution pattern of 5fU peaks along each chromosome (Figure 5A). The results show that the distribution of 5fU in human thyroid carcinoma tissues is uniform in most chromosomes. The fold change of pull-down versus control for most peaks (50%) was found to fall between 6.32 and 15.30 (Figure 5B). To locate 5fU sites within the genome, we examined the distribution of 5fU sites within different gene fragment groups and found that 54.17% of the sites occurred in intergenic segments and 43.02% in introns (Figures 5C and S15). We then examined whether 5fU is enriched at specific types of different genomic elements. We found that 5fU is highly enriched at low_complexity and simple_repeat but is depleted at transposable elements including long terminal repeats (LTRs), short interspersed nuclear elements, and long interspersed nuclear elements (Figure S16).

Figure 5.

Figure 5

The 5fU Peaks Distribution in Human Thyroid Carcinoma Tissues

(A) The average distribution of merged 5fU peak (the peaks in pull-down 1 was merged with the peaks in pull-down 2) calculated from two biological replicates normalized to chromosome size in whole-genome level, using two sets of input data (I1 and I2) as control, respectively.

(B) Box chart showing the distribution of four sets of 5fU peaks calculated from two pull-down data (P1, P2) and two input data (I1, I2) based on fold change versus control. ● is the maximum value, ■ is minimum value, ◆ is mean, the box stands for interquality (25–75%) area, the line within is median, upper split line is 90% point, and lower split line is 10% point. I1, input 1; I2, input 2; P1, pull-down 1; P2, pull-down 2.

(C) Percentage of 5fU peak lengths overlapping with genomic features.

Enriched peaks were visualized in the IGV using the input control and pull-down data as shown in Figure 6A. To determine the pull-down efficiency, the heatmap results were obtained (Figures 6B and S17). We also validated the 5fU-specific enrichments observed in the peaks with qPCR (Figure S18). All these results manifested that the selective enrichment of 5fU in genome by the fU-Seq strategy is effective. We further investigated whether 5fU in human thyroid carcinoma tissues is associated with histone modification or protein-binding sites, which has genetic significance. With the assistance of annotatePeaks.pl (-size: 4,000; -hist: 2), several major existing histone modification peak and protein-binding site data of thyroid gland tissues in adults, including H3K27ac, H3K27me3, H3K4me1, H3K4me3, POLR2A (all downloaded from ENCODE database), were compared with 5fU sites. Surprisingly, the appearance of 5fU sites in the genome negatively correlated with H3K4me1 modification peak, whereas it positively correlated with binding sites of POLR2A protein (Figure 6C). H3K4me1 is enriched at active enhancers and acts as a marker for many cell-type-specific enhancer sites (Creyghton et al., 2010, Rada-Iglesias, 2018, Shen et al., 2016). Hence, it is tempting to speculate that 5fU sites in human thyroid carcinoma tissues are indicative of suppressed enhancer overactivation. Besides, a lot of researchers had reported that POLR2A encodes the largest and catalytic subunit of the RNA polymerase II complex (Bradner, 2015, Liu et al., 2015). Because 5fU sites act as a potential repressor, they may be distributed around POLR2A-binding sites. It is reasonable to guess that the existence of 5fU in human thyroid carcinoma may impede protein binding with DNA and then inhibit the transcription of some specific genes. Further endeavor is needed for studying the causal relationship between 5fU and cancers.

Figure 6.

Figure 6

fU-Seq Reveals 5fU Maps in the Whole Genome of Human Thyroid Carcinoma Tissues

(A) Visual representation of the enrichment peak coverage of fU-Seq and the input control are shown. I1, input 1; I2, input 2; P1, pull-down 1; P2, pull-down 2.

(B) Heatmap representations of 5fU-normalized read densities (reads/million/base) across the genome. 5fU-containing read signals across the genome ranked by RPKM in default chromosome sort order. Heatmap scales correspond to normalized read densities.

(C) The correlation between 5fU sites with H3K4me1 modification and POLR2A in human thyroid carcinoma tissues.

Discussion

The current study developed the first selective and efficient approach (fU-Seq) to label and capture 5fU from mouse and human genomic DNA and investigated the relationship between 5fU and histone modifications. We have demonstrated the feasibility of using this method to determine the genome-wide distribution of 5fU. Intriguingly, most 5fU sites were found in intergenic regions and introns. In addition, the analysis of histone modifications and 5fU sites suggested that 5fU might play an inhibitory role in gene transcription. Also, the distribution of 5fU in human thyroid carcinoma tissues is positively correlated with binding sites of POLR2A protein, which indicates that 5fU may distributed around POLR2A-binding sites. What's more, recently, Zhou et al. and Tretyakova et al. reported that aldehydes present in 5fC can conjugate histone to yield DNA-protein cross-linking (Ji et al., 2017, Li et al., 2017). Aldehydes present in 5fU are more active than 5fC (Habgood et al., 2011, Hardisty et al., 2015), which might also conjugate histone to yield DNA-protein cross-linking and influence transcriptional regulation and chromatin remodeling. Although the method we proposed could not realize single-base-resolution analysis of 5fU, we believe that this goal can be realized in the future, according to the recent report of single-base-resolution detection method of 5hmU proposed by Balasubramanian's group (Kawasaki et al., 2018). Further studying and applying this method will advance our understanding of the role played by 5fU modification in genomics.

Limitations of the Study

Although this method can profile the distribution of 5fU in genomic DNA, it does not have single-base resolution. In addition, it is unavoidable for nonspecific enrichment of DNA during pull-down process. We are currently in the process of realizing the single-base-resolution analysis of 5fU, as per the recent report by Balasubramanian's group (Kawasaki et al., 2018). Combining with the single-base-resolution analysis, the nonspecific enrichment of DNA can be ruled out.

External Data

H3K27ac (ENCSR000CCE, ENCSR000CDC, ENCSR000CDD), H3K27me3 (ENCSR000CFM), H3K4me1 (ENCSR000CCF, ENCSR000CAL, ENCSR000CAI, ENCFF710BOL), H3K4me3 (ENCSR000CAK, ENCSR000CAJ, ENCSR000CDS), and POLR2A (ENCFF079QGD) chromatin immunoprecipitation sequencing datasets were obtained from ENCODE Project database.

Methods

All methods can be found in the accompanying Transparent Methods supplemental file.

Acknowledgments

We thank the National Natural Science Foundation of China (21432008, 91753201, and 21721005 to X Zhou) and the China Postdoctoral Innovative Talent Support Program (No. BX20180228 to Y.W.). The numerical calculations in this article have been done on the supercomputing system in the Supercomputing Center of Wuhan University. We also thank Dr. Haifang Li (Analysis Center, Tsinghua University) who provided DNA MALDI-TOF test instructions.

Author Contributions

X. Zhou, Y.W., and C.L. conceived the original idea and designed the experiments with the help of Xiong Zhang and Z.W.; Y.W., C.L., F.W., and Xiong Zhang performed the experiments. F.W., Z.C., W.Z., Y.Z., X.W., and Z.W. performed bioinformatics analysis; C.L., Xiong Zhang, and W.Y. synthesized the chemicals; S.L., and Xiaolian Zhang helped with the tissues. X. Zhou, Y.W., C.L., and F.W. wrote the paper.

Declaration of Interests

I and three of my authors are applying for a Chinese patent. The authors declare no competing interests.

Published: November 30, 2018

Footnotes

Supplemental Information includes Transparent Methods, 18 figures, 2 schemes, and 5 tables and can be found with this article online at https://doi.org/10.1016/j.isci.2018.10.024.

Data and Software Availability

Sequencing data have been deposited into the Gene Expression Omnibus (GEO). The accession number is GSE115918.

Supplemental Information

Document S1. Transparent Methods, Figures S1–S18, Schemes S1 and S2, and Tables S1–S5
mmc1.pdf (1.6MB, pdf)

References

  1. Barski A., Cuddapah S., Cui K., Roh T.-Y., Schones D.E., Wang Z., Wei G., Chepelev I., Zhao K. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129:823–837. doi: 10.1016/j.cell.2007.05.009. [DOI] [PubMed] [Google Scholar]
  2. Booth M.J., Raiber E.-A., Balasubramanian S. Chemical methods for decoding cytosine modifications in DNA. Chem. Rev. 2015;115:2240–2254. doi: 10.1021/cr5002904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bradner J.E. An essential passenger with p53. Nature. 2015;520:626. doi: 10.1038/nature14390. [DOI] [PubMed] [Google Scholar]
  4. Chen Y., Hong T., Wang S., Mo J., Tian T., Zhou X. Epigenetic modification of nucleic acids: from basic studies to medical applications. Chem. Soc. Rev. 2017;46:2844–2872. doi: 10.1039/c6cs00599c. [DOI] [PubMed] [Google Scholar]
  5. Creyghton M.P., Cheng A.W., Welstead G.G., Kooistra T., Carey B.W., Steine E.J., Hanna J., Lodato M.A., Frampton G.M., Sharp P.A. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. U S A. 2010;107:21931. doi: 10.1073/pnas.1016071107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Decarroz C., Wagner J.R., Van Lier J.E., Krishna C.M., Riesz P., Cadet J. Sensitized photo-oxidation of thymidine by 2-methyl-1,4-naphthoquinone. Characterization of the stable photoproducts. Int. J. Radiat. Biol. Relat. Stud. Phys. Chem. Med. 1986;50:491–505. doi: 10.1080/09553008614550901. [DOI] [PubMed] [Google Scholar]
  7. Ferrari Karin J., Scelfo A., Jammula S., Cuomo A., Barozzi I., Stützer A., Fischle W., Bonaldi T., Pasini D. Polycomb-dependent H3K27me1 and H3K27me2 regulate active transcription and enhancer fidelity. Mol. Cell. 2014;53:49–62. doi: 10.1016/j.molcel.2013.10.030. [DOI] [PubMed] [Google Scholar]
  8. Habgood M., Price S.L., Portalone G., Irrera S. Testing a variety of electronic-structure-based methods for the relative energies of 5-formyluracil crystals. J. Chem. Theory Comput. 2011;7:2685–2688. doi: 10.1021/ct200354t. [DOI] [PubMed] [Google Scholar]
  9. Hardisty R.E., Kawasaki F., Sahakyan A.B., Balasubramanian S. Selective chemical labeling of natural T modifications in DNA. J. Am. Chem. Soc. 2015;137:9270–9272. doi: 10.1021/jacs.5b03730. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Heinz S., Benner C., Spann N., Bertolino E., Lin Y.C., Laslo P., Cheng J.X., Murre C., Singh H., Glass C.K. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell. 2010;38:576–589. doi: 10.1016/j.molcel.2010.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Hirose W., Sato K., Matsuda A. Selective detection of 5-formyl-2′-deoxyuridine, an oxidative lesion of thymidine, in DNA by a fluorogenic reagent. Angew. Chem. Int. Ed. 2010;122:8570–8572. doi: 10.1002/anie.201004087. [DOI] [PubMed] [Google Scholar]
  12. Hong H., Cao H., Wang Y., Wang Y. Identification and quantification of a guanine−thymine intrastrand cross-link lesion induced by Cu(II)/H2O2/ascorbate. Chem. Res. Toxicol. 2006;19:614–621. doi: 10.1021/tx060025x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Hong H., Wang Y. Derivatization with girard reagent T combined with LC−MS/MS for the sensitive detection of 5-Formyl-2‘-deoxyuridine in cellular DNA. Anal. Chem. 2007;79:322–326. doi: 10.1021/ac061465w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hong T., Yuan Y., Chen Z., Xi K., Wang T., Xie Y., He Z., Su H., Zhou Y., Tan Z.-J. Precise antibody-independent m6A identification via 4SedTTP-involved and FTO-assisted strategy at single-nucleotide resolution. J. Am. Chem. Soc. 2018;140:5886–5889. doi: 10.1021/jacs.7b13633. [DOI] [PubMed] [Google Scholar]
  15. Iwan K., Rahimoff R., Kirchner A., Spada F., Schröder A.S., Kosmatchev O., Ferizaj S., Steinbacher J., Parsa E., Müller M. 5-Formylcytosine to cytosine conversion by C–C bond cleavage in vivo. Nat. Chem. Biol. 2017;14:72–78. doi: 10.1038/nchembio.2531. [DOI] [PubMed] [Google Scholar]
  16. Jackson S.P., Bartek J. The DNA-damage response in human biology and disease. Nature. 2009;461:1071–1078. doi: 10.1038/nature08467. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Ji S., Shao H., Han Q., Seiler Christopher L., Tretyakova Natalia Y. Reversible DNA–protein cross-linking at epigenetic DNA marks. Angew. Chem. Int. Ed. 2017;56:14130–14134. doi: 10.1002/anie.201708286. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Jiang H.-P., Liu T., Guo N., Yu L., Yuan B.-F., Feng Y.-Q. Determination of formylated DNA and RNA by chemical labeling combined with mass spectrometry analysis. Anal. Chim. Acta. 2017;981:1–10. doi: 10.1016/j.aca.2017.06.009. [DOI] [PubMed] [Google Scholar]
  19. Johnson R.P., Fleming A.M., Perera R.T., Burrows C.J., White H.S. Dynamics of a DNA mismatch site held in confinement discriminate epigenetic modifications of cytosine. J. Am. Chem. Soc. 2017;139:2750–2756. doi: 10.1021/jacs.6b12284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Kasai H., Iida A., Yamaizumi Z., Nishimura S., Tanooka H. 5-Formyldeoxyuridine: a new type of DNA damage induced by ionizing radiation and its mutagenicity to Salmonella strain TA102. Mutat. Res. 1990;243:249–253. doi: 10.1016/0165-7992(90)90139-b. [DOI] [PubMed] [Google Scholar]
  21. Kawasaki F., Beraldi D., Hardisty R.E., McInroy G.R., van Delft P., Balasubramanian S. Genome-wide mapping of 5-hydroxymethyluracil in the eukaryote parasite Leishmania. Genome Biol. 2017;18:23. doi: 10.1186/s13059-017-1150-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Kawasaki F., Martínez Cuesta S., Beraldi D., Mahtey A., Hardisty R.E., Carrington M., Balasubramanian S. Sequencing 5-hydroxymethyluracil at single-base resolution. Angew. Chem. Int. Ed. 2018;130:9842–9844. doi: 10.1002/anie.201804046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kawasaki F., Murat P., Li Z., Santner T., Balasubramanian S. Synthesis and biophysical analysis of modified thymine-containing DNA oligonucleotides. Chem. Commun. (Camb) 2017;53:1389–1392. doi: 10.1039/c6cc08670e. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kitsera N., Allgayer J., Parsa E., Geier N., Rossa M., Carell T., Khobta A. Functional impacts of 5-hydroxymethylcytosine, 5-formylcytosine, and 5-carboxycytosine at a single hemi-modified CpG dinucleotide in a gene promoter. Nucleic Acids Res. 2017;45:11033–11042. doi: 10.1093/nar/gkx718. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Kittaka A., Takayama H., Kurihara M., Horii C., Tanaka H., Miyasaka T., Inoue J.-I. DNA sequence recognition by NFκB p50 homodimer: strict and obscure recognition sites in the binding sequence. Nucleosides Nucleotides Nucleic Acids. 2001;20:669–672. doi: 10.1081/NCN-100002347. [DOI] [PubMed] [Google Scholar]
  26. Langmead B., Trapnell C., Pop M., Salzberg S.L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:R25. doi: 10.1186/gb-2009-10-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Li F., Zhang Y., Bai J., Greenberg M.M., Xi Z., Zhou C. 5-formylcytosine yields DNA–protein cross-links in nucleosome core particles. J. Am. Chem. Soc. 2017;139:10617–10620. doi: 10.1021/jacs.7b05495. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Liu C., Chen Y., Wang Y., Wu F., Zhang X., Yang W., Wang J., Chen Y., He Z., Zou G. A highly efficient fluorescence-based switch-on detection method of 5-formyluracil in DNA. Nano Res. 2017;10:2449–2458. [Google Scholar]
  29. Liu C., Wang Y., Zhang X., Wu F., Yang W., Zou G., Yao Q., Wang J., Chen Y., Wang S. Enrichment and fluorogenic labelling of 5-formyluracil in DNA. Chem. Sci. 2017;8:4505–4510. doi: 10.1039/c7sc00637c. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Liu C., Zou G., Peng S., Wang Y., Yang W., Wu F., Jiang Z., Zhang X., Zhou X. 5-formyluracil as a multifunctional building block in biosensor designs. Angew. Chem. Int. Ed. 2018;130:9837–9841. doi: 10.1002/anie.201804007. [DOI] [PubMed] [Google Scholar]
  31. Liu M.Y., DeNizio J.E., Schutsky E.K., Kohli R.M. The expanding scope and impact of epigenetic cytosine modifications. Curr. Opin. Chem. Biol. 2016;33:67–73. doi: 10.1016/j.cbpa.2016.05.029. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Liu Y., Zhang X., Han C., Wan G., Huang X., Ivan C., Jiang D., Rodriguez-Aguayo C., Lopez-Berestein G., Rao P.H. TP53 loss creates therapeutic vulnerability in colorectal cancer. Nature. 2015;520:697–701. doi: 10.1038/nature14418. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17:10–12. Next Generation Sequencing Data Analysis. [Google Scholar]
  34. Modrzejewska M., Gawronski M., Skonieczna M., Zarakowska E., Starczak M., Foksinski M., Rzeszowska-Wolny J., Gackowski D., Olinski R. Vitamin C enhances substantially formation of 5-hydroxymethyluracil in cellular DNA. Free Radic. Biol. Med. 2016;101:378–383. doi: 10.1016/j.freeradbiomed.2016.10.535. [DOI] [PubMed] [Google Scholar]
  35. Pais J.E., Dai N., Tamanaha E., Vaisvila R., Fomenkov A.I., Bitinaite J., Sun Z., Guan S., Corrêa I.R., Noren C.J. Biochemical characterization of a Naegleria TET-like oxygenase and its application in single molecule sequencing of 5-methylcytosine. Proc. Natl. Acad. Sci. U S A. 2015;112:4316–4321. doi: 10.1073/pnas.1417939112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Pfaffeneder T., Spada F., Wagner M., Brandmayr C., Laube S.K., Eisen D., Truss M., Steinbacher J., Hackner B., Kotljarova O. Tet oxidizes thymine to 5-hydroxymethyluracil in mouse embryonic stem cell DNA. Nat. Chem. Biol. 2014;10:574–581. doi: 10.1038/nchembio.1532. [DOI] [PubMed] [Google Scholar]
  37. Rada-Iglesias A. Is H3K4me1 at enhancers correlative or causative? Nat. Genet. 2018;50:4–5. doi: 10.1038/s41588-017-0018-3. [DOI] [PubMed] [Google Scholar]
  38. Robinson J.T., Thorvaldsdóttir H., Winckler W., Guttman M., Lander E.S., Getz G., Mesirov J.P. Integrative genomics viewer. Nat. Biotechnol. 2011;29:24–26. doi: 10.1038/nbt.1754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Rogstad D.K., Heo J., Vaidehi N., Goddard W.A., Burdzy A., Sowers L.C. 5-Formyluracil-induced perturbations of DNA function. Biochemistry. 2004;43:5688–5697. doi: 10.1021/bi030247j. [DOI] [PubMed] [Google Scholar]
  40. Samanta B., Seikowski J., Höbartner C. Fluorogenic labeling of 5-formylpyrimidine nucleotides in DNA and RNA. Angew. Chem. Int. Ed. 2015;55:1912–1916. doi: 10.1002/anie.201508893. [DOI] [PubMed] [Google Scholar]
  41. Shen H., Xu W., Guo R., Rong B., Gu L., Wang Z., He C., Zheng L., Hu X., Hu Z. Suppression of enhancer overactivation by a RACK7-histone demethylase complex. Cell. 2016;165:331–342. doi: 10.1016/j.cell.2016.02.064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Shen L., Song C.-X., He C., Zhang Y. Mechanism and function of oxidative reversal of DNA and RNA methylation. Annu. Rev. Biochem. 2014;83:585–614. doi: 10.1146/annurev-biochem-060713-035513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Shu X., Liu M., Lu Z., Zhu C., Meng H., Huang S., Zhang X., Yi C. Genome-wide mapping reveals that deoxyuridine is enriched in the human centromeric DNA. Nat. Chem. Biol. 2018;14:680–687. doi: 10.1038/s41589-018-0065-9. [DOI] [PubMed] [Google Scholar]
  44. Song C.-X., Clark T.A., Lu X.-Y., Kislyuk A., Dai Q., Turner S.W., He C., Korlach J. Sensitive and specific single-molecule sequencing of 5-hydroxymethylcytosine. Nat. Methods. 2011;9:75–77. doi: 10.1038/nmeth.1779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Song C.-X., Szulwach Keith E., Dai Q., Fu Y., Mao S.-Q., Lin L., Street C., Li Y., Poidevin M., Wu H. Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming. Cell. 2013;153:678–691. doi: 10.1016/j.cell.2013.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Suzuki M.M., Bird A. DNA methylation landscapes: provocative insights from epigenomics. Nat. Rev. Genet. 2008;9:465–476. doi: 10.1038/nrg2341. [DOI] [PubMed] [Google Scholar]
  47. Thorvaldsdóttir H., Robinson J.T., Mesirov J.P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 2013;14:178–192. doi: 10.1093/bib/bbs017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Wagner M., Steinbacher J., Kraus Theo F.J., Michalakis S., Hackner B., Pfaffeneder T., Perera A., Müller M., Giese A., Kretzschmar Hans A. Age-dependent levels of 5-methyl-, 5-hydroxymethyl-, and 5-formylcytosine in human and mouse brain tissues. Angew. Chem. Int. Ed. 2015;54:12511–12514. doi: 10.1002/anie.201502722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Wang Y., Liu C., Yang W., Zou G., Zhang X., Wu F., Yu S., Luo X., Zhou X. Naphthalimide derivatives as multifunctional molecules for detecting 5-formylpyrimidine by both PAGE analysis and dot-blot assays. Chem. Commun. (Camb) 2018;54:1497–1500. doi: 10.1039/c7cc08715b. [DOI] [PubMed] [Google Scholar]
  50. Wang Y., Liu C., Zhang X., Yang W., Wu F., Zou G., Weng X., Zhou X. Gene specific-loci quantitative and single-base resolution analysis of 5-formylcytosine by compound-mediated polymerase chain reaction. Chem. Sci. 2018;9:3723–3728. doi: 10.1039/c8sc00493e. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Wu X., Zhang Y. TET-mediated active DNA demethylation: mechanism, function and beyond. Nat. Rev. Genet. 2017;18:517–534. doi: 10.1038/nrg.2017.33. [DOI] [PubMed] [Google Scholar]
  52. Wyrick J.J., Roberts S.A. Genomic approaches to DNA repair and mutagenesis. DNA Repair (Amst) 2015;36:146–155. doi: 10.1016/j.dnarep.2015.09.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Yoshida M., Makino K., Morita H., Terato H., Ohyama Y., Ide H. Substrate and mispairing properties of 5-formyl-2′-deoxyuridine 5′-triphosphate assessed by in vitro DNA polymerase reactions. Nucleic Acids Res. 1997;25:1570–1577. doi: 10.1093/nar/25.8.1570. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Zhu C., Gao Y., Guo H., Xia B., Song J., Wu X., Zeng H., Kee K., Tang F., Yi C. Single-cell 5-formylcytosine landscapes of mammalian early embryos and ESCs at single-base resolution. Cell Stem Cell. 2017;20:720–731.e5. doi: 10.1016/j.stem.2017.02.013. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Transparent Methods, Figures S1–S18, Schemes S1 and S2, and Tables S1–S5
mmc1.pdf (1.6MB, pdf)

Articles from iScience are provided here courtesy of Elsevier

RESOURCES