Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Jan 10.
Published in final edited form as: Cell. 2018 Dec 13;176(1-2):144–153.e13. doi: 10.1016/j.cell.2018.10.055

HMCES maintains genome integrity by shielding abasic sites in single strand DNA

Kareem N Mohni 1, Sarah R Wessel 1, Runxiang Zhao 1, Andrea C Wojciechowski 1, Jessica W Luzwick 1, Hillary Layden 1, Brandt F Eichman 1,2, Petria S Thompson 1, Kavi P M Mehta 1, David Cortez 1,*
PMCID: PMC6329640  NIHMSID: NIHMS1511529  PMID: 30554877

Summary

Abasic sites are one of the most common DNA lesions. All of the known abasic site repair mechanisms operate only when the damage is in double-stranded DNA. Here we report the discovery of HMCES as a sensor of abasic sites in single-stranded DNA. HMCES acts at replication forks, binds PCNA and single-stranded DNA, and generates a DNA-protein crosslink to shield abasic sites from error-prone processing. This unusual HMCES DNA-protein crosslink intermediate is resolved by proteasome-mediated degradation. Acting as a suicide enzyme, HMCES prevents translesion DNA synthesis and the action of endonucleases that would otherwise generate mutations and double-strand breaks. HMCES is evolutionarily conserved in all domains of life and its biochemical properties are shared with its E. coli ortholog. Thus, HMCES is an ancient DNA lesion recognition protein that preserves genome integrity by promoting error free repair of abasic sites in single-stranded DNA.

Keywords: HMCES, SRAP, DNA replication, DNA repair, replication stress, translesion DNA synthesis, DNA-protein crosslink, 5-hydroxymethylcytosine, REV1, PCNA

In brief

An enzyme conserved from bacteria to eukaryotes can covalently bind to abasic sites in single-stranded DNA to shield them and prevent genomic instability

Graphical Abstract

graphic file with name nihms-1511529-f0007.jpg

Introduction

Apurinic/apyrimidinic sites, also known as abasic or AP sites, are one of the most common DNA lesions with a frequency of ~18,000 per day in human cells (Friedberg et al., 2006). They are generated by spontaneous base loss and after base damage from both endogenous and exogenous sources (Dianov et al., 2003; Friedberg et al., 2006). For example, ionizing radiation (IR) and ultraviolet (UV) radiation generate hydroxyl radicals that cause oxidative base damage (Lindahl, 1993). Specialized glycosylases remove these oxidized bases generating an AP site (Krokan and Bjoras, 2013). Alkylating agents such as methyl methanesulfonate (MMS) also cause AP sites since base alkylation weakens the N-glycosyl bond (Friedberg et al., 2006; Lindahl, 1993). The majority of AP sites are removed from doublestranded DNA (dsDNA) by base excision repair (BER) with nucleotide excision repair (NER) acting as a back up pathway (Friedberg et al., 2006; Krokan and Bjoras, 2013). After removal of the lesion, the undamaged strand is used as a template for repair synthesis.

AP sites also form in single-stranded DNA (ssDNA). For example, cytosine deamination to uracil occurs 2–3 orders of magnitude faster in ssDNA than in dsDNA and can also be catalyzed by APOBEC enzymes (Kavli et al., 2007). Removal of the uracil by Uracil DNA Glycosylase (UDG) generates an AP site, and translesion synthesis (TLS) across the AP site causes mutations thereby causing one of the mutational signatures in cancer resulting from APOBEC activity (Nik-Zainal et al., 2012; Sale, 2013). Alternatively, if ssDNA AP sites were acted on by AP endonucleases, the result would be a dsDNA break (DSB). There is no known mechanism for repairing AP sites in the context of ssDNA.

5-hydroxymethylcytosine (5hmC) binding, ES cell-specific (HMCES) was originally identified as a possible reader of 5hmC in embryonic stem cell extracts using a double-stranded DNA molecule containing 5hmC as bait (Spruijt et al., 2013). A HMCES protein is encoded in almost all organisms in every domain of life and several viruses (Aravind et al., 2013). HMCES orthologs are genetically linked to bacterial SOS response genes (the bacterial DNA damage response) (Aravind et al., 2013) and expression of the E. coli ortholog, yedK, closely parallels that of DinB, a DNA damage-induced polymerase (Hu et al., 2009). These proteins contain a single SOS response associated peptidase (SRAP) domain (Aravind et al., 2013) and are reported to act as a nuclease on 5hmC containing DNA (Kweon et al., 2017).

Here we report the discovery of HMCES as a sensor and shield of AP sites in ssDNA at replication forks. HMCES interacts with PCNA and travels with replication forks where ssDNA AP sites are likely most common and detrimental. HMCES covalently crosslinks to ssDNA AP sites generating a DNA-protein crosslink (DPC) intermediate and effectively shields the lesion from endonucleases and TLS polymerases. Thus, HMCES-deficient cells exhibit delayed AP site repair, accumulate DNA damage, are hypersensitive to genotoxins that generate AP sites, and have increased genetic instability.

Results:

HMCES localizes to replication forks and interacts with PCNA

We previously identified HMCES as a protein modestly enriched in purifications of nascent DNA (Dungrawala et al., 2015; Sirbu et al., 2011). To validate that HMCES is at replication forks, we performed quantitative iPOND (isolation of proteins on nascent DNA) SILAC-mass spectrometry in HEK293T, HeLa, HCT116 and RPE-hTERT cells and found that HMCES is enriched at forks to similar levels as the MCM2–7 helicase complex (Figure 1A). Furthermore, HMCES knockout (HMCESΔ) cells are hypersensitive to inhibition of the replication checkpoint kinase ATR (Figure 1B and C). Gene products required to survive ATR inhibition often function in DNA replication or repair suggesting a function for HMCES in these processes (Mohni et al., 2014; Mohni et al., 2015).

Figure 1. HMCES is a ssDNA binding, replication fork associated protein.

Figure 1.

(A) iPOND-SILAC-mass spectrometry derived nascent DNA/bulk chromatin abundance ratios for selected proteins or complexes are depicted. Mean+/−SD, n=3 or 4 for each cell line. (B) Immunoblot of wild type and HMCESΔ U2OS cells. (C) Clonogenic survival assay of cells treated with ATR inhibitor VX-970 for 24 hours. Mean+/−SD, n=3, ANOVA with Dunnett posttest. (D) Schematic diagram of HMCES and E. coli yedK. See also Figure S1. (E) Surface charge and catalytic pocket of the SRAP domain from human HMCES (PDB: 5KO9). (F-I) Electrophoretic mobility shift analysis of the indicated DNA ligands incubated with (F) GST-HMCES, (G) yedK-HIS, (H) wild type and mutant HMCES proteins after removal of the GST tag, or (I) wild type and mutant yedK-HIS proteins. See also Figure S2, Table S3, and Table S4.

Human HMCES contains a single domain called SRAP, which is 24.2% identical and 38.5% similar to the SRAP domain of E. coli yedK (Figures 1D and S1). SRAP proteins contain an invariant cysteine at position two, as well as a conserved histidine and glutamic acid that sit within a putative catalytic pocket (Figures 1E and S1). SRAP domains have a positively charged surface adjacent to this pocket (Figure 1E). We measured DNA binding using electrophoretic mobility shift assays with purified recombinant HMCES and yedK. Both proteins bound DNA with a strong preference for ssDNA over dsDNA ligands (Figures 1F,G and S2). DNA binding is dependent on conserved arginines on the positively charged surface (Figures 1H, I and S2). At high protein concentrations we can observe more than one DNA-protein complex, which we interpret as multiple protein binding events on ligands with sufficient ssDNA to accommodate more than one HMCES molecule.

HMCES is expressed at higher levels in S-phase than quiescent cells (Figure 2A) and is recruited to chromatin after DNA damage specifically during S-phase (Figure 2B). Furthermore, the replisome protein PCNA is consistently identified by mass spectrometry in HMCES immunopurifications (Figure 2C), and we observed an interaction between HMCES and PCNA in cells using proximity ligation assays (Figure 2D). The C-terminal regions of vertebrate HMCES proteins contain a putative PCNA interacting peptide (PIP) box similar to the PIP boxes in TLS polymerases (Figure 2E and F) (Mailand et al., 2013). Truncating the C-terminus or mutation of these residues abrogated the HMCES-PCNA interaction (Figure 2F and G). Furthermore, far western assays demonstrate that the interaction between HMCES and PCNA is direct and dependent on the PIP box but not the putative catalytic cysteine (C2) or DNA binding surface residues (R98) (Figure 2H). We conclude that HMCES is a replication fork protein that interacts directly with PCNA.

Figure 2. HMCES localizes to chromatin in S-phase cells exposed to DNA damage and interacts with PCNA.

Figure 2.

(A-B) RPE-hTERT cells that were contact inhibited (G0) then released for 20 hours to synchronize in S-phase (S) were compared to asynchronously growing cells. Immunoblots of (A) total cell lysates or (B) chromatin and soluble fractions. Cells were treated with 100 J/m2 UV and allowed to recover for 3h where indicated. (C) Flag-HMCES interacting proteins identified by mass spectrometry. Number of peptides identified for each protein in three experiments is indicated. Control cells do not express any tagged protein. (D) Proximity ligation assay with PCNA and HMCES antibodies. Scale bar is 10 μm. (E) Sequence alignment of the PIP box in HMCES compared to TLS polymerases. (F) Schematic diagram of HMCES indicating the location of the PIP box and the truncation mutants tested in G and H. (G) Flag-HMCES immunoprecipitates immunoblotted with Flag or PCNA antibodies. (H) Purified HMCES proteins were separated by SDS-PAGE and transferred to nitrocellulose. Membranes were incubated with purified PCNA and anti-PCNA antibodies.

HMCES is a replication stress response protein

HMCES was identified as a possible reader of 5-hydroxymethylcytosine (5hmC) in other studies (Kweon et al., 2017; Spruijt et al., 2013). However, we did not observe any changes in total amounts of 5hmC or 5-methylcytosine (5mC) in DNA from HMCESΔ or HMCES-overexpressing cells (Figure S3A and B). HMCES also does not have a strong preference for binding ssDNA or dsDNA ligands containing 5hmC or 5mC in CpG contexts compared to unmodified DNA (Figure S3C and D). Since 5mC regulates transcription (Jaenisch and Bird, 2003), we performed RNA sequencing, but found that the expression of only 22 genes was significantly altered in HMCESΔ cells (Table S1). Notably, these include upregulation of the p53 responsive genes CDKN1A, TRIM22, and RRM2B (Figure S3E). We confirmed that HMCESΔ cells have elevated p53 and p21 protein even with no added genotoxic stress (Figure S3F). Increased DNA damage or replication stress would explain both the hypersensitivity to ATR inhibition and elevated p53 response in HMCESΔ cells. Consistent with this hypothesis, HMCESΔ cells have an increase in 53BP1 nuclear bodies in G1 phase cells (Figure S3G), a hallmark of DNA damage that persists from the previous S phase (Harrigan et al., 2011; Lukas et al., 2011). HMCESΔ cells did not exhibit marked changes in cell cycle distribution but had slower doubling times than the wild type cells (Figure S3H and I). Thus, HMCES is not a major regulator of gene expression or epigenetic marks in these cells, but HMCES inactivation causes elevated levels of DNA damage.

To examine a potential function in DNA repair, we tested how HMCESΔ cells respond to DNA damaging agents. HMCES-deficient cells are hypersensitive to IR, MMS, and shortwave UV radiation (Figures 3A and S4A). The IR hypersensitivity could largely be rescued by reexpression of wild type protein, but not the catalytic (C2A), DNA binding (R98E), or PIP (WL/AA) mutants (Figure 3B and C). Cells with the R98E mutation engineered into the endogenous HMCES alleles are also hypersensitive to IR (Figure 3D and S4B). Defects in double-strand break (DSB) repair often confer hypersensitivity to IR, poly (ADP-ribose) polymerase (PARP) inhibitors, crosslinking agents, and topoisomerase inhibitors. However, HMCESΔ cells are not sensitive to PARP inhibition, cisplatin, or camptothecin (Figures 3E and S4C). HMCESΔ cells also had no apparent defects in homologous recombination or nonhomologous end joining repair of DSBs (Figure S4D-G). Therefore, we conclude that HMCES is not a DSB repair protein.

Figure 3. HMCESΔ cells are hypersensitive to radiation and MMS.

Figure 3.

(A) Clonogenic survival assay of wild type and HMCESΔ U2OS cells or siRNA-transfected cells treated with IR, UV, or MMS (siNT=non-targeting siRNA). (B) Immunoblot and (C) clonogenic survival analyses of HMCESΔ cells complemented with wild type and mutant HMCES. (D) IR sensitivity of wild type, HMCESΔ or the R98E mutation engineered into the HMCES genomic locus. (E) PARP inhibitor (BMN673) sensitivity of wild-type and HMCESΔ cells. All graphs show mean+/−SD, n=3, ANOVA with Dunnett post-test. See also Figures S3, S4, and Table S1.

HMCES crosslinks to AP sites and promotes AP site resolution

IR, UV, and MMS are often used as tools to study three distinct DNA repair mechanisms—DSB repair, NER, or BER respectively. In addition to double-strand breaks and methylated bases, IR and MMS generate abasic sites through oxidative DNA damage and loss of the methylated base, respectively (Friedberg et al., 2006). While short wavelength UV (UVC) is typically used to induce cyclobutane pyrimidine dimers and 6–4 photoproducts, it can also generate AP sites (Gorner, 1994; Kuluncsics et al., 1999; Mitchell et al., 1991). We irradiated cells with a UV light box equipped with UVC bulbs and confirmed a significant increase in AP sites (Figure S5A). Thus, we hypothesized that AP sites could be the common DNA lesion that underlies the hypersensitivity of HMCESΔ cells to DNA damaging agents. To directly test if HMCES is required to resolve AP sites, we monitored global levels of AP sites in HMCESΔ cells using the aldehyde reactive probe (ARP) conjugated to biotin. The ARP reacts with aldehydes present in purified genomic DNA (Kubo et al., 1992) and can be detected with streptavidin-HRP. AP site levels are elevated in HMCESΔ compared to the wild type cells even in the absence of added DNA damage (Figure 4A), and HMCESΔ cells also exhibit a delay in AP site resolution after exposure to DNA damage (Figures 4B and S5B). The level of AP sites induced in HMCESΔ cells is less than that induced by inactivating AP endonuclease to block BER, likely because inhibition of BER also results in an increase of AP sites in dsDNA, whereas HMCES may have specificity for ssDNA (Figure S5C).

Figure 4. HMCES crosslinks to AP sites and promotes AP site repair.

Figure 4.

(A and B) Measurements of AP sites in genomic DNA using aldehyde reactive probe. All values are normalized to the untreated U2OS cell control. (A) Untreated U2OS versus HMCESΔ cells. mean+/−SD, n=8, one sample t-test, mean is different than 1. (B) Cells were treated with 10mM MMS (1 hour) and then allowed to recover for 1, 2 or 4 hours. Mean +/− SEM, n=5, ANOVA. (C) Denaturing gel analysis of 1nM unmodified (dT), THF stabilized abasic site, or a natural abasic site (dU + UDG) ssDNA incubated with human SRAP domain or yedK (3, 10nM). (D) Native and denaturing gel analysis of 1nM dU containing ssDNA or dsDNA treated with UDG as indicated. (E) Denaturing gel analysis of 1nM abasic site ssDNA (dU + UDG) incubated with wild type or mutant human SRAP (10, 1nM) or yedK (30, 10nM). Where indicated, proteinase K was used to digest the SRAP-DPC. (F) Measurement of abasic site DPC forming ability of GSTSRAP, *SRAP generated by cleavage of the GST leaving 4 N-terminal amino acids, and SRAPHIS tagged at the C-terminus. Mean+/−SD, n=3. (G) RADAR DPC assay: Mock or UV (100 J/m2) irradiated cells were allowed to recover for 3 hours in the presence or absence of MG132. Samples were lysed in detergents under denaturing conditions and DNA was ethanol precipitated. Only covalently attached proteins copurify with the DNA. 10μg of purified genomic DNA was digested with nuclease, applied to a nitrocellulose membrane, and immunoblotted with HMCES or RPA32 antibodies. (H) Quantification of RADAR assay on HMCESΔ cells complemented with wild type or mutant HMCES. Mean+/−SD, n=3, ANOVA with Dunnett posttest. See also Figures S2, S5, S6, and Table S3 and Table S4.

The SRAP domains of HMCES and yedK both bind to ssDNA containing an AP site with apparent sub-nanomolar affinity (Figures S5D-F). Strikingly, the protein-DNA complexes were not disrupted by boiling and denaturing gel electrophoresis, suggesting that HMCES and yedK covalently crosslink to AP site-containing ssDNA (Figure 4C). We did not observe appreciable crosslinking to AP sites in dsDNA (Figure 4D). The crosslinking activity is dependent on the cysteine in the catalytic pocket and the DNA binding surface (Figure 4E). We did not observe any endonuclease or lyase activity since protease treatment of the HMCES-DNA crosslinks restores the DNA gel migration to that of the full-length ssDNA (Figure 4E). Since the essential conserved cysteine is almost universally amino acid two, we hypothesized that additional Nterminal residues might interfere with DPC formation. Indeed, addition of an N-terminal GST fusion protein or even a four amino acid N-terminal extension that is left after cleavage of the GST significantly inhibited DPC formation (Figure 4F).

AP sites exist in an equilibrium between a ring-closed, furanose, and ring-open, aldehyde, form. AP site crosslinking to HMCES and yedK requires opening of the sugar ring since neither protein crosslinked to ssDNA containing a tetrahydrofuran (THF)-stabilized abasic site mimic, which cannot form the ring-open aldehyde (Figure 4C). We conclude that HMCES reacts with the aldehyde in the ring-open form. Several other DNA lesions also contain aldehydes such as 5-formyl-dC (5fC) and FaPy-G, and the ARP probe used to detect AP sites may detect other reactive aldehydes. When tested in ssDNA, we were able to observe a small amount of HMCES crosslinking to 5fC, but no detectable crosslinking to FaPy-G containing substrates (Figure S5G). The small amount of reactivity to 5fC could potentially explain the previously published link between HMCES and epigenetic DNA modifications; however, HMCES has a preference for AP sites.

To test if HMCES also forms a DNA protein crosslink (DPC) in cells, we utilized the RADAR (rapid approach to DNA adduct recovery) assay (Kiianitsa and Maizels, 2013). Cells are lysed with detergents in denaturing conditions and DNA is collected by ethanol precipitation. Therefore, any proteins present in the DNA pellet are covalently attached to the DNA and can be assayed by immunoblot. HMCES forms DPCs in cells exposed to DNA damage that generates AP sites (Figures 4G and S6A-C). In contrast, the major ssDNA binding protein RPA is not detected by this method. DPC formation in cells is dependent on the HMCES catalytic cysteine and DNA binding surface (Figure 4H). The amount of HMCES-DPCs increases when the proteasome is inhibited with MG132 (Figures 4G and S6A). We conclude that HMCES is a sensor of AP sites and covalently modifies the ssDNA AP site to generate a DPC.

The HMCES-DPC is resolved through ubiquitin-dependent proteolysis

In addition to the genotoxic agents already tested we examined HMCES responses to oxidative damage since it is another well-known inducer of AP lesions. Potassium bromate induces large amounts of 8-oxoguanine, which generates AP sites through spontaneous or glycosylase-catalyzed depurination. As predicted, HMCESΔ cells are highly sensitive to potassium bromate (Figure 5A). Potassium bromate also induced the HMCES-DPC, which peaked three hours after treatment and was largely resolved by 6 hours. In contrast, proteasome inhibition reduced resolution of the HMCES-DPC (Figure 5B and C). Total levels of HMCES were also depleted following DNA damage with potassium bromate or UV radiation (Figure 5D) suggesting that HMCES is being degraded by the proteasome after covalently modifying the AP site.

Figure 5. The HMCES-DPC is resolved by ubiquitin-dependent proteolysis.

Figure 5.

(A) Clonogenic survival assay of cells treated with KBrO3, mean+/−SD, N=3. Error bars in some cases are smaller than the symbols. (B) RADAR DPC assay: Cells were treated with 30mM KBrO3 for 30 minutes and allowed to recover for the indicated times in the presence or absence of MG132 (+). 20μg of purified genomic DNA was immunoblotted for HMCES. DNA was stained with methylene blue. (C) Quantification of B. Mean+/−SEM, n=3, two-tailed t-test. (D) Immunoblot analysis of total HMCES from cells treated with 30mM KBrO3 or 100 J/m2 UV. (E) Cells treated with 100 J/m2 UV were allowed to recover for 3 hours in the presence or absence of MG132, processed using the RADAR assay method. 100μg of purified genomic DNA was digested with nuclease, separated by SDS-PAGE, and immunoblotted for HMCES. (F) Cells expressing HMCES and HIS-ubiquitin were treated with 100 J/m2 UV and allowed to recover for 3 hours in the presence of MG132. HIS-tagged proteins purified under denaturing conditions were immunoblotted for HMCES. (G) Immunoblot analysis of total HMCES protein from cells treated with 100 J/m2 UV and incubated with or without MG132.

To test if the HMCES-DPC is ubiquitylated, we damaged cells, isolated the DPC using the RADAR method in the presence or absence of proteasome inhibitors, and examined the HMCES-DPC after SDS-PAGE separation by immunoblotting. UV treatment induces a laddering of high molecular weight HMCES species that are more abundant when the proteasome is inhibited and absent in HMCESΔ cells (Figure 5E). This HMCES laddering corresponds to ubiquitylation since HIS-ubiquitin purifications using denaturing conditions contained HMCES (Figure 5F). Furthermore, the decrease in total HMCES protein after UV treatment is prevented by proteasome inhibition (Figure 5G). Thus, the HMCES-DPC is an intermediate that is resolved through ubiquitin-mediated proteolysis indicating that HMCES is a suicide enzyme.

HMCES shields AP sites to promote error free repair

We next tested whether HMCES crosslinks to AP sites on DNA ligands that mimic the junction that would be exposed when a polymerase stalls and disengages from the template containing the AP site. Indeed, HMCES binds and crosslinks to an AP site placed on the template ssDNA at the 3’ junction, but it is unable to significantly crosslink to an AP site in the context of dsDNA (Figure 6A). The higher molecular weight HMCES complex observed in native gels at high HMCES concentrations with this 42-nucleotide length ssDNA likely represent one HMCES molecule crosslinked to the AP site and another that is non-covalently bound since only one species is observed in denaturing conditions. The crosslinking is also detectable as a HMCES mobility shift in SDS-PAGE gels (Figure 6B). The crosslinking is prevented when the AP site ssDNA is treated with AP endonuclease (APE1) prior to addition of HMCES, and the HMCES-DPC blocks APE1 cleavage of the ssDNA (Figure 6B). Thus, the HMCES-DPC prevents ssDNA cleavage that would generate a DSB. Consistent with this idea, DSBs accumulate in HMCESΔ cells as measured using neutral COMET assays (Figure 6C). However, even though we observe more DSBs, we do not observe synthetic lethality when the homologous recombination proteins BRCA1 or BRCA2 are inactivated in HMCES-deficient cells (Figure S6D).

Figure 6. HMCES functions at replication forks to promote error-free repair of AP sites.

Figure 6.

(A) Native and denaturing gel analysis of the indicated DNA ligands (1nM) treated with UDG to create an AP site after incubation with human HMCES SRAP domain (0, 1, or 10nM). (B) AP site containing ssDNA was incubated with SRAP domain and APE1 in the indicated order. Samples were resolved by SDS-PAGE and stained with coomassie or resolved on an 8M UREATBE gel prior to autoradiography. (C) DSBs were measured by neutral comet assay in wild type or HMCESΔ cells. Box and whiskers plot showing the median value, 25th to 75th percentile, and smallest and largest values; two-tailed t-test. (D) iPOND-SILAC-mass spectrometry was used to characterize the replication fork proteome of HMCESΔ compared to wild type HEK293T cells. Mean ratios, n=2. Two outlier data points were excluded from the graph but are included in Table S2. (E) Wild type or HMCESΔ cells were transfected with siRNA to REV3. The number of surviving colonies was compared to the non-targeting siRNA control in each cell line. Mean+/−SD, n=3, two-tailed t-test. (F) The mutation frequency of the indicated cell lines was assessed 48 hours after transfection of a mock or UV irradiated plasmid (pSP189), mean+/−SD, n=2, >50,000 colonies per sample, ANOVA with Dunnett post-test. (G) Cells were transfected with the indicated siRNAs. Immunoblot of chromatin and soluble fractions 3 hours after treating S-phase cells with 100 J/m2 UV. (H) Model of HMCES function. See also Figures S4, S6, and Tables S24.

Since HMCES is present at replication forks, we next asked if there are HMCESdependent changes to the replication fork proteome using quantitative iPOND-mass spectrometry comparing HMCESΔ to wild-type cells. There were no differences in core replisome components such as Polδ or PCNA, but we did observe an increase in the abundance of translesion bypass (TLS) polymerases REV1 and REV3 (the catalytic subunit of POLζ) at replication forks in HMCESΔ cells (Figure 6D and Table S2). Therefore, we hypothesized that increased TLS polymerization could compensate for the loss of HMCES. Consistent with this prediction, REV3 and HMCES deficiencies are synthetically lethal (Figure 6E). Furthermore, HMCES-deficient cells have elevated mutation frequencies that can be rescued by wild type HMCES but not the catalytic or DNA binding mutants (Figure 6F). While we do observe an increase in AP sites on the reporter plasmid used to measure mutagenesis after UV, we cannot rule out that the mutagenesis is due to another lesion (Figure S6E). Consistent with the iPOND data, the increased mutation frequency in HMCES-deficient cells is dependent on REV3 (Figure 6F). Furthermore, increasing TLS activity by inactivating the USP1 deubiquitinating enzyme and causing increased PCNA ubiquitylation (Hendel et al., 2011) inhibits HMCES recruitment to damaged DNA (Figure 6G). In contrast, knocking down the PCNA ubiquitin ligase RAD18 had no effect. These data are consistent with a model in which post-replicative repair TLS and HMCES act in separate pathways. We conclude that HMCES is needed to maintain genome stability by sensing and shielding AP sites from error-prone damage tolerance mechanisms during DNA replication.

Discussion

We report the discovery of a previously unrecognized mechanism to process AP sites that form in ssDNA. HMCES recognizes AP sites in ssDNA at replication forks and chemically modifies the lesion to generate a DPC. DPC formation shields the AP sites from mutagenic TLS polymerases and endonucleases that otherwise could generate mutations or DSBs (Figure 6H). Thus, HMCES promotes genome stability during DNA replication by regulating DNA repair pathway choice. Strikingly, almost all organisms encode a single SRAP domain protein suggesting an essential function for organism fitness. SRAP gene linkage to bacterial SOS response genes supports our discovery that HMCES is a genome maintenance protein (Aravind et al., 2013; Hu et al., 2009).

HMCES was previously identified as a possible regulator of 5hmC (Spruijt et al., 2013). Removal of 5hmC from DNA may happen via further oxidation and glycosylase action (He et al., 2011). Thus, HMCES could be involved in a step of this epigenetic modification by recognizing the oxidized 5fC or the AP site formed after base removal. However, the evolution of SRAP domain proteins predates the machinery for cytosine methylation and oxidation. Furthermore, many organisms do not utilize methylcytosine for epigenetic control but do encode SRAP proteins (Bewick et al., 2017). In contrast, AP site repair is a universal need for all organisms.

Generation of a DPC seems counterintuitive as a genome maintenance mechanism since DPCs present a repair challenge (Stingele et al., 2017). However, by preventing the action of TLS polymerases and AP endonucleases, the HMCES-DPC provides a programmed mechanism to shield the ssDNA AP site from activities that could generate genome instability. We cannot rule out the possibility that the HMCES-DPC is a pathological intermediate of an alternative repair mechanism. However, the requirement for the conserved HMCES catalytic pocket cysteine residue for DNA damage resistance and mutation avoidance, the ability of both human and E. coli SRAP proteins to readily crosslink to AP site ssDNA, and the abundance of HMCESDPCs in cells exposed to DNA damage argue that it is a repair intermediate.

Additionally, we note that the catalytic cysteine is almost universally at amino acid two in the SRAP proteins of all organisms, and the first methionine is likely removed by the action of aminopeptidases or the putative intrinsic peptidase activity of the SRAP domain (Aravind et al., 2013). Thus, the N-terminus is within the catalytic pocket. This arrangement is poised to promote covalent linkages via a nucleophilic attack on the ring-opened AP site deoxyribose. In fact, N-terminal cysteines facilitate native chemical ligation of peptides and can readily form stable thiazolidines (Carrico, 2008; Dawson et al., 1994). Furthermore, if the N-terminus is modified with additional amino acids HMCES no longer efficiently forms the DPC. Thus, the SRAP domain is poised to generate a stable AP site DPC.

An intact DNA strand would be expected to be required to template error-free resolution of the HMCES-DPC (Stingele et al., 2017). One source of an intact DNA strand could be postreplication repair involving template switching. However, SHPRH, an E3 ubiquitin ligase that poly-ubiquitylates PCNA and promotes post-replication repair (Lin et al., 2011), is enriched at forks in HMCESΔ cells (Figure 6D and Table S2), and HMCES and SHPRH inactivation are synthetically lethal (Figure S6F). Thus, while this result does not rule out the possibility that SHPRH could act downstream of HMCES-DPC formation in a template switching pathway, it does indicate that at least SHPRH-dependent template switching can act independently of HMCES. Alternatively, fork reversal could be involved to provide an intact template for errorfree repair (Bhat and Cortez, 2018), a process implicated in replication-coupled repair of interstrand crosslinks (Amunugama et al., 2018). In any case, resolution of the HMCES-DPC via ubiquitylation and proteasome-dependent degradation is distinct from other reported mechanisms of DPC repair that rely on the SPRTN protease (Duxin et al., 2014; Stingele et al., 2014).

In conclusion, HMCES is an evolutionarily ancient genome maintenance protein that acts as the initiating step of a previously unrecognized replication-coupled repair mechanism for abasic sites in single-stranded DNA. Future studies to understand how the HMCES-DPC intermediate is resolved to yield error-free repair during DNA replication will provide new insights into DPC repair and ssDNA AP site processing.

STAR Methods

Contact for Reagent and Resource Sharing

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, David Cortez (david.cortez@vanderbilt.edu).

Experimental Model and Subject Details

Cell lines

U2OS, HEK293T, and HeLa cells were cultured in DMEM with 7.5% fetal bovine serum (FBS). RPE-hTERT cells were cultured in DMEM/F12, 7.5% FBS, and 7.5% sodium bicarbonate. HCT-116 cells were cultured in McCoy’s 5A with 7.5% FBS. All cell lines were purchased from ATCC, tested for mycoplasma, and authentication verified using short tandem repeat profiling. All cells were cultured at 37C and 5% CO2. U2OS, HEK293T, HeLa, and RPEhTERT are female. HCT-116 is male.

Method Details

CRISPR/Cas9 Editing

U2OS HMCESΔ and HEK293T HMCESΔ cells were generated with CRISPR/Cas9. Cells were transfected with pSpCas9(BB)-2A-Puro 2 (Addgene plasmid number 48139 (Ran et al., 2013)) containing guide RNAs that target sequences flanking the intron-exon junction of the second exon of HMCES (5’-TTGCGCCTACCAGGATCGGC and 5’-ACTTTAGACGGTGGTCACGG) and selected with 2 μg/mL puromycin for two days prior to plating for individual clones. Clones were screened by immunoblotting for loss of HMCES expression with antibodies raised against the middle and C-terminus of the protein. The HMCESΔ cell lines were also validated by sequencing the edited alleles. U2OS and HEK293T cells contain as many as four HMCES alleles and HMCESΔ clones contain a mixture of alleles with a deletion of the intron-exon 2 junction that results in frame-shift mutations, as well as alleles with insertion/deletions in exon 2 only. Finally, we confirmed loss of expression of exon 2 by RNA sequencing. Complementation of HMCESΔ cells with untagged cDNA expression vectors was completed by lentiviral infection and selection for the linked puromycin resistance cassette.

U2OS HMCES R98E cells were generated using CRISPR/Cas9. Cells were transfected with pSpCas9(BB)-2A-Puro 4 containing guide RNA 5’-TACGGTATCACTACGACAGT and a 278 base pair correcting template DNA in pBluescript. The correcting fragment contains the R98E mutation and introduces wobble base mutations to remove the PAM and a PstI site. Genomic DNA was screened by loss of the PstI site from a PCR product spanning exon 3, which contains R98. Clones were further screened by cloning and sequencing of the alleles and by determining HMCES expression levels by immunoblot. All clones used lacked any remaining wild-type HMCES alleles, contain a mixture of R98E edited and frame-shift alleles, and express HMCES near wild type levels.

Cell Transfections

Plasmid transfections were performed with polyethylenimine. Lipofectamine PLUS was used for plasmid transfection in the DR-GFP and EJ5-GFP assays. siRNA transfections were performed with Dharmafect 1 (Dharmacon) for U2OS cells and RNAiMax (Thermo Fisher) for HEK293T cells.

Plasmids

The HMCES cDNA was obtained from the ThermoScientific Open Biosystems Human ORFeome collection (Catalog number OHS5893–202494257). The plasmid codes for an aspartic acid at position 60, this differs from a glutamic acid in the reference sequence. Point mutants were generated by site-directed mutagenesis and truncation mutants were generated by PCR. For cellular complementation assays, wild type and mutant cDNAs were expressed without tags. To identify interacting proteins, cDNAs were expressed with a FLAG tag. YedK was cloned by PCR from E. coli strain BW25113 genomic DNA. All plasmids were confirmed by sequencing.

Viability assays

Cells were treated with ATRi (VX-970), cisplatin, or camptothecin for 24 hours or MMS for 1 hour. PARPi (BMN673) was maintained in the growth media for the duration of the experiment. KBrO3 treatment was for 30 minutes. Ionizing radiation was administered with a 137Cs irradiator. 254nm ultraviolet radiation was administered using a light box. Cells were plated for single colonies and scored after two weeks of growth. All viability measurements are presented as a percentage of the untreated control.

Flow cytometry

Cells were labeled with 10 μM BrdU for 30 minutes, fixed with 70% ethanol, denatured with 2N HCl for 30 minutes, stained with anti-BrdU antibody, and then treated with propidium iodide and RNaseA. Cells were analyzed on a BD Biosciences FACS Calibur.

Double-strand break repair assays

DR-GFP U2OS cells were provided by Dr. Maria Jasin, Memorial Sloan Kettering, and used as described (Xia et al., 2006). Plasmid based assays for homologous recombination (DRGFP) and total non-homologous end joining (EJ5-GFP) were provided by Dr. Jeremy Stark, City of Hope, and used as described (Bennardo et al., 2008; Stark et al., 2004). The DR-GFP reporter consists of two defective GFP genes, the first of which contains an I-SceI endonuclease site. Cellular expression of I-SceI leads to a DSB, which can be repaired by HR using the downstream wild-type GFP sequence. The EJ5-GFP reporter has a single GFP gene interrupted by a selectable marker that is flanked by I-SceI sites.

iPOND-SILAC Mass Spectrometry

iPOND was performed as described (Dungrawala et al., 2015). Cells were labeled with EdU for 10minutes. For pulse-chase experiments with thymidine (Sigma), EdU-labeled cells were washed once with temperature- and pH-equilibrated medium containing 10 μM thymidine to remove the EdU, then chased into 10 μM thymidine for 1 hour. After labeling, cells were cross-linked in 1% formaldehyde/PBS for 10 min at room temperature, quenched using 1.25 M glycine, and washed three times in PBS. Collected cell pellets were frozen at −80°C, then resuspended in 0.25% Triton-X/PBS to permeabilize. Pellets were washed once with 0.5% BSA/PBS and once with PBS prior to the click reaction.

Light and heavy labeled cells were mixed 1:1. The click reaction was completed in 1 hr. and the cells were lysed by sonication. Capture of DNA-protein complexes utilized streptavidin-coupled C1 magnabeads for 1 hr. Beads were washed with lysis buffer (1% SDS in 50 mM Tris [pH 8.0]), low salt buffer (1% Triton X-100, 20 mM Tris [pH 8.0], 2 mM EDTA, 150 mM NaCl), high salt buffer (1% Triton X-100, 20 mM Tris [pH 8.0], 2 mM EDTA, 500 mM NaCl), lithium chloride wash buffer (100 mM Tris [pH 8.0], 500 mM LiCl, 1% Igepal), and twice in lysis buffer. Captured proteins were eluted and cross-links were reversed in SDS sample buffer by incubating for 30 min at 95°C.

iPOND samples were separated by SDS-PAGE. Gel regions above and below the streptavidin band were excised and treated with 45 mM DTT for 30 min, and available cysteine residues were carbamidomethylated with 100 mM iodoacetamide for 45 min. After destaining the gel pieces with 50% acetonitrile (MeCN) in 25 mM ammonium bicarbonate, proteins were digested with trypsin (Promega) in 25 mM ammonium bicarbonate at 37°C. Peptides were extracted by gel dehydration (60% MeCN, 0.1% trifluoroacetic acid [TFA]), vacuum dried, and reconstituted in 0.1% formic acid.

MudPIT mass spectrometry analysis was performed with an eight-step salt gradient. Peptides were introduced via nano-electrospray into a Q Exactive mass spectrometer (Thermo Scientific) operating in the data-dependent mode acquiring higher energy collisional dissociation tandem MS (HCD MS/MS) scans (R = 17,500) after each MS1 scan (R = 70,000) on the 20 most abundant ions using an MS1 ion target of 1 × 106 ions and an MS2 target of 1 × 105 ions. The maximum ion time for MS/MS scans was set to 100 ms, the HCD-normalized collision energy was set to 28, dynamic exclusion was set to 30 s, and peptide match and isotope exclusion were enabled.

MS/MS spectra were searched against a human subset database created from the UniprotKB protein database (http://www.uniprot.org). Precursor mass tolerance was set to 20 ppm for the first search, and for the main search, a 10-ppm precursor mass tolerance was used. The maximum precursor charge state was set to 7. Variable modifications included carbamidomethylation of cysteines (+57.0214) and oxidation of methionines (+15.9949). Enzyme specificity was set to Trypsin/P, and a maximum of two missed cleavages were allowed. The target-decoy false discovery rate (FDR) for peptide and protein identification was set to 1% for peptides and 2% for proteins. A multiplicity of 2 was used, and Arg10 and Lys8 heavy labels were selected. For SILAC protein ratios, a minimum of two unique peptides and a minimum ratio count of 1 were required, and the requantify option was enabled. Protein groups identified as reverse hits were removed from the datasets.

siRNA and antibodies

siRNAs were obtained from Dharmacon. HMCES: D-020333–01 5’-GCGAACAUCCUGUCACUUA, J-020333–19 5’-CGUAAUGGAGAAACGGUCA, and J020333–20 5’-ACCAACUGUCGUAGUGAUA. Smartpools were purchased from Dharmacon for REV3L, Rad18, USP1, SHPRH, BRCA1, and BRCA2. The following antibodies were used: rabbit anti-HMCES (Sigma, HPA044968); mouse anti-GAPDH (Millipore, MAB374); mouse anti-RPA32 (Abcam, ab2175); rabbit anti-Cyclin A (Santa Cruz, sc-751); mouse anti-PCNA (Santa Cruz, sc-56); mouse anti-Histone H3 (Abcam, ab10799); mouse anti-FLAG M2 (Sigma, F3165); 5-methylcytosine (Abcam, ab10805); 5-hydroxymethylcytosine (Abcam, ab106918); ssDNA antibody (Millipore, MAB3034); mouse anti-p21 (Cell Signaling, 2946); mouse anti-p53 (Santa Cruz, sc-126); mouse anti-53BP1 (Millipore, MAB3802).

Co-immunoprecipitation and mass spectrometry

HEK293T cells were transfected with FLAG-HMCES and nuclear extracts prepared in the presence of Pierce universal nuclease as previously described (Dignam et al., 1983). HMCES was immunoprecipitated using EZ-view Red FLAG M2 affinity gel. HMCES and interacting proteins were eluted by addition of FLAG peptide, TCA precipitated, and analyzed by two-dimensional liquid chromatography tandem mass spectrometry.

For immunoblot analysis of co-precipitating proteins cells were lysed in 3-[(3-cholamidopropyl)-dimethylammonio]-1-propanesulfonate (CHAPS) buffer (0.75% CHAPS, 50mM Tris pH 7.5, 150mM NaCl, 1mM DTT, 1mM NaF, 1mM sodium orthovanadate, 1μg/mL aprotinin, and 1μg/mL leupeptin) supplemented with 1250 units/mL of Pierce universal nuclease. Immunoprecipitation was performed as above. FLAG peptide eluted proteins were analyzed by SDS-PAGE and immunoblotting for FLAG and PCNA.

Ubiquitylation analysis

HEK293T cells were transfected with expression vectors for HMCES and HIS-ubiquitin (pMT107 (Treier et al., 1994)). Cells were lysed in 6M guanidine-HCl, 0.5M NaCl, 100mM sodium phosphate pH 8, and 20mM imidazole and sonicated. Clarified lysates were incubated with Ni-NTA agarose for 3 hours. Beads were washed several times with lysis buffer followed by wash buffer (25mM Tris pH 6.8 and 20mM imidazole). Proteins were eluted with wash buffer supplemented with 250mM imidazole. Eluted proteins were analyzed by SDS-PAGE and immunoblotting for HMCES.

Protein Purification

GST-HMCES fusion proteins were induced with 1mM IPTG overnight at 16°C in Arctic Express bacterial cells. Cell pellets were resuspended in NET buffer (25mM Tris pH 8, 50mM NaCl, 0.1mM EDTA, 5% glycerol, and 1mM DTT), sonicated, and then triton X-100 was added to a final concentration of 1%. Samples were incubated on ice for 30 minutes and soluble lysates were combined with glutathione sepharose beads. Bound proteins were washed 3 times in NET buffer with 1% triton X-100, eluted with 15mM glutathione, and dialyzed into 20mM HEPES pH 7.9, 100mM KCl, 0.1mM EDTA, and 1mM DTT. Alternatively, bound proteins were eluted by cleavage of the GST with PreScission protease in 50mM Tris pH 7, 150mM NaCl, 1mM EDTA, and 1mM DTT at 4°C overnight.

Codon optimized C-terminally tagged human SRAP domain (1–270)-HIS, yedK-HIS, and N-terminal HIS-GST-SRAP (1–270) fusion proteins were induced with 1mM IPTG for 6 hours at 30°C in BL21 bacterial cells. Cells were resuspended in HIS binding buffer (50mM Tris pH 8, 100mM NaCl, 10mM imidazole, 0.1mM EDTA, 5% glycerol, 1mM PMSF, 1μg/mL aprotinin, and 1μg/mL leupeptin), treated with 1mg/mL lysozyme for 30 minutes on ice, and sonicated. Soluble lysates were combined with Ni-NTA agarose for 3 hours and bound proteins were washed three times with HIS wash buffer (50mM Tris pH 8, 300mM NaCl, 20mM imidazole, 0.1mM EDTA, 5% glycerol, 1mM PMSF, 1μg/mL aprotinin, and 1μg/mL leupeptin), and eluted with HIS elution buffer (50mM Tris pH 8, 300mM NaCl, 300mM imidazole, 0.1mM EDTA, 5% glycerol, 1mM PMSF, 1μg/mL aprotinin, and 1μg/mL leupeptin). Samples were then separated using a S200 size exclusion column in 50mM Tris pH 8,150mM NaCl, 5% glycerol, and 10mM DTT. Fractions containing SRAP or yedK were pooled. Where indicated, tags were cleaved overnight at 4°C with PreScission protease and the free GST and the PreScission protease (which has a GST tag) were removed with glutathione sepharose beads.

HIS-PCNA, provided by John Pascal, University of Montreal, was induced with 1mM IPTG for 4 hours at 37°C in BL21 bacterial cells. Cells were lysed with 50mM Tris pH 7.5, 500mM NaCl, 10% glycerol, 0.02% NP-40, 10mM imidazole, 1mM DTT, 1mM PMSF, 1μg/mL aprotinin, and 1μg/mL leupeptin, treated with 1mg/mL lysozyme for 30 minutes on ice, and sonicated. Soluble lysates were combined with Ni-NTA agarose for 3 hours, washed five times with lysis buffer, and eluted with lysis buffer supplemented with 300mM imidazole prior to dialysis into 50mM HEPES pH 7.9, 100mM NaCl, 10% glycerol, 0.1mM EDTA, and 1mM DTT.

DNA Binding assays

32P-labeled DNA substrates (1nM) were incubated with the indicated concentrations of protein in binding buffer (10mM Tris pH 7.9, 50mM NaCl, 10mM MgCl2, 2mM DTT, 0.1mg/mL bovine serum albumin) for 1 hour at 37°C. Ficol was added to a final concentration of 2.5% and samples were resolved on a 10% polyacrylamide gel in 1X TBE (100mM Tris, 90mM boric acid, 2mM EDTA) at 40V for 180 minutes at 4°C. For denaturing electrophoresis, formamide was added to a final concentration of 33%, samples were boiled for 5 minutes, and resolved on 10% polyacrylamide gels containing 8M Urea in 1X TBE (100mM Tris, 90mM boric acid, 2mM EDTA) at 40V for 180 minutes at room temperature. Experiments shown in Figure S5G used 6-FAM labeled DNA at a concentration of 25nM. DNA was visualized on a Typhoon.

Alternatively, 32P-labeled DNA substrates (10nM) were incubated with the indicated concentrations of protein in binding buffer (20mM HEPES pH 7.4, 100mM KCl, 5mM MgCl2, 1% glycerol, 0.01% NP-40, 1mM DTT, 50mM EDTA, 0.25mg/mL bovine serum albumin) for 30 minutes at room temperature. Ficol was added to a final concentration of 2.5% and samples were resolved on a 6% polyacrylamide gel in 1X TBE (100mM Tris, 90mM boric acid, 2mM EDTA) at 40V for 180 minutes at 4°C. This alternative protocol was used for experiments in Figure 1 E and G.

DNA oligonucleotide sequences and annealed ligands are listed in Supplemental Tables 3 and 4, respectively. The presence of 5fC was confirmed by thymine DNA glycosylase (TDG) activity. TDG was provided by Dr. Alexander Drohat, University of Maryland School of Medicine. The presence of FaPy-G was confirmed by FPG glycosylase and lyase activity. FPG was purchased from New England Biolabs.

RADAR

The RADAR, rapid approach to DNA adduct recovery, assay was performed as described with the following modifications (Kiianitsa and Maizels, 2013; Quinones et al., 2015). Cells were lysed in RADAR buffer (4M Guanidine thiocyanate, 1% Sarkosyl, 2 % Triton x-100, 1% 1,4-dithioerythritol, 100mM Sodium Acetate pH 5, 20mM Tris pH 8, 20mM EDTA pH 8, adjusted to pH 6.5 with 4N HCl) and genomic DNA was ethanol precipitated by the addition of ½ volume 100% ethanol and incubation overnight at −20°C. The DNA pellet was washed twice with 70% ethanol and resuspended in 8mM NaOH at 37°C for 1 hour. Samples were centrifuged to remove insoluble material and then the DNA concentration was determined by spectrophotometry. DNA was digested with Pierce universal nuclease in 1X TBS with 2mM MgCl2 at 37°C for 30 minutes. Samples were then boiled for 5 minutes and applied to nitrocellulose membrane with a slot blot apparatus. The membrane was blocked for 1 hour with 5% non-fat dry milk in TBST and immunoblotted for HMCES. In Fig 5E and S6C samples were separated by SDS-PAGE and immunoblotted.

AP site detection

Genomic DNA was purified as described in the RADAR method and resuspended in TE (Tris-EDTA). 3μg of DNA was incubated with 2mM biotinylated aldehyde reactive probe (ARP) (Kubo et al., 1992) in 100μL of TE buffer for 1 hour at 37°C. DNA was then ethanol precipitated overnight, washed twice with 70% ethanol, and resuspended in TE. 1μg was diluted in 200μL 6X SSC and applied to a nylon membrane with a slot blot apparatus. The membrane was blocked with 5% bovine serum albumin in TBST (Tris buffered saline with 0.1% Tween20) and biotin was detected with streptavidin-HRP. Where indicated, cells were treated with 10μM APE1 inhibitor III for 24 hours.

Proximity ligation assay

Proximity ligation assays were performed with the Duolink PLA mouse/rabbit kit (Sigma) according to the manufacturers instructions. Primary antibodies used were mouse antiPCNA and rabbit anti-HMCES.

Far western blot

Purified HMCES (2.5 μg) or PNCA (0.1 μg) were separated by SDS-PAGE and transferred to a nitrocellulose membrane. The membrane was stained with Ponceau S, imaged, and destained. It was then blocked with 5% non-fat dry milk in TBST for 1 hour and incubated with HIS-PCNA (1 μg/mL) in TBST with 2% non-fat dry milk, 10% glycerol, 1mM DTT, and 0.5mM EDTA overnight. The membrane was washed and immunoblotted for PCNA.

Neutral Comet Assay

Trevigen CometAssay ESII system was utilized to detect DNA double-strand breaks. Tail moments were scored using the open source Fiji and OpenComet software (Gyori et al., 2014; Schindelin et al., 2012). Data is presented with box and whisker plots where the box depicts 2575%, whiskers are smallest and largest values, and the median value is indicated.

UV-induced mutagenesis

The pSP189 vector and MBM7070 E. coli strains for the SupF assay have been described previously and were provided by Dr. Karlene Cimprich, Stanford University (Kraemer and Seidman, 1989; Lin et al., 2011; Parris and Seidman, 1992). Briefly, the plasmid was irradiated with 1000J/m2 UV and transfected into wild type or HMCESΔ cells using PEI. Replicated plasmid was recovered after 48 hours using the Sigma miniprep kit, DpnI digested, ethanol precipitated, transformed into MBM7070 cells, and plated on LB plates containing X-gal, IPTG, and ampicillin. The mutation frequency was calculated as the number of white colonies divided by the total number of colonies. Approximately 50,000–60,000 colonies were counted per sample to calculate the mutation frequency. Where indicated, cells were transfected with siRNA 48 hours prior to plasmid transfection.

RNA sequencing

RNA was collected from asynchronous U2OS or HMCESΔ clones 1 and 2 at passage 41 with the Aurum total RNA mini kit (Biorad). The library was prepared with the Illumina Tru-seq total RNA sample prep kit, sequencing was performed on the Illumina HiSeq 3000 (PE75) and 30 million reads were generated for each sample. Reads were trimmed to remove adaptor sequences using Flexbar (Dodt et al., 2012) and aligned to hg38 using HiSat2 (Kim et al., 2015).

FeatureCounts was used to count the number of mapped reads to each gene (Liao et al., 2014). Differential gene expression analysis was performed with edgeR (McCarthy et al., 2012; Robinson et al., 2010).

Quantification and Statistical Analysis

Statistical analyses were completed using Prism. An ANOVA test was used when comparing more than two groups followed by a Dunnett multiple comparisons post-test. A twotailed t-test was used to compare two samples with normally distributed data. No statistical methods or criteria were used to estimate sample size or to include/exclude samples. Multiple siRNAs and CRISPR/Cas9-derived clonal cell lines were analyzed to confirm results were not caused by off-target effects or clonal variations. Unless otherwise stated all experiments were performed twice and representative experiments are shown.

Data and Software Availability

All data is present in the paper or supplemental tables. RNA sequencing reads are available at GEO, GSE121515.

Supplementary Material

1

Figure S1. Sequence alignment of SRAP proteins from eukaryotes, prokaryotes, and archaebacteria. Related to Figure 1. Sequence alignment was performed with Clustal Omega and visualized with Jalview.

9

Table S4. DNA ligands used in this study. Related to Figures 1, 4, and 6.

10

Figure S2. Purified proteins used in this study. Related to Figures 1, 4, and 6. (A) Coomassie stained gels of full-length N-terminal GST-HMCES wild type and mutant proteins purified from Arctic Express bacterial cells. The GST-tag was removed by protease cleavage in all but lane 1. The higher molecular weight species corresponds to the chaperones overexpressed in Arctic Express cells. DNA binding mutants R98E and R212E indicate that the DNA binding activity is not due to the co-purifying proteins. (B) Coomassie stained gels of Cterminal HIS tagged human SRAP domain and yedK purifications from BL21 bacterial cells. (C) Coomassie stained gels of N-terminal GST-SRAP before and after GST cleavage. Purified from BL21 bacterial cells.

2

Figure S3. HMCES does not regulate gene expression or change 5-hydroxymethylcytosine levels in U2OS cells. Related to Figure 3. (A) Genomic DNA was purified from wild type, HMCESΔ, or U2OS cells over-expressing HMCES that had been passaged for at least 30 days, transferred to a nylon membrane, denatured, and blotted with antibodies to 5hmC and 5mC. (B) Antibody specificities were verified using DNA modified with 5hmC or 5mC. (C and D) Electrophoretic mobility shift analysis of HMCES with the indicated single- and double-strand DNA ligands. Modified cytosines are in a CpG context. (C) 10nM ssDNA. (D) 1nM dsDNA. (E) RNA sequencing data. Genes highlighted in blue are significantly changed, n=2 U2OS samples and HMCESΔ clones. (F) Immunoblot analysis of wild type and HMCES knockout cell lines showing elevated p53 and p21 levels. (G) Immunofluorescence staining with anti 53BP1 antibodies in G1 cells identified by lack of PCNA foci or cyclin A (111 G1 cells scored for each sample, Kruskal-Wallis test). Scale bar is 10 μm. (H) Cells were counted every 24 hours and the doubling time is indicated in parentheses. A representative experiment is shown, n=3. (I) Cells were labeled with BrdU for 30 minutes and analyzed for BrdU and DNA content (propidium iodide, PI) by flow cytometry.

3

Figure S4. HMCES is not needed to repair DNA double-strand breaks. Related to Figures 3 and 6. (A) Immunoblot analysis to verify siRNA knockdown efficiencies in U2OS cells. (B) Immunoblot analysis of HMCES expression levels in U2OS cell lines with the R98E mutation edited into the endogenous HMCES alleles. (C) Clonogenic survival assay of wild type and HMCESΔ U2OS clones treated with cisplatin or camptothecin (CPT) for 24 hours. (D) Immunoblot analysis of wild type and HMCESΔ 293T cells. (E-G) Homologous recombination and non-homologous end joining were measured using the DR-GFP and EJ5-GFP reporters. (E) U2OS DR-GFP cell line transfected with siRNA targeting BRCA1 or HMCES. (F) Wild type and HMCESΔ 293T cells transfected with DR-GFP plasmid, paired two-tailed t-test. ns=not significant. (G) Wild type and HMCESΔ U2OS cells transfected with EJ5-GFP plasmid. All graphs show mean+/−SD, n=3, ANOVA with a Dunnett post-test unless otherwise indicated.

4

Figure S5. HMCES-deficient cells accumulate AP sites. Related to Figure 4. (A) U2OS cells were irradiated with 100J/m2 UV and genomic DNA was collected immediately or the cells were allowed to recover for 3 hours. DNA was combined with the aldehyde reactive probe (ARP) to measure AP sites. See Figure 4 legend and STAR methods. (B) Repair of AP sites after MMS treatment as described in Figure 4 using 0.5mM MMS. An individual experiment at this dose of MMS is shown. (C) U2OS cells were treated with 10μM APE1 inhibitor III for 24 hours. Genomic DNA was purified and AP sites were detected using the ARP. (D) Native gel analysis of human SRAP domain (3, 10nM) incubated with 1nM unmodified (dT), THF stabilized abasic site, or a natural abasic site (dU + UDG) ssDNA. (E) Native gel analysis of human SRAP incubated with 1nM abasic site ssDNA (dU + UDG). (F) Quantification of DNA binding from experiments in Figures S3C and S5E. (G) Denaturing gel analysis of human SRAP crosslinking to ssDNA containing an AP site (dU + UDG), 5-formyldC (5fC), or FaPy-G. DNA was FAM-labeled, at a final concentration of 25nM, and imaged with a Typhoon.

5

Figure S6. HMCES forms DPCs in cells and is synthetic lethal with SHPRH. Related to Figures 4 and 6. (A-C) RADAR DPC assay. Cells were treated with 0, 20, or 100 J/m2 UV or 5mM MMS (1 hour) and allowed to recover for 3 hours in the presence or absence of 10 μM MG132. (A and B) 10μg of genomic DNA was digested with nuclease, applied to a nitrocellulose membrane, and immunoblotted with HMCES antibodies. (C) 25μg of genomic DNA was digested with nuclease, then boiled in SDS sample buffer and separated by SDS-PAGE prior to immunoblotting with either HMCES or RPA antibodies. (D) Wild type or HMCESΔ cells were transfected with siRNA to BRCA1 or BRCA2. The number of surviving colonies was compared to the non-targeting siRNA control in each cell line (mean+/−SD, n=3, two-tailed t-test). (E) The mock and UV damaged plasmid used in Figure 6F was reacted with the ARP to determine the number of abasic sites induced by UV damage (mean+/−SD, n=3). (F) Wild type or HMCESΔ cells were transfected with siRNA to SHPRH. The number of surviving colonies was compared to the nontargeting siRNA control in each cell line (mean+/−SD, n=3, two-tailed t-test).

6

Table S1. RNA Sequencing of HMCESΔ cells. Related to Figure 3. RNA sequencing of two HMCESΔ clones was compared to two samples of wild type U2OS cells. The log2 fold change and log counts per million are reported. Negative log fold changes indicate a reduction in HMCESΔ cells. Table in Excel file. Raw data is available in GEO, accession GSE121515.

7

Table S2. iPOND analysis of HMCESΔ cells. Related to Figure 6. Log2 ratios of protein fold changes between wild type and HMCESΔ 293T cells. Positive values indicate an increase in HMCESΔ cells. Table in Excel file.

8

Table S3. Sequence of oligonucleotides used in this study. Related to Figures 1, 4, and 6.

Highlights.

  • HMCES senses abasic sites in ssDNA and forms a covalent DNA-protein crosslink

  • HMCES shields the abasic site from TLS polymerases and endonucleases

  • HMCES is a suicide enzyme and the DPC is ubiquitylated and degraded

  • HMCES is conserved in all domains of life and loss results in genetic instability

Acknowledgements:

We thank Drs. Karlene Cimprich, Maria Jasin, Jeremy Stark, John Pascal, and Alexander Drohat for providing reagents. We thank Dr. Carmelo Rizzo for the suggestion of looking at AP sites and Dr. Lisa Poole for help with RNA sequencing data analysis. We also thank Drs. Paul Modrich, James Berger, James Dewar, and Ian Macara for critical reading of the manuscript.

Funding: This research was supported by grants to D.C. from the NIH (R01GM116616) and the Breast Cancer Research Foundation. K.N.M. was supported by Susan G. Komen fellowship PDF14302198 and 5T32CA009582. P.S.T. was supported by P01CA092584 and F30CA228242. S.R.W. was supported by F32GM126646. K.P.M.M. was supported by 5T32CA009582. Additional support came from the Vanderbilt-Ingram Cancer Center.

Footnotes

Declaration of interests: Authors declare no competing interests.

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  1. Amunugama R, Willcox S, Wu RA, Abdullah UB, El-Sagheer AH, Brown T, McHugh PJ, Griffith JD, and Walter JC (2018). Replication Fork Reversal during DNA Interstrand Crosslink Repair Requires CMG Unloading. Cell reports 23, 3419–3428. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Aravind L, Anand S, and Iyer LM (2013). Novel autoproteolytic and DNA-damage sensing components in the bacterial SOS response and oxidized methylcytosine-induced eukaryotic DNA demethylation systems. Biology direct 8, 20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bennardo N, Cheng A, Huang N, and Stark JM (2008). Alternative-NHEJ is a mechanistically distinct pathway of mammalian chromosome break repair. PLoS genetics 4, e1000110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Bewick AJ, Vogel KJ, Moore AJ, and Schmitz RJ (2017). Evolution of DNA Methylation across Insects. Molecular biology and evolution 34, 654–665. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bhat KP, and Cortez D (2018). RPA and RAD51: fork reversal, fork protection, and genome stability. Nat Struct Mol Biol 25, 446–453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Carrico IS (2008). Chemoselective modification of proteins: hitting the target. Chemical Society reviews 37, 1423–1431. [DOI] [PubMed] [Google Scholar]
  7. Dawson PE, Muir TW, Clark-Lewis I, and Kent SB (1994). Synthesis of proteins by native chemical ligation. Science 266, 776–779. [DOI] [PubMed] [Google Scholar]
  8. Dianov GL, Sleeth KM, Dianova II, and Allinson SL (2003). Repair of abasic sites in DNA. Mutat Res 531, 157–163. [DOI] [PubMed] [Google Scholar]
  9. Dignam JD, Lebovitz RM, and Roeder RG (1983). Accurate transcription initiation by RNA polymerase II in a soluble extract from isolated mammalian nuclei. Nucleic Acids Res 11, 1475–1489. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Dodt M, Roehr JT, Ahmed R, and Dieterich C (2012). FLEXBAR-Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms. Biology 1, 895–905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Dungrawala H, Rose KL, Bhat KP, Mohni KN, Glick GG, Couch FB, and Cortez D (2015). The Replication Checkpoint Prevents Two Types of Fork Collapse without Regulating Replisome Stability. Mol Cell 59, 998–1010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Duxin JP, Dewar JM, Yardimci H, and Walter JC (2014). Repair of a DNA-protein crosslink by replication-coupled proteolysis. Cell 159, 346–357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Friedberg EC, Walker GC, Siede W, Wood RD, Schultz RA, and Ellenberger T (2006). DNA repair and mutagenesis, 2nd edn (Washington, D.C.: ASM Press; ). [Google Scholar]
  14. Gorner H (1994). Photochemistry of DNA and related biomolecules: quantum yields and consequences of photoionization. Journal of photochemistry and photobiology B, Biology 26, 117–139. [DOI] [PubMed] [Google Scholar]
  15. Gyori BM, Venkatachalam G, Thiagarajan PS, Hsu D, and Clement MV (2014). OpenComet: an automated tool for comet assay image analysis. Redox biology 2, 457–465. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Harrigan JA, Belotserkovskaya R, Coates J, Dimitrova DS, Polo SE, Bradshaw CR, Fraser P, and Jackson SP (2011). Replication stress induces 53BP1-containing OPT domains in G1 cells. J Cell Biol 193, 97–108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. He YF, Li BZ, Li Z, Liu P, Wang Y, Tang Q, Ding J, Jia Y, Chen Z, Li L, et al. (2011). Tet-mediated formation of 5-carboxylcytosine and its excision by TDG in mammalian DNA. Science 333, 1303–1307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Hendel A, Krijger PH, Diamant N, Goren Z, Langerak P, Kim J, Reissner T, Lee KY, Geacintov NE, Carell T, et al. (2011). PCNA ubiquitination is important, but not essential for translesion DNA synthesis in mammalian cells. PLoS genetics 7, e1002262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Hu P, Janga SC, Babu M, Diaz-Mejia JJ, Butland G, Yang W, Pogoutse O, Guo X, Phanse S, Wong P, et al. (2009). Global functional atlas of Escherichia coli encompassing previously uncharacterized proteins. PLoS Biol 7, e96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Jaenisch R, and Bird A (2003). Epigenetic regulation of gene expression: how the genome integrates intrinsic and environmental signals. Nat Genet 33 Suppl, 245–254. [DOI] [PubMed] [Google Scholar]
  21. Kavli B, Otterlei M, Slupphaug G, and Krokan HE (2007). Uracil in DNA--general mutagen, but normal intermediate in acquired immunity. DNA Repair (Amst) 6, 505–516. [DOI] [PubMed] [Google Scholar]
  22. Kiianitsa K, and Maizels N (2013). A rapid and sensitive assay for DNA-protein covalent complexes in living cells. Nucleic Acids Res 41, e104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Kim D, Langmead B, and Salzberg SL (2015). HISAT: a fast spliced aligner with low memory requirements. Nat Methods 12, 357–360. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Kraemer KH, and Seidman MM (1989). Use of supF, an Escherichia coli tyrosine suppressor tRNA gene, as a mutagenic target in shuttle-vector plasmids. Mutat Res 220, 61–72. [DOI] [PubMed] [Google Scholar]
  25. Krokan HE, and Bjoras M (2013). Base excision repair. Cold Spring Harb Perspect Biol 5, a012583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Kubo K, Ide H, Wallace SS, and Kow YW (1992). A novel, sensitive, and specific assay for abasic sites, the most commonly produced DNA lesion. Biochemistry 31, 3703–3708. [DOI] [PubMed] [Google Scholar]
  27. Kuluncsics Z, Perdiz D, Brulay E, Muel B, and Sage E (1999). Wavelength dependence of ultraviolet-induced DNA damage distribution: involvement of direct or indirect mechanisms and possible artefacts. Journal of photochemistry and photobiology B, Biology 49, 71–80. [DOI] [PubMed] [Google Scholar]
  28. Kweon SM, Zhu B, Chen Y, Aravind L, Xu SY, and Feldman DE (2017). Erasure of Tet-Oxidized 5-Methylcytosine by a SRAP Nuclease. Cell reports 21, 482–494. [DOI] [PubMed] [Google Scholar]
  29. Liao Y, Smyth GK, and Shi W (2014). featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930. [DOI] [PubMed] [Google Scholar]
  30. Lin JR, Zeman MK, Chen JY, Yee MC, and Cimprich KA (2011). SHPRH and HLTF act in a damage-specific manner to coordinate different forms of postreplication repair and prevent mutagenesis. Mol Cell 42, 237–249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Lindahl T (1993). Instability and decay of the primary structure of DNA. Nature 362, 709–715. [DOI] [PubMed] [Google Scholar]
  32. Lukas C, Savic V, Bekker-Jensen S, Doil C, Neumann B, Pedersen RS, Grofte M, Chan KL, Hickson ID, Bartek J, et al. (2011). 53BP1 nuclear bodies form around DNA lesions generated by mitotic transmission of chromosomes under replication stress. Nat Cell Biol 13, 243–253. [DOI] [PubMed] [Google Scholar]
  33. Mailand N, Gibbs-Seymour I, and Bekker-Jensen S (2013). Regulation of PCNA-protein interactions for genome stability. Nat Rev Mol Cell Biol 14, 269–282. [DOI] [PubMed] [Google Scholar]
  34. McCarthy DJ, Chen Y, and Smyth GK (2012). Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res 40, 4288–4297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Mitchell DL, Jen J, and Cleaver JE (1991). Relative induction of cyclobutane dimers and cytosine photohydrates in DNA irradiated in vitro and in vivo with ultraviolet-C and ultravioletB light. Photochemistry and photobiology 54, 741–746. [DOI] [PubMed] [Google Scholar]
  36. Mohni KN, Kavanaugh GM, and Cortez D (2014). ATR pathway inhibition is synthetically lethal in cancer cells with ERCC1 deficiency. Cancer Res 74, 2835–2845. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Mohni KN, Thompson PS, Luzwick JW, Glick GG, Pendleton CS, Lehmann BD, Pietenpol JA, and Cortez D (2015). A Synthetic Lethal Screen Identifies DNA Repair Pathways that Sensitize Cancer Cells to Combined ATR Inhibition and Cisplatin Treatments. PLoS One 10, e0125482. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD, Raine K, Jones D, Hinton J, Marshall J, Stebbings LA, et al. (2012). Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Parris CN, and Seidman MM (1992). A signature element distinguishes sibling and independent mutations in a shuttle vector plasmid. Gene 117, 1–5. [DOI] [PubMed] [Google Scholar]
  40. Quinones JL, Thapar U, Yu K, Fang Q, Sobol RW, and Demple B (2015). Enzyme mechanism-based, oxidative DNA-protein cross-links formed with DNA polymerase beta in vivo. Proc Natl Acad Sci U S A 112, 8602–8607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Ran FA, Hsu PD, Wright J, Agarwala V, Scott DA, and Zhang F (2013). Genome engineering using the CRISPR-Cas9 system. Nature protocols 8, 2281–2308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Robinson MD, McCarthy DJ, and Smyth GK (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Sale JE (2013). Translesion DNA synthesis and mutagenesis in eukaryotes. Cold Spring Harb Perspect Biol 5, a012708. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, Preibisch S, Rueden C, Saalfeld S, Schmid B, et al. (2012). Fiji: an open-source platform for biological-image analysis. Nat Methods 9, 676–682. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Sirbu BM, Couch FB, Feigerle JT, Bhaskara S, Hiebert SW, and Cortez D (2011). Analysis of protein dynamics at active, stalled, and collapsed replication forks. Genes Dev 25, 1320–1327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Spruijt CG, Gnerlich F, Smits AH, Pfaffeneder T, Jansen PW, Bauer C, Munzel M, Wagner M, Muller M, Khan F, et al. (2013). Dynamic readers for 5-(hydroxy)methylcytosine and its oxidized derivatives. Cell 152, 1146–1159. [DOI] [PubMed] [Google Scholar]
  47. Stark JM, Pierce AJ, Oh J, Pastink A, and Jasin M (2004). Genetic steps of mammalian homologous repair with distinct mutagenic consequences. Mol Cell Biol 24, 9305–9316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Stingele J, Bellelli R, and Boulton SJ (2017). Mechanisms of DNA-protein crosslink repair. Nat Rev Mol Cell Biol 18, 563–573. [DOI] [PubMed] [Google Scholar]
  49. Stingele J, Schwarz MS, Bloemeke N, Wolf PG, and Jentsch S (2014). A DNAdependent protease involved in DNA-protein crosslink repair. Cell 158, 327–338. [DOI] [PubMed] [Google Scholar]
  50. Treier M, Staszewski LM, and Bohmann D (1994). Ubiquitin-dependent c-Jun degradation in vivo is mediated by the delta domain. Cell 78, 787–798. [DOI] [PubMed] [Google Scholar]
  51. Xia B, Sheng Q, Nakanishi K, Ohashi A, Wu J, Christ N, Liu X, Jasin M, Couch FJ, and Livingston DM (2006). Control of BRCA2 cellular and clinical functions by a nuclear partner, PALB2. Mol Cell 22, 719–729. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Figure S1. Sequence alignment of SRAP proteins from eukaryotes, prokaryotes, and archaebacteria. Related to Figure 1. Sequence alignment was performed with Clustal Omega and visualized with Jalview.

9

Table S4. DNA ligands used in this study. Related to Figures 1, 4, and 6.

10

Figure S2. Purified proteins used in this study. Related to Figures 1, 4, and 6. (A) Coomassie stained gels of full-length N-terminal GST-HMCES wild type and mutant proteins purified from Arctic Express bacterial cells. The GST-tag was removed by protease cleavage in all but lane 1. The higher molecular weight species corresponds to the chaperones overexpressed in Arctic Express cells. DNA binding mutants R98E and R212E indicate that the DNA binding activity is not due to the co-purifying proteins. (B) Coomassie stained gels of Cterminal HIS tagged human SRAP domain and yedK purifications from BL21 bacterial cells. (C) Coomassie stained gels of N-terminal GST-SRAP before and after GST cleavage. Purified from BL21 bacterial cells.

2

Figure S3. HMCES does not regulate gene expression or change 5-hydroxymethylcytosine levels in U2OS cells. Related to Figure 3. (A) Genomic DNA was purified from wild type, HMCESΔ, or U2OS cells over-expressing HMCES that had been passaged for at least 30 days, transferred to a nylon membrane, denatured, and blotted with antibodies to 5hmC and 5mC. (B) Antibody specificities were verified using DNA modified with 5hmC or 5mC. (C and D) Electrophoretic mobility shift analysis of HMCES with the indicated single- and double-strand DNA ligands. Modified cytosines are in a CpG context. (C) 10nM ssDNA. (D) 1nM dsDNA. (E) RNA sequencing data. Genes highlighted in blue are significantly changed, n=2 U2OS samples and HMCESΔ clones. (F) Immunoblot analysis of wild type and HMCES knockout cell lines showing elevated p53 and p21 levels. (G) Immunofluorescence staining with anti 53BP1 antibodies in G1 cells identified by lack of PCNA foci or cyclin A (111 G1 cells scored for each sample, Kruskal-Wallis test). Scale bar is 10 μm. (H) Cells were counted every 24 hours and the doubling time is indicated in parentheses. A representative experiment is shown, n=3. (I) Cells were labeled with BrdU for 30 minutes and analyzed for BrdU and DNA content (propidium iodide, PI) by flow cytometry.

3

Figure S4. HMCES is not needed to repair DNA double-strand breaks. Related to Figures 3 and 6. (A) Immunoblot analysis to verify siRNA knockdown efficiencies in U2OS cells. (B) Immunoblot analysis of HMCES expression levels in U2OS cell lines with the R98E mutation edited into the endogenous HMCES alleles. (C) Clonogenic survival assay of wild type and HMCESΔ U2OS clones treated with cisplatin or camptothecin (CPT) for 24 hours. (D) Immunoblot analysis of wild type and HMCESΔ 293T cells. (E-G) Homologous recombination and non-homologous end joining were measured using the DR-GFP and EJ5-GFP reporters. (E) U2OS DR-GFP cell line transfected with siRNA targeting BRCA1 or HMCES. (F) Wild type and HMCESΔ 293T cells transfected with DR-GFP plasmid, paired two-tailed t-test. ns=not significant. (G) Wild type and HMCESΔ U2OS cells transfected with EJ5-GFP plasmid. All graphs show mean+/−SD, n=3, ANOVA with a Dunnett post-test unless otherwise indicated.

4

Figure S5. HMCES-deficient cells accumulate AP sites. Related to Figure 4. (A) U2OS cells were irradiated with 100J/m2 UV and genomic DNA was collected immediately or the cells were allowed to recover for 3 hours. DNA was combined with the aldehyde reactive probe (ARP) to measure AP sites. See Figure 4 legend and STAR methods. (B) Repair of AP sites after MMS treatment as described in Figure 4 using 0.5mM MMS. An individual experiment at this dose of MMS is shown. (C) U2OS cells were treated with 10μM APE1 inhibitor III for 24 hours. Genomic DNA was purified and AP sites were detected using the ARP. (D) Native gel analysis of human SRAP domain (3, 10nM) incubated with 1nM unmodified (dT), THF stabilized abasic site, or a natural abasic site (dU + UDG) ssDNA. (E) Native gel analysis of human SRAP incubated with 1nM abasic site ssDNA (dU + UDG). (F) Quantification of DNA binding from experiments in Figures S3C and S5E. (G) Denaturing gel analysis of human SRAP crosslinking to ssDNA containing an AP site (dU + UDG), 5-formyldC (5fC), or FaPy-G. DNA was FAM-labeled, at a final concentration of 25nM, and imaged with a Typhoon.

5

Figure S6. HMCES forms DPCs in cells and is synthetic lethal with SHPRH. Related to Figures 4 and 6. (A-C) RADAR DPC assay. Cells were treated with 0, 20, or 100 J/m2 UV or 5mM MMS (1 hour) and allowed to recover for 3 hours in the presence or absence of 10 μM MG132. (A and B) 10μg of genomic DNA was digested with nuclease, applied to a nitrocellulose membrane, and immunoblotted with HMCES antibodies. (C) 25μg of genomic DNA was digested with nuclease, then boiled in SDS sample buffer and separated by SDS-PAGE prior to immunoblotting with either HMCES or RPA antibodies. (D) Wild type or HMCESΔ cells were transfected with siRNA to BRCA1 or BRCA2. The number of surviving colonies was compared to the non-targeting siRNA control in each cell line (mean+/−SD, n=3, two-tailed t-test). (E) The mock and UV damaged plasmid used in Figure 6F was reacted with the ARP to determine the number of abasic sites induced by UV damage (mean+/−SD, n=3). (F) Wild type or HMCESΔ cells were transfected with siRNA to SHPRH. The number of surviving colonies was compared to the nontargeting siRNA control in each cell line (mean+/−SD, n=3, two-tailed t-test).

6

Table S1. RNA Sequencing of HMCESΔ cells. Related to Figure 3. RNA sequencing of two HMCESΔ clones was compared to two samples of wild type U2OS cells. The log2 fold change and log counts per million are reported. Negative log fold changes indicate a reduction in HMCESΔ cells. Table in Excel file. Raw data is available in GEO, accession GSE121515.

7

Table S2. iPOND analysis of HMCESΔ cells. Related to Figure 6. Log2 ratios of protein fold changes between wild type and HMCESΔ 293T cells. Positive values indicate an increase in HMCESΔ cells. Table in Excel file.

8

Table S3. Sequence of oligonucleotides used in this study. Related to Figures 1, 4, and 6.

Data Availability Statement

All data is present in the paper or supplemental tables. RNA sequencing reads are available at GEO, GSE121515.

RESOURCES