Skip to main content
Cell Genomics logoLink to Cell Genomics
. 2023 May 2;3(6):100315. doi: 10.1016/j.xgen.2023.100315

Cell-type specificity of the human mutation landscape with respect to DNA replication dynamics

Madison Caballero 1, Dominik Boos 2, Amnon Koren 1,3,
PMCID: PMC10300547  PMID: 37388911

Summary

The patterns of genomic mutations are associated with various genomic features, most notably late replication timing, yet it remains contested which mutation types and signatures relate to DNA replication dynamics and to what extent. Here, we perform high-resolution comparisons of mutational landscapes between lymphoblastoid cell lines, chronic lymphocytic leukemia tumors, and three colon adenocarcinoma cell lines, including two with mismatch repair deficiency. Using cell-type-matched replication timing profiles, we demonstrate that mutation rates exhibit heterogeneous replication timing associations among cell types. This cell-type heterogeneity extends to the underlying mutational pathways, as mutational signatures show inconsistent replication timing bias between cell types. Moreover, replicative strand asymmetries exhibit similar cell-type specificity, albeit with different relationships to replication timing than mutation rates. Overall, we reveal an underappreciated complexity and cell-type specificity of mutational pathways and their relationship to replication timing.

Keywords: mutations, replication timing, mutational signatures

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Analysis of somatic mutations and matched replication timing in five cell types

  • Cell-type variability in mutation bias toward late replicating regions

  • Cell-type heterogeneity in mutational pathways and replicative strand asymmetry


An analysis of the mutational landscape in five cell types reveals diverse relationships between mutation rate, the prevalence of mutational pathways, and replicative strand asymmetry, all in relation to DNA replication timing.

Introduction

Mutations arise through a compendium of known and unknown mechanisms. These include the improper repair of DNA damage produced by endogenous or exogenous agents, enzymatic alterations of DNA, and mismatches introduced during DNA replication. Knowing how, where, and when mutations occur is central to understanding evolution, aging, and disease. In this respect, it is well established that mutations are distributed non-randomly at the nucleotide, regional, and global genomic levels. At the nucleotide level, many mutational pathways are biased toward specific nucleotide substitutions and surrounding sequence contexts.1 For example, the spontaneous deamination of 5-methylcytosine to thymine happens almost exclusively at CpG sites.2 On a regional and global scale, variations in mutation rates and substitution types are associated with various genetic and epigenetic factors including nucleotide content,3,4 chromatin state,5,6,7,8 three-dimensional genome organization,9 transcription factor binding,10,11 and—most notably—DNA replication timing.12,13,14,15,16,17,18,19,20,21

DNA replication timing is the cell-type-specific spatiotemporal pattern of genome replication along S-phase. In eukaryotic cells, DNA replication begins at multiple replication origins that fire throughout S-phase and mediate bidirectional replication until the entire genome is duplicated. Late replicating regions of the genome are broadly enriched for single-nucleotide variants and mutations.12,13,15,16,17,22,23 The mechanisms by which mutations accumulate in later replicating regions of the genome remain incompletely understood, although evidence suggests that mismatch repair (MMR) attenuates toward the end of S-phase and contributes to these biases.17,24 On the other hand, many classes of mutations and their underlying mutational pathways are not biased with respect to replication timing,13,16 suggesting complex contributions by different DNA damage and repair pathways.

A powerful method to glean the types and abundances of mutational pathways that shape mutational landscapes has been the analysis of local trinucleotide mutation signatures. Large-scale pan-cancer analyses revealed an extensive diversity of mutation signatures between and within cancer types.1,25,26 Some mutational processes are shared such as those manifesting as single base substitution (SBS) signatures 1, 5, and 40, while others are more specific to subsets of cell or cancer types such as MMR deficiency. Previous studies showed that different mutational processes—and their resulting mutational signatures—have differential relationships to replication timing.11,13,16,27,28 For example, SBS1, SBS8, SBS9, and SBS17 were shown to be enriched in late-replicating regions of the genome, while SBS5, SBS21, SBS40, and SBS44 showed either bias to early replication or no bias at all. Another property of mutations that we and others have previously described is DNA replicative strand asymmetry, in which certain mutation types tend to occur more often on either the leading or the lagging strands of replication.16,29,30 Replicative strand asymmetry is characteristic of several mutational signatures (notably SBS2, SBS3, SBS13, and SBS17), while others are not coupled to asymmetry, e.g., signature SBS8 is more often observed in late replicating regions but does not show significant replicative strand asymmetry.27

Previous studies that established how mutational processes relate to DNA replication have assumed that any given process relates to replication timing and strand bias in a constant way. However, it is becoming increasingly clear that mutational processes may be heterogeneous not only in their quantity across cell/cancer types but also in their relation to replication dynamics across cell types.1,27,28 This complexity has led to conflicting conclusions among different studies. For example, signature SBS1, which is caused by spontaneous deamination of 5-methylcytosine to thymine, has been reported by different studies to be biased toward early replication, late replication, or neither.11,16,28 Similarly inconsistent conclusions have been proposed for SBS5, SBS40, and others.11,16,28 These conflicting results could be reconciled if additional, orthogonal factors that vary between cell types affect the relationship of mutational processes to DNA replication timing.

Here, we utilized several complementary cell types to perform high-resolution comparisons of mutation rate, mutational pathways, and replicative strand asymmetry all with respect to cell-type-specific replication timing. We find that the relationship between mutation rate and replication timing differed by cell type, with mutations being more abundant in late replicating regions in the two B cell types than the colon cancer cell lines. We further characterized cell-type variability in mutational pathways and their replicative stand asymmetries as a function of replication timing, finding that the same mutational pathway exhibits varying degrees of late replication bias in different cell types. Our results underscore how a given mutational pathway can exhibit heterogeneous relationships to replication timing in different cell types.

Results

A catalog of somatic mutations in five cell types/lines

We called somatic mutations in five cell types/lines for which matched replication timing data were either available or were generated here. These cell types included B lymphoblastoid cell lines (LCLs), B cell chronic lymphocytic leukemia (CLL), and three colon cancer cell lines to contrast with the B cell-related data (Table S1).

We analyzed 885,655 autosomal single-nucleotide variant (SNV) mutations in LCLs, called through family-based mutation identification.31 LCL mutations are mostly somatic in origin, are estimated to have a >90% mutation-calling accuracy, and <1% are predicted to be functional. Autosomal mutation counts ranged from 66 to 8,737 per individual (median 408; 0.169 mutations/Mb) (Figure 1A). LCL mutations were compared with LCL DNA replication timing generated from the same samples.31 To complement the analysis of LCLs, we incorporated 377,605 autosomal single-nucleotide mutations from patients with CLL called from tumor-normal comparison.31 Mutation counts ranged from 221 to 5,629 per patient (median 2,368; 0.98 mutations/Mb) (Figure 1A). Due to the primary tumor source and abundant copy-number alterations of CLL,32,33 we used LCL replication timing to compare with CLL mutations, given that similar cell types have conserved replication timing.34,35

Figure 1.

Figure 1

Mutation rate association with DNA replication timing varies in a cell-type-specific manner

(A) Mutation sources and autosomal counts.

(B) Autosomal mutation counts in 20 replication timing bins of uniform genome content.

(C) Mutation rate correlations to cell-type-specific replication timing of HCT116 and LCLs. Mutation rate was calculated as the mean number of mutations across all samples of the same cell type in 1 Mb sliding windows with a 0.5 Mb step. Mutation rates were normalized to an autosomal mean of zero and a standard deviation of one to control for the different mutation rates in the two cell types.

(D) Mutation rates correlate most strongly with replication timing profiles of the same cells/cell type. Correlation values are Pearson’s correlation coefficients.

As a further axis of comparison, we incorporated mutational accumulation experiments in three colon adenocarcinoma cell lines. Two cell lines, HCT116 and LS180, possess microsatellite instability (MSI) resulting from loss of functional MMR. The third, HT115, was microsatellite stable (MSS) with intact MMR. To accumulate mutations, cell lines were sequentially passaged, and single-cell daughter clones were then isolated, expanded, sequenced, and compared with the original parental clone (Figure 1A). Mutations from LS180 and HT115 were sourced from Petljak et al.26 The cell lines were passaged for 44 and 45 days, respectively, and five daughter subclones were isolated from each line. LS180 yielded 14,974 autosomal mutations (range: 749–5,310; median: 2,601) and HT115 yielded 28,944 (range: 5,296–6,511; median: 5,572). HCT116 was passaged by us 100 times (approximately 1 year), and six daughter subclones were isolated. HCT116 yielded 150,470 autosomal mutations (range: 15,385–39,469; median: 21,846; 9.74 mutations/Mb). Replication timing profiles for LS180 and HT115 were produced by sorting and sequencing G1- and S-phase cells.12,22 An HCT116 mean reference replication timing profile was generated from the whole-genome sequencing of the six daughter subclones (this was achievable since HCT116 is near diploid) and further validated by comparison to a profile generated by G1/S sequencing (see STAR Methods). The replication timing profiles of individual HCT116 clones were highly correlated with each other (Pearson r between 0.91 and 0.97; median = 0.96).

High-resolution comparison of mutation rates with DNA replication timing

Given our comprehensive catalog of cell line mutations and the high-resolution analysis they enable, we first sought to refine the relationship of mutation rate to replication timing. We divided the autosomal replication timing profiles into 20 bins of equal genomic proportions organized from the earliest replicating fraction to the latest and counted the number of mutations of each respective cell type within the replication timing range of each bin. While all cell types showed continuous increases in mutation rate with later replication, these relationships differed considerably among cell types (Figure 1B). Both B cell-derived cell types, LCL and CLL, showed exponential-like increases in mutation rate from the earliest to latest replicating bins. Interestingly, LCL only showed an increase in mutation rate in the second half of S-phase, whereas CLL showed a continuous increase (Figure 1B). CLL demonstrated a more dramatic overall increase in mutation rate, with 4.58-fold more mutations between the latest and earliest replicating bins (from 2.55% of mutations to 11.67%) than LCL (1.90-fold; Figure 1B). We also observed strong increases in mutation rate in HT115 and LS180, with 3.10-fold (range of 2.8- to 3.3-fold for individual clones) and 3.18-fold (range: 2.6- to 3.3-fold) more mutations in the latest replicating bins than the earliest, respectively (Figure 1B). In contrast, HCT116 showed a diminished relationship, with an only 1.63-fold (3.90%–6.35%; 1.4- to 1.9-fold for individual clones) increase in mutation rate (Figure 1B). The contrast between the cell types, demonstrated most profoundly when comparing CLL and HCT116, establishes a wide disparity in how mutation rates relate to DNA replication timing.

The relationship between replication timing and mutation rates was also apparent visually: plotting mutation rates as continuous profiles along chromosomes revealed a cell-type-specific correspondence with replication timing (Figures 1C and S1A). Indeed, the mutation rate in each cell type was most strongly correlated with its matching replication timing profile (Figure 1D). Overall, our comprehensive dataset comparing mutation rates with matching replication profiles establishes their global correlation but also the heterogeneity among cell types.

A heterogeneous relationship between replication timing and mutational signatures

To further probe the heterogeneity by which the mutational landscape relates to replication timing, we deciphered the underlying mutational pathways in each cell type and investigated how the rate of each of them varies across the genome in relation to cell-type-specific replication timing programs. Specifically, we asked if the disparity in mutation rates between early and late replicating regions could be attributed to specific mutational pathways.

We first determined which COSMIC v.3.2 SBS mutational signatures were active in each cell type and in what proportions. Mutations in both LCL and CLL were best explained by the combination of SBS1, SBS5, SBS9, and SBS40.31 The combination of SBS1, SBS5, and SBS40 comprises clock-like signatures—highly ubiquitous signatures of unknown etiology that increase in abundance with age.1,36 The proposed etiology of SBS9 is error-prone polymerase η repair, which is linked to on- and off-target somatic hypermutation, a pathway prominent in, and nearly exclusive to, B cells.1,37,38,39 Both on-target mutagenesis at immunoglobulin genes and off-target mutagenesis begin with DNA damage from enzymatic, replication, or other genotoxic stress, which is then repaired by DNA polymerase η.39 We found that SBS9 was present globally in both CLL and LCL, but the proportion of mutations was higher in LCL (30% ± 0.12% of all autosomal mutations) than in CLL (14.8% ± 0.15%) (Figures 2A, 2B, and S2A). Mutations in the MSI cell lines HCT116 and LS180 could be explained by combinations of the six MMR-deficiency (MMRd) signatures: SBS6, SBS14, SBS15, SBS20, SBS26, and SBS44.1 Along with the common clock-like SBS1, SBS5, and SBS40, we found MMRd signatures SBS21 and SBS44 best explained autosomal mutations in both cell lines (cosine similarity of 0.97 in HCT116 and 0.98 in LS180). The MMRd signatures comprised a similar proportion of autosomal mutations in these two cell lines (49.5% ± 0.30% and 47.7% ± 0.95%, respectively; variation among individual clones was within ±4.4%) (Figures 2C, 2D, and S2A). HT115 is known to have functional mutations in the exonuclease domain of POLE (DNA polymerase ε). The study from which we sourced the HT115 data showed that all daughter subclones had additional mutations in the MMR genes PMS2, MSH6, and MSH3.26 One daughter subclone also had a heterozygous POLD1 (DNA polymerase δ subunit) mutation, although its signature accounted for a negligible proportion of genomic mutations26 and was therefore not further considered in our analysis. SBS10a and SBS10b (POLE mutations), SBS14 (concurrent MMRd and POLE mutations), SBS21 (MMRd), and the common clock-like SBS1, SBS5, and SBS40 best explained HT115 autosomal mutations (cosine similarity: 0.95). The signatures resulting from POLE mutations and MMRd comprised a total of 53.1% ± 0.63% of autosomal mutations (between 50.3 and 56.2 for individual clones; Figures 2E and S2A).

Figure 2.

Figure 2

Mutational signatures’ association with DNA replication timing varies in a cell-type-specific manner

(A–E) Proportion of individual mutational signatures contributing to the total pool of autosomal mutations in each cell type.

(F–J) Abundance of mutational signatures in 20 replication timing bins of equal genomic content.

Having established the main mutational signatures contributing to mutations in each cell type/line, we analyzed their relation to replication timing by fitting signatures to mutations in 20 autosomal DNA replication timing bins. We combined the contributions of SBS1, SBS5, and SBS40 into a unified clock-like mutational category, SBS21 and SBS44 into an MMRd category for HCT116 and LS180, and SBS10a, SBS10b, SBS14, and SBS21 into an MMRd+POLE category for HT115.

Several mutational signatures showed distinct relationships to replication timing. In LCL and CLL, SBS9 contribution increased 16.88- and 5.13-fold, respectively, between the earliest and the latest replication timing fractions (Figures 2F and 2G). In HCT116 and LS180, MMRd contribution increased modestly at 1.60- and 1.09-fold more mutations (Figures 2H and 2I). Compared with SBS9 and clock-like mutations, MMRd mutations were more uniformly distributed across the genome. This is consistent with previous findings that showed that mutations in MSI cancers are less enriched at late replicating parts of the genome.17,40 In HT115, MMRd+POLE mutations were enriched in late replicating regions in a similar pattern to clock-like mutations, at 2.24-fold more mutations (Figure 2J). Given the stronger replication timing dependence of the combined MMRd+POLE signature compared with MMRd alone, it can be inferred that POLE-derived mutations are specifically enriched in late replicating areas of the genome.

The clock-like category, which explained a substantial proportion of autosomal mutations in all cell types, showed different relationships to replication timing in each cell type. The strongest association was observed in LS180, with 3.42-fold more autosomal mutations in the latest versus earliest replication timing fraction, followed by HT115 (3.12-fold), CLL (3.01-fold), and HCT116 (1.90-fold) (Figures 2F–2J). In contrast, clock-like mutations showed no apparent relationship to replication timing in LCLs. When considering individual signatures, mutations contributed by SBS1—which represents spontaneous deamination of 5-methylcytosine to thymine1—were enriched in late replicating regions in CLL but not in other cell types (Figure S2B). SBS5 and SBS40 were similarly variable among cell types, although their mutational spectra similarity1 precluded associating each of them separately with replication timing. Taken together, the relationship between mutation rates and DNA replication timing varies by mutational pathway and in different ways across cell types.

Heterogeneity of mutational replicative strand asymmetry

Another property of mutations and mutational signatures that varies along the genome is their tendency to occur on the leading or lagging replicative strands. Extending from the results above, we systematically evaluated the relationships between replicative strand and mutational rates, stratified by mutational signatures and replication timing.

We used the slope of replication timing profiles in each cell type/line to assign replicative strand to mutations (Figure 3A): a negative slope on a replication timing profile indicates that the positive genome strand replicates as the leading strand, while a positive slope implies that the positive strand replicates as the lagging strand.30 Because of uncertainties surrounding the locations of replication origins and termini, we regarded 100 Kb on either side of a replication direction change as undefined strandedness. While the strand of origin of any particular mutation cannot be determined without additional information, the replicative asymmetry of mutations can be evaluated by parsing mutations based on the genomic strand and therefore the replicative strand of the substituted pyrimidine base16,30,41,42 (Figure 3A; see STAR Methods). This established approach can determine replicative strand bias based on the ratio of pyrimidine base substitutions. Accordingly, a positive log2-ratio asymmetry value indicates greater leading strand bias of a given mutation type, while negative values indicate greater lagging strand bias.

Figure 3.

Figure 3

Mutational replicative strand asymmetry association with DNA replication timing varies in a cell-type-specific manner

(A) Partitioning mutations by replicative strand. Top: negative slope on a replication timing profile indicates that the positive genome strand replicates as the leading strand, and vice versa for a positive slope. Bottom: mutations are partitioned to the leading or the lagging strand based on the genome strand and replicative strand of the substituted pyrimidine base.

(B) Genome-wide autosomal replicative strand asymmetry for LCL mutational categories.

(C) Replicative strand asymmetry for LCL mutational categories in five replication timing bins of uniform genome content.

(D–K) As in (B) and (C), the replicative strand asymmetry for the mutational pathways in CLL (D and E), HT115 (F and G), HCT116 (H and I), and LS180 (J and K). For all panels, error bars represent the standard error of replicative asymmetry. For (F), (H), and (J), variation among individual clones of each cell line were as follows: HT115 clock: 0.19 to 0.71; MMRd+POLE: 0.42 to 0.70. HCT116 clock: −0.34 to −0.18; MMRd: −0.24 to 0.03. LS180 clock: −0.26 to −0.51; MMRd: −0.11 to −0.25.

We validated strand assignment using four mutational signatures with known replicative strand asymmetries. POLE exonuclease domain mutations result in elevated C>A and C>T mutations on the leading replicative strand,30,42,43 as indeed we observed for the POLE mutation signatures SBS10a (primarily C>A) and SBS10b (primarily C>T), which were significantly enriched on the leading strand in HT115 (asymmetry values of 0.79 ± 0.07 and 0.73 ± 0.11, respectively; Figure S3A). In MMRd, C>T mutations are known to be more abundant on the leading strand,16,44 consistent with our observation for SBS44 (MMRd signature characterized by C>T mutations), which was enriched on the leading strand (asymmetry value of 0.49 ± 0.03 in HCT116 and 0.57 ± 0.13 in LS180; Figure S3A). Similarly, T>C substitutions associated with MMRd are more abundant on the lagging strand,30 and we found SBS21, an MMRd signature characterized almost exclusively by T>C mutations, to be enriched on the lagging strand (−1.87 ± 0.07 in HCT116, −1.25 ± 0.17 in LS180, and −0.45 ± 0.12 in HT115; Figure S3A).

Having demonstrated the effective assignment of replicative strand asymmetry of mutations, we characterized genome-wide replicative strand asymmetry for mutational pathways in the five cell types/lines. Clock-like mutations showed leading strand asymmetry in HT115 yet lagging strand asymmetry in HCT116 and LS180 and no strand asymmetry in LCL and CLL (Figures 3B, 3D, 3F, 3H, and 3J). These were surprising results, especially since a previous study that used mutations pooled from many cancer types reported that the clock-like signatures SBS1 and SBS5 do not show any strand asymmetry.16 MMRd showed minor lagging strand asymmetry in HCT116 and LS180, which can be explained by the combined abundances and opposing replicative strand asymmetries of SBS21 and SBS44 (Figures 3H, 3J, and S3A). On the other hand, the POLE+MMRd mutational pathway in HT115 showed substantial leading strand asymmetry, which could be attributed to the overpowering replicative strand asymmetries of POLE mutations over MMRd (Figures 3F and S3A). Finally, SBS9 showed lagging strand asymmetry in LCL and CLL (Figures 3B, 3D, and S3A), consistent with a previous study.16

We next evaluated the replicative asymmetry of mutational pathways with respect to replication timing. Due to the lower number of mutations assigned to a given strand, we analyzed five instead of 20 genomic bins. Replicative strand asymmetry of clock-like mutations did not change between the replication timing fractions in all cell types except for HCT116, where greater lagging strand asymmetry was evident in the middle replicating fractions (Figures 3C, 3E, 3G, 3I, and 3K). Thus, as with mutations in general, the relationship of the clock-like category to replication timing was variable across cell types/lines. Lagging strand asymmetry for MMRd mutations in HCT116 and LS180 also did not change between replication fractions (Figures 3I and 3K). However, the asymmetry for the individual MMRd signatures SBS21 and -44 showed the strongest lagging and leading strand asymmetry values, respectively, in the middle replicating fractions (Figure S3B). A similar trend was observed for SBS9 and POLEd+MMRd (Figures 3C, 3E, and 3G). This mid-S-phase pattern of greater asymmetry was found in the individual signatures SBS10a, SBS10b, and SBS14 (Figure S3B). By removing 500 Kb regions flanking slope directionality changes, we ruled out that these mid-S enrichment patterns were due to uncertainty in calling replication origin and terminus locations and hence replication direction in their vicinity (Figure S3C).

Taken together, mutational signatures and pathways showed variable replicative strand asymmetry patterns with respect to replication timing. Importantly, these cell-type-specific asymmetry patterns were distinct from the mutation rate patterns described above. More generally, our analyses reaffirm and extend previous findings that the relationship between mutational pathways and replication timing is heterogeneous across cell types and provide a foundation for more detailed investigations to follow.

Discussion

In this work, we sought to describe how mutation rate, mutational pathways, and mutational replicative strand asymmetry vary across cell types, in particular with respect to their association with DNA replication timing. Within the five cell types/lines analyzed in this study, mutations were ascribed to pathways that were either specific to a DNA damage or repair defect or part of the ubiquitous clock-like background mutagenesis. Surprisingly, we found that mutational pathways showed greater bias to late replication or greater replicative strand asymmetry in certain cell types than others. This disparity was most apparent for SBS9, which showed both greater late replication bias and mid S-phase lagging strand asymmetry in CLL than in LCLs. Cell type differences were also evident for MMRd, showing greater late replication bias in HCT116 than in LS180. The ubiquitous clock-like mutational pathway was distributed more uniformly on chromosomes, with less prominent asymmetry, though these properties varied considerably across the five cell types. Taken together, these findings show that the replication timing bias and replicative asymmetry of different mutational pathways are not invariable characteristics but rather specific to individual cell types. In a concomitant analysis focusing on interindividual variation in LCL and CLL tumors, we further identified the overall mutation load of a sample, the rate of mutation clustering along chromosomes, and X-inactivation status as additional factors in the complex associations between the mutational landscape, DNA replication dynamics, and cell-type heterogeneity.31 Overall, we reveal multidimensional heterogeneity of the human mutation landscape related to DNA replication dynamics and suspect that this heterogeneity extends further beyond the factors dissected here. It will be vital for future studies to take into consideration these and other dimensions of variation when interpreting mutational patterns. Furthermore, this complexity underscores the need to further study the molecular mechanisms giving rise to different types of mutations, and understand the sources of their genomic spatiotemporal biases.

Limitations of the study

This study investigated mutational landscape heterogeneity in five specific model cell types. Further research incorporating additional mutational pathways across a wider range of cell types will allow an even broader characterization of mutational landscape and epigenetic heterogeneity. One hurdle to such larger-scale comparative studies is incorporating matched replication timing profiles. For example, though HCT116, LS180, and HT115 are all colon cancer cell lines, their replication timing profiles were distinctive and reflected in their mutational landscapes. Though replication timing profiles of various cell types are already available or readily generated,45,46,47,48 replication timing is understudied in primary tissues, across tumor types, and in individual tumors. Finally, our analysis focused on DNA replication dynamics, which is a central factor shaping the mutational landscape but is nonetheless one among many other genetic and epigenetic factors that interface with mutational processes. Thus, we anticipate that additional factors, such as nucleosome occupancy, transcription factor binding, and histone modifications, may be subject to similar cell-type heterogeneity and overall contribute to a multidimensional complexity of mutational landscape determinants.8,11

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Deposited data

Filtered LCL and CLL mutations Caballero and Koren.31 N/A
Filtered HCT116, LS180, and HT115 mutations This manuscript Data S1
LS180 and HT115 WGS Petljak et al.26 https://doi.org/10.1016/j.cell.2019.02.012
HCT116 WGS This manuscript SRA bioproject: PRJNA875498
Replication timing profiles This manuscript Data S2
LS180 and HCT116 S/G1 sequencing This manuscript SRA bioproject: PRJNA875498
HT116 S/G1 sequencing Massey et al.46 https://doi.org/10.3390%2Fgenes10040269
COSMIC v3.2 SBSsignatures Alexandrov et al.1 https://cancer.sanger.ac.uk/signatures/

Software and algorithms

Picard Tools (v1.138) N/A http://broadinstitute.github.io/picard/
BWA-mem (v0.7.17) Li,49 http://arxiv.org/abs/1303.3997
GATK (v4.1.4.0) McKenna et al.50 https://gatk.broadinstitute.org/hc/en-us
vcf-liftover N/A https://github.com/hmgu-itg/VCF-liftover
SigProfilerMatrixGenerator (v1.2) Bergstrom et al.51 https://github.com/AlexandrovLab/SigProfilerMatrixGenerator
GATK (v4.1.4.0) mutect2 Benjamin et al.52 https://gatk.broadinstitute.org/hc/en-us
TIGER Koren et al.53 https://github.com/TheKorenLab/TIGER
MutationalPatterns (v3.8.0) Manders et al.54 https://bioconductor.org/packages/release/bioc/html/MutationalPatterns.html

Resource availability

Lead contact

Further information and requests for resources, additional code, and data should be directed to and will be fulfilled by the lead contact, Amnon Koren (koren@cornell.edu).

Materials availability

This study did not generate new unique reagents.

Experimental model and subject details

The HCT116 line was a gift from the tissue culture lab at the Francis Crick Institute. Cells were grown in Dulbecco’s Modified Eagle Medium (DMEM), 10% fetal calf serum, penicillin, and streptomycin. Culture was maintained at 37°C with 5% CO2. Passage was performed approximately twice per week for one year.

Method details

Genomic data sources and mutation calling

HCT116, HT115, and LS180 mutation data

HCT116 BAM files were generated by aligning reads to hg38 with BWA-mem49 (v0.7.17) followed by recalibration with GATK (v4.1.4.0)50 commands ‘BaseRecalibrator’ and ‘IndelRealigner’. We acquired the hg19-aligned BAM files from the passage of HT115 and LS180 from Petljak et al. 201926 and recalibrated/Indel realigned BAM files as with HCT116.

Mutations were identified with GATK (v4.1.4.0) mutect252 per the somatic short variant discovery best-practices pipeline. The parental clone was considered the normal sample, and daughter clones were considered tumor samples. For filtering, read orientation bias artifacts were predicted with the command ‘LearnReadOrientationModel’ and used in filtering with ‘FilterMutectCalls.’ The Mutect2 step of cross-sample contamination was not implemented since the samples were cell lines. We identified candidate mutations as heterozygous calls that passed the mutect2 filtering and were unique to a daughter subclone. We required that at daughter candidate mutation sites, the parental genotype must be homozygous for the reference allele and not contain any mutant allele reads. We removed mutations where the parental clone had no read depth, as this prevented confident mutation calling. Finally, we only retained candidate mutations with an MQ of <40 and an alternate (mutant) allele frequency of >0.2 and <0.8 in the daughter.

For LS180 and HT115, we lifted mutations to hg38 coordinates using vcf-liftover (https://github.com/hmgu-itg/VCF-liftover, only liftover within the same chromosome were allowed). We then removed mutations in HCT116 (as with LCLs and CLL) at coordinates without an hg19 equiv to compensate for the reduction of genotypes following liftover.

We removed mutations in all cell lines around the HLA locus and gaps >25Kb in the respective cell type replication timing profile. The final mutation dataset contained 150,470 autosomal mutations in the six HCT116 subclones, 28,944 autosomal mutations in the five HT115 subclones, and 14,974 autosomal mutations in the five LS180 subclones. Mutation trinucleotide context and interpolated replication timing values were assigned using the methods described above for LCLs and CLL.

Replication timing profiles

HCT116

We generated a median autosomal replication timing profile for HCT116 from the six daughter subclones and the parental line using TIGER.53 HCT116 is nearly diploid, with several large copy number alterations present in some or all samples. We removed these copy number alterations by filtering out 2.5Kb windows in individual samples with a copy number ±0.6 than the chromosomal median copy number (as calculated in the individual sample). Each sample was then filtered via the TIGER command ‘TIGER_segment_filt’ (using the MATLAB function ‘segment’, R2: 0.04, standard deviation threshold: 2.5). After filtering, we took the median GC-corrected data in 2.5Kb each window across all samples. Altogether, 280Mb were removed in filtering (11.1% of the autosomal genome). Notably, four copy number alterations >10Mb were removed from all samples.

HT115 and LS180

HT115 and LS180 replication timing profiles were generated from S/G1 sequencing as described in Massey et al., 2019.46 DNA from each cell cycle fraction was sequenced using an Illumina NextSeq 500 and aligned to hg19. The S/G1 DNA replication timing profile for HT115 was previously described.22 The S/G1 replication timing coordinates were lifted-over to hg38.

We compared the final TIGER-generated HCT116 replication timing profile to one generated by S/G1 alongside HT115 and LS180. The two profiles were highly correlated (Pearson’s r = 0.91; Figure S1B). We chose to use the TIGER-generated profile for HCT116 to match the source of the mutation calls.

Mutation counts and signature fitting

We fit the previously described biologically relevant COSMIC v3.2 SBSsignatures1 to all autosomal mutations in the five cell types using the MutationalPatterns54 (v3.8.0) command ‘fit_to_signatures‘. Following current best-practices,38 individual COSMIC signatures were corrected by adjusting the 96 trinucleotide frequencies by the relative abundance of trinucleotide frequencies between the filtered and unfiltered autosomal genome.

To assess the relationship of mutations or signature abundance to replication timing, we divided the autosomal replication timing profiles of each cell type into 20 bins ordered by replication timing. Each bin contained an equal 5% of the genome. In later analyses where mutations were reduced (e.g., stratification by replicative strand), we used five bins (each with an equal 20%) to preserve resolution. The number of bins was chosen to optimize visualization for the different analyses. When fitting signatures to mutations, we again corrected for trinucleotide abundances within each replication timing bin. For this, the 96 trinucleotide frequencies were corrected by the relative abundance of trinucleotide frequencies between the filtered and unfiltered autosomal genome within the replication timing range of each bin.

Replicative strand asymmetry

The local slope of replication timing provides replicative strand information for the positive strand of the genome. We assigned 2.5Kb smoothed data windows of positive slope (based on the immediate flanking windows) as lagging replicative strand on the positive genome strand and leading replicative strand on the negative genome strand. Reciprocally, windows of negative slope were assigned as leading replicative strand on the positive strand and lagging replicative strand on the negative strand. At locations of a slope change, flanking windows within 100Kb were assigned undefined replicative strandedness for both the positive and negative genome strands. Undefined replicative strandedness comprised 600.15Mb (approximately 25%) of the LCL replication timing profile, 599.49Mb in CLL, 740.15Mb in HCT116, 1113.77Mb in LS180, and 1000.07Mb in HT115. Mutations were partitioned into leading or lagging groups based on (1) whether the pyrimidine base of the substitution was on the positive or negative genome strand and (2) the replicative strand of the positive and negative genome strands at that coordinate. We did not include mutations in regions of undefined replicative strand in asymmetry analysis.

We fit the biologically relevant mutational signatures separately to replicative strand-partitioned autosomal mutations. As performed above, individual COSMIC signatures were corrected by adjusting the 96 trinucleotide frequencies by the relative abundance of trinucleotide frequencies between the filtered leading or lagging replicative strand and unfiltered autosomal genome. Regions of undefined strandedness were not included in correction. To assess the relationship of mutational replicative strand asymmetry to replication timing, we divided the autosomal replication timing profile (voiding regions of undefined strandedness) into five bins ordered by replication timing value. Each bin contained an equal quintile (20%) of the genome. We fit the biologically relevant mutational signatures separately to the replicative strand-partitioned mutations in each quintile. Again, we performed signature correction using only regions of defined strandedness within the range of replication timing quintiles.

To increase strand asymmetry confidence, we repeated the analysis of strand asymmetry in LCL, CLL, and HCT116 while removing 500Kb (instead of 100Kb) around regions of slope change. The rationale for this validation was that origin and termination sites in replication timing profiles may be regionally imprecise or variable across samples, leading to false mutation strand assignment even after removing 200Kb around regions of slope change. HT115 and LS180 were not included in this reanalysis due to an insufficient number of mutations.

Quantification and statistical analysis

All statistical analyses were performed using R v4.0.5. Statistical tests and sample numbers are directly stated in the figure legends, figures, or corresponding results.

Signature fitting

We used cosine similarity to assess the confidence of signature fit. This metric compares the original trinucleotide frequencies of mutations to reconstructed frequencies based on predicted signature contributions. A value of one indicates an identical reconstruction. We calculated cosine similarity with the MutationalPatterns command ‘cos_sim’. We additionally performed 1000 bootstrap sampling when fitting signatures using the MutationalPatterns command ‘fit_to_signatures_bootstrapped’. We used the standard deviation of 1000 bootstrap samples as the standard error for signature contribution. Standard errors for combined signatures (e.g., MMRd, which is the combination of SBS21 and SBS44 in HCT116/LS180) were calculated using standard error in the difference of the means (the square-root of the sum of variances).

Replicative strand asymmetry

Before determining asymmetry values, we calculated replicative strand ratios for a given mutational signature using the formula:

rSBS10a=dSBS10agSBS10a

where d and g represent the number of autosomal mutations on the respective leading and lagging strand regarding the genomic strand of the substituted pyrimidine base.

As described above, we calculated standard error for a signature as the standard deviation of 1000 bootstrap samples. Standard error was calculated separately for mutations partitioned to the leading and lagging replicative strand. To get standard error for a replicative strand ratio, we propagated standard errors from the leading and lagging strands using the formula:

σrSBS10arSBS10a=(σdSBS10adSBS10a)2+(σgSBS10agSBS10a)2
σrSBS10a=rSBS10a(σdSBS10adSBS10a)2+(σgSBS10agSBS10a)2

We then calculated replicative strand asymmetry values using the formula:

aSBS10a=log2(rSBS10a)

To calculate standard error for asymmetry values, we subtracted the error from the replicative strand ratio before log2 transformation. Thus, we determined the error for asymmetry as:

σaSBS10a=aSBS10alog2(rSBS10aσrSBS10a)

Acknowledgments

We thank Verena Höfer for technical assistance in generating data for HCT116. This work was funded by the National Institutes of Health (award DP2-GM123495 to A.K.), the National Science Foundation (award MCB-1921341 to A.K.), and the United States-Israel Binational Science Foundation (award 202108 to A.K. and I. Simon).

Author contributions

M.C. analyzed data and A.K. provided supervision. D.B. provided data for HCT116. M.C. and A.K. wrote the paper.

Declaration of interests

The authors declare no competing interests.

Published: May 2, 2023

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.xgen.2023.100315.

Supplemental information

Document S1. Table S1 and Figures S1–S3
mmc1.pdf (613.3KB, pdf)
Data S1. Mutations for HCT116, LS180, and HT115, related to STAR Methods
mmc2.zip (3.8MB, zip)
Data S2. Consensus replication timing profiles, related to STAR Methods
mmc3.zip (18.9MB, zip)
Document S2. Article plus supplemental information
mmc4.pdf (4.7MB, pdf)

Data and code availability

Mutation calls and consensus replication timing profiles in hg38 coordinates for HCT116, LS180, and HT115 are available in Data S1 and S2, respectively. Whole genome sequencing for HCT116 used to generate replication timing profiles with TIGER and S/G1 sequencing for HCT116 and LS180 are available as SRA bioproject PRJNA875498. Code used for analyses, replication timing generation, and mutation calling is available on Mendeley (Mendeley Data: https://doi.org/10.17632/2hwhv32gs2.4).

References

  • 1.Alexandrov L.B., Kim J., Haradhvala N.J., Huang M.N., Tian Ng A.W., Wu Y., Boot A., Covington K.R., Gordenin D.A., Bergstrom E.N., et al. The repertoire of mutational signatures in human cancer. Nature. 2020;578:94–101. doi: 10.1038/s41586-020-1943-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Nik-Zainal S., Alexandrov L.B., Wedge D.C., Van Loo P., Greenman C.D., Raine K., Jones D., Hinton J., Marshall J., Stebbings L.A., et al. Mutational processes molding the genomes of 21 breast cancers. Cell. 2012;149:979–993. doi: 10.1016/j.cell.2012.04.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Aggarwala V., Voight B.F. An expanded sequence context model broadly explains variability in polymorphism levels across the human genome. Nat. Genet. 2016;48:349–355. doi: 10.1038/ng.3511. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Zhang W., Bouffard G.G., Wallace S.S., Bond J.P., NISC Comparative Sequencing Program Estimation of DNA sequence context-dependent mutation rates using primate genomic sequences. J. Mol. Evol. 2007;65:207–214. doi: 10.1007/s00239-007-9000-5. [DOI] [PubMed] [Google Scholar]
  • 5.Makova K.D., Hardison R.C. The effects of chromatin organization on variation in mutation rates in the genome. Nat. Rev. Genet. 2015;16:213–223. doi: 10.1038/nrg3890. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Polak P., Karlić R., Koren A., Thurman R., Sandstrom R., Lawrence M., Reynolds A., Rynes E., Vlahoviček K., Stamatoyannopoulos J.A., et al. Cell-of-origin chromatin organization shapes the mutational landscape of cancer. Nature. 2015;518:360–364. doi: 10.1038/nature14221. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Schuster-Böckler B., Lehner B. Chromatin organization is a major influence on regional mutation rates in human cancer cells. Nature. 2012;488:504–507. doi: 10.1038/nature11273. [DOI] [PubMed] [Google Scholar]
  • 8.Gonzalez-Perez A., Sabarinathan R., Lopez-Bigas N. Local determinants of the mutational landscape of the human genome. Cell. 2019;177:101–114. doi: 10.1016/j.cell.2019.02.051. [DOI] [PubMed] [Google Scholar]
  • 9.Akdemir K.C., Le V.T., Kim J.M., Killcoyne S., King D.A., Lin Y.-P., Tian Y., Inoue A., Amin S.B., Robinson F.S., et al. Somatic mutation distributions in cancer genomes vary with three-dimensional chromatin structure. Nat. Genet. 2020;52:1178–1188. doi: 10.1038/s41588-020-0708-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Reijns M.A.M., Kemp H., Ding J., de Procé S.M., Jackson A.P., Taylor M.S. Lagging-strand replication shapes the mutational landscape of the genome. Nature. 2015;518:502–506. doi: 10.1038/nature14183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Otlu B., Díaz-Gay M., Vermes I., Bergstrom E.N., Barnes M., Alexandrov L.B. Topography of mutational signatures in human cancer. bioRxiv. 2022 doi: 10.1101/2022.05.29.493921. Preprint at. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Koren A., Polak P., Nemesh J., Michaelson J.J., Sebat J., Sunyaev S.R., McCarroll S.A. Differential relationship of DNA replication timing to different forms of human mutation and variation. Am. J. Hum. Genet. 2012;91:1033–1040. doi: 10.1016/j.ajhg.2012.10.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Agarwal I., Przeworski M. Signatures of replication timing, recombination, and sex in the spectrum of rare variants on the human X chromosome and autosomes. Proc. Natl. Acad. Sci. USA. 2019;116:17916–17924. doi: 10.1073/pnas.1900714116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Francioli L.C., Polak P.P., Koren A., Menelaou A., Chun S., Renkens I., Genome of the Netherlands Consortium. van Duijn C.M., Swertz M., Wijmenga C., et al. Genome-wide patterns and properties of de novo mutations in humans. Nat. Genet. 2015;47:822–826. doi: 10.1038/ng.3292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Chen C., Qi H., Shen Y., Pickrell J., Przeworski M. Contrasting determinants of mutation rates in germline and soma. Genetics. 2017;207:255–267. doi: 10.1534/genetics.117.1114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Tomkova M., Tomek J., Kriaucionis S., Schuster-Böckler B. Mutational signature distribution varies with DNA replication timing and strand asymmetry. Genome Biol. 2018;19:129. doi: 10.1186/s13059-018-1509-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Supek F., Lehner B. Differential DNA mismatch repair underlies mutation rate variation across the human genome. Nature. 2015;521:81–84. doi: 10.1038/nature14173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Yehuda Y., Blumenfeld B., Mayorek N., Makedonski K., Vardi O., Cohen-Daniel L., Mansour Y., Baror-Sebban S., Masika H., Farago M., et al. Germline DNA replication timing shapes mammalian genome composition. Nucleic Acids Res. 2018;46:8299–8310. doi: 10.1093/nar/gky610. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Smith T.C.A., Arndt P.F., Eyre-Walker A. Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans. PLoS Genet. 2018;14:e1007254. doi: 10.1371/journal.pgen.1007254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Chen C.-L., Rappailles A., Duquenne L., Huvet M., Guilbaud G., Farinelli L., Audit B., d’Aubenton-Carafa Y., Arneodo A., Hyrien O., et al. Impact of replication timing on non-CpG and CpG substitution rates in mammalian genomes. Genome Res. 2010;20:447–457. doi: 10.1101/gr.098947.109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Cui P., Ding F., Lin Q., Zhang L., Li A., Zhang Z., Hu S., Yu J. Distinct contributions of replication and transcription to mutation rate variation of human genomes. Dev. Reprod. Biol. 2012;10:4–10. doi: 10.1016/S1672-0229(11)60028-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Brody Y., Kimmerling R.J., Maruvka Y.E., Benjamin D., Elacqua J.J., Haradhvala N.J., Kim J., Mouw K.W., Frangaj K., Koren A., et al. Quantification of somatic mutation flow across individual cell division events by lineage sequencing. Genome Res. 2018;28:1901–1918. doi: 10.1101/gr.238543.118. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Woo Y.H., Li W.-H. DNA replication timing and selection shape the landscape of nucleotide variation in cancer genomes. Nat. Commun. 2012;3:1004. doi: 10.1038/ncomms1982. [DOI] [PubMed] [Google Scholar]
  • 24.Sanders M.A., Vöhringer H., Forster V.J., Moore L., Campbell B.B., Hooks Y., Edwards M., Bianchi V., Coorens T.H.H., Butler T.M., et al. Life without mismatch repair. bioRxiv. 2021 doi: 10.1101/2021.04.14.437578. Preprint at. [DOI] [Google Scholar]
  • 25.Degasperi A., Zou X., Amarante T.D., Martinez-Martinez A., Koh G.C.C., Dias J.M.L., Heskin L., Chmelova L., Rinaldi G., Wang V.Y.W., et al. Substitution mutational signatures in whole-genome–sequenced cancers in the UK population. Science. 2022;376:abl9283. doi: 10.1126/science.abl9283. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Petljak M., Alexandrov L.B., Brammeld J.S., Price S., Wedge D.C., Grossmann S., Dawson K.J., Ju Y.S., Iorio F., Tubio J.M.C., et al. Characterizing mutational signatures in human cancer cell lines reveals episodic APOBEC mutagenesis. Cell. 2019;176:1282–1294.e20. doi: 10.1016/j.cell.2019.02.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Singh V.K., Rastogi A., Hu X., Wang Y., De S. Mutational signature SBS8 predominantly arises due to late replication errors in cancer. Commun. Biol. 2020;3:421. doi: 10.1038/s42003-020-01119-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Yaacov A., Vardi O., Blumenfeld B., Greenberg A., Massey D.J., Koren A., Adar S., Simon I., Rosenberg S. Cancer mutational processes vary in their association with replication timing and chromatin accessibility. Cancer Res. 2021;81:6106–6116. doi: 10.1158/0008-5472.CAN-21-2039. [DOI] [PubMed] [Google Scholar]
  • 29.Vöhringer H., Hoeck A.V., Cuppen E., Gerstung M. Learning mutational signatures and their multidimensional genomic properties with TensorSignatures. Nat. Commun. 2021;12:3628. doi: 10.1038/s41467-021-23551-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Haradhvala N.J., Polak P., Stojanov P., Covington K.R., Shinbrot E., Hess J.M., Rheinbay E., Kim J., Maruvka Y.E., Braunstein L.Z., et al. Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair. Cell. 2016;164:538–549. doi: 10.1016/j.cell.2015.12.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Caballero M., Koren A. The landscape of somatic mutations in lymphoblastoid cell lines. Cell Genomics. 2023;3 doi: 10.1016/j.xgen.2023.100305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Mosquera Orgueira A., Antelo Rodríguez B., Díaz Arias J.Á., González Pérez M.S., Bello López J.L. New recurrent structural aberrations in the genome of chronic lymphocytic leukemia based on exome-sequencing data. Front. Genet. 2019;10:854. doi: 10.3389/fgene.2019.00854. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Edelmann J., Holzmann K., Tausch E., Saunderson E.A., Jebaraj B.M.C., Steinbrecher D., Dolnik A., Blätte T.J., Landau D.A., Saub J., et al. Genomic alterations in high-risk chronic lymphocytic leukemia frequently affect cell cycle key regulators and NOTCH1-regulated transcription. Haematologica. 2020;105:1379–1390. doi: 10.3324/haematol.2019.217307. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Rivera-Mulia J.C., Buckley Q., Sasaki T., Zimmerman J., Didier R.A., Nazor K., Loring J.F., Lian Z., Weissman S., Robins A.J., et al. Dynamic changes in replication timing and gene expression during lineage specification of human pluripotent stem cells. Genome Res. 2015;25:1091–1103. doi: 10.1101/gr.187989.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Yaffe E., Farkash-Amar S., Polten A., Yakhini Z., Tanay A., Simon I. Comparative analysis of DNA replication timing reveals conserved large-scale chromosomal architecture. PLoS Genet. 2010;6:e1001011. doi: 10.1371/journal.pgen.1001011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Alexandrov L.B., Jones P.H., Wedge D.C., Sale J.E., Campbell P.J., Nik-Zainal S., Stratton M.R. Clock-like mutational processes in human somatic cells. Nat. Genet. 2015;47:1402–1407. doi: 10.1038/ng.3441. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Zhang L., Dong X., Lee M., Maslov A.Y., Wang T., Vijg J. Single-cell whole-genome sequencing reveals the functional landscape of somatic mutations in B lymphocytes across the human lifespan. Proc. Natl. Acad. Sci. USA. 2019;116:9014–9019. doi: 10.1073/pnas.1902510116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Maura F., Degasperi A., Nadeu F., Leongamornlert D., Davies H., Moore L., Royo R., Ziccheddu B., Puente X.S., Avet-Loiseau H., et al. A practical guide for mutational signature analysis in hematological malignancies. Nat. Commun. 2019;10:2969. doi: 10.1038/s41467-019-11037-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Machado H.E., Mitchell E., Øbro N.F., Kübler K., Davies M., Leongamornlert D., Cull A., Maura F., Sanders M.A., Cagan A.T.J., et al. Diverse mutational landscapes in human lymphocytes. Nature. 2022:1–9. doi: 10.1038/s41586-022-05072-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Drost J., van Boxtel R., Blokzijl F., Mizutani T., Sasaki N., Sasselli V., de Ligt J., Behjati S., Grolleman J.E., van Wezel T., et al. Use of CRISPR-modified human stem cell organoids to study the origin of mutational signatures in cancer. Science. 2017;358:234–238. doi: 10.1126/science.aao3130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Zou X., Koh G.C.C., Nanda A.S., Degasperi A., Urgo K., Roumeliotis T.I., Agu C.A., Badja C., Momen S., Young J., et al. A systematic CRISPR screen defines mutational mechanisms underpinning signatures caused by replication errors and endogenous DNA damage. Nat. Cancer. 2021;2:643–657. doi: 10.1038/s43018-021-00200-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Robinson P.S., Coorens T.H.H., Palles C., Mitchell E., Abascal F., Olafsson S., Lee B.C.H., Lawson A.R.J., Lee-Six H., Moore L., et al. Increased somatic mutation burdens in normal human cells due to defective DNA polymerases. Nat. Genet. 2021;53:1434–1442. doi: 10.1038/s41588-021-00930-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Shinbrot E., Henninger E.E., Weinhold N., Covington K.R., Göksenin A.Y., Schultz N., Chao H., Doddapaneni H., Muzny D.M., Gibbs R.A., et al. Exonuclease mutations in DNA polymerase epsilon reveal replication strand specific mutation patterns and human origins of replication. Genome Res. 2014;24:1740–1750. doi: 10.1101/gr.174789.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Andrianova M.A., Bazykin G.A., Nikolaev S.I., Seplyarskiy V.B. Human mismatch repair system balances mutation rates between strands by removing more mismatches from the lagging strand. Genome Res. 2017;27:1336–1343. doi: 10.1101/gr.219915.116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Rhind N., Gilbert D.M. DNA replication timing. Cold Spring Harb. Perspect. Biol. 2013;5:a010132. doi: 10.1101/cshperspect.a010132. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Massey D.J., Kim D., Brooks K.E., Smolka M.B., Koren A. Next-generation sequencing enables spatiotemporal resolution of human centromere replication timing. Genes. 2019;10:E269. doi: 10.3390/genes10040269. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Rivera-Mulia J.C., Sasaki T., Trevilla-Garcia C., Nakamichi N., Knapp D.J.H.F., Hammond C.A., Chang B.H., Tyner J.W., Devidas M., Zimmerman J., et al. Replication timing alterations in leukemia affect clinically relevant chromosome domains. Blood Adv. 2019;3:3201–3213. doi: 10.1182/bloodadvances.2019000641. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Luo Y., Hitz B.C., Gabdank I., Hilton J.A., Kagda M.S., Lam B., Myers Z., Sud P., Jou J., Lin K., et al. New developments on the Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids Res. 2020;48:D882–D889. doi: 10.1093/nar/gkz1062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv. 2013 doi: 10.48550/arXiv.1303.3997. Preprint at. [DOI] [Google Scholar]
  • 50.McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., Garimella K., Altshuler D., Gabriel S., Daly M., et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–1303. doi: 10.1101/gr.107524.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Bergstrom E.N., Huang M.N., Mahto U., Barnes M., Stratton M.R., Rozen S.G., Alexandrov L.B. SigProfilerMatrixGenerator: a tool for visualizing and exploring patterns of small mutational events. BMC Genom. 2019;20:685. doi: 10.1186/s12864-019-6041-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Benjamin D., Sato T., Cibulskis K., Getz G., Stewart C., Lichtenstein L. Calling somatic SNVs and indels with Mutect2. bioRxiv. 2019 doi: 10.1101/861054. Preprint at. [DOI] [Google Scholar]
  • 53.Koren A., Massey D.J., Bracci A.N. TIGER: inferring DNA replication timing from whole-genome sequence data. Bioinformatics. 2021;37:4001–4005. doi: 10.1093/bioinformatics/btab166. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Manders F., Brandsma A.M., de Kanter J., Verheul M., Oka R., van Roosmalen M.J., van der Roest B., van Hoeck A., Cuppen E., van Boxtel R. MutationalPatterns: the one stop shop for the analysis of mutational processes. BMC Genom. 2022;23:134. doi: 10.1186/s12864-022-08357-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Table S1 and Figures S1–S3
mmc1.pdf (613.3KB, pdf)
Data S1. Mutations for HCT116, LS180, and HT115, related to STAR Methods
mmc2.zip (3.8MB, zip)
Data S2. Consensus replication timing profiles, related to STAR Methods
mmc3.zip (18.9MB, zip)
Document S2. Article plus supplemental information
mmc4.pdf (4.7MB, pdf)

Data Availability Statement

Mutation calls and consensus replication timing profiles in hg38 coordinates for HCT116, LS180, and HT115 are available in Data S1 and S2, respectively. Whole genome sequencing for HCT116 used to generate replication timing profiles with TIGER and S/G1 sequencing for HCT116 and LS180 are available as SRA bioproject PRJNA875498. Code used for analyses, replication timing generation, and mutation calling is available on Mendeley (Mendeley Data: https://doi.org/10.17632/2hwhv32gs2.4).


Articles from Cell Genomics are provided here courtesy of Elsevier

RESOURCES