Skip to main content
. Author manuscript; available in PMC: 2020 Jul 29.
Published in final edited form as: Nature. 2020 Jan 29;578(7794):266–272. doi: 10.1038/s41586-020-1961-1

Extended Data Figure 5. Comparison of mutational signatures extracted using two algorithms.

Extended Data Figure 5

(A) Trinucleotide contexts for the signatures extracted by the hierarchical Dirichlet process (HDP) on the left and MutationalPatterns non-negative matrix factorisation on the right. The six substitution types are shown in the panels across the top of each signature. Within each panel, the trinucleotide context is shown as four sets of four bars, grouped by whether an A, C, G or T respectively is 5’ to the mutated base, and within each group of four by whether A, C, G or T is 3’ to the mutated base. Where signatures show high cosine similarity scores between algorithms, they are lined up horizontally. We note that MutationalPatterns’ Signature C does not have a match in the signatures extracted by the hierarchical Dirichlet process algorithm, but appears very similar to Signature A in MutationalPatterns (or SBS-5 from the hierarchical Dirichlet process). This means it likely represents over-splitting of the signatures.

(B) The heatmap shows the cosine similarities of signatures extracted by MutationalPatterns with those extracted by the hierarchical Dirichlet process (HDP). Only cosine similarity scores >0.75 are coloured.

(C) Scatterplots showing the fraction of mutations in each colony (n = 632) assigned to each signature by the hierarchical Dirichlet process (HDP; x axis) versus the MutationalPatterns algorithm (y axis). Correlation values quoted are Pearson’s correlation coefficients, R2.

(D) Transcription strand bias of A>G mutations in N[A]T context before and after transcription start sites. Note the absence of transcriptional strand bias in intergenic regions, but evidence for both transcription-coupled damage and repair after the transcription start site, applying similarly in both never smokers and ex-/current smokers.