Skip to main content
. 2021 Jan 25;10:e61082. doi: 10.7554/eLife.61082

Figure 3. Comparisons of prediction accuracies (AUCs) of supervised, partially supervised, and unsupervised methodologies.

(a) Supervised age SuperSigs vs unsupervised Signature 1 over 30 tumor types; (b) SuperSigs vs unsupervised signatures for all annotated etiological factors other than age found in Alexandrov et al., 2013a, in tumor types for which the unsupervised signature was present (for the full list see Supplementary file 1). (c) Partially supervised vs unsupervised NMF signatures for all annotated etiological factors other than age (see Materials and methods). Each combination of tumor type and risk factor (e.g. lung adenocarcinoma and smoking) yields a signature and is represented by one point, which depicts the prediction accuracies of the unsupervised approach (x-axis coordinate value) versus the supervised (a–b) or partially supervised (c) one (y-axis coordinate value). Apparent AUCs are reported. The great majority (c) or essentially all (a–b) points lie above or on the line, indicating the greater accuracy of the supervised and partially supervised approaches.

Figure 3.

Figure 3—figure supplement 1. Unsupervised, random, and supervised methods’ comparisons.

Figure 3—figure supplement 1.

Comparison of the prediction accuracies (in terms of AUC) are reported for all signatures of age, environmental, and inherited factors, for the unsupervised, the randomly generated single peak signatures, and the supervised methodologies. (a) Random Single Peak (Single Peak) vs Alexandrov’s Signature one for age; (b) Random Single Peak (Single Peak) vs Alexandrov’s Unsupervised for smoking; (c) Random Single Peak (Single Peak) vs Non-negative Least Square (NNLS) SuperSigs for age; (d) Random Single Peak (Single Peak) vs Non-negative Least Square (NNLS) SuperSigs for smoking; (e) Alexandrov’s Unsupervised vs Best NMF for the indicated exposures; (f) Non-negative Least Square (NNLS) SuperSigs vs standard SuperSigs, that is the ones using logistic regression (LR) (see Materials and methods for details). All comparisons based on apparent AUC except for f.
Figure 3—figure supplement 2. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 2.

Figure 3—figure supplement 3. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 3.

Figure 3—figure supplement 4. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 4.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Not discounted for age. See Materials and method section for details.
Figure 3—figure supplement 5. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 5.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance not discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 6. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 6.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Not discounted for age. See Materials and method section for details.
Figure 3—figure supplement 7. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 7.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance not discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 8. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 8.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Not discounted for age. See Materials and method section for details.
Figure 3—figure supplement 9. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 9.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance not discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 10. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 10.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Not discounted for age. See Materials and method section for details.
Figure 3—figure supplement 11. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 11.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance not discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 12. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 12.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Not discounted for age. See Materials and method section for details.
Figure 3—figure supplement 13. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 13.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance not discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 14. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 14.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Not discounted for age. See Materials and method section for details.
Figure 3—figure supplement 15. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 15.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance not discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 16. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 16.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Not discounted for age. See Materials and method section for details.
Figure 3—figure supplement 17. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 17.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance not discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 18. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 18.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Not discounted for age. See Materials and method section for details.
Figure 3—figure supplement 19. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 19.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 20. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 20.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 21. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 21.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Discounted for age. See Materials and method section for details.
Figure 3—figure supplement 22. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 22.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 23. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 23.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Discounted for age. See Materials and method section for details.
Figure 3—figure supplement 24. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 24.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 25. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 25.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Discounted for age. See Materials and method section for details.
Figure 3—figure supplement 26. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 26.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 27. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 27.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Discounted for age. See Materials and method section for details.
Figure 3—figure supplement 28. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 28.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 29. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 29.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Discounted for age. See Materials and method section for details.
Figure 3—figure supplement 30. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 30.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 31. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 31.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Discounted for age. See Materials and method section for details.
Figure 3—figure supplement 32. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 32.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 33. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 33.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Discounted for age. See Materials and method section for details.
Figure 3—figure supplement 34. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 34.

Heatmap of the distance, in terms of correlation, between any two etiological factors’ mutational landscapes in the corresponding tissues. Distance discounted for age. The distance between any two mutational landscapes is given by 1- the Pearson’s correlation between the two mutational landscapes. See Materials and method section for details.
Figure 3—figure supplement 35. The tissue dependence of the mutational signatures.

Figure 3—figure supplement 35.

Multidimensional Scaling (MDS) plot of indicated etiological factors’ mutational landscapes in the corresponding tissues. Discounted for age. See Materials and method section for details.
Figure 3—figure supplement 36. Partially supervised versus unsupervised methods.

Figure 3—figure supplement 36.

Performance comparison in terms of AUC for the partially supervised method and the unsupervised one.
Figure 3—figure supplement 37. Model misspecification and the dimensionality issue with the unsupervised method.

Figure 3—figure supplement 37.

All selected features of the supervised and unsupervised POL-ε signatures in UCEC-TCGA are listed and their frequencies compared (IUPAC notations: B=not A, D = not C, H = not G, V = not T, W = A or T, S = C or G, M = A or C, K = G or T, R = A or G, Y = C or T). Different plots are provided according to the different numbers of patterns (i.e. rank) unsupervised NMF was required to find: (a – e) correspond to rank = 2, 3, 4, 5, and 6, respectively. The larger the rank the greater the difference of the unsupervised signature from the correct supervised one. See Materials and method section for details.
Figure 3—figure supplement 38. Speed benchmark.

Figure 3—figure supplement 38.

Runtimes of the full SuperSigs methodology for all TCGA datasets analyzed (each point is one whole exome or whole genome dataset).
Figure 3—figure supplement 39. Age tertiles.

Figure 3—figure supplement 39.

Age ranges of the two groups considered in each TCGA datasets analyzed for an age signature: young (lowest tertile) and old (highest tertile).