Sequence context (SNVs) |
Identity of mutated base (A/T or C/G). Trinucleotide and penta-nucleotide contexts centered at the mutated base, and 1 bp and 2 bp left and right flanks of the mutated base. |
Sequence context is a major covariate of mutation probability. Although previous studies typically considered trinucleotide contexts, mutation rates could be affected by wider sequence contexts25. |
Computed from mutation data |
Sequence context (indels) |
Presence of poly-A/T or poly-C/G sequences longer than 5 bp at the indel site. |
Long mononucleotide repeats could lead to artifacts in indel calling. |
Computed from mutation data |
TF-binding profiles |
ChIP-Seq peak profiles of 132 TFs and 1 meta profile including peaks of all TFs from ENCODE cell lines. |
TF-binding sites have elevated mutation rates in certain cancers due to impaired nucleotide excision repair. |
Zerbino et al. 26
|
Replication timing |
Mean replication timing profile of 13 ENCODE cell lines. |
Replication timing is inversely correlated with mutation probability. |
Hansen et al. 27
|
APOBEC editing sites |
Predicted APOBEC editing sites. |
Elevated mutation rates at APOBEC editing sites could lead to the formation of passenger hotspots. |
Buisson et al.28 Table S2. |
Local mutation rate |
Mutation rate of 100 kb nonoverlapping genomic bins. |
To correct for additional unexplained regional variation in mutation rates. |
Computed from mutation data |
Individual mutation count |
Mutation burden of individual tumors. |
To account for intertumor heterogeneity. |
Computed from mutation data |
Tissue-specific epigenetic profile |
Chromatin accessibility and modification profiles from matched tissue/cell type. |
Epigenetic profiles from the cell of origin better predict the mutational landscape of tumors13. |
Supplied by the user |
COSMIC mutation signatures |
Proportion of mutations contributed by a specific mutation signature for each tumor. |
To further correct for specific mutational processes in the tumor cohort. |
Supplied by the user |