Table 1. Multi-step filter selects cancer drivers in genomics datasets with high mutational burden.
Step | Logic | Program | Input | Output | Reference files |
---|---|---|---|---|---|
i | Cohort selection | TCGA Portal, CGHub, Gene Torrent | tcga_patient_list.txt | wes.bam | |
ii | Mutation call | MuTect 1.1.428 | wes.bam | coverage.wig, call_stats.txt | HG19, COSMIC_54.vcf, dbsnp_132.vcf |
iii | Recurrence, evolutionary conservation | MutSig 2.029 | patient.maf, coverage.wig | covariates.txt | HG19/.maf |
iv | Correction for background mutation rate | InVEx 1.0.19 | coverage.wig, covariates.txt | significant_mutation_burden.txt, qq.png | HG19/.maf, PPH2, nucleotide_classes_HG19.txt, COSMIC, genePeptideFile_HG19 |
v | Mutation Signature, UV induced damage | Text editor | patient.maf | sorted_transitions.maf | nucleotide_classes_HG19.txt, uv_transitions.txt10 |
vi | Structure-activity-relationship | SWISS-MODEL 8.05, TMpred, 25.0 | structure.pdb | model.pdb, tm_model.txt | |
vii | Pathway enrichment, Mutual exclusivity | GSEA 2.1.012, MEMo 1.111 | tcga_patient_list.txt, scna.txt, amp_del_gene.txt, covariates.txt, coverage.wig | modules.txt | |
viii | Recurrence within PAN-Cancer TCGA | TCGA | covariate_target.txt, patient.maf | covariate.maf |
Somatic mutations of driver genes are called after i) cohort selection, ii) mapping of human genome and patient specific somatic references, iii) assessment of recurrence, evolutionary conversation, and iv) basal mutation rate based on frequency of mutations of introns vs exons. This first set of filters i)-iv) is necessary and sufficient to identify statistically significant enriched somatic mutations of driver genes in any dataset with high mutational burden. In a genome-wide sequencing experiment with a goal to find cancer drivers, an additional level of filters v)-viii) is advantageous. Relevance of mutations is assessed by v) nucleotide signature, vi) structure activity relationship, vii) pathway enrichment and mutual exclusivity to known cancer drivers, as well as viii) recurrence in other cancer tissues.