Table 1.
GDD-ENS performance and comparison with WGS/WES classifiers.
| Model | Data set | Types | Accuracy | Macro-prec. | % In-dist | % High conf. |
|---|---|---|---|---|---|---|
| DeepTumour (13) | WGS | 24 | 91% | 91% | 73 | — |
| CUPLR (11) | WGS | 33 | 89% | 78% | 92 | — |
| Salvadores-SVM (10) | WGS | 18 | 91% | 86% | 74 | — |
| MuAt (14) | WGS | 24 | 89% | 87% | 73 | — |
| Soh-SVM (15) | WES | 28 | 77% | 78% | 84 | — |
| CPEM (16) | WES | 31 | 84% | 83% | 85 | — |
| MuAt (14) | WES | 20 | 64% | 66% | 74 | — |
| GDD-RF (22) | MSK-IMPACT | 22 | 74% | 71% | 85 | — |
| GDD-ENS | MSK-IMPACT | 38 | 79% | 64% | 97 | — |
| CUPLR (11) High conf. |
WGS | 33 | 96% | 82% | 92 | 82 |
| GDD-RF (22) High conf. |
MSK-IMPACT | 22 | 91% | 87% | 85 | 62 |
| GDD-ENS High conf. |
MSK-IMPACT | 38 | 93% | 88% | 97 | 72 |
NOTE: WGS and WES-based methods perform better than panel-based approaches in general, as these approaches generate more data that can be used to derive additional informative features, like regional mutation density. However, high-confidence GDD-ENS predictions perform similarly to or better than most models, on a larger set of cancer types that covers a greater percentage of the solid tumor data set. Bold indicates best performing models for each metric.
Abbreviations: Macro-prec., macro-precision; % In-dist., in-distribution proportion or percentage of our solid tumor discovery cohort predictable by the classifier's specific training labels. % High conf., proportion of outputs above the high-confidence threshold above.