Table 5.
Dataset | Q2 | P[D] | S[D] | P[N] | S[N] | C | AUC | PM |
---|---|---|---|---|---|---|---|---|
SIFT | 0.61 | 0.62 | 0.66 | 0.60 | 0.56 | 0.22 | 0.64 | 95 |
CHASM | 0.80 | 0.85 | 0.73 | 0.76 | 0.87 | 0.60 | 0.88 | 100 |
SPF-All | 0.88 | 0.88 | 0.87 | 0.87 | 0.88 | 0.75 | 0.94 | 100 |
SPF-Cancer | 0.90 | 0.91 | 0.90 | 0.90 | 0.91 | 0.81 | 0.96 | 100 |
Overall accuracy (Q2), positive predictive value (P) Sensitivity, Correlation coefficient (C) and area under the ROC curve (AUC) are defined in Methods section. D (Disease) and N (Neutral) are respectively driver and passenger cancer variants. The latter have been generated by CHASM. PM is the percentage predicted variants for the Synthetic dataset. The confidence interval for Q2, C and AUC calculated on the two subsets is ≤0.01.