Table 3.
Dataset | METABRIC discovery | METABRIC validation | MILLER (test) | |||
---|---|---|---|---|---|---|
Low risk | High risk | Low risk | High risk | Low risk | High risk | |
Predicted low risk | 157 (94%) | 11 (7%) | 107 (89%) | 13 (11%) | 68 (93%) | 5 (7%) |
Predicted medium risk | 278 (68%) | 134 (33%) | 236 (70%) | 99 (30%) | 55 (68%) | 26 (32%) |
Predicted high risk | 21 (35%) | 39 (65%) | 33 (42%) | 46 (58%) | 29 (62%) | 18 (38%) |
The confusion matrices show the performance of our decision tree on three datasets. The percentage of predicted cases with respect to the total number of predictions in each group is shown in parentheses. From a clinical standpoint, it is important to achieve a high precision (positive predictive value) for low risk cases (shown in bold) to confidently recommend a less agressive treatment regimen for a subset of patients. The probability of surviving more than 10 years is above 89% for the predicted low risk cases in all the three datasets (Additional file 1: Figure S11)