Table 3.
TTB | TIPS | UAE | |
---|---|---|---|
AUROC | 0.913 | 0.788 | 0.879 |
Maximum F1 score | 0.532 | 0.376 | 0.700 |
Precision (at maximum F1 score) | 0.426 | 0.279 | 0.563 |
Recall (at maximum F1 score) | 0.709 | 0.576 | 0.915 |
Sensitivity | 90.0% | 90.0% | 90.0% |
Specificity (at sensitivity of 90%) | 82.4% | 45.3% | 68.0% |
Threshold (at sensitivity of 90%) | 0.209 | 0.103 | 0.195 |
Note–Performance metrics for each of the random forest models when evaluated on the testing set. AUROC is a good overall summary of each model’s performance. The maximum F1 score is useful for evaluating the performance of each model on imbalanced data (ie, when there are far more patients without the outcome of interest than with the outcome of interest). The F1 score is defined as the harmonic mean of precision (positive predictive value) and recall (sensitivity). Precision and recall values corresponding to the maximum F1 score have also been provided. Threshold refers to the classifier value that fixed sensitivity at 90%. The corresponding specificity was computed and is reported.
AUROC = area under the receiver operating characteristic curve; TIPS = transjugular intrahepatic portosystemic shunt; TTB = transthoracic biopsy; UAE = uterine artery embolization.