Table 3.
Metric | Definition and details |
---|---|
Recall |
Fraction of true positive (TP) instances among the instances predicted to be positive by an algorithm, including false positive (FP) instances (synonym for “positive predictive value”) Recall = |
Precision |
Fraction of the instances predicted to be positive by an algorithm among all TP instances, including false negative (FN) instances (synonym for “sensitivity”) Precision = |
Accuracy |
Fraction of TP and true negatively (TN) predicted instances among all instances. Accuracy = |
F1-score |
Harmonic mean of precision and recall. Ranges from 0 to 1 (meaning perfect precision and recall). Important measure, because both high precision and recall are needed for high F1 scores. F1 = 2 * () |
False-positive findings | Negative instances falsely predicted to be positive by an algorithm. Numbers of false-positive findings are very important in ML, because too many of them render algorithms useless. Investigating the reasons for false-positive findings may help to develop strategies to avoid them, but requires domain knowledge in the field of application. |
ROC curve | Receiver operating characteristic curve. Graph illustrating the discriminative ability of a classifier. Sensitivity (Y-axis) plotted against the false-positive rate (X-axis) for different classification thresholds. The area under the curve (AUC) measures the 2D area underneath the ROC curve and provides an aggregate measure of performance. |
Intersection-Over-Union (IoU) |
Important measure to assess the performance of algorithms for segmentation tasks. Overlap between two regions of interest, mostly of a ground truth segmentation and a predicted segmentation, e.g., of the left ventricle. Ranges from 0 to 1, with 1 indicating perfect overlap. IoU = . |
Dice similarity coefficient (DSC) |
Another important measure in assessing segmentation algorithms. Ranges from 0 to 1, with 1 indicating perfect overlap. DSC = |