Skip to main content
. 2020 Nov 19;31(6):3909–3922. doi: 10.1007/s00330-020-07417-0

Table 3.

Performance metrics frequently used in ML

Metric Definition and details
Recall

Fraction of true positive (TP) instances among the instances predicted to be positive by an algorithm, including false positive (FP) instances (synonym for “positive predictive value”)

Recall = TPTP+FP

Precision

Fraction of the instances predicted to be positive by an algorithm among all TP instances, including false negative (FN) instances (synonym for “sensitivity”)

Precision = TPTP+FN

Accuracy

Fraction of TP and true negatively (TN) predicted instances among all instances.

Accuracy = TP+TNTP+TN+FP+FN

F1-score

Harmonic mean of precision and recall. Ranges from 0 to 1 (meaning perfect precision and recall). Important measure, because both high precision and recall are needed for high F1 scores.

F1 = 2 * (precisionrecallprecision+recall)

False-positive findings Negative instances falsely predicted to be positive by an algorithm. Numbers of false-positive findings are very important in ML, because too many of them render algorithms useless. Investigating the reasons for false-positive findings may help to develop strategies to avoid them, but requires domain knowledge in the field of application.
ROC curve Receiver operating characteristic curve. Graph illustrating the discriminative ability of a classifier. Sensitivity (Y-axis) plotted against the false-positive rate (X-axis) for different classification thresholds. The area under the curve (AUC) measures the 2D area underneath the ROC curve and provides an aggregate measure of performance.
Intersection-Over-Union (IoU)

Important measure to assess the performance of algorithms for segmentation tasks. Overlap between two regions of interest, mostly of a ground truth segmentation and a predicted segmentation, e.g., of the left ventricle. Ranges from 0 to 1, with 1 indicating perfect overlap.

IoU = Area of overlapArea of union.

Dice similarity coefficient (DSC)

Another important measure in assessing segmentation algorithms. Ranges from 0 to 1, with 1 indicating perfect overlap.

DSC = 2Area of overlapTotalarea of objects