Skip to main content
. 2020 Oct 13;4(4):041503. doi: 10.1063/5.0011697

TABLE III.

Benchmark measures explained.

Benchmark measures Definition
Acc (accuracy) The proportion of correct predictions among the total No. of predictions (TP + TN)/total population.
Sp (specificity) The proportion of negatively classified cases among the total No. of negative cases; TN/(TN + FP).
Se (sensitivity) The proportion of positively classified cases among the total No. of positive cases; TP/(TP + FN);
TP (True positive) An outcome where the model correctly predicts the positive class.
TN (true negative) An outcome where the model correctly predicts the negative class.
FP (false positive) Where you receive a positive result for a test, when you should have received a negative results.
FN (false negative) Where you receive a negative result for a test, when you should have received positive results.
Error (error rate) The frequency of errors occurred, defined as “the ratio of total number of data units in error to the total number of data units transmitted.”
Risk (risk score) Designed to represent an underlying probability of an adverse event denoted Y = 1, given a vector of P explaining variables X containing measurements of the relevant risk factors.
Time (time complexity) The computational complexity that describes the amount of time it takes to run an algorithm.
JD (Jaccard coefficient) Also known as the Intersection over Union and the Jaccard similarity coefficient is a statistic used for gauging the similarity and diversity of sample sets. The Jaccard coefficient measures similarity between finite sample sets and is defined as the size of the intersection divided by the size of the union of the sample sets
DSI (dice similarity index) Statistics for similarity.
PSNR (peak signal-to-noise ratio) The ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation.
MSE (mean square error) Measures the average of the squares of the errors.
RMSE (root mean square error) Measures the standard deviation of the residual.
FRDD (fault rate dust detection) Calculated as FRDD = (TP+FN)/(TP+TN+FP+FN)
PCC (Pearson correlation coefficient) A measure of the linear correlation between two variables X and Y.
HDD (Hausdorff distance) Measures how far two subsets of a metric space are from each other.
AUC (area under the curve) A graphical plot illustrating the sensitivity as a function of “1-specificity” in a binary classifier with a varying discrimination threshold. The area under the curve corresponds to the probability that a binary classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one; range 0 to 1.
MAD (Mean absolute distance) The average absolute distance between two-surface points.
CMD (center of mass distance) The distance between two centers of mass of surface points.
MSD (mean distance of the surface point) The average distance between two-surface points.
mSD (min. distance of the surface point) The minimum distance between two-surface points.
pFDR (positive false discovery rate) Can be written as pFDR = E[V /R—R > 0], where V is the number of false positives (Type I error) and R is the number of rejected null hypotheses. The term “positive” describes the fact that we have conditioned on at least one positive finding having occurred.
MAE (mean absolute error) An average of the absolute errors: |ei|=|yi-xi|, where yi is the prediction and xi the true value.
RMSLE (root mean square logarithmic error) Measures the ratio between actual and predicted. It is then sqrt (mean (squared logarithmic errors)).
Rec (recall) Quantifies the number of positive class predictions made out of all positive examples in the dataset. It is calculated as the number of TP divided by the total number of TP and FN.
Pr (precision) Quantifies the number of positive class predictions that actually belong to the positive class. It is calculated as the ratio of correctly predicted positive examples divided by the total number of positive examples that were predicted.
PPV (positive predictive value) The probability that subjects with a positive screening test truly have the disease.
NPV (negative predictive value) The probability that subjects with a negative screening test truly do not have the disease.
FAR (false alarm rate) The number of false alarms per the total number of warnings or alarms in a given study or situation.
ROC (receiver operating characteristic) Created by plotting the true positive rate against the false positive rate at various threshold settings.
IGV (intergroup variance) Variations caused by differences within individual groups.