. 2020 Jul 14;14:28. doi: 10.3389/fninf.2020.00028

Table 3.

Summary of all performance metrics.

	Model [5-vote] (std)	Model [3-vote] (std)	Neurologists [3-vote] (std)
AUC	0.92 (0.01)	0.92 (0.01)	N/A
AP	0.85 (0.01)	0.82 (0.02)	N/A
F1	0.81 (0.01)	0.77 (0.02)	0.71 (0.08)
Accuracy	0.89 (0.01)	0.87 (0.02)	0.82 (0.05)
Precision	0.84 (0.01)	0.81 (0.03)	0.76 (0.17)
Recall	0.78 (0.02)	0.74 (0.05)	0.75 (0.17)
Cohen	0.73 (0.02)	0.68 (0.03)	0.60 (0.11)

The values shown in brackets are the standard deviations of the average performance. Recall that the performance of each neurologist is averaged across 4 3-vote ground truth sets. For the 5-vote labels, we subsampled the test set 5 times (at 80%) to obtain the value for the standard deviation. Note that for these metrics the abnormal class was used as the positive label. AUC, Area Under the receiver operating characteristic Curve; AP, Average Precision.