Skip to main content
. 2019 Jan 17;8:e38173. doi: 10.7554/eLife.38173

Table 3. Cross-validated results of each labeler, where each labeler’s performance is compared against the annotations of the rest of the labelers using a majority vote.

Results are given in the form F1 score (precision, recall), and empty entries correspond to datasets not manually annotated by the specific labeler. The results indicate decreased performance compared to the consensus annotation annotations.

Name L1 L2 L3 L4 Mean
N.01.01 0.75
(0.73, 0.77)
0.70
(0.58, 0.88)
0.86
(0.81, 0.90)
0.84
(0.92, 0.77)
0.79
(0.76, 0.83)
N.03.00.t X 0.75
(0.69, 0.82)
0.79
(0.67, 0.97)
0.85
(0.76,0.97)⁢
0.8
(0.71,0.92)
N.00.00 X 0.87
(0.84,0.90)⁢
0.82
(0.75,0.91)
⁢0.72
(0.71,0.97)⁢
0.83
(0.76,0.93)
YST ⁢0.7
(0.93,0.56)⁢
0.79
(0.7,0.9)⁢
⁢0.81
(0.76,0.86)⁢
⁢0.77
(0.75,0.78)⁢
0.77
(0.78,0.78)⁢
N.04.00.t X 0.79
⁢(0.76,0.83)⁢
0.72
⁢(0.60,0.89)⁢
0.68
(0.53,0.96)⁢
0.73
(0.63,0.89)⁢
N.02.00 0.84
(0.97,0.75)
⁢0.88
⁢(0.89,0.87)⁢
0.86
(0.79,0.94)⁢
⁢0.81
(0.7,0.95)
0.85
(0.83,0.88)
J123 X 0.9
(0.86,0.93)⁢
0.89
(0.84,0.93)⁢
0.77
(0.63,0.96)⁢
⁢0.87
(0.88,0.88)
J115 0.85
(0.98,0.76)⁢
0.87
(0.80,0.97)
0.88
(0.80,0.97)
0.87
(0.93,0.82)⁢
0.85
(0.78,0.94)⁢
K53 0.86
(0.98,0.77)⁢
0.9
(0.85,0.96)
0.88
(0.8,0.96)
0.89
(0.9,0.88)⁢
0.88
(0.88,0.89)⁢
mean ± std 0.8±0.06
(0.92±0.09,0.72±0.08)
0.83±0.07
(0.77±0.1,0.9±0.05)⁢
0.83±0.05
(0.76±0.07,0.92±0.04)⁢
0.81±0.06
(0.76±0.13,0.9±0.08)
0.82±0.06
(0.77±0.12,0889±0.06)