Table 3. Cross-validated results of each labeler, where each labeler’s performance is compared against the annotations of the rest of the labelers using a majority vote.
Name | L1 | L2 | L3 | L4 | Mean |
---|---|---|---|---|---|
N.01.01 | 0.75 (0.73, 0.77) |
0.70 (0.58, 0.88) |
0.86 (0.81, 0.90) |
0.84 (0.92, 0.77) |
0.79 (0.76, 0.83) |
N.03.00.t | X | 0.75 (0.69, 0.82) |
0.79 (0.67, 0.97) |
0.85 (0.76,0.97) |
0.8 (0.71,0.92) |
N.00.00 | X | 0.87 (0.84,0.90) |
0.82 (0.75,0.91) |
0.72 (0.71,0.97) |
0.83 (0.76,0.93) |
YST | 0.7 (0.93,0.56) |
0.79 (0.7,0.9) |
0.81 (0.76,0.86) |
0.77 (0.75,0.78) |
0.77 (0.78,0.78) |
N.04.00.t | X | 0.79 (0.76,0.83) |
0.72 (0.60,0.89) |
0.68 (0.53,0.96) |
0.73 (0.63,0.89) |
N.02.00 | 0.84 (0.97,0.75) |
0.88 (0.89,0.87) |
0.86 (0.79,0.94) |
0.81 (0.7,0.95) |
0.85 (0.83,0.88) |
J123 | X | 0.9 (0.86,0.93) |
0.89 (0.84,0.93) |
0.77 (0.63,0.96) |
0.87 (0.88,0.88) |
J115 | 0.85 (0.98,0.76) |
0.87 (0.80,0.97) |
0.88 (0.80,0.97) |
0.87 (0.93,0.82) |
0.85 (0.78,0.94) |
K53 | 0.86 (0.98,0.77) |
0.9 (0.85,0.96) |
0.88 (0.8,0.96) |
0.89 (0.9,0.88) |
0.88 (0.88,0.89) |
mean std | 0.8±0.06 (0.92±0.09,0.72±0.08) |
0.83±0.07 (0.77±0.1,0.9±0.05) |
0.83±0.05 (0.76±0.07,0.92±0.04) |
0.81±0.06 (0.76±0.13,0.9±0.08) |
0.82±0.06 (0.77±0.12,0889±0.06) |