Table 3.
Rater experience | Inter-rater-agreement | Cronbach alpha |
Mean recall (SD) (%) |
Mean precision (SD) (%) |
F1 | Windowed gold standard (GS) used to calculate precision and recall |
---|---|---|---|---|---|---|
Expert | 0.86 | 0.80 | 79 (2) | 91 (1) | 0.846 | Average of each expert compared to Expert GS (with compared expert removed) |
Non-expert | 0.73 | 0.65 | 68 (15) | 77 (7) | 0.722 | Average of each non-expert compared to Expert GS |
Combined raters | 0.74 | 0.68 | 76 (13) | 83 (9) | 0.793 | Average of each rater (non-expert and expert) compared to Expert GS (removing compared expert from GS) |