Table 2.
Interrater reliability and kappa scores for the internal (n = 980) and multicenter (n = 996) validation set
Property (n)/dataset | Interrater reliability (%)a | Kappa score | ||
---|---|---|---|---|
Internal validation set (n = 980) | Multicenter validation set (n = 996) | Internal validation set (n = 980) | Multicenter validation set (n = 996) | |
Laterality | 98.0 (NESK = 245, NFJP = 233) | 98.1 (NESK = 288, NFJP = 293) | 0.94 | 0.95 |
Temporality | 97.7 (NESK = 163, NFJP = 157) | 98.1 (NESK = 96, NFJP = 85) | 0.91 | 0.88 |
Uncertainty | 96.1 (NESK = 135, NFJP = 107) | 97.5 (NESK = 98, NFJP = 87) | 0.82 | 0.85 |
Removal of uncertainty | 98.3 (NESK = 11, NFJP = 14) | 99.3 (NESK = 8, NFJP = 11) | 0.36 | 0.63 |
Descending on Kappa scores
aNESK is the sum of the records in the corresponding property according to annotator ESK and NFJP is the sum of the records in the corresponding property according to annotator FJP