Skip to main content
. 2017 Jun 7;78(5):413–418. doi: 10.1055/s-0037-1603649

Table 3. Interrater reliability by training level.

Scale and raters Sellar invasion a Suprasellar extension b
Reliability (95% CI) Percent agreement Reliability (95% CI) Percent agreement
Full scale
 Faculty raters 0.67 (0.48–0.80) 9/50 (18%) 0.80 (0.68–0.88) 19/50 (38%)
 Resident raters 0.68 (0.49–0.80) 22/50 (44%) 0.78 (0.64–0.87) 14/50 (28%)
Intermediate scores
 Faculty raters 0.14 (− 0.29 to 0.52) 2/23 (9%) 0.27 (− 0.15 to 0.61) 13/24 (54%)
 Resident raters 0.13 (− 0.30 to 0.52) 9/23 (39%) 0.49 (0.11–0.75) 9/24 (38%)
Scale ends
 Faculty raters 0.73 (0.49–0.87) 7/27 (26%) 0.85 (0.70–0.93) 6/26 (23%)
 Resident raters 0.86 (0.71–0.95) 13/27 (48%) 0.86 (0.70–0.95) 3/26 (12%)
Dichotomous scale
 Faculty raters 0.58 (0.36–0.74) 36/50 (72%) 0.51 (0.27–0.69) 43/50 (86%)
 Resident raters 0.62 (0.41–0.77) 36/50 (72%) 0.15 (–0.22–0.48) 45/50 (90%)

Abbreviation: CI, confidence interval.

a

Full scale: Grades 0–IV. Dichotomous scale: Grades 0–III versus Grade IV.

b

Full scale: Types 0–D. Dichotomous scale: Types 0–C versus Type D.