. 2014 Apr 1;14:45. doi: 10.1186/1471-2288-14-45

Table 1.

Inter-rater reliability on the Newcastle-Ottawa Scale (NOS) assessments, by item

Item	Agreement K (95% CI)	Interpretation [8]	0 point difference^c	±1 point difference^c	> ±2 points difference^c
Representativeness of the exposed cohort	0.03 (−0.10, 0.15)	Slight	43 (66.2%)	22 (33.8%)	0 (0%)
Selection of the non-exposed cohort	0.00 (0.00, 0.00)	Slight	53 (81.5%)	12 (18.5%)	0 (0%)
Ascertainment of exposure	−0.02 (−0.08, 0.04)	Poor	12 (18.5%)	53 (81.5%)	0 (0%)
Demonstration that outcome of interest was not present at start of study	0.09 (−0.16, 0.35)	Slight	47 (72.3%)	18 (27.7%)	0 (0%)
Comparability	0.00^a (−0.11, 0.12)	Slight	38 (58.5%)	18 (27.7%)	9 (13.8%)
Assessment of outcome	−0.04 (−0.09, 0.00)	Poor	59 (90.8%)	6 (9.2%)	0 (0%)
Was follow-up long enough for outcomes to occur	−0.06 (−0.22, 0.10)	Poor	31 (47.7%)	34 (52.3%)	0 (0%)
Adequacy of follow-up of cohorts	0.15 (−0.19, 0.48)	Slight	57 (87.7%)	8 (12.3%)	0 (0%)
Total NOS score	−0.004^a (−0.11, 0.11)	Poor	15 (23.1%)	20 (30.8%)	30 (46.1%)
Total categorized NOS score	0.14^b (−0.02, 0.29)	Slight	44 (67.7%)	21 (32.3%)	0 (0%)

Abbreviation: 95% CI = 95% confidence interval.

^aLinear weighted kappa was used for both Comparability and Total NOS score; other kappas were not weighted (i.e., Cohen’s kappa was applied).

^bQuadratic weighted kappa was used assuming the difference between very high risk, high risk and low risk were comparably unequal.

^cNumber of studies with a 0, ±1, or more than ±2 points difference between reviewer and author assessments, separated by item.