Skip to main content
. 2012 May 31;21(8):649–656. doi: 10.1136/bmjqs-2011-000429

Table 4.

Reliability of assessment of insightful practice (AIP) questions 1–5

Raters AIP questions 1–3 (engagement, insight and action) 1–7 scale reliability (G) AIP question 4 (global assessment) 1–7 scale reliability (G) (ICC)* AIP question 5 (binary yes/no recommendation on revalidation) reliability (G) (ICC)*
Internal consistency Inter-rater Inter-rater Inter-rater (95% CI) Inter-rater Inter-rater (95% CI)*
1 0.94 0.71 0.66 0.54
2 0.96 0.83 0.79 (0.68 to 0.88) 0.7 (0.54 to 0.83)
3 0.96 0.88 0.85 (0.78 to 0.91) 0.78 (0.69 to 0.86)
4 0.97 0.91 0.89 (0.84 to 0.93) 0.83 (0.75 to 0.89)
5 0.97 0.92 0.91 (0.87 to 0.94) 0.86 (0.80 to 0.91)
6 0.97 0.94 0.92 (0.89 to 0.95) 0.88 (0.83 to 0.92)

Reliabilities greater than 0.8, as required for high-stakes assessment, are given in bold.9

*

Intraclass correlation coefficients (ICCs) are G coefficients when you have a one facet design (rater).

Inter-rater reliability is the extent to which one rater's assessments (or when based on multiple raters, the average of raters' assessments) are predictive of another rater's assessments.

95% CIs for reliabilities (ICCs) were calculated using Fisher's ZR transformation which is dependent on raters (k) with a denominator value of (k-1), and so cannot be calculated when there is only one rater.9