Skip to main content
. 2024 Feb 15;13:e54704. doi: 10.2196/54704

Table 5.

Interrater reliability per METRICS item.

METRICSa item Score Quality Cohen κ Asymptotic standard error Approximate T P value

Meanb (SD) Range




Model 3.72 (0.58) 2.5-5.0 Very good 0.820 0.090 6.044 <.001
Timing 2.90 (1.93) 1.0-5.0 Good 0.853 0.076 6.565 <.001
Count 3.04 (1.32) 1.0-5.0 Good 0.962 0.037 10.675 <.001
Specificity of prompts and language 3.44 (1.25) 1.0-5.0 Very good 0.765 0.086 8.083 <.001
Evaluation 3.31 (1.16) 1.0-5.0 Good 0.885 0.063 9.668 <.001
Individual factors 2.50 (1.42) 1.0-5.0 Satisfactory 0.865 0.087 6.860 <.001
Transparency 3.24 (1.01) 1.0-5.0 Good 0.558 0.112 5.375 <.001
Range 3.24 (1.07) 2.0-5.0 Good 0.836 0.076 8.102 <.001
Randomization 1.31 (0.87) 1.0-4.0 Suboptimal 0.728 0.135 5.987 <.001
Overall 3.01 (0.58) 1.5-4.1 Good 0.381 0.086 10.093 <.001

aMETRICS: Model, Evaluation, Timing, Range/Randomization, Individual factors, Count, and Specificity of prompts and language.

bThe mean scores represent the results of evaluating the included studies averaged for the 2 rater scores.