TABLE 1.
Definitions and limitations of reliability and agreement parameters
Intraclass correlation coefficient (ICC) | Cohen's kappa | Positive agreement | Negative agreement | Prevalence index | ||
---|---|---|---|---|---|---|
General definition | Reliability measure indicating the degree of correlation for a continuous metric over different raters. 13 | Agreement measure between two raters, taking into account the probability of agreement occurring by chance. 14 | Degree of agreement between two raters for the positive category, given the distribution of responses. 15 | Degree of agreement between two raters for the negative category, given the distribution of responses. 15 | Relative prevalence of the positive compared to the negative category. 16 | |
Description with regards to FOG assessment | Correlation between raters on their overall FOG score (e.g., % time frozen) over the assessed participants. | Exact overlap between the annotated FOG events between raters, taking into account the probability of guessing. | Probability measure for the overlapping FOG events (= black areas). | Probability measure for the overlapping no FOG periods (= white areas). | Overall prevalence of FOG events. | |
Interpretation | See Cohen's kappa | <0.00 | Poor 17 | See Cohen's kappa | See Cohen's kappa | −1 no FOG |
0.00–0.20 | Slight | +1 continuously FOG | ||||
0.21–0.40 | Fair | |||||
0.41–0.60 | Moderate | |||||
0.61–0.80 | Substantial | |||||
0.81–1.00 | Almost perfect | |||||
Limitations | Does not reflect the exact overlap of the FOG events. 13 | Kappa tends to underestimate the agreement when the events are rare. 14 , 19 | All three metrics are needed to evaluate the agreement between raters. | |||
There exist many different formulas to calculate the ICC, each with a slightly different purpose. 18 | When both raters annotate no FOG episodes for a certain participant, Cohen's kappa becomes undefined. 14 | |||||
Advice | Limit the use of ICC to studies where the exact overlap between episodes is not important (e.g., evaluation of FOG treatments) and always report the used formula. | Consider reporting positive agreement, negative agreement, and prevalence index instead of Cohen's kappa. 15 , 16 | Report all three metrics together |
Note: Formulas of the different agreement parameters are given in the supplementary material.
Abbreviations: FOG, freezing of gait; ICC, intraclass correlation coefficient.