. 2022 Nov 19;20:152. doi: 10.1186/s12955-022-02062-1

Table 5.

Evidence of inter-rater agreement

	Inter-rater agreement (PwD and proxy)
EQ-5D dimension	Evidence	Source
Mobility	Mobility was the only dimension to produce an acceptable level of agreement (kappa = 0.53 formal, 0.44 informal proxy)	Ankri et al. [19]
	Mobility dimension produced the best agreement on the dimensions (kappa coefficient indicate moderate agreement)	Orgeta et al. [37]
	Proxies rated more problems in the mobility dimension (than PwD had self-rated)	Vogel et al. [32]
Self-care	PwD self-rated self-care significantly more optimistically than proxies (1.0 ± 0.2 vs 1.1 ± 0.4, P = 0.031	Bonfiglio et al. [41]
Self-care	Self-care was the only EQ-5D dimension to showed a significant correlation between self and proxy report (r = 0.51, p < 0.01)	Vogel et al. [32]
Usual activities	Agreement (across proxy type) was lowest with PwD for usual activities dimension	Orgeta et al. [37]
	The difference between the kappa-coefficients in the subgroups of mild vs moderate PwD was statistically significant (p > 0.05)	Kunz et al. [29]
	PwD self-rated usual activities significantly more optimistically than proxies (1.1 ± 0.4 vs 1.6 ± 0.8, P = 0.000)	Bonfiglio et al. [41]
	Proxies rated more problems in the usual activities dimension (than PwD had self-rated)	Vogel et al. [32]
	PwD self-rated pain/discomfort significantly more optimistically than proxies (1.5 ± 0.7 vs 1.7 ± 0.9, P = 0.015)	Bonfiglio et al. [41]
	Proxies rated more problems in the pain/discomfort dimension (than PwD had self-rated)	Vogel et al. [32]
Anxiety/depression	PwD self-rated anxiety/depression significantly more optimistically than proxies (1.1 ± 0.5 vs 1.3 ± 0.5, P = 0.008)	Bonfiglio et al. [41]
Anxiety/depression	Anxiety/depression was the only dimension that PwD self-rated more problems than proxies	Vogel et al. [32]
EQ-5D index score	Intraclass correlation coefficient for EQ-5D total scores on PwD and proxy responses reflected average concordance – informal: ICC = 0.41, p < 0.001), formal: ICC = 0.42, p < 0.001)	Ankri et al. [19]
	Proxy EQ-5D ratings were significantly worse, with a mean difference of 0.1 in total score	Kunz et al. [29]
	Relationships between EQ-5D scores and clinical variables (CSDD, NPI, ADCS-ADL) were stronger for proxy assessments	Bhatttacharya et al. [20]
	Proxy EQ-5D index scores were significantly lower than self-report (0.8 ± 0.1 vs 0.9 ± 0.1, P = 0.000)	Bonfiglio et al. [41]
	MMSE and NPI scores were significantly associated with EQ-5D proxy (p = 0.00), but not EQ-5D self report (p = 0.63	Farina et al. [35]
	EQ-5D index scores were significantly different based on the rater: 0.67 (± 0.33) for self-report and 0.45 (± 0.36) for proxy, p < 0.001	Heßmann et al. [28]
	Self-completed EQ-5D was poor at reflecting clinically important differences and changes in clinical measures, vs EQ-5D proxy which did capture these changes	Martin et al. [36]
	EQ-5D proxy index scores were significantly lower than self-scores	Orgeta et al. [37]
	Self-rated EQ-5D scores were significantly higher than proxy EQ-5D (patient mean EQ5D score 0.71, 95% CI 0.64–0.77, proxies mean EQ5D score 0.30, 95% CI 0.22–0.38), mean difference 0.40 (95% CI 0.32–0.48, p 5 0.001)	Sheehan et al. [38]