. 2022 Apr 21;5:733163. doi: 10.3389/frai.2022.733163

Table 3.

Distribution analysis and inter-rater agreement.

Human responses Distribution
Resp/Part	P1	P2	P3	P4	AVG	%
−1	2,065	995	645	1,185	1,223	34.0%
0	149	1120	1895	1,270	1,109	30.8%
1	1,386	1485	1060	1,145	1,269	35.3%
TOT	3,600	3600	3600	3,600	3,600	100%

Participant Agreement analysis
	P1	P2	P3	P4	Average	%
P1	0	1726	1308	1650	1561	43%
P2	1726	0	1944	1758	1809	50%
P3	1308	1944	0	1741	1664	46%
P4	1650	1758	1741	0	1716	48%
				TOTAL	6,751
				AVG xPART	1,688
		Average	Particip match each other			47%
Fleiss-Kappa	Error	Confidence Interval		Agreement	Z	p-value
0.202	0.0048153	0.19955	0.20446	“Fair”	41.951	0

The top part shows human judgement distribution for the three possible questionnaire responses “less” (−1), “neutral” (0), and “more” (1). The bottom part shows percent agreement for the four raters; Fleiss' Kappa analysis revealed that the agreement between raters is better than chance with a p < 0.05. The task was difficult and the responses noisy. Thus, only the most reliable questions were used to compare to the CEREBRA model.