Table 3.
Distribution analysis and inter-rater agreement.
| Human responses Distribution | ||||||
|---|---|---|---|---|---|---|
| Resp/Part | P1 | P2 | P3 | P4 | AVG | % |
| −1 | 2,065 | 995 | 645 | 1,185 | 1,223 | 34.0% |
| 0 | 149 | 1120 | 1895 | 1,270 | 1,109 | 30.8% |
| 1 | 1,386 | 1485 | 1060 | 1,145 | 1,269 | 35.3% |
| TOT | 3,600 | 3600 | 3600 | 3,600 | 3,600 | 100% |
| Participant Agreement analysis | ||||||
| P1 | P2 | P3 | P4 | Average | % | |
| P1 | 0 | 1726 | 1308 | 1650 | 1561 | 43% |
| P2 | 1726 | 0 | 1944 | 1758 | 1809 | 50% |
| P3 | 1308 | 1944 | 0 | 1741 | 1664 | 46% |
| P4 | 1650 | 1758 | 1741 | 0 | 1716 | 48% |
| TOTAL | 6,751 | |||||
| AVG xPART | 1,688 | |||||
| Average | Particip match each other | 47% | ||||
| Fleiss-Kappa | Error | Confidence Interval | Agreement | Z | p-value | |
| 0.202 | 0.0048153 | 0.19955 | 0.20446 | “Fair” | 41.951 | 0 |
The top part shows human judgement distribution for the three possible questionnaire responses “less” (−1), “neutral” (0), and “more” (1). The bottom part shows percent agreement for the four raters; Fleiss' Kappa analysis revealed that the agreement between raters is better than chance with a p < 0.05. The task was difficult and the responses noisy. Thus, only the most reliable questions were used to compare to the CEREBRA model.