Table 1.
Kolmogorov-Smirnov test for within-rater pairings across the 5 trial blocks (10 possible pairings).
p-value of test of agreement across judgments within rater | ||||||||||
Rater | 1–2 | 1–3 | 1–4 | 1–5 | 2–3 | 2–4 | 2–5 | 3–4 | 3–5 | 4–5 |
1 | 0.349 | 0.006 | 0.013 | 0.009 | 0.108 | 0.108 | 0.108 | 0.779 | 0.689 | 0.507 |
2 | 0.779 | 0.283 | 0.424 | 0.083 | 0.859 | 0.991 | 0.180 | 0.968 | 0.018 | 0.14 |
3 | 0.108 | <0.001 | 0.002 | 0.034 | 0.062 | 0.349 | 0.689 | 0.083 | 0.002 | 0.507 |
4 | 0.924 | 0.424 | 0.227 | 0.018 | 0.779 | 0.507 | 0.062 | 0.998 | 0.507 | 0.507 |
5 | 0.283 | 0.140 | 0.001 | 0.083 | 0.083 | <0.001 | <0.001 | 0.507 | 0.108 | 0.689 |
6 | 0.227 | 0.140 | 0.062 | 0.108 | 0.991 | 0.859 | 0.689 | 0.689 | 0.968 | 0.349 |
7 | 0.013 | 0.006 | 0.507 | 0.596 | 0.779 | 0.349 | 0.083 | 0.108 | 0.013 | 0.689 |
8 | 0.025 | 0.227 | 0.083 | 0.083 | 0.968 | 0.046 | 0.034 | 0.349 | 0.283 | 0.779 |
9 | 0.507 | 0.424 | 0.004 | 0.004 | 0.006 | <0.001 | <0.001 | 0.424 | 0.424 | 0.924 |
10 | 0.596 | 0.068 | 0.689 | 0.013 | 0.09 | 0.968 | 0.018 | 0.284 | 0.848 | 0.227 |
11 | 0.859 | 0.006 | <0.001 | <0.001 | 0.034 | <0.001 | <0.001 | 0.006 | 0.001 | 0.859 |
12 | 0.859 | 0.227 | 0.924 | 0.083 | 0.596 | 0.998 | 0.227 | 0.349 | 0.779 | 0.227 |
13 | 0.018 | 0.006 | 0.025 | 0.227 | 0.001 | 0.227 | 0.025 | 0.046 | 0.001 | 0.034 |
14 | 0.779 | 0.283 | <0.001 | <0.001 | 0.507 | 0.009 | 0.001 | 0.083 | 0.034 | 0.424 |
15 | 0.001 | 0.108 | 0.002 | 0.083 | 0.507 | 0.14 | 0.283 | 0.424 | 0.596 | 0.596 |
16 | 0.596 | 0.424 | 0.507 | 0.859 | 0.859 | 0.968 | 0.859 | 0.779 | 0.779 | 0.968 |
17 | 0.924 | 0.689 | 0.003 | 0.034 | 0.424 | 0.002 | 0.034 | 0.006 | 0.002 | 0.046 |
18 | 0.349 | 0.001 | <0.001 | <0.001 | 0.108 | <0.001 | 0.034 | 0.034 | 0.507 | 0.227 |
One hundred nineteen out of 180 comparisons (61%) of the 5 within-rater trials were not significantly different (p >0.05). In these cases, raters made judgments of imitativeness consistently across rating blocks. The remaining 39% of the comparisons (shaded in gray), however, showed significant differences across ratings [χ2(9) = 143.32, p < 0.001], suggesting raters changed their decision patterns for ratings of imitativeness across the five trial blocks.