Extended Data Table 2 |.
Dataset 1 | Dataset 2 | Statistic | Corrected P-value |
---|---|---|---|
1024-word-General MCD | 50-phrase-AAC chance MCD | 1.10e+03 | 1.68–48 |
50-phrase-AAC WER | 1024-word-General WER | 3.94e+03 | 3.09e–33 |
50-phrase-AAC CER | 1024-word-General CER | 3.96e+03 | 4.56e–33 |
1024-word-General MCD | 1024-word-General chance MCD | 7.70e+01 | 7.30e–33 |
1024-word-General CER | 1024-word-General chance CER | 1.96e+02 | 4.01e–32 |
1024-word-General WER | 1024-word-General chance WER | 2.55e+01 | 1.10e–29 |
50-phrase-AAC MCD | 1024-word-General MCD | 4.53e+03 | 6.59e–28 |
50-phrase-AAC WER | 50-phrase-AAC chance WER | 0.00e+00 | 1.35e–25 |
50-phrase-AAC MCD | 50-phrase-AAC chance MCD | 1.00e+00 | 2.58e–25 |
50-phrase-AAC CER | 50-phrase-AAC chance CER | 3.00e+00 | 2.58e–25 |
529-phrase-AAC MCD | 529-phrase-AAC chance MCD | 2.70e+01 | 3.56e–25 |
529-phrase-AAC CER | 529-phrase-AAC chance CER | 4.10e+01 | 6.13e–25 |
529-phrase-AAC WER | 529-phrase-AAC chance WER | 6.20e+01 | 7.52e–24 |
529-phrase-AAC WER | 1024-word-General WER | 8.34e+03 | 3.12e–12 |
529-phrase-AAC CER | 1024-word-General CER | 9.06e+03 | 6.45e–10 |
50-phrase-AAC MCD | 529-phrase-AAC MCD | 7.12e+03 | 1.56e–07 |
50-phrase-AAC CER | 529-phrase-AAC CER | 7.79e+03 | 2.43e–07 |
50-phrase-AAC WER | 529-phrase-AAC WER | 7.83e+03 | 2.43e–07 |
529-phrase-AAC MCD | 1024-word-General MCD | 1.04e+04 | 7.08e–07 |
Across-dataset comparisons use two-sided Mann-Whitney U-tests and within-dataset comparisons use two-sided Wilcoxon signed-rank tests. All with tests are with 19-way Holm-Bonferroni correction. We use n = 15 pseudo-blocks for the AAC sentence sets, and n = 20 pseudo-blocks for the 1024-word-General sentence set.