Table 2. Results from likelihood and predictability analyses.
L-0, L-1, L-2, and L-Int represent the average per-phrase negative log-likelihood under the zero-order, first-order, second-order, and interpolated Markov models, respectively. P-0, P-1, P-2 and P-Int indicated the prediction accuracy for the same models. Bolded values highlight the most likely model and the model with the best ability to predict upcoming phrases for each individual under the two evaluation paradigms.
Individual | Method | Sample size* | Likelihood | Prediction Accuracy (%) | ||||||
---|---|---|---|---|---|---|---|---|---|---|
L-0 | L-1 | L-2 | L-Int | P-0 | P-1 | P-2 | P-Int | |||
AGBk | Train-test | 5170 | 3.67 | 1.27 | 1.25 | 1.22 | 7.9 | 58.6 | 61.1 | 61.6 |
LOOCV | 10032 | 3.69 | 1.23 | 1.18 | 1.16 | 5.9 | 58.7 | 63.2 | 62.4 | |
AGO | Train-test | 2414 | 3.68 | 1.34 | 1.28 | 1.24 | 12.7 | 57.2 | 63.4 | 62.7 |
LOOCV | 4288 | 3.63 | 1.30 | 1.23 | 1.19 | 13.4 | 56.8 | 63.8 | 64.1 | |
AOBu | Train-test | 3883 | 3.68 | 1.62 | 1.65 | 1.59 | 7.7 | 46.9 | 45.4 | 47.5 |
LOOCV | 6634 | 3.66 | 1.56 | 1.58 | 1.52 | 6.1 | 47.9 | 49.5 | 50.1 | |
AYO | Train-test | 721 | 3.34 | 1.44 | 1.45 | 1.42 | 17.1 | 56.9 | 55.9 | 58.1 |
LOOCV | 1187 | 3.50 | 1.53 | 1.61 | 1.51 | 12.9 | 56.1 | 55.3 | 57.7 | |
BuRA | Train-test | 2698 | 3.53 | 1.89 | 2.06 | 1.88 | 6.8 | 40.4 | 41.0 | 42.6 |
LOOCV | 4460 | 3.59 | 1.81 | 1.88 | 1.78 | 4.6 | 41.8 | 42.7 | 43.8 | |
Gate | Train-test | 915 | 3.85 | 1.60 | 1.76 | 1.59 | 5.1 | 58.6 | 60.3 | 61.4 |
LOOCV | 1396 | 3.54 | 1.49 | 1.58 | 1.46 | 4.7 | 58.1 | 59.7 | 60.4 | |
GRA | Train-test | 1986 | 3.82 | 1.47 | 1.52 | 1.44 | 13.0 | 53.2 | 56.0 | 56.8 |
LOOCV | 3562 | 3.80 | 1.38 | 1.37 | 1.35 | 10.3 | 56.5 | 60.1 | 59.3 | |
Meadow | Train-test | 781 | 3.90 | 1.69 | 1.72 | 1.66 | 3.1 | 52.7 | 51.8 | 50.3 |
LOOCV | 1290 | 3.66 | 1.37 | 1.42 | 1.35 | 7.4 | 62.4 | 60.0 | 62.6 | |
ORA | Train-test | 1252 | 3.56 | 1.29 | 1.21 | 1.19 | 12.3 | 59.7 | 64.2 | 64.1 |
LOOCV | 2056 | 3.65 | 1.19 | 1.11 | 1.09 | 10.2 | 63.4 | 68.9 | 69.3 | |
RYA | Train-test | 1578 | 3.64 | 1.43 | 1.55 | 1.43 | 7.0 | 59.5 | 61.9 | 62.2 |
LOOCV | 2820 | 3.64 | 1.29 | 1.35 | 1.27 | 7.4 | 64.5 | 63.7 | 65.6 | |
Gully | Train-test | 1195 | 3.71 | 1.58 | 1.68 | 1.55 | 6.3 | 54.4 | 55.1 | 57.0 |
LOOCV | 1742 | 3.72 | 1.54 | 1.59 | 1.50 | 6.1 | 54.4 | 56.0 | 57.8 | |
WABk | Train-test | 4714 | 3.85 | 1.48 | 1.53 | 1.46 | 8.2 | 52.4 | 54.6 | 54.7 |
LOOCV | 8302 | 3.61 | 1.35 | 1.40 | 1.33 | 13.1 | 54.1 | 56.2 | 55.4 | |
YAW | Train-test | 1710 | 3.39 | 1.34 | 1.43 | 1.34 | 14.2 | 62.6 | 61.9 | 62.6 |
LOOCV | 2955 | 3.48 | 1.37 | 1.44 | 1.36 | 11.8 | 60.1 | 60.4 | 61.1 | |
YBuA | Train-test | 1558 | 3.54 | 1.49 | 1.50 | 1.40 | 6.0 | 57.2 | 61.9 | 62.8 |
LOOCV | 2760 | 3.59 | 1.53 | 1.51 | 1.43 | 5.1 | 57.2 | 61.9 | 62.2 |
*Sample size for the train-test paradigm was the number of phrases in an individual’s training set, while for the LOOCV paradigm, the sample size was the number of phrases in an individual’s total recording corpus minus the average number of phrases in each recording.