Skip to main content
. 2022 Aug 30;13:5024. doi: 10.1038/s41467-022-32012-w

Fig. 5. Models applied to data from phonology textbooks.

Fig. 5

a Measuring % lexicon solved, which is the percentage of stems that match gold ground-truth annotations. Problems marked with an asterisk are allophony problems and are typically easier. For allophony problems, we count % solved as 0% when no rule explaining an alternation is found and 100% otherwise. For allophony problems, full/CEGIS models are equivalent, because we batch the full problem at once (Supplementary Methods 3). b Convergence rate of models evaluated on the 54 non-allophony problems. All models are run with a 24-h timeout on 40 cores. Only our full model can best tap this parallelism (Supplementary Methods 3.3). Our models typically converge within a half-day. SyPhon36 solves fewer problems but, of those it does solve, it takes minutes rather than hours. Curves show means over problems. Error bars show the standard error of the mean. c Rule accuracy was assessed by manually grading 15 random problems. Both precision and recall correlate with lexicon accuracy, and all three metrics are higher for easier problems requiring fewer phonological rules (red, easier; blue, harder). Requiring an exact match with a ground-truth stem occasionally allows solving some rules despite not matching any stems, as in the outlier problem marked with **. Pearson’s r confidence intervals (CI) were calculated with two-tailed test. Points were randomly jittered ±0.05 for visibility. Source data are provided as a Source data file.