Table 4.
Language (corpus) | Utterances | Words |
Morphemes |
||
---|---|---|---|---|---|
Total | Analyzed | Total | Analyzed | ||
Chintang (Stoll et al., unpublished) | 396,412 | 987,120 | 473,918 | 1,594,829 | 814,076 |
Inuktitut (Allen, unpublished) | 46,680 | 73,255 | 23,164 | 37,781 | 8673 |
Japanese (Miyata, 2012)a | 271,868 | 821,106 | 514,344 | 666,748 | 376,934 |
Russian (Stoll & Meyer, unpublished) | 828,041 | 2,033,755 | 1,316,234 | NA | NA |
Sesotho (Demuth, 2015) | 69,530 | 237,112 | 83,514 | 329,347 | 112,630 |
Turkish (Küntay et al., unpublished) | 400,836 | 1,136,332 | 938,955 | 300,907 | 272,459 |
Yucatec (Pfeiler, unpublished) | 91,825 | 257,496 | 89,219 | 198,761 | 84,928 |