Skip to main content
. 2023 Jan 24;7:1–30. doi: 10.1162/opmi_a_00070

Table 1. .

Summary of corpora measures across languages for Study 1.

Language Child’s Age Range No. Corpora No. Tokens No. Types Word Frequency (per million) α β Pearson’s-r
British English 0;2–3;6 12 6311249 27476 1 - 303005 (0.16 - 48010.31) 1.57 19.48 0.97
North-American English 0;5–3;6 34 4876774 24573 1 - 230355 (0.21 - 47235.12) 1.52 18.34 0.965
German 0;5–3;6 7 2168002 37018 1 - 71238 (0.46 - 32858.83) 1.42 12.69 0.998
French 0;11–3;6 8 1540284 19327 1 - 59852 (0.65 - 38857.77) 1.53 17.98 0.99
Dutch 1;5–3;6 5 1036586 18717 1 - 41646 (0.96 - 40176.12) 1.48 12.63 0.998
Japanese 0;6–3;6 6 941006 25648 1 - 45886 (1.06 - 48762.71) 1.30 7.12 0.993
Polish* 0;10–6;11 8 794183.7 44425 1 - 32172.19 (1.26 - 40509.76) 1.16 4.04 0.997
Spanish 0;11–3;6 12 353104 10057 1 - 13437 (2.83 - 38053.94) 1.39 9.96 0.994
Swedish 1;0–3;6 2 341280 9466 1 - 16924 (2.93 - 49589.78) 1.40 7.58 0.998
Portuguese 1;5–3;6 2 309296 7562 1 - 21416 (3.23 - 69241.12) 1.37 5.44 0.981
Hebrew 0;8–3;6 6 300766 13801 1 - 16048 (3.32 - 53357.09) 1.19 3.69 0.996
Norwegian 1;1–2;9 2 183658 8306 1 - 9135 (5.44 - 49739.19) 1.32 6.92 0.996
Estonian 0;9–3;6 5 167666 10057 1 - 10344 (5.96 - 61694.08) 1.21 6.02 0.956
Danish 0;11–3;5 1 155826 4102 1 - 8421 (6.42 - 54041.05) 1.50 7.71 0.997
Mandarin 1;8–2;3 2 150852 7095 1 - 6305 (6.63 - 41795.93) 1.42 11.94 0.989
Catalan 1;1–4;2 4 132410 6416 1 - 8051 (7.55 - 60803.56) 1.30 6.27 0.985
Summary Mean = 1.38; SD = 0.13 Mean = 9.86; SD = 5.13 Mean = 0.988; SD = 0.013
*

For Polish the data is taken from a list including a summary of words and their relative frequencies. Hence the one digit precision.