Skip to main content
. 2015 Jul 9;10(7):e0129031. doi: 10.1371/journal.pone.0129031

Table 2. Power-law fitting results for words and lemmas, denoted respectively by subindices w and l.

V is the number of types (vocabulary size), n m is the maximum frequency of the distribution, N a is the number of types in the power-law tail, i.e., with na, a is the minimum value for which the power-law fit holds, and γ and σ are the power-law exponent and its standard deviation, respectively. 2σ d, the double of the standard deviation σ d is also given. σ d is the standard deviation of γ lγ w assuming independence, which is σd=σw2+σl2. The last column provides ℓ1, the number of lemmas associated to only one word form. Notice that the lemma exponent is very close to the one found in Ref. [29] for the tail of a double power-law fitting, except for Moby-Dick and Ulysses.

Title V w n mw N aw a w γ w ± σ w V l n ml N al a l γ l ± σ l 2σ d 1
Clarissa 20492 38632 1514 51 1.83±0.02 9041 41679 838 101 1.83±0.03 0.07 5750
Moby-Dick 18516 14438 2658 8 1.97±0.02 9141 14438 1548 13 1.90±0.02 0.06 6157
Ulysses 29450 14934 4377 6 1.95±0.01 12469 14934 1024 26 1.97±0.03 0.07 8670
Don Quijote 21180 20704 939 40 1.93±0.03 7432 31521 936 32 1.83±0.03 0.08 3812
La Regenta 21871 19596 1196 26 2.01±0.03 9900 32300 993 32 2.00±0.03 0.08 5308
Artamène 25161 88490 936 200 1.86±0.03 5008 119016 641 200 1.79±0.03 0.08 2178
Bragelonne 25775 26848 3173 16 1.84±0.02 10744 45577 1382 40 1.84±0.02 0.06 5391
Seitsemän 22035 4247 22035 1 2.13±0.01 7658 4247 474 26 2.13±0.05 0.10 4246
Kevät ja 25071 5042 8660 2 2.05±0.01 8898 6886 699 20 1.96±0.04 0.07 5060
Vanhempieni 35931 5254 6523 3 2.09±0.01 13510 7526 571 32 2.05±0.04 0.09 7837