Skip to main content
. 2015 Apr 13;112(17):5348–5353. doi: 10.1073/pnas.1420946112

Fig. 4.

Fig. 4.

Empirical rank distribution of word frequencies in The Origin of Species (black), showing two power-law regimes. For the most frequent words, the distribution is approximately power-law with an exponent γ0.9. The corresponding distribution for the Φ(λ) process with λ=0.9 (red), suggests a slight deviation from perfect nesting. This means that in sentence formation, about 90% of consecutive word pairs, sample space is strictly reducing. Simulation: N=5,000 (words), and M=10,000 restarts (sentences).