Skip to main content
. 2010 Dec 2;5(12):e14139. doi: 10.1371/journal.pone.0014139

Figure 4. Zipf's law and Heaps' law in four example systems.

Figure 4

(a) Words in Dante Alghieri's great book “La Divina Commedia” in Italian [44] where Inline graphic is the frequency of the word ranked Inline graphic and Inline graphic is the number of distinct words. (b) Keywords of articles published in the Proceedings of the National Academy of Sciences of the United States of America (PNAS) [30] where Inline graphic is the frequency of the keyword ranked Inline graphic and Inline graphic is the number of distinct keywords; (c) Confirmed cases of the novel virus influenza A (H1N1) [45] where Inline graphic is the number of confirmed cases of the country ranked Inline graphic and Inline graphic is the number of infected country in the presence of Inline graphic confirmed cases over the world; (d) PNAS articles having been cited at least once from 1915 to 2009 where Inline graphic is the number of citations of the article ranked Inline graphic and Inline graphic is the number of distinct articles in the presence of Inline graphic citations to PNAS. In (c), the data set is small and thus the effective number is only two digits. The fittings in (c1) and (c2) only cover the area marked by blue. In (d1), the deviation from a power law is observed in the head and tail, and thus the fitting only covers the blue area. The Zipf's (power-law) exponents and Heaps' exponents are obtained by using the maximum likelihood estimation [3], [43] and least square method, respectively. Statistics of these data sets can be found in Table 1 (the data set numbers of (a), (b), (c) and (d) are 9, 10, 34 and 35 in Table 1) with detailed description in Materials and Methods .