Skip to main content
. 2010 Dec 2;5(12):e14139. doi: 10.1371/journal.pone.0014139

Table 1. Empirical statistics and analysis results of real data sets.

No. Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic Inline graphic
1 206779 18217 1.323 0.756 0.725 0.738
2 20516 5671 0.969 1 0.858 0.859
3 109854 13906 1.063 0.941 0.845 0.817
4 449205 20220 1.464 0.683 0.667 0.679
5 68458 9191 1.095 0.913 0.823 0.810
6 81037 13254 1.025 0.976 0.859 0.832
7 63742 16622 1.057 0.946 0.840 0.852
8 138985 15550 1.188 0.842 0.787 0.765
9 101940 12667 1.117 0.895 0.818 0.799
10 504610 116800 0.893 1 0.936 0.863
11 53214 34194 0.540 1 0.983 0.946
12 310853 69185 0.939 1 0.913 0.871
13 30852 17562 0.595 1 0.972 0.939
14 2761 2328 0.397 1 0.964 0.978
15 58300 22599 0.786 1 0.941 0.914
16 20660 8155 0.790 1 0.921 0.890
17 226090 69251 0.692 1 0.977 0.894
18 176291 62567 0.572 1 0.989 0.920
19 44735 19933 0.685 1 0.961 0.915
20 1924 1323 0.463 1 0.946 0.939
21 5093 2985 0.593 1 0.941 0.920
22 3490 2442 0.500 1 0.952 0.950
23 1403 787 0.524 1 0.926 0.931
24 7469 4142 0.654 1 0.936 0.925
25 7710 3857 0.658 1 0.935 0.930
26 3232 2658 0.416 1 0.964 0.976
27 13165 7743 0.612 1 0.959 0.936
28 3749 2353 0.568 1 0.943 0.940
29 30092 11002 0.815 1 0.924 0.891
30 21894 8666 0.776 1 0.930 0.900
31 7627 3841 0.685 1 0.933 0.930
32 4185 2242 0.675 1 0.921 0.929
33 23822 10753 0.648 1 0.959 0.917
34 8829 40 3.0 0.33 0.34 0.35
35 237982 56961 0.462 1 0.993 0.929

Inline graphic is the total number of elements, Inline graphic is the total number of distinct elements, Inline graphic is the Zipf's exponent obtained by the maximum likelihood estimation [3], [43], Inline graphic is the asymptotic solution of the Heaps' exponent as shown in Eq. 7, Inline graphic is the numerical value of the Heaps' exponent given Inline graphic and Inline graphic as shown in Fig. 3, and Inline graphic is the empirical result of the Heaps' exponent obtained by the least square method. The effective number of the 34th data set is only two digits since the size of this data set is very small. Except the 4th data set, in all other 34 real data sets, the numerical results based on Eq. 6 outperform the asymptotic solution shown in Eq. 7. Detailed description of these data sets can be found in Materials and Methods .