Table 3.
The dictionary and parse sizes for several files from the Pizza and Chili repetitive corpus, with three settings of the parameters w and p
| File | Size | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Dict. | Parse | % | Dict. | Parse | % | Dict. | Parse | % | ||
| cere | 440 | 61 | 77 | 31 | 43 | 159 | 46 | 89 | 17 | 24 |
| cere_no_Ns | 409 | 33 | 77 | 27 | 43 | 33 | 18 | 60 | 17 | 19 |
| dna.001.1 | 100 | 8 | 20 | 27 | 13 | 9 | 21 | 21 | 4 | 25 |
| einstein.en.txt | 446 | 2 | 87 | 20 | 3 | 39 | 9 | 4 | 17 | 5 |
| influenza | 148 | 16 | 28 | 30 | 32 | 12 | 29 | 49 | 6 | 37 |
| kernel | 247 | 14 | 52 | 26 | 14 | 20 | 13 | 15 | 10 | 10 |
| world_leaders | 45 | 5 | 5 | 21 | 8 | 2 | 21 | 11 | 1 | 26 |
| world_leaders_no_dots | 23 | 4 | 5 | 34 | 6 | 2 | 31 | 7 | 1 | 33 |
All sizes are reported in megabytes; percentages are the sums of the sizes of the dictionaries and parses, divided by the sizes of the uncompressed files
For each file, the sizes are in italics for the settings with the best overall compression