Table 1.
Protein | Atom | Original | Rounded coordinates | gzip | PIC | Compression | Images | Decompression | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ID | Count | File | Text | Binary | Size | CR | Size | CR | RMSD | Time | Number | Space | Time |
size (KB) | size (KB) | size (KB) | (KB) | (KB) | (min:sec) | used | used (%) | (min:sec) | |||||
2ja9 | 1458 | 163.3 | 24.1 | 6.6 | 6.3 | 3.834 | 10.0 | 2.412 | 0.031 | 0:0.1 | 1 | [0.9] | 0:0.4 |
2jan | 12591 | 1101.2 | 206.1 | 56.7 | 54.1 | 3.813 | 61.2 | 3.368 | 0.047 | 0:1.3 | 1 | [2.7] | 0:2.1 |
2jbp | 27367 | 2397.4 | 447.8 | 133.4 | 130.2 | 3.439 | 108.8 | 4.117 | 0.043 | 0:4.1 | 1 | [11.1] | 0:5.0 |
2ja8 | 32000 | 2831.2 | 507.6 | 144.0 | 139.6 | 3.637 | 138.0 | 3.678 | 0.043 | 0:5.6 | 1 | [6.5] | 0:7.0 |
2ign | 41758 | 3579.2 | 666.7 | 187.9 | 180.8 | 3.688 | 147.3 | 4.526 | 0.069 | 0:9.0 | 1 | [9.5] | 0:11.4 |
2jd8 | 50351 | 4457.6 | 828.1 | 226.6 | 219.7 | 3.769 | 196.8 | 4.207 | 0.056 | 0:12.8 | 1 | [7.7] | 0:15.9 |
2ja7 | 63924 | 5605.5 | 1077.0 | 287.7 | 278.6 | 3.866 | 258.8 | 4.161 | 0.055 | 0:19.6 | 1 | [10.2] | 0:24.7 |
2fug | 73916 | 6386.9 | 1180.7 | 360.3 | 347.5 | 3.398 | 283.3 | 4.168 | 0.060 | 0:26.2 | 1 | [10.7] | 0:33.3 |
2b9v | 80710 | 6818.4 | 1279.8 | 393.5 | 379.4 | 3.373 | 289.0 | 4.428 | 0.073 | 0:32.2 | 1 | [10.3] | 0:39.5 |
2j28 | 95358 | 8152.3 | 1526.2 | 429.1 | 412.2 | 3.702 | 346.6 | 4.403 | 0.055 | 0:47.0 | 1 | [13.7] | 1:0.2 |
6hif | 118753 | 12726.2 | 2105.2 | 534.4 | 516.2 | 4.078 | 372.2 | 5.656 | 0.062 | 1:30.5 | 2 | [34.0, 0.1] | 1:48.6 |
3j7q | 140540 | 16027.2 | 2529.7 | 737.8 | 707.6 | 3.575 | 475.6 | 5.318 | 0.058 | 2:28.2 | 1 | [20.3] | 2:44.4 |
3j9m | 158384 | 17995.2 | 2845.4 | 772.1 | 765.8 | 3.716 | 525.7 | 5.413 | 0.069 | 3:28.8 | 1 | [21.7] | 3:55.9 |
6gaw | 178372 | 20825.4 | 3179.9 | 869.6 | 862.1 | 3.688 | 587.6 | 5.411 | 0.071 | 4:58.1 | 1 | [23.5] | 5:39.1 |
5t2a | 200172 | 22787.6 | 3253.9 | 900.8 | 872.4 | 3.73 | 651.7 | 4.993 | 0.068 | 7:8.2 | 2 | [31.1, 1.7] | 8:59.1 |
4ug0 | 218776 | 24906.9 | 3841.4 | 1066.5 | 1056.7 | 3.635 | 707.3 | 5.431 | 0.069 | 8:34.2 | 2 | [33.8, 1.7] | 9:25.5 |
4v60 | 241956 | 24377.8 | 4207.8 | 1179.5 | 1167.2 | 3.605 | 730.2 | 5.762 | 0.120 | 9:50.8 | 2 | [45.6, 2.1] | 13:48.9 |
4wro | 260090 | 35661.1 | 4363.1 | 1267.9 | 1246.2 | 3.501 | 848.8 | 5.14 | 0.086 | 13:54.0 | 1 | [29.6] | 16:6.9 |
6fxc | 281510 | 31329.0 | 5067.1 | 1477.9 | 1424.2 | 3.558 | 917.7 | 5.522 | 0.100 | 15:52.9 | 2 | [34.6, 1.0] | 17:11.6 |
4wq1 | 299951 | 40130.9 | 5042.1 | 1462.3 | 1438.0 | 3.506 | 968.8 | 5.204 | 0.087 | 19:59.6 | 2 | [34.7, 0.2] | 22:39.0 |
PIC compression algorithm, , results. Rounded Coordinates Text Size and Binary Size are the sizes of the text and binary files (in kilobytes, i.e. 1000 bytes, rather than kibibytes), respectively, that contain only the Cartesian coordinates found in the original file, rounded to one decimal place. The binary file (which uses a variable-length encoding) is then gzipped. The gzip and PIC compression ratios (CR) are the ratios of the Rounded Coordinates Text Size to the size the gzip file and PNG image output(s) from the PIC compressor, respectively. Bolded values are the best of gzip and PIC. Compression and decompression times are for the PIC algorithm; note that our code is unoptimized, as the focus is on compression ratios, but we include these times here for completeness. As an aside, (de)compression for gzip takes negligible time for files of this size. We also include RMSD values to measure the lossiness of PIC compression. Image Space Used gives the proportion of the image space that was used to encode the protein coordinate data, or part thereof, in each image constructed by the PIC compressor (for large proteins, more than one image is needed to represent all the atoms)