Table 3.
Data set | #Bits | #Fingerprints (in millions) | FPB size (in MiB) | AREN size (in MiB) | FPID size (in MiB) | HASH size (in MiB) |
---|---|---|---|---|---|---|
chemfp benchmark | 166 | 1.00 | 54.0 | 22.9 | 15.9 | 15.3 |
chemfp benchmark | 881 | 1.00 | 134 | 107 | 11.6 | 15.3 |
chemfp benchmark | 1021 | 1.00 | 153 | 122 | 15.9 | 15.3 |
chemfp benchmark | 2048 | 1.00 | 275 | 244 | 15.9 | 15.3 |
ChEMBL 24 | 2048 | 1.82 | 501 | 444 | 29.9 | 27.8 |
PubChem | 881 | 96.9 | 13,000 | 10,300 | 1130 | 1480 |
The AREN chunk contains the fingerprints, the FPID chunk contains record identifiers indexed by position, and the HASH chunk contains a hash table mapping identifiers to index