Skip to main content
. 2019 Dec 5;11:76. doi: 10.1186/s13321-019-0398-8

Table 3.

Fingerprint data set sizes in FPB format and largest chunk sizes

Data set #Bits #Fingerprints (in millions) FPB size (in MiB) AREN size (in MiB) FPID size (in MiB) HASH size (in MiB)
chemfp benchmark 166 1.00 54.0 22.9 15.9 15.3
chemfp benchmark 881 1.00 134 107 11.6 15.3
chemfp benchmark 1021 1.00 153 122 15.9 15.3
chemfp benchmark 2048 1.00 275 244 15.9 15.3
ChEMBL 24 2048 1.82 501 444 29.9 27.8
PubChem 881 96.9 13,000 10,300 1130 1480

The AREN chunk contains the fingerprints, the FPID chunk contains record identifiers indexed by position, and the HASH chunk contains a hash table mapping identifiers to index