Skip to main content
. 2019 Dec 4;20:265. doi: 10.1186/s13059-019-1875-0

Fig. 4.

Fig. 4

a Relationship between maximum leading zero count (Max LZC) and set size for three randomly-generated sets of 8-bit numbers. The Max LZC roughly estimates the log2 of the set size, though with high variance; here, two of three estimates are off by 2-fold. b Schematic of HyperLogLog sketch. Input items are hashed and hash value is partitioned into prefix p and suffix q. p indexes into the array of HLL registers. A register contains the maximum leading zero count among all suffixes q that mapped there. Register-level estimates are then combined to obtain an overall cardinality estimate. c Estimating cardinalities of sets A and B, and d estimating the cardinality of their union. For intersection cardinalities using inclusion-exclusion principle, estimated set and union cardinalities are combined. e Direct estimation of intersection cardinality with Ertl’s JMLE