Table 2. Statistics of different data structures on a large, simulated dataset with 167,578 unique UMIs.
Data Structure | Property | Value |
---|---|---|
Subsequences | Number of bins | 1,471,978 |
Avg bin size | 1.1 | |
Max bin size | 5 | |
Trie | Nodes | 511,634 |
n-grams | Number of bins | 6,250 |
Avg bin size | 53.6 | |
Max bin size | 139 | |
BK-tree | Avg depth | 8.6 |
Max depth | 13 |