Figure 5.
Relationship between dataset coverage and the similarity threshold. The graph shows the number of chemicals in the dataset (y-axis), defined as the coverage for which there are analogues at a given level of Jaccard similarity threshold (x-axis). The dataset coverage decreases with an increasing similarity threshold.