Skip to main content
. Author manuscript; available in PMC: 2008 Aug 29.
Published in final edited form as: J Chem Inf Model. 2007 Feb 28;47(2):302–317. doi: 10.1021/ci600358f

Table 2.

Actual search time benchmarks obtained searching the entire ChemDB database, with about 5M compounds using a 2.4MHz AMD Opteron processor with 2 GB of memory. Searches are carried using Tanimoto similarity measure with threshold (t = 0.9), or top ten (K = 10), or both. Search times for single-molecule query are expressed in seconds and are averaged over each dataset. The datasets correspond to the six Stahl and Rarey14 datasets, a random set of 1,000 queries extracted from the set of actual ChemDB queries, and a random set of 100 queries taken from the ChemDB. The fraction of the database that needs to be searched is given by 1 — f.

Dataset
Size
Time (t=0.9)
1-f
Time (K=10)
1-f
Time (t=0.9,K=10)
1-f
Cox2 128 0.79 0.17 3.53 0.76 0.78 0.17
Estrogen 55 0.60 0.12 2.03 0.43 0.52 0.11
Gelatinase A 43 0.77 0.16 3.31 0.71 0.77 0.16
Neuraminidase 17 0.70 0.14 2.74 0.59 0.66 0.14
p38 MAP kinase 25 0.90 0.18 3.30 0.71 0.87 0.18
Thrombin 67 0.91 0.19 3.27 0.70 0.88 0.19
ChemDB Queries 1,000 0.27 0.06 1.12 0.24 0.26 0.06
Random ChemDB 100 0.64 0.14 1.23 0.27 0.58 0.12