Skip to main content
. Author manuscript; available in PMC: 2008 Aug 29.
Published in final edited form as: J Chem Inf Model. 2007 Feb 28;47(2):302–317. doi: 10.1021/ci600358f
Algorithm 1 Top K Search
Require: database of fingerprints B binned by bit count Bs
Ensure: hits contains top K hits which satisfy Similarity(A,B) > T
1: hitsMinHeap()
2: boundsList()
3: for all B in database do //iterate over bins
4:     tupleTuple(Bound(A,B),B)
5:     ListAppend(bounds, tuple)
6: end for
7: Quicksort(bounds) //NOTE: the length of bounds is constant
8: for all bound, B in bounds do //iterate in order of decreasing bound
9:     if bound < T then
10:         break //threshold stopping condition
11:     end if
12:     if KHeapSize(hits) and bound <MinSimilarity(hits) then
13:         break //top-K stopping condition
14:     end if
15:     for all B in database[B] do
16:         S=Similarity(A,B)
17:         tupleTuple(S,B)
18:         if S ≤ T then
19:             continue //ignore this B and continue to next
20:         else if Length(hits)< K then
21:             HeapPush(hits, tuple)
22:         else if S > MinSimilarity(hits) then
23:             HeapPopMin(hits)
24:             HeapPush(hits,tuple)
25:         end if
26:     end for
27: end for
28: return hits