Skip to main content
. Author manuscript; available in PMC: 2008 Nov 1.
Published in final edited form as: J Chem Inf Model. 2007 Oct 30;47(6):2098–2109. doi: 10.1021/ci700200n

Figure 9.

Figure 9

ROC retrieval curved based on Tanimoto similarity measures, computed from lossy and lossless compressed fingerprints. Curves are obtained using molecules from six biologically relevant datasets using leave-one-out cross validation. Each ROC curve is constructed by aggregating the ROC curves calculated by using each molecule in the group to search for the rest of the group against the background provided by the random subset of 50,000 molecules from the ChemDB. Lossless compression leads to better retrieval performance corresponding, for instance, to an average increase of 18% for the area under the ROC curves (AUC measure). Lossy fingerprints are derived by modulo compression to 512 bits.