Skip to main content
. Author manuscript; available in PMC: 2017 Sep 27.
Published in final edited form as: Commun ACM. 2016 Aug;59(8):72–80. doi: 10.1145/2957324

Metric-entropy ratio (ratio of clusters to entries in database) and fractal dimension at typical search radii for four datasets.

Metric-entropy ratio gives an estimate of the acceleration of coarse search with respect to naïve search, and as long as fractal dimension is low, coarse search should dominate total search time. NCBI’s non-redundant ‘NR’ protein and ‘NT’ nucleotide sequence databases are from June 2015. Protein Data Bank (PDB) is from July 2015. PubChem is from October 2013.

Dataset Metric-entropy ratio Fractal dimension
Nucleotide sequences (NCBI NT) 7:1 1.5
Protein sequences (NCBI NR) 5:1 1.6
Protein structure (PDB) 10:1 2.5
Chemical structure (PubChem) 11:1 0.2