Skip to main content
. 2021 Jul 28;61(9):4156–4172. doi: 10.1021/acs.jcim.0c00993

Table 2. Similarity Coefficientsa.

name measurement range reference(s)
Braun-Blanquet x/max(y,z) 0 to 1 (53,54)
Cosine
graphic file with name ci0c00993_m002.jpg
0 to 1 (55,56)
Dice
graphic file with name ci0c00993_m003.jpg
0 to 1 (57,58)
Dot-product x 0 to ∞ N/A
Euclidean
graphic file with name ci0c00993_m004.jpg
0 to 1 N/A
Kulczynski Inline graphic 0 to 1 (59)
McConnaughey
graphic file with name ci0c00993_m006.jpg
–1 to 1 (60)
Russel/Rao x/w 0 to 1 (61)
Simpson x/min(y,z) 0 to 1 (62)
Sokal/Sneath
graphic file with name ci0c00993_m007.jpg
0 to 1 (63)
Tanimoto
graphic file with name ci0c00993_m008.jpg
0 to 1 (64,65)
Tullos XYZ 0 to 1 (66)
Tversky
graphic file with name ci0c00993_m009.jpg
0 to 1 (67)
a

A total of 13 different similarity coefficients (several of these coefficients were collected by Raymond and Willett)52 were compiled for measuring the degree of structural similarity of two compounds described by a given molecular fingerprint encoding. Here, x is the number of bits set in both fingerprints, y is the number of bits set in the first fingerprint, z is the number of bits set in the second fingerprint, and w is the total number of bits in the bit string. For the Tullos similarity coefficient, Inline graphic, Inline graphic, and Inline graphic. For the asymmetric evaluation of the Tversky similarity coefficient, α = 0.9. Here, we assume that the parameters of the Tversky coefficient in its original formulation, Inline graphic, will follow p + q = 1. The Dice and Tanimoto similarity coefficients are two symmetric instances of the Tversky coefficient, where p = q = 0.5 and p = q = 1, respectively.