Table 2.
Overview of similarity measures used in research articles.
| Similarity measures | Description | Refs. |
|---|---|---|
| logD | logD ratio (I, II) = logDI/logDII | [37] |
| Chemical nature | Compounds were clustered based on their chemical nature, i.e., acids, bases and neutrals. | [54] |
| k | k-ratio similarity = kI/kII | [55] |
| Dual filtering | Dual filtering involves using a combination of two similarity searches to improve the model performance, such as combinations of structural similarity searching and retention time similarity searching, combining partition coefficient logP searching with SDI-based searching. | [56] |
| Tanimoto similarity | Tanimoto similarity (I, II) = | [57] |
| SDI | logα = log (k/kEB) = η′H – σ′S + β′A + α′B + κ′C | [58] |
I, II: different compounds; FP: fingerprint; logD: distribution coefficient (logarithmic scale); α: chromatographic selectivity; k: retention factor; kEB: the retention factor of ethylbenzene; SDI: second dominant interaction; H, S, A, B, C: five column coefficients; η′, σ′, β′, α′, κ': five complementary solute coefficients.