Skip to main content
. 2020 Jul 20;11(9):1075–1087. doi: 10.1039/d0md00110d

Table 2. The frequency of training-external activity discontinuities in a selection of cell lines. Discontinuities, also termed activity cliffs, are compounds with similar structure but opposite activity labels. Here, chemical similarity was defined by 166 bit MACCS representations having Tanimoto similarity (Tc) values of 0.8 or higher, or by 1024 bit ECFP (radius 2) representations having Tc values ≥ 0.5. A number of ECFP-based train-external discontinuities have Tc ≥ 0.8. Mean counts and standard deviations generated by repeated train-external splits and subsampling are used as estimates of the unknown global population of discontinuities.

Cell line MACCS-based (Tc ≥ 0.8) ECFP-based (Tc ≥ 0.8) ECFP-based (Tc ≥ 0.5)
SK-MEL-2 3502.4 ± 112.15 43.4 ± 4.84 1244.8 ± 18.15
A549/ATCC 3847.8 ± 200.82 52.4 ± 4.96 1322.4 ± 55.00
MDA-N 4623.0 ± 79.22 63.2 ± 6.05 1201.0 ± 24.45
HCT-15 4620.6 ± 349.2 61.8 ± 12.06 1619.2 ± 83.33