Skip to main content
. 2020 Aug 28;11:4334. doi: 10.1038/s41467-020-18171-8

Fig. 3. Representative structure similarity (RSS) and CCS prediction accuracy.

Fig. 3

a The simulation workflow for investigating the structural similarity and prediction accuracy of CCS values; b comparison of CCS prediction accuracy between before and after excluding lipid and lipid-like molecules from the training set; the abbreviations “ARE” and “MRE” represent average relative error and median relative error, respectively; p-values determined by two-sided Wilcoxon rank-sum test; c, d comparison of CCS prediction accuracy between before and after excluding one super class from the training set; p-value determined by two-sided Wilcoxon rank-sum test; e correlation between the RSS scores and the relative errors of CCS prediction; the error bands represent 95% confidence interval; p-value determined by linear regression; f correlation of RSS scores and prediction errors in validation sets; the insert bar plot displays MREs for different RSS groups; the error bands represent 95% confidence interval; p-value determined by linear regression; g two compound examples for RSS and CCS prediction accuracy; The abbreviation “TC” represents tanimoto coefficient; h distribution of RSS scores in seven common compound databases. Source data are provided as a Source Data file.