Classification results of three machine learning models (random forest, support vector machine, and neural networks) on an independent validation cohort. (a) A detailed overview of the samples composing the validation cohort (n = 361) including the study name, number of samples, material, array, data type, and processing method. (b–d) Accuracy and predictable cases for different thresholds (0.5–0.95) for the random forest, support vector machine and neural networks classifiers. (e–g) Confusion matrices without filters (upper panel) and with filters (lower panel) for the three classifiers: random forest (e), support vector machine (f), and neural networks (g). (h–j) The probability score of the correct class for the classifiers (random forest (h), support vector machine (i), and neural network (j)) for three different variables: tissue of origin (iCCA, normal bile, and PAAD, left), material type (FFPE versus frozen, middle), and study set (right).