External validation of QSAR models for MIEs based on ChEMBL data. For each MIE predicting QSAR the average number of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN) were reported. The metrics for evaluating the predictivity of the models were sensitivity (SEN), specificity (SPE), balanced accuracy (BA), Matthew’s correlation coefficient (MCC) and area under the ROC curve (AUC). Performance is the average of metrics obtained over 100 different training-test splits.