Skip to main content
. 2024 Mar 4;12(4):e03584-23. doi: 10.1128/spectrum.03584-23

TABLE 2.

Pairwise comparison between the generalization performance of models trained using only the mutation dataa

Model A Model B Mean A Mean B P(A > B) P(A == B) P(A < B)
Clade 20C LANDMark Extra Trees 0.965 ± 0.035 0.968 ± 0.037 0.001 0.996 0.003
LANDMark Logistic Regression 0.965 ± 0.035 0.972 ± 0.035 0.001 0.990 0.009
LANDMark K-Nearest Neighbors 0.965 ± 0.035 0.875 ± 0.051 0.948 0.052 0
LANDMark Linear SVC 0.965 ± 0.035 0.973 ± 0.030 0 0.999 0.001
Extra Trees Logistic Regression 0.968 ± 0.037 0.972 ± 0.035 0.004 0.985 0.011
Extra Trees K-Nearest Neighbors 0.968 ± 0.037 0.875 ± 0.051 0.969 0.031 0
Extra Trees Linear SVC 0.968 ± 0.037 0.973 ± 0.030 0.003 0.985 0.012
Logistic Regression K-Nearest Neighbors 0.972 ± 0.035 0.875 ± 0.051 0.952 0.048 0
Logistic Regression Linear SVC 0.972 ± 0.035 0.973 ± 0.030 0 0.999 0.001
K-Nearest Neighbors Linear SVC 0.875 ± 0.051 0.973 ± 0.030 0 0.023 0.977
Clade 21J LANDMark Extra Trees 0.892 ± 0.044 0.896 ± 0.042 0 1 0
LANDMark Logistic Regression 0.892 ± 0.044 0.886 ± 0.045 0.002 0.998 0
LANDMark K-Nearest Neighbors 0.892 ± 0.044 0.764 ± 0.049 1 0 0
LANDMark Linear SVC 0.892 ± 0.044 0.882 ± 0.042 0.008 0.991 0
Extra Trees Logistic Regression 0.896 ± 0.042 0.886 ± 0.045 0.009 0.990 0
Extra Trees K-Nearest Neighbors 0.896 ± 0.042 0.764 ± 0.049 1 0 0
Extra Trees Linear SVC 0.896 ± 0.042 0.882 ± 0.042 0.023 0.976 0
Logistic Regression K-Nearest Neighbors 0.886 ± 0.045 0.764 ± 0.049 0.999 0.001 0
Logistic Regression Linear SVC 0.886 ± 0.045 0.882 ± 0.042 0.004 0.994 0.001
K-Nearest Neighbors Linear SVC 0.764 ± 0.049 0.882 ± 0.042 0 0.003 0.997
All Clades LANDMark Extra Trees 0.982 ± 0.007 0.982 ± 0.006 0 1 0
LANDMark Logistic Regression 0.982 ± 0.007 0.984 ± 0.005 0 1 0
LANDMark K-Nearest Neighbors 0.982 ± 0.007 0.950 ± 0.018 0.035 0.965 0
LANDMark Linear SVC 0.982 ± 0.007 0.985 ± 0.018 0.0 1 0.0
Extra Trees Logistic Regression 0.982 ± 0.006 0.984 ± 0.005 0.0 1 0.0
Extra Trees K-Nearest Neighbors 0.982 ± 0.006 0.950 ± 0.018 0.027 0.973 0
Extra Trees Linear SVC 0.982 ± 0.006 0.985 ± 0.018 0 1 0
Logistic Regression K-Nearest Neighbors 0.984 ± 0.005 0.950 ± 0.018 0.043 0.957 0
Logistic Regression Linear SVC 0.984 ± 0.005 0.985 ± 0.018 0 1 0
K-Nearest Neighbors Linear SVC 0.950 ± 0.018 0.985 ± 0.018 0 0.947 0.053
a

A Bayesian t-test was used to determine the probability that the balanced accuracy score of model A either exceeds, is lower, or is equivalent to the performance of model B. Feature selection using Triglav was not performed to generate these results.