TABLE 3.
Performance comparison of KneeXNet with state-of-the-art methods on the test set and an independent dataset.
| Method | Test set | Independent dataset | ||||
|---|---|---|---|---|---|---|
| Abnormality | ACL tear | Meniscal tear | Abnormality | ACL tear | Meniscal tear | |
| SVM | 0.872 0.013 | 0.841 0.016 | 0.836 0.015 | 0.865 0.014 | 0.833 0.017 | 0.828 0.016 |
| RF | 0.885 0.011 | 0.857 0.014 | 0.849 0.013 | 0.878 0.012 | 0.849 0.015 | 0.841 0.014 |
| GBM | 0.901 0.010 | 0.873 0.012 | 0.865 0.011 | 0.894 0.011 | 0.865 0.013 | 0.857 0.012 |
| 2D CNN | 0.923 0.008 | 0.896 0.010 | 0.889 0.009 | 0.916 0.009 | 0.888 0.011 | 0.881 0.010 |
| 3D CNN | 0.937 0.007 | 0.912 0.009 | 0.905 0.008 | 0.930 0.008 | 0.904 0.010 | 0.897 0.009 |
| Transformer | 0.948 0.006 | 0.925 0.007 | 0.919 0.007 | 0.941 0.007 | 0.917 0.008 | 0.911 0.008 |
| SENet | 0.956 0.005 | 0.934 0.006 | 0.928 0.006 | 0.949 0.006 | 0.926 0.007 | 0.920 0.007 |
| KneeXNet | 0.985 0.003 a,b | 0.972 0.004 a,b | 0.968 0.004 a,b | 0.978 0.004 a,b | 0.964 0.005 a,b | 0.960 0.005 a,b |
| Additional evaluation metrics for KneeXNet | ||||||
| Accuracy | 0.968 0.004 | 0.951 0.005 | 0.946 0.006 | 0.960 0.005 | 0.942 0.006 | 0.937 0.007 |
| Precision | 0.972 0.005 | 0.958 0.006 | 0.953 0.006 | 0.964 0.006 | 0.949 0.007 | 0.944 0.007 |
| Recall | 0.979 0.004 | 0.965 0.005 | 0.961 0.005 | 0.971 0.005 | 0.956 0.006 | 0.952 0.006 |
| F1 score | 0.975 0.004 | 0.961 0.005 | 0.957 0.005 | 0.967 0.005 | 0.952 0.006 | 0.948 0.006 |
| Specificity | 0.933 0.008 | 0.918 0.009 | 0.914 0.010 | 0.920 0.009 | 0.905 0.010 | 0.901 0.011 |
| P-values for KneeXNet vs. best competing method (SENet) | ||||||
Statistical significance test results:
Significantly better than all traditional ML methods (SVM, RF, GBM) with
Significantly better than all deep learning methods (2D CNN, 3D CNN, transformer, SENet) with
The bold values represent the best-performing results for each respective metric or category.