Table 2.
Performance of the lung graph–based machine learning models in the identification of f-ILD on the testing set
| Evaluation level | Method | AUC | Accuracy | Sensitivity | Specificity | PPV | NPV |
|---|---|---|---|---|---|---|---|
| Scan-level | Split 1 | 0.983 | 0.918 | 0.9 | 0.939 | 0.947 | 0.886 |
| Split 2 | 0.996 | 0.984 | 0.973 | 1 | 1 | 0.963 | |
| Split 3 | 1 | 1 | 1 | 1 | 1 | 1 | |
| Split 4 | 0.965 | 0.908 | 0.897 | 0.923 | 0.946 | 0.857 | |
| Split 5 | 0.913 | 0.841 | 0.941 | 0.743 | 0.78 | 0.929 | |
| Mean | 0.971 ± 0.032 | 0.930 ± 0.057 | 0.942 ± 0.040 | 0.921 ± 0.094 | 0.935 ± 0.081 | 0.927 ± 0.051 | |
| Patient-level | Split 1 | 0.969 | 0.881 | 0.84 | 0.941 | 0.955 | 0.8 |
| Split 2 | 0.99 | 0.976 | 0.958 | 1 | 1 | 0.944 | |
| Split 3 | 1 | 1 | 1 | 1 | 1 | 1 | |
| Split 4 | 0.949 | 0.854 | 0.833 | 0.882 | 0.909 | 0.789 | |
| Split 5 | 0.958 | 0.878 | 0.917 | 0.824 | 0.88 | 0.875 | |
| Mean | 0.973 ± 0.019 | 0.918 ± 0.059 | 0.910 ± 0.065 | 0.929 ± 0.068 | 0.949 ± 0.048 | 0.882 ± 0.081 |
All results are shown as mean values and standard deviations over the five random splits. Evaluation results (except AUC) of the proposed method were calculated by using the standard classification decision threshold of 0.5
AUC area under the curve, PPV positive predict value, NPV negative predict value