Skip to main content
. 2024 Jan 16;37(1):268–279. doi: 10.1007/s10278-023-00909-7

Table 3.

Performance of the lung graph–based machine learning model and radiologists in the identification of f-ILD on the independent validation set

Method Evaluation level AUC Accuracy Sensitivity Specificity PPV NPV
Split 1 Scan-level 0.998 (0.992, 1.000) 0.957 (0.915, 0.989) 0.929 (0.837, 1.000) 0.981 (0.939, 1.000) 0.975 (0.917, 1.000) 0.944 (0.878, 1.000)
Split 2 0.997 (0.989, 1.000) 0.957 (0.915, 0.989) 0.905 (0.810, 0.979) 1.000 (1.000, 1.000) 1.000 (1.000, 1.000) 0.929 (0.855, 0.984)
Split 3 0.998 (0.992, 1.000) 0.968 (0.926, 1.000) 0.952 (0.880, 1.000) 0.981 (0.932, 1.000) 0.976 (0.914, 1.000) 0.962 (0.902, 1.000)
Split 4 0.997 (0.991, 1.000) 0.957 (0.915, 0.989) 0.905 (0.814, 0.978) 1.000 (1.000, 1.000) 1.000 (1.000, 1.000) 0.929 (0.862, 0.983)
Split 5 0.995 (0.984, 1.000) 0.968 (0.926, 1.000) 0.929 (0.844, 1.000) 1.000 (1.000, 1.000) 1.000 (1.000, 1.000) 0.945 (0.879, 1.000)
Average 0.999 (0.994, 1.000) 0.968 (0.926, 1.000) 0.929 (0.844, 1.000) 1.000 (1.000, 1.000) 1.000 (1.000, 1.000) 0.945 (0.873, 1.000)
Radiologist A 0.933 (0.879, 0.979) 0.936 (0.883, 0.979) 0.905 (0.810, 0.970) 0.962 (0.902, 1.000) 0.950 (0.868, 1.000) 0.926 (0.849, 0.983)
Radiologist B 0.842 (0.769, 0.909) 0.830 (0.755, 0.904) 0.952 (0.882, 1.000) 0.731 (0.607, 0.854) 0.741 (0.621, 0.857) 0.950 (0.871, 1.000)
Radiologist C 0.904 (0.846, 0.953) 0.894 (0.830, 0.947) 1.000 (1.000, 1.000) 0.808 (0.692, 0.906) 0.808 (0.690, 0.906) 1.000 (1.000, 1.000)
Split 1 Patient-level 1.000 (1.000, 1.000) 0.986 (0.959, 1.000) 0.971 (0.905, 1.000) 1.000 (1.000, 1.000) 1.000 (1.000, 1.000) 0.974 (0.915, 1.000)
Split 2 0.997 (0.988, 1.000) 0.973 (0.932, 1.000) 0.943 (0.861, 1.000) 1.000 (1.000, 1.000) 1.000 (1.000, 1.000) 0.950 (0.881, 1.000)
Split 3 0.999 (0.995, 1.000) 0.986 (0.959, 1.000) 0.971 (0.903, 1.000) 1.000 (1.000, 1.000) 1.000 (1.000, 1.000) 0.974 (0.913, 1.000)
Split 4 0.998 (0.994, 1.000) 0.959 (0.918, 1.000) 0.914 (0.821, 1.000) 1.000 (1.000, 1.000) 1.000 (1.000, 1.000) 0.927 (0.838, 1.000)
Split 5 0.998 (0.992, 1.000) 0.973 (0.932, 1.000) 0.943 (0.857, 1.000) 1.000 (1.000, 1.000) 1.000 (1.000, 1.000) 0.950 (0.870, 1.000)
Average 1.000 (1.000, 1.000) 0.986 (0.959, 1.000) 0.971 (0.912, 1.000) 1.000 (1.000, 1.000) 1.000 (1.000, 1.000) 0.974 (0.919, 1.000)
Radiologist A 0.917 (0.855, 0.973) 0.918 (0.849, 0.973) 0.886 (0.774, 0.974) 0.947 (0.872, 1.000) 0.939 (0.853, 1.000) 0.900 (0.795, 0.977)
Radiologist B 0.828 (0.742, 0.903) 0.822 (0.726, 0.904) 0.971 (0.912, 1.000) 0.684 (0.525, 0.825) 0.739 (0.608, 0.860) 0.963 (0.880, 1.000)
Radiologist C 0.908 (0.844, 0.969) 0.904 (0.836, 0.973) 1.000 (1.000, 1.000) 0.816 (0.688, 0.938) 0.833 (0.705, 0.944) 1.000 (1.000, 1.000)

Statistics in the square brackets showed 95% confidence intervals (CIs). Evaluation results (except AUC) of the proposed method were calculated by using the standard classification decision threshold of 0.5

Average average of five groups of models, PPV positive predict value, NPV negative predict value