Table 2.
Construction and performance validations of machine learning models
| Methods | AUC (95% CI) | Delong test | Accuracy (95%CI) | Sensitivity (95%CI) | Specificity (95%CI) | PPV (95%CI) | NPV (95%CI) |
|---|---|---|---|---|---|---|---|
| In training set | |||||||
| GBDT | 0.994 (0.988-1.000) | Ref | 0.994 (0.988-1.000) | 0.996 (0.989-1.000) | 0.992 (0.982-1.000) | 0.992 (0.982-1.000) | 0.996 (0.989-1.000) |
| LR | 0.890 (0.860–0.920) | < 0.001 | 0.834 (0.802–0.866) | 0.810 (0.762–0.857) | 0.858 (0.815-0.900) | 0.852 (0.808–0.896) | 0.817 (0.771–0.863) |
| AdaBoost | 0.918 (0.894–0.941) | < 0.001 | 0.918 (0.894–0.941) | 0.962 (0.939–0.985) | 0.873 (0.833–0.914) | 0.885 (0.848–0.922) | 0.958 (0.932–0.983) |
| SVM | 0.912 (0.888–0.936) | < 0.001 | 0.912 (0.888–0.936) | 0.924 (0.892–0.956) | 0.900 (0.864–0.936) | 0.903 (0.868–0.939) | 0.921 (0.888–0.954) |
| KNN | 0.908 (0.883–0.933) | < 0.001 | 0.908 (0.883–0.933) | 0.916 (0.883–0.950) | 0.900 (0.864–0.936) | 0.903 (0.867–0.938) | 0.914 (0.880–0.948) |
| MLP | 0.948 (0.929–0.967) | < 0.001 | 0.948 (0.929–0.967) | 0.958 (0.934–0.982) | 0.938(0.909–0.968) | 0.940 (0.912–0.969) | 0.957 (0.932–0.982) |
| In testing test | |||||||
| GBDT | 0.985 (0.966-1.000) | Ref | 0.969 (0.940–0.999) | 1.000 (1.000–1.000) | 0.940 (0.884–0.997) | 0.941 (0.885–0.997) | 1.000 (1.000–1.000) |
| LR | 0.896 (0.841–0.951) | < 0.001 | 0.763 (0.691–0.836) | 0.828 (0.736–0.921) | 0.701 (0.592–0.811) | 0.726 (0.624–0.828) | 0.810 (0.709–0.911) |
| AdaBoost | 0.940 (0.900–0.980) | 0.099 | 0.939 (0.898–0.980) | 0.984 (0.954-1.000) | 0.896 (0.822–0.969) | 0.900 (0.830–0.970) | 0.984 (0.952-1.000) |
| SVM | 0.924 (0.879–0.970) | 0.031 | 0.924 (0.878–0.969) | 0.953 (0.901-1.000) | 0.896 (0.822–0.969) | 0.897 (0.825–0.969) | 0.952 (0.900-1.000) |
| KNN | 0.924 (0.878–0.970) | 0.030 | 0.924 (0.878–0.969) | 0.938 (0.878–0.997) | 0.910 (0.842–0.979) | 0.909 (0.840–0.978) | 0.938 (0.880–0.997) |
| MLP | 0.916 (0.868–0.964) | 0.017 | 0.916 (0.869–0.964) | 0.922 (0.856–0.988) | 0.910 (0.842–0.979) | 0.908 (0.837–0.978) | 0.924 (0.860–0.988) |
Notes: SVM: Support vector machine; KNN: K-nearest neighbor; MLP: multi-layer perceptron; LR: logistic regression; GBDT: gradient boosting decision tree; AdaBoost: adaptive enhancement algorithm; PPV: Positive predictive values; NPV: Negative predictive values; AUC: Area under curve; CI: confidence interval; Ref: Reference