Skip to main content
. 2024 Apr 29;16(4):2482–2498. doi: 10.21037/jtd-24-416

Figure 3.

Figure 3

Analysis of the machine learning models. (A,B) ROCs and AUCs of the training and validation sets. The patients with COPD were tested 10 times at a ratio of 7:3. (C) A DCA plot of the validation set, in which all patients were hypothesized to be frail (black dashed line) or all patients were hypothesized to be nonfrail (red dashed line). Different models are represented by the remaining solid lines. (D) The dashed diagonal line represents the reference point, and the other solid lines represent the fitted models used. The smaller the number in parentheses is, the closer the fitted line to the reference line and the more accurate the predicted value of the model. A lower number in parentheses indicates that a fitted curve is more similar to a reference curve and thus more accurate. (E,F) PR curves and APs for the training set and test set. The horizontal axis indicates the recall rate, and the vertical axis indicates the precision rate. When one model’s PR curve is completely covered by another’s, this indicates the latter’s superior performance, with higher APs indicating better model performance. The colors in the image correspond to the individual models, with their respective mean values and 95% confidence intervals displayed. ROC, receiver operating characteristic; XGBoost, extreme gradient boosting; LightGBM, light gradient-boosting machine; AP, average precision; CI, confidence interval; SVM, support vector machine; KNN, k-nearest neighbors; MLP, multilayer perceptron; AUC, area under the curve; COPD, chronic obstructive pulmonary disorder; DCA, decision curve analysis; PR, precision curve.