. 2022 Sep 19;17(9):e0274691. doi: 10.1371/journal.pone.0274691

Table 3. Summary of the performances for the 5- and 10-year follow-up achieved by Model 1, Model 2, and Model 3 over the ten independent test sets.

Follow-up	Model	AUC (%)	Accuracy (%)	Sensitivity (%)	Specificity (%)
	Model 1	65.8 [63.1–68.9]	65.1 [63.4–70.6]	49.2 [42.8–57.1]	70.7 [67.6–74.3]
5 years	Model 2	70.5 [67.7–73.4]	71.1 [67.3–74.5]	50.1 [42.9–63.7]	77.3 [69.3–81.2]
	Model 3	34.3 [29.1–40.6]	35.9 [32.7–40.4]	35.5 [35,3–57.1]	31.9 [28.2–35.1]
	Ensemble Model	77.1 [69.3–78.6]	75.7 [70.3–77.5]	64.0 [55.6–66.6]	75.5 [73.9–84.0]
10 years	Model 1	67.9 [59.8–70.3]	63.2 [58.5–67.9]	62.2 [56.0–66.7]	62.1 [58.8–71.9]
	Model 2	70.7 [59.6–76.7]	66.0 [62.3–69.8]	51.3 [47.6–57.9]	75.8 [64.7–80.6]
	Model 3	34.8 [31.7–43.6]	41.5 [37.7–47.2]	50.0 [42.1–52.4]	38.5 [32.3–46.4]
	Ensemble Model	76.3 [62.8–76.8]	71.3 [68.0–74.9]	66.0 [50.0–71.8]	81.9 [61.3–87.5]

The performances are evaluated in percentage median values of AUC, accuracy, sensitivity, and specificity. The percentage first and third quartile values are also computed and reported in brackets. For each metric, the best performances are highlighted in bold.