. 2022 Apr 6;11(4):6. doi: 10.1167/tvst.11.4.6

Table 3.

Performances of the Models on the Held-Out Test and Using Cross-Validation

Models	F1 Scores	AUROC	ACC	SP	SN	PPV	NPV
Clinical
Train	78.2	79.3	76.0	88.5	71.7	80.0	72.9
Test	79.7	80.6	80.2	89.7	72.3	80.0	81.8
OCT-based DL
Train	67.1 ± 28.9	77.3 ± 10.3	69.8 ± 3.6	76.9 ± 25.2	65.4 ± 15.6	57.6 ± 14.4	81.4 ± 9.0
Test	61.5 ± 23.7	72.8 ± 14.6	63.9 ± 13.2	70.8 ± 30.2	60.2 ± 17.9	60.2 ± 15.4	76.9 ± 15.4
Hybrid
Train	78.0 ± 1.7	84.1 ± 1.6	76.9 ± 4.2	79.0 ± 16.8	76.4 ± 26.8	74.2 ± 9.4	78.8 ± 7.6
Test	80.4 ± 7.7	81.9 ± 5.2	78.7 ± 2.9	91.3 ± 15.9	67.8 ± 26.9	77.4 ± 4.3	80.8 ± 6.7
Clinical cross-validation
Train	77.0 ± 2.1	82.4 ± 2.7	76.5 ± 5.3	84.7 ± 9.9	64.2 ± 19.9	72.6 ± 9.5	83.2 ± 6.9
Test	81.0 ± 7.1	81.5 ± 11.2	80.3 ± 10.8	97.2 ± 5.0	55.4 ± 23.2	70.4 ± 11.0	86.7 ± 5.9
OCT-based DL cross-validation
Train	74.0 ± 3.7	75.3 ± 6.9	73.5 ± 7.3	85.2 ± 8.5	53.8 ± 21.5	66.8 ± 9.0	79.5 ± 7.3
Test	76.3 ± 6.8	74.8 ± 11.1	74.9 ± 10.3	87.3 ± 11.2	57.2 ± 25.8	70.3 ± 13.5	81.3 ± 14.6
Hybrid cross-validation
Train	76.8 ± 2.6	82.2 ± 3.2	76.4 ± 5.6	80.6 ± 9.8	70.1 ± 20.0	75.7 ± 10.5	80.0 ± 6.0
Test	80.1 ± 7.6	81.7 ± 10.6	79.3 ± 10.7	92.4 ± 9.3	60.2 ± 23.6	72.3 ± 12.6	90.6 ± 10.3

DL, deep learning; AUROC, area under the receiver operating characteristic curve; ACC, accuracy; SP, specificity; SN, sensitivity; PPV, positive predictive value; NPV, negative predictive value.

Best means are highlighted.