Skip to main content
. 2022 Oct 14;36(3):296–307. doi: 10.4103/sjopt.sjopt_219_21

Table 5.

Summary of results from the 12 included studies

Study Algorithm Performance
Sens % Spec % Area under the ROC curve (AUROC)
Detecting Disease
  Hu et al. 2019 96 98 0.9922
  Zhang et al. 2018 94.1 99.3 0.998
Chen et al. 2020
  American Trained Algorithm NR NR 0.99
  Nepal Trained Algorithm NR NR 0.96
  Combined (American & Nepal) Trained Algorithm NR NR 0.99

Sens grading % Spec grading % AUROC grading

Detecting Disease & Stage
  Huang et al. 2020 96.14±0.87 95.95±0.48 0.96 91.82±2.03 (stage 1) 94.5±0.71 (stage 1) 0.93 (stage 1)
Wang et al. 2018
  Id-Net 96.64 99.33 0.995 n/a n/a n/a
  Gr-Net n/a n/a n/a 88.46 (minor vs. severe) 92.31 (minor vs. severe) 0.951 (minor vs. severe)

Sens Pre-Plus % Spec Pre-plus % AUROC Pre-plus

Detecting Plus Disease
  Brown et al. 2018 93 94 0.98 100 94 NR
  Mao et al. 2020 95.1 97.8 0.99 92.4 97.4 NR
  Ramachandran et al. 2021 99 98 0.9947 n/a n/a n/a
  Tan et al. 2019 96.6 98 0.993 n/a n/a n/a
  Yildiz et al. 2020 NR NR 0.94 NR NR 0.88
Detecting Plus & Severity
  Wang et al. 2021 91.8 97 0.983 98.2 (stage) 98.5 (stage) 0.998 (stage)
  Tong et al. 2020 71.3 90.7 NR 77.8 (“normal” “mild” “semi-urgent” “urgent”) 93.2 (“normal” “mild” “semi-urgent” “urgent”) NR

Study Human Performance
External Validation
Sens % Spec % Sens grading % Spec grading % AUROC Sens % Spec % AUROC

Detecting Disease
  Hu et al. 2019 n/a n/a n/a n/a n/a n/a n/a n/a
  Zhang et al. 2018 93.5 99.5 n/a n/a n/a n/a n/a n/a
Chen et al. 2020
  American Trained Algorithm n/a n/a n/a n/a n/a 52 99 0.96
  Nepal Trained Algorithm n/a n/a n/a n/a n/a 44 69 0.62
  Combined (American & Nepal) Trained Algorithm n/a n/a n/a n/a n/a 98/82 (against American/Nepal set) 96/99 (against American/Nepal set) 0.99/0.98 (against American/Nepal set)

Sens grading % Spec grading % AUROC grading

Detecting Disease & Stage
  Huang et al. 2020 NR NR NR NR NR n/a n/a n/a n/a n/a n/a
Wang et al. 2018
  Id-Net NR NR NR NR NR 84.91 96.9 NR n/a n/a n/a
  Gr-Net NR NR NR NR NR n/a n/a n/a 93.33 (minor vs. severe) 73.63 (minor vs. severe) NR

Sens Pre-Plus % Spec Pre-plus % AUROC Pre-plus

Detecting Plus Disease
  Brown et al. 2018 n/a n/a n/a n/a n/a 93 94 NR 100 94 NR
  Mao et al. 2020 n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a
  Ramachandran et al. 2021 n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a
  Tan et al. 2019 n/a n/a n/a n/a n/a 93.9 80.7 NR 81.4 80.7 0.977
  Yildiz et al. 2020 n/a n/a n/a n/a n/a NR NR 0.99 NR NR 0.97
Detecting Plus & Severity
  Wang et al. 2021 100 (compared to J-PROP on same dataset: 100) 99.8 (compared to J-PROP on same dataset: 98.4) 91.7 (stage) (compared to J-PROP on same dataset: 97.9) 99.1 (stage) (compared to J-PROP on same dataset: 97.4) NR n/a n/a n/a n/a n/a n/a
  Tong et al. 2020 74.8 (expert 1), 65.9 (expert 2) (for grading “normal” “mild” “semi-urgent” “urgent”) 93.4 (expert 1), 92.3 (expert 2) (for grading “normal” “mild” “semi-urgent” “urgent”) n/a n/a NR n/a n/a n/a n/a n/a n/a