. 2020 Aug 1;7:40. doi: 10.1186/s40662-020-00206-2

Table 4.

Prior ROP AI study performance comparison

Reference	Patients (N)	Cases	Images	Labels	Model	Specificity	Sensitivity	Accuracy
Wang et al. [35]	1273	3722	20,795	normal/minor	Id-Net; Gr-Net	99.32% (Id-Net);	96.62% (Id-Net); 88.46% (Gr-Net)	–
				ROP/severe ROP		99.32% (Id-Net);
				ROP/severe ROP		92.31% (Gr-Net)
Brown et al. [19]	898	1762	5511	normal/pre-plus	U-net (Inception-v1)	94% (plus disease)	93% (plus disease); 100% (pre-plus disease)	–
				disease/plus disease		94% (plus disease)
				disease/plus disease		94% (pre-plus disease)
Worrall et al. [34]	35	347	1459	normal/plus disease	GoogleNet; Bayesian CNNs	0.983 (per image)	0.825 (per image)	–
						0.983 (per image)	0.954 (per exam)
						0.947 (per exam)	0.954 (per exam)
Campbell et al. [37]	–	–	77	normal/pre-plus	i-ROP	–	–	95%
Campbell et al. [37]	–	–	77	disease/plus disease	i-ROP	–	–	95%
Hu et al. [38]	720	–	3017	normal/mild	Inception-v2; VGG-16; ResNet-50	–	–	0.970 (normal and ROP);
				ROP/severe ROP				0.970 (normal and ROP);
				ROP/severe ROP				0.840 (mild and severe)

ROP = retinopathy of prematurity; GoogleNet = google inception net; CNN = convolutional neural network; VGG = visual geometry group; ResNet = residual network