. 2020 Oct 11;27(12):1921–1934. doi: 10.1093/jamia/ocaa139

Table 8.

Summary of performance on MIMIC-III for all FIDDLE-based models, compared to MIMIC-Extract

Task		In-hospital mortality, 48 h n = 1264		ARF, 4 h n = 2358		ARF, 12 h n = 2093		Shock, 4 h n = 2867		Shock, 12 h n = 2612
Method		AUROC	AUPR	AUROC	AUPR	AUROC	AUPR	AUROC	AUPR	AUROC	AUPR
MIMIC-Extract	LR	0.859 (0.830–0.887)	0.445 (0.358–0.540)	0.777 (0.752–0.803)	0.604 (0.561–0.648)	0.723 (0.683–0.759)	0.250 (0.200–0.313)	0.796 (0.771–0.821)	0.505 (0.454–0.557)	0.748 (0.712–0.784)	0.242 (0.193–0.310)
	RF	0.852 (0.821–0.882)	0.448 (0.359–0.537)	0.821 (0.799–0.843)	0.660 (0.617–0.698)	0.747 (0.709–0.782)	0.289 (0.235–0.356)	0.824 (0.801–0.845)	0.541 (0.488–0.588)	0.778 (0.742–0.812)	0.307 (0.248–0.369)
	CNN	0.851 (0.820–0.879)	0.439 (0.353–0.529)	0.788 (0.763–0.814)	0.633 (0.591–0.672)	0.722 (0.684–0.758)	0.258 (0.207–0.320)	0.798 (0.773–0.824)	0.520 (0.471–0.572)	0.741 (0.704–0.778)	0.247 (0.198–0.317)
	LSTM	0.837 (0.803–0.867)	0.441 (0.358–0.523)	0.796 (0.770–0.822)	0.634 (0.590–0.675)	0.700 (0.661–0.736)	0.229 (0.184–0.286)	0.801 (0.778–0.825)	0.513 (0.463–0.562)	0.753(0.717–0.791)	0.248 (0.199–0.313)
FIDDLE	LR	0.856(0.821–0.888)	0.444(0.357–0.545)	0.817(0.792–0.839)	0.657(0.614–0.696)	0.757(0.720–0.789)	0.291(0.236–0.354)	0.825(0.803–0.846)	0.548(0.501–0.595)	0.792(0.758–0.824)	0.274(0.227–0.338)
	RF	0.814(0.780–0.847)	0.357(0.279–0.448)	0.817(0.795–0.839)	0.652(0.608–0.690)	0.760(0.726–0.793)	0.317(0.255–0.382)	0.809(0.786–0.833)	0.516(0.467–0.566)	0.773(0.740–0.806)	0.288(0.231–0.355)
	CNN	0.886(0.854–0.916)	0.531(0.434–0.629)	0.827(0.803–0.848)	0.666(0.626–0.705)	0.768(0.733–0.800)	0.294(0.238–0.361)	0.831(0.811–0.851)	0.541(0.493–0.589)	0.791(0.758–0.823)	0.295(0.239–0.361)
	LSTM	0.868(0.835–0.897)	0.510(0.411–0.597)	0.827(0.801–0.846)	0.664(0.623–0.703)	0.771(0.737–0.802)	0.326(0.267–0.397)	0.824(0.803–0.845)	0.541(0.497–0.587)	0.792(0.759–0.823)	0.314(0.251–0.386)

Note: Reported as AUROC and AUPR with 95% CIs in parentheses on the respective held-out test set for the 5 prediction tasks. For each task (column), the bolded results are the best-performing model for either MIMIC-Extract or FIDDLE.

ARF: acute respiratory failure; AUROC: area under the receiver operating characteristics curve; AUPR: area under the precision-recall curve; CI: confidence interval; CNN: convolutional neural networks; FIDDLE: Flexible Data-Driven Pipeline; LR: logistic regression; LSTM: long short-term memory networks; RF: random forest.