Skip to main content
. 2018 Apr 3;13(4):e0192726. doi: 10.1371/journal.pone.0192726

Table 3. Patient-level performance evaluation for predicting clinical heart failure or severe tissue pathology from H&E stained whole-slide images for validation folds of the training data set.

Metric Random Forest Deep Learning p-value
Image-level results
Accuracy 0.869 ± 0.05 0.954 ± 0.03 0.05
Sensitivity 0.866 ± 0.07 0.968 ± 0.03 n.s.
Specificity 0.872 ± 0.04 0.943 ± 0.05 n.s.
Positive predictive value 0.848 ± 0.05 0.935 ± 0.05 0.05
AUC 0.944 ± 0.04 0.977 ± 0.02 0.05
Patient-level results
Accuracy 0.923 ± 0.03 0.962 ± 0.02 n.s.
Sensitivity 0.917 ± 0.07 0.979 ± 0.04 n.s.
Specificity 0.930 ± 0.06 0.947 ± 0.05 n.s.
Positive predictive value 0.919 ± 0.07 0.942 ± 0.06 n.s.
AUC 0.963 ± 0.05 0.960 ± 0.05 n.s.

The results are presented as the Mean ± SD of three models. Each model was trained on ~770 images from ~70 patients. These models were evaluated on the validation fold of ~35 patients. The patient-level diagnosis is the majority vote over all the images from a single patient. Statistics were determined by an unpaired two-sample t-test with an N of three folds.