Skip to main content
. 2011 Oct 21;52(11):8316–8322. doi: 10.1167/iovs.10-7012

Table 7.

Testing Performance on Dataset B, Based on the Pathology Classifiers Trained on Dataset A

Performance on Dataset B NM MH ME AMD
AUC 0.978 0.969 0.941 0.975
Best balanced accuracy, % 95.5 97.3 90.5 95.2

The ground truth for this experiment was defined by the consensus from the two experts for both datasets. The consensus includes 96.9%, 95.4%, 88.0%, and 90.5% of 326 scans from dataset A, and 94.7%, 100%, 90.0%, and 84.7% of 131 scans from dataset B, for NM, MH, ME, and AMD, respectively. The number of positive cases versus total cases is shown.