Skip to main content
. 2021 Oct 7;4:145. doi: 10.1038/s41746-021-00520-6

Fig. 5. Effectiveness metric.

Fig. 5

We consider a data set, D, to be more effective than another, E, if a model trained on D has a higher validation accuracy than a model trained on E, and both D and E cost the same to annotate (i.e., both have the same number of data points and annotations). To this end, we use the AUC of a model’s validation accuracy vs number of training samples as a measure of effectiveness, with a higher AUC indicating higher effectiveness and the ratio in AUC between two curves allowing a comparison between two models. In the example above (use-case: tumor-infiltrating lymphocytes), the data set generated with AI augmentation has an absolute validation accuracy improvement of 0.11, 0.11, and 0.05, over a data set generated without AI augmentation, for 50, 75, and 100 training samples, respectively. The AUC ratio of the two curves is 5.3%.