Skip to main content
. Author manuscript; available in PMC: 2020 Aug 11.
Published in final edited form as: Nat Med. 2019 Jul 15;25(8):1301–1309. doi: 10.1038/s41591-019-0508-1

Extended Data Fig. 8 |. The publicly shared MSK breast cancer metastases dataset is representative of the full MSK breast cancer metastases test set.

Extended Data Fig. 8 |

We created an additional dataset of the size of the test set of the CAMEYON16 challenge (130 slides) by subsampling the full MSK breast cancer metastases test set, ensuring that the models achieved similar performance for both datasets. Left, the model was trained on MSK data with our proposed method (MIL-RNN) and tested on: the full MSK breast data test set (n = 1,473; AUC = 0.968), the public MSK dataset (n = 130; AUC = 0.965); and the test set of the CAMELYON16 challenge (n = 129; AUC = 0.898). Right, the model was trained on CAMELYON16 data with supervised learning18 and tested on: the test set of the CAMELYON16 challenge (n = 129; AUC = 0.932); the full MSK breast data test set (n = 1,473; AUC = 0.731); and the public MSK dataset (n = 130; AUC = 0.737). Error bars represent 95% confidence intervals for the true AUC calculated by bootstrapping each test set.