Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2018 Sep 14.

Published in final edited form as: Nature. 2017 Nov 8;551(7681):517–520. doi: 10.1038/nature24473

Extended Data Figure 5 | — Predictions were performed using subsampled IEDB epitope sequences, with subsampling rate varying between 0.1 and 0.9. For each rate, 10,000 iterations were performed to obtain a distribution of log-rank test scores. The violin plots represent data density at a given value on a vertical axis (n=10,000). Solid black lines mark the log-rank test score of the prediction on the full set of epitope sequences and gray thick lines mark the median scores on subsampled data. a-c, Subsampling of the original set of IEDB sequences, supported by positive T-cell assays, shows that quality of predictions decreases with subsampling rate. Prediction quality is more robust in the Snyder et al. and Rizvi et al. datasets. d-f, Analogous subsampling procedure was repeated on IEDB sequences not supported by positive T-cell assays. For Van Allen et al. and Snyder et al. model performance is substantially lowered.