Figure 1. Uniquorn workflow.
CCLs from a reference library are compared to a given query sample q based on their set of small variants (variant profile). Variants are weighted according to their prevalence within the library (e.g. CCLE) and frequent variants are excluded afterwards. Subsequently, Uniquorn computes a confidence score quantifying the likelihood for each reference sample r being identical to q. Significantly different amounts of variants in q and r affect the statistical test that assesses whether q and r are similar. Therefore, a regularization step calculates the minimal amount of matching variants required to predict that q and r are related.