A,B. (Top panel) Distribution of distances within (blue) and between (gray) all human replication profiles for consensus fingerprinting domains in human (A) and mouse (B) cell types. (Bottom panel) Number of classification errors as a function of distance ratio cutoff. The optimal classifier (θ) is that which minimizes classification errors, with distances above θ hypothesized to originate from different cell types. C,D. Human dataset classification results for the standard kNN method (Standard) leave-one-out crossvalidation (LOOCV), and with each cell type excluded from training (LCTO). For LOOCV, each experiment (e.g., BG01ES.R1) is classified using 20 regions selected with that experiment left out. For LCTO, experiments are labeled as the most similar type in the training set, or correctly classified as “Unseen” for distances above θ. Experimental replicates are denoted with suffixes ‘R1’, ‘R2’, etc, and are described in Table S1.