Figure 1.
Results of PSRCC DD Identification on lymph_lung and large_upper data sets
x-=axis: types of sample pairs based on the similarities of their class and patient. y-axis: PSRCC (Pairwise Spearman Rank Correlation Coefficient) values of each sample pair. Dots labeled in gray are not PSRCC DDs (data doppelgängers), whereas dots labeled in purple are PSRCC DDs. PSRCC DDs are sample pairs in “Same Class Different Patient” with a PSRCC value greater than the cut-off. The cut-off is the maximum PSRCC of any sample pair in “Different Class Different Patient.” The cut-off PSRCC is higher in large_upper (B) than in lymph_lung (A). In sum, 1,034 PSRCC DDs were identified within lymph_lung (A), whereas 144 PSRCC DDs were identified within large_upper (B).