Skip to main content
. 2022 Jul 19;25(8):104788. doi: 10.1016/j.isci.2022.104788

Table 5.

Summary of identified PPCC DDs (Pairwise Pearson’s correlation coefficient data doppelgängers) before and after balancing

DataSet Number of PPCC DDs in Unbalanced Number of PPCC DDs in Balanced Batch Imbalance Ratio Description of PPCC DDs
DMD 54 (6.25%) 47 (5.44%) 1.5 2 additional PPCC DDs in the balanced case, 9 additional PPCC DDs in the unbalanced case
Leukemia 6 (0.174%) 9 (0.260%) 1.5 3 additional PPCC DDs in the balanced case
ALL 41 (2.96%) 22 (1.59%) 1.27 9 additional PPCC DDs in the unbalanced case

The first column “DataSet” contains the names of the data sets described in each row. The next two columns contain the number of PPCC DDs and the proportion of PPCC DDs (percentage of all sample pairs that are PPCC DDs) in brackets in both balanced and unbalanced cases. The “Batch Imbalance Ratio” column denotes the extent of batch imbalance in each data set; it is calculated by dividing the batch size of the larger batch by the batch size of the smaller batch. The final column, “Description of PPCC DDs,” mentions notable observations of PPCC DDs in both cases.