Each of the five datasets began with 5,051,776 600 bp genomic windows. For each of the five datasets, we reduced the dataset to windows that had nonzero counts in at least three DNA samples in the methylated condition
and three DNA samples in the unmethylated condition (i.e. 6 DNA samples total; ‘DNA filter’). We then reduced the dataset to windows that had nonzero counts in at least three RNA samples in either the methylated
or unmethylated condition (‘RNA filter’). Finally, we retained only windows that showed high repeatability across DNA samples, following
Lea et al., 2018 (‘DNA repeatability’). Numbers correspond to million windows that passed or failed each filter for which the arrow points to. Note that one mSTARR RNA-seq sample in the baseline condition [sample ID
L31250] was removed from further analysis because it had an unusually high proportion of zero counts in the testable windows; we therefore also removed the corresponding paired DNA sample prior to analysis. See
Supplementary file 9 for the precise window numbers corresponding to the plots.