Fig. 3|.
Systematic benchmarking of cell type specific TF-RE binding potential and cis-regulatory potential performance. A, E. ROC curve and PR curve of binding potential for ETS1 in naive CD4 T cells. The ground truth for A to E is the ChIP-seq data of ETS1 in naïve CD4+ T cells. The color in A to E represents the different competitors to predict TF-RE regulation. Orange represents LINGER, green represents PCC between the expression of TF and the chromatin accessibility of RE, and blue represents motif binding affinity of TF to RE. B, C. Violin plot of AUC and AUPR ratio values of binding potential across diverse TFs and cell types. The ground truth is ChIP-seq data for 10 TFs from different cell types in blood. In this study, we use the following convention for symbols indicating statistical significance: ns: p > 0.05; *: p <= 0.05; **: p <= 0.01; ***: p <= 0.001; ****, p <= 0.0001. We hide the ns symbol when displaying significance levels. D. The performance metrics F1 score of binding potential. Each point represents for each ground truth data. The p-value for D, H, and K are based on the one sided paired t-test. F, G. AUC and AUPR ratio of cis-regulatory potential in naïve CD4+ cell. The ground truth for F to H is promoter capture HiC data. RE-TG pairs are divided into six distance groups ranging from 0–5k to 100–200 kb. PCC is calculated between the expression of TG and the chromatin accessibility of RE. Distance denotes the decay function of the distance to the TSS. Random denotes the uniform distribution. H. F1 score of cis-regulatory in naïve CD4+ cell for LINGER and SCEINC+. I, J. AUC and AUPR ratio of cis-regulatory potential. The ground truth is eQTL data from 6 immune cell types. K. F1 score of cis-regulatory potential in naïve B cell. The ground truth is eQTL data from naïve B cells. This figure corresponds to the PMBC data.
