Skip to main content
. Author manuscript; available in PMC: 2021 Nov 29.
Published in final edited form as: Nat Genet. 2021 May 6;53(6):869–880. doi: 10.1038/s41588-021-00861-8

Figure 5. Nucleotide sequence and epigenetic determinants of HbF regulatory sequences.

Figure 5.

(a) Genome browser screenshot of the β-like globin gene cluster showing the distribution of BPRSHbF in relation to various epigenetic signals. The tracks are color-coded as follows: red, 3D chromatin organization; dark red, chromatin accessibility (ATAC-seq); blue, key TF occupancy; brown, histone modifications.

(b) Box plots comparing epigenetic signal distribution in ABE-mutated adenines with high (>30) (n = 313) and low (<10) BPRSHbF (n = 9,268). The y-axis represents log-transformed normalized signals. P value were determined using with two-tailed unpaired Wilcoxon test. Box depicts the interquartile range; central line indicates the median and whiskers indicate minimum/maximum values.

(c) The significance of differences in specific functional genomic features in ABE-mutated adenines with high and low BPRSHbF, ranked by −log10 (P value). P value were determined using with two-tailed Wilcoxon test.

(d) Comparison of CRE prediction models. The left panel summarizes the performance of different models for resolving high- and low-BPRSHbF adenines according to the distance between them. The y-axis represents the area under the receiver operating characteristic (AUROC) curves. The x-axis represents the threshold of minimal distance. The two panels at right show details of the prediction performance, represented by AUROC curves at different window sizes. RF refers to the random forest model that we developed. n = 400 iterations of cross-validation. Bar plot shows the mean and 95% confidence interval of the AUROC distribution.