Skip to main content
. 2015 Oct 10;43(18):8694–8712. doi: 10.1093/nar/gkv865

Figure 2.

Figure 2.

Evaluation of different feature encodings and classification algorithms for enhancer-promoter interaction prediction. (A) Area Under the Precision-Recall curve (AUPR) values for all four cell lines and the three classification approaches tested. These approaches include the Random Forests classifier, a regularized linear regression approach (LASSO) and a regularized logistic regression approach (LASSOGLM). The higher the bar the better the particular classification approach. (B) Top selected features using Random Forests and Group Lasso. For Random forests the feature importance is the out of bag error when the feature is included in the top 20, and 0 otherwise, and for Group Lasso the feature importance is the absolute value of the regression coefficient. (C) AUPRs on different combinations of data sets: ALL Common: all 23 data sets, GLASSO: 13 data sets selected by Group Lasso, RF: 17 data sets selected by Random Forests feature ranking, RF_GLASSO_intersect: 12 data sets in the intersection of data sets selected by Group Lasso and Random Forests, H3k27ac+H3k4me2+Exp: 3 data sets including H3K27ac, H3K4me2 and RNA-seq based gene expression levels.