Results in the tables are the summary of 100 re-sampling permutations of genes and each permutation randomly 1,000 genes to test the models trained with the other genes. “NA” indicates that the model has no better performance than random prediction. A) Gene expression is represented as a binomial variable (whether a gene is expressed) and logistic regression is used for the modeling. The three best values are highlighted in bold. B). Gene expression is represented as continuous variable (how much a gene is expressed) and so linear regression is used instead. Numbers indicate the Pearson’s correlation coefficients between predicted and observed expression level. Island = within or out of CpG island; Meth = average DNA methylation; AUC = area under ROC curve; ACC = accuracy of prediction; SENS = sensitivity; SPEC = specificity; PPV = positive predictive value; and NPV = negative predictive value. All genes include active, inactive, and partially active genes