Skip to main content
. 2018 Oct 5;14(10):e1006451. doi: 10.1371/journal.pcbi.1006451

Table 3. Linear model results of log2OR ~ GC content + TFBS frequency + GC content × TFBS frequency + ε.

Dataset Region Variable Estimate Standard error P-value SSR
Human
Enhancer
(n = 4321,
R2 = 0.127)
GC contents 0.56 0.024 < 1×10−15 0.11
TFBS frequency 0.44 0.11 0.00012 0.003
GC × TFBS NS
Promoter
(n = 1342,
R2 = 0.360)
GC contents 6.40 0.30 < 1×10−15 0.21
TFBS frequency 22.65 2.74 < 1×10−15 0.03
GC × TFBS -41.84 4.61 < 1×10−15 0.04
Mouse Enhancer
(n = 4423,
R2 = 0.0287)
GC contents 0.25 0.026 < 1×10−15 0.020
TFBS frequency 0.87 0.097 < 1×10−15 0.018
GC × TFBS NS
Promoter
(n = 1615,
R2 = 0.372)
GC contents 5.23 0.23 < 1×10−15 0.21
TFBS frequency 18.79 1.89 < 1×10−15 0.038
GC × TFBS -37.82 3.09 < 1×10−15 0.059

We used LASSO-selected species sequence determinants for these analyses. NS indicates that the interaction terms were not statistically significant at P = 0.05. In such cases we conducted log2OR ~ GC content + TFBS frequency + ε model instead of the original model. Numbers of sequence determinants, R2 values of the models, and Type III partial sum of square in regression (SSR) for each variable are also provided.