Skip to main content
. Author manuscript; available in PMC: 2019 Mar 19.
Published in final edited form as: Biochemistry. 2018 Dec 21;58(11):1539–1551. doi: 10.1021/acs.biochem.7b01069

Figure 4. Predictive modeling of σ70 promoter strength.

Figure 4.

A) We trained a log-linear model on 50% of the data, and the resultant predictions on the remaining data explain approximately 80% of the variance in expression within our dataset. B) We analyzed the model by ANOVA and found that approximately 73.7% of variance in promoter expression can be explained by the −10 and −35 elements (and their interaction). C) We also trained a simple neural network model and found that the resultant predictions captured an estimated 95.5% of the promoter variance, indicating that these models are better able to capture more complex interactions between sequence elements. D) We trained the same neural network models with 10-fold cross-validation and show that we can effectively predict promoter expression when trained on as little as 5% of the data. In 4A, 4C, and 4D, R2 is the coefficient of determination between predicted and actual expression values on the held-out datasets.