Skip to main content
. 2014 Jun 26;10(6):e1004407. doi: 10.1371/journal.pgen.1004407

Figure 4. Modeling intron features uncovers design principles and allows the prediction of gene expression in a synthetic system.

Figure 4

A) Sequence based predictor of gene expression assembly process: In every iteration the feature contributing the highest correlation to the reporter expression measurements was added. The first eight features and their description are presented. B) Bar diagram of the predictor's cumulative correlation with expression levels of YiFP variants as a function of the number of added features. C) A predictor function based on 3, 13, or 38 features was able to explain 49%, 77% and 90% of gene expression variation, respectively. (for 13 features: p<2.2e-16; empirical p<5e-03); D) Cross validation of the predictor assembly method using training and test sets, with 80% and 20% of introns respectively, demonstrated a predictive power of 50% (for >15 features: 0.37<r<0.5; p<3.6e-02). E) A new predictor assembled using strains with introns inserted to several locations in the YFP maintains 80% of the model's predictive power (r = 0.38; p = 0.036), suggesting that although some of the regulatory splicing information is not located in intronic regions, our methodology is able to predict intron regulation under several exon contexts.