Table 3.
sequence set | # bases | mean # of each triplet | G/C content | # significant codon usage changes | codon usage towards At | codon usage away from At | codon over represented | codon under represented | significant changes per aa |
At mRNAs | 10,755,859 | 56,020 | 43.32 | n.a. | n.a. | n.a. | n.a. | n.a. | n.a. |
Pp ORFs | 7,638,122 | 39,782 | 49.94 | n.a. | n.a. | n.a. | n.a. | n.a. | n.a. |
retained genes | 77,998 | 406 | 50.30 | 7 | 1 | 6 | 2 | 5 | Phe under represented |
paralogs | 1,115,937 | 5,812 | 50.07 | 3 | 1 | 2 | 1 | 2 | none |
orthologs | 953,293 | 4,965 | 49.04 | 10 | 8 | 2 | 4 | 6 | Pro under reprensented |
sum | 10 | 10 | 7 | 13 |
The predicted Physcomitrella ORF were used as background to check for significant changes in percentage codon fraction usage in the orthologs, paralogs and retained genes (best BLAST hit not among plants). In case of significant deviation (two times average absolute deviation – AAD) from the total set, the direction of the change relative to the Arabidopsis codon usage was checked. Significant deviations are shown enlarged, At = Arabidopsis thaliana, Pp = Physcomitrella patens.