Abstract
The relationship between gene length and synonymous codon usage bias was investigated in Drosophila melanogaster, Escherichia coli and Saccharomyces cerevisiae. Simulation studies indicate that the correlations observed in the three organisms are unlikely to be due to sampling errors or any potential bias in the methods used to measure codon usage bias. The correlation was significantly positive in E.coli genes, whereas negative correlations were obtained for D. melanogaster and S.cerevisiae genes. When only ribosomal protein genes were used, whose expression levels are assumed to be similar, E.coli and S.cerevisiae showed significantly positive correlations. For the two eukaryotes, the distribution of effective number of codons was different in short genes (300-500 bp) compared with longer genes; this was not observed in E.coli. Both positive and negative correlations can be explained by translational selection. Energetically costly longer genes have higher codon usage bias to maximize translational efficiency. Selection may also be acting to reduce the size of highly expressed proteins, and the effect is particularly pronounced in eukaryotes. The different relationships between codon usage bias and gene length observed in prokaryotes and eukaryotes may be the consequence of these different types of selection.