Table 1.
Indices for codon usage analysis
Index | Principle | Mathematical formula | Scores | Reference |
---|---|---|---|---|
P2 index | Quantifies the proportion of codons that conform to the intermediate strength of the codon–anticodon interaction energy | where W = A or U, S = G or C, Y = C or U and A/C/G/U are the nucleotide composition of codon triplets | Under uniform codon usage P2 is equal to 0.5. A value more than 0.5 indicates strong codon bias and less than 0.5 indicates no codon bias | Gouy and Gautier (1982) |
Relative synonymous codon usage (RSCU) | Calculated as the ratio of the observed frequency of a codon to the expected frequency of that codon, assuming uniform codon usage | where gij is the observed number of ith codon for the jth amino acid, which has ni kinds of synonymous codons | The synonymous codons with RSCU values of 1 indicates no codon usage bias for that amino acid and the codons are chosen equally or randomly. RSCU values above 1 indicate positive codon usage bias and RSCU values below 1 indicate negative codon usage bias | Sharp et al. (1986) |
Effective number of codons (ENC) | Measures how far the codon usage of a gene stays from the equal usage of synonymous codons | where k (k = 2, 3, 4, and 6) is the average of k values for k-fold degenerate amino acids and can be estimated by the formula: where m is total number of codons for that amino acid and where mi is the number of occurrences of the ith codon for this amino acid | ENC values range between 20 and 61; a value of 20 indicates an extremely biased gene that uses only one codon for each amino acid, while a value of 61 indicates an unbiased gene | Ikemura (1981) |
Chi-squared index | Calculates the divergence of the observed data from the values that would be expected under the null hypothesis of no association between observed and expected data | where oij is the number of occurrences of the jth codon for the ith amino acid, ei is the expected usage of the jth codon under conditions of equal synonymous codon usage, and fi is the degeneracy of the codons for ith amino acid | The probability (P) of codon occurrence for a particular amino acid is estimated from the chi-square distribution (upper tail) based on calculated χ2 value. If the value of P is less than 0.05, then the codon cannot occur for that particular amino acid and hence the null hypothesis cannot be accepted | Shields et al. (1988) |
Frequency of optimal codons (Fop) | Calculated as the ratio of the frequency of optimal codons in a gene to the total number of synonymous codons based on a specific reference gene set | where Nopt = number of optimal codons and Ntot = number of synonymous codons | Fop values range between 0 and 1.0: a value of 0 indicates that there is no optimal codon, and a value of 1.0 indicates that a gene is entirely composed of optimal codons | Ikemura (1985) |
Codon adaptation index (CAI) | Quantifies the geometric mean of the relative adaptiveness for each codon with respect to the codon usage of a reference set of highly expressed genes and is calculated based on RSCU values | where ωi is the relative adaptiveness of codon i, fij is the frequency of codon i encoding amino acid j, and L is the length of the gene | CAI values range between 0 and 1: a value of 0 indicates random codon usage and low expression level of the gene, whereas a value of 1 suggests extreme codon bias and potentially high expression level of the gene | Sharp and Li (1987) |
Codon bias index (CBI) | Measure of codon usage bias based on the codon usage of a specific reference set of genes | where Nopt = number of optimal codons, Nran = number of optimal codons and Ntot = number of synonymous codons | CBI values range between 0 and 1: a value of 0 indicates random codon usage, whereas a value of 1 suggests extreme codon bias | Bennetzen and Hall (1982) |