Skip to main content
. 2002 May 1;30(9):2076–2082. doi: 10.1093/nar/30.9.2076

Table 1. Base-composition statistics for RNAs and genomes.

  No. of sequences used (G+C)% ρ(CG) (G–C)% difference (A–T)% difference
(A) Average RNA base-composition statistics          
M.jannaschii  48 63.1 (7.3) 0.75 (0.24)  8.1 (9.7) –3.3 (12.9)
Plasmodium  59 32.1 (7.2) 0.94 (0.56) 12.7 (6.3) –1.6 (4.1)
C.elegans  59 53.5 (8.2) 0.96 (0.23)  6.8 (10.1) –9.6 (11.4)
H.sapiens 186 48.7 (9.1) 0.60 (0.41)  7.5 (11.8) –5.8 (13.0)
(B) Genomic base-composition statistics          
M.jannaschii   31.4 (6.9) 0.34 (0.47)  1.4 (36.9) –0.34 (18.8)
P.falciparum Chr. II   20.0 (8.4) 0.75 (1.3)  0.73 (34.5) –1.7 (24.0)
C.elegans Chr. I   35.9 (8.8) 1.03 (0.68)  0.65 (25.0) –0.61 (19.6)

This table summarizes the differences in mean-value base-composition statistics between ncRNAs and the genomic background in M.jannaschii, P.falciparum and C.elegans. SDs are shown in parentheses. Statistics for RNAs of several Plasmodium species were averaged together since there are only a limited number of P.falciparum RNA sequences in the RNA databases. (A) Average base-composition statistics among ncRNAs. The low (32.1%) value for (G+C)% in Plasmodium in contrast to the high (>48%) (G+C)% value for the other genomes is striking. One also notes the positive (G–C)% values and negative (A–T)% values, possibly resulting from the occurrence of G-U ‘wobble’ pairs in the ncRNAs. (B) Base-composition statistics for three test genomes: M.jannaschii, P.falciparum chromosome II and C.elegans chromosomes (results for other C.elegans and P.falciparum chromosomes were similar—data not shown). One notes the differences in genome mean values from the RNA values of (G+C)% for M.jannaschii and C.elegans and for ρ(CG) for M.jannaschii. Data for dinucleotide frequencies other than ρ(CG) did not show systematic differences between RNAs and the genomic background (data not shown). Genomic (G–C)% and (A–T)% differences are seen to be very close to zero (as expected from ‘Chargaff’s Second Law’) which is different from the RNA values shown in (A). However, the table also shows the large SDs for (G–C)% and (A–T)% relative to their mean values, implying that using these differences to distinguish RNAs from the background would be difficult.