Skip to main content
. 2001 Apr 1;29(7):1464–1469. doi: 10.1093/nar/29.7.1464

Table 4. Over- and under-represented sequences in the IL sample.

IL under-represented oligonucleotides
4mer cagt(0.80) ctga(0.81) gatg(0.81) tcag(0.81) tgga(0.83)
5mer acctt(0.61) agcat(0.70) cacag(0.72) cagga(0.78) cagtg(0.74)
  ctcag(0.67) ctgac(0.61) ctgga(0.80) gaggt(0.69) gattg(0.58)
  gctga(0.78) ggagt(0.64) ggcac(0.66) gtgga(0.76) tcagg(0.71)
  tgcct(0.65) tggac(0.74) ttgcc(0.58)    
IL over-represented oligonucleotides
4mer cgaa(1.36) cgac(1.47) cgag(1.27) cgca(1.58) cgcg(2.49)
  cggc(1.51) cggt(1.46) ctcg(1.66) gccg(1.64) gcga(1.80)
  gcgc(1.94) ggcg(1.79) tcgc(1.48)    
5mer aaaaa(1.58) aaaag(1.38) aagcg(1.49) accgc(1.67) acgac(1.60)
  acgcg(2.22) actcg(1.77) agaaa(1.46) ccgaa(1.63) ccggt(2.17)
  cgacg(2.33) cgcag(1.65) cgccg(2.53) cgcga(2.95) cgcgc(3.17)
  cgcgg(2.20) cgctc(1.59) cggcg(2.36) cggtg(1.76) ctagc(1.84)
  ctcgc(2.31) ctcgt(1.68) gaaag(1.44) gccga(1.70) gccgg(1.54)
  gcgaa(2.13) gcgag(1.67) gcgat(1.99) gcgca(1.95) gcgcg(3.22)
  gcggc(1.85) gctcg(1.92) ggccg(1.71) ggcga(1.69) ggcgc(2.20)
  ggcgg(1.63) gggcg(1.85) gtgcg(1.83) taggg(1.94) tcgcg(3.86)
  tctcg(1.58) tgcga(1.90)      
IL under-represented codons with context
codon_N1 aca g(0.74) gat g(0.76) tca g(0.66) tcc a(0.80)  
codon_N1N2 aca gg(0.49) acc tt(0.65) cag ga(0.66) cca ga(0.58) cct ga(0.62)
  cgg at(0.26) ctg ac(0.62) ctg ca(0.63) gct ga(0.68) gct ta(0.27)
  ggc ac(0.64) gtg at(0.56) gtg ga(0.72) tac at(0.61) tcc at(0.54)
IL over-represented codons with context
codon_N1 aga a(1.62) aga g(1.51) agg g(1.56) atg t(1.32) ccc g(1.59)
  ccg a(1.95) cgc a(1.62) cgc g(2.18) ctc g(1.57) gcc g(1.82)
  gcg c(2.10) gcg g(1.91) ggc g(2.22) tcg g(1.66)  
codon_N1N2 acc gc(2.06) aga aa(1.85) aga ga(1.83) cac gc(2.00) ccg aa(2.46)
  ccg ag(2.29) ccg gt(2.90) cga cg(3.66) cgc ac(2.12) cgc ag(1.96)
  cgc cg(2.08) cgc ga(3.19) cgc gg(2.63) ctc gt(1.75) gcc ga(1.73)
  gcg aa(2.65) gcg cc(2.25) gcg cg(3.51) gcg gc(2.44) gcg gt(2.00)
  ggc ga(1.78) ggc gc(3.55) tca cg(2.21) tcg cg(4.01) tcg gg(2.53)
  tgc ga(2.45) tta gg(2.48)      

List of all 4–5mer oligonucleotides and codons with N1 and N1N2 context over- or under-represented in the IL sample compared to all the 200 random IC subsets.

The number in parentheses (k) beside every motif represents the corresponding relative abundance in the IL sample compared to the whole IC sample and was calculated using the formula: k = NIL/NIC × LIC/LIL, where NIL and NIC are the occurrences of the examined sequence in the IL and IC samples, respectively, and LIL and LIC are the sizes of the samples.

It should be noted that putative exonic splicing enhancers are among the IL under-represented sequences and putative exonic splicing silencers are among the IL over-represented.