Figure 1.
Characterization of SSRs found in human coding and non-coding transcripts. (A) Representation rate of all triplet repeats in the human RNAome compared with their occurrence in the human genome. Tracts of at least 5 consecutive triplet repeats were analyzed in the genome, protein-coding transcripts, lncRNAs, pseudogenes and circRNAs. Positive and negative values represent the fold over- and under-representation, respectively. Repeats capable of forming stable RNA structures are marked with an asterisk. (B) Number of non-coding transcripts containing at least 5 consecutive disease-relevant SSRs. No RNAs were found for AUUCU, GGCCUG and GGGGCC repeat tracts. (C) Distribution of the lengths of tracts containing at least 5 consecutive disease-relevant triplet repeats in the genome and RNAome, * − P-value < 0.05; ** − P-value < 0.01, *** − P-value < 0.001. Boxes and whiskers represent the minimum and maximum values.