Figure 1.
Quantification of exon usage. (A) Exemplary gene model in the reference genome (green) and alignments of RNA-seq reads (upper panel). Sequenced fragments whose alignments fall fully into an exonic region are shown by a gray box; alignments that map into two (or more) exonic regions are shown by shorter gray boxes connected by a horizontal line. For a particular exon (highlighted in orange), we consider two strategies to quantify its usage, as illustrated in Panels (B and C) (see ‘Materials and Methods’ section for the formal description). The first strategy is illustrated in Panel B, where sequenced fragments are counted into two groups: those that map fully or partially to the exon (λ) and those that map to the rest of the exons (ϵ). θREUC is defined as the ratio between λ and ϵ, and the REUC for the exon in sample j is computed as the ratio between θREUC in that sample to the mean θREUC across all samples. Panel C illustrates the second strategy, where sequenced fragments are also counted into two groups: those that map fully or partially to the exon (λ) and those that align to exons both downstream and upstream of the exon under consideration (ρ). The latter represent transcripts from which the exon was spliced out. θRSIC is then defined as the ratio between λ and ρ. The relative spliced-in coefficient (RSIC) for the exon in sample j is the ratio of θRSIC in this sample to the mean θRSIC across all samples. Note that while differences in exon usage due to alternative splicing are reflected in both REUCs and RSICs, differences due to alternative transcription or termination are only reflected in REUCs. (D) Heatmap representations of the REUCs for three exonic regions (E004, E005 and E006) of the gene 5-Aminolevulinate Synthase 1, computed using subset A of the GTEx data. The rows of the heatmaps correspond to the eight tissues, and each column corresponds to one individual. The horizontal color patterns of exon E005 indicate elevated inclusion of cerebellum and cerebellar cortex as compared to the rest of the brain cell types. (E) RNA-seq samples from two cell types (cortex and cerebellum) from individual 12ZZX (also indicated by the arrows below each heatmap in Figure 1D) are displayed as sashimi plots. The three exonic regions presented in Panel D are shown. The middle exon, E005, is an untranslated cassette exon (ENSEMBL identifier ENSE00002267562) that is spliced out more frequently in cortex than in cerebellum.
