Skip to main content
. 2022 Nov 10;13:6803. doi: 10.1038/s41467-022-34568-z

Fig. 2. Characterization of unannotated transcripts identified in cancer cell lines.

Fig. 2

a Composition of different unannotated transcript types. match_Refjunction: multi-exon with at least one junction match; contain_Ref: containment of reference (reverse containment); intergenic: no overlap with annotated genes; retain_Refintron: retained intron(s), all or partial introns matched or retained; within_Refintron: fully contained within a reference intron; overlap_Refexon: other same strand overlap with reference exons. b Bar plots show the numbers of transcripts matching one single genes (non-readthrough transcripts) or matched more than one gene (readthrough transcripts). c Associations between unannotated transcripts and hallmarks. Color bars represent different hallmarks, and bar lengths indicate the number of associated unannotated transcripts. The characters in inner circle indicate the chromosomes. Red links indicate positive correlations, while blue links indicate negative correlations. d The heatmap shows representative unannotated transcripts from different gene types, including protein-coding genes, lncRNAs, intergenic genes, and pseudogenes. Bar plots on the right represent the numbers of significant cancer types for each transcript in the Cox proportional hazards model, differential expression analysis, and association analysis of tumor stages. e Identification of the unannotated transcript AC092803.3-u1 in A2780 cells by 3′ RACE and Sanger sequencing. f Comparison of the expression levels of AC092803.3-u1 and AC092803.3-a1 across cancer cell lines (n = 1017). P, two-sided Wilcoxon’s rank-sum test p-value. Each box represents the IQR and median of expression for each transcript, whiskers indicate 1.5 times IQR. g Comparison of survival risk and expression levels between AC092803.3-u1 and AC092803.3-a1 transcripts in individual tumor types (n = 9 paired tumor and normal samples for CHOL, n = 72 for KIRC, n = 43 for HNSC). Each box represents the IQR and median of expression in each sample group, whiskers indicate 1.5 times IQR. Log-rank test for survival analysis, two-sided Student’s t test for differential expression analysis.