Skip to main content
. 2020 Aug 25;10:14169. doi: 10.1038/s41598-020-71040-8

Figure 1.

Figure 1

Schematic illustration of the gene fusions, workflow, and the number of gene fusions in human cancer. (A) Schematic description of gene fusions formation. Fusions are formed mainly via balanced and unbalanced chromosomal rearrangements, such as translocations, deletions, inversions and insertions. This usually leads to formation of a fusion gene with the 5′ end of Gene 1 and 3′ end of Gene 2. If the fusion occurs between two protein coding genes, depending on whether the reading frame is violated, and where exactly the fusion occurs, a fusion protein may be transcribed with features and domains from both partners. Other possible outcomes include full or truncated 3′ gene under the control of the promoter of the 5′ gene. (B) Workflow used in this study. Analysis progressed from the total set of fusions discovered by the DEEPEST method22 and moved towards more specific kinase / TF containing, protein producing oncofusions. We started with TCGA data-based fusion set from Dehghannasiri et al. (2019), for which we generated protein sequences with AGFusion. Domains were added by matching sequence to Uniprot proteins annotated with Pfam domains, after which non-unique entries were dropped. Fusions were classified as protein producing, if both gene fragments were predicted to produce > 30 AA of protein sequence. From this set, the two most prominent protein groups were protein kinases and transcription factors, and thus we focused further analysis on the 1,811 unique protein kinase or transcription factor containing fusions, using the full protein producing fusion set for comparison. Known interactions for wild type fusion proteins were obtained from IMEx consortium, and used for estimating maximal foreseeable effect on signaling pathways from Reactome. Finally, TCGA gene expression quantification data was used to probe observable effects of kinase/TF fusions, using other protein producing fusions as background. (C) Top: Breakdown of samples and fusion mutations by TCGA project. Largest single contributor of samples with fusions was TCGA breast invasive carcinoma project (BRCA), which had the highest number of samples and identified fusion mutations. Bottom: Proportion of protein producing fusions that include PK or TF genes.