Abstract
As fusion detection NGS techniques are adopted by clinical labs, assay performance comparison is urgently needed. We compared four fusion-detection assay platforms on a pilot cohort of 24 prostate cancer samples: (1) Oncomine Comprehensive panel v3; (2) AmpliSeq comprehensive panel v3; (3) The solid tumor panel of FusionPlex; and (4) The human oncology panel of QIAseq. The assays were compared for the detection of different types of fusion based on whether the partner gene or the breakpoints are known. All assays detected fusion with known gene partners and known breakpoint, represented by TMPRSS2-ERG. A fusion with known partners but unknown breakpoint, TMPRSS2-ETV4, was reported by OCAv3 and FusionPlex, but not by AICv3 because the specific breakpoint was not in the manifest, nor by QIAseq since the panel did not target the exact exons involved. For fusion with unknown partners, FusionPlex identified the largest number of ETV1 fusions because it had the highest exon coverage for ETV1. Among these, SNRPN-ETV1 and MALAT1-ETV1, were novel findings. To determine reportability of low-level calls of highly prevalent fusions, such as TMPRSS2-ERG, we propose the use of percent fusion reads over total number of reads per sample instead of the fusion read count.
Keywords: NGS, fusion, assay comparison, prostate cancer, ETS genes
INTRODUCTION
The significance of gene fusions has been well recognized for both diagnosis and treatment of cancer, including hematologic malignancies and solid tumors(1–4). This highlights the importance of accurate detection of gene fusions for patient care.
Multiple techniques are used clinically to detect gene fusions. G-banding cytogenetics detects chromosomal translocations that may be used to infer gene fusions. Reverse transcription polymerase chain reaction (RT-PCR) is used to detect fusion transcripts if the breakpoints are well established and are concentrated to a small region that can be adequately amplified by routine polymerase. However, gene fusions with heterogeneous breakpoints and highly variable partners often occur, e.g. ALK, MLL, and FGFR1/2 rearrangements. A significant portion of clinically important gene rearrangements do not result in a known fusion transcript, e.g. MYC and EVI1 rearrangements. Immunohistochemistry (IHC) detecting the overexpression of the protein that results from a fusion event is sometimes used, e.g. CCND1 fusions and cyclin D1 (BCL1) IHC for mantle cell lymphoma. Fluorescent in situ hybridization (FISH), using either dual-fusion or break-apart probes, is most frequently used to detect gene fusions clinically. Although regarded as the gold standard, FISH can be challenged by complex and cryptic rearrangements. In addition, FISH is typically carried out as multiple rounds of reflex studies, which add cost and turn-around time to patient care.
Advancements in next generation sequencing (NGS) enables the assessment of numerous potential fusions in a single experiment. RNA-based NGS is currently the most common approach in multi-target fusion detection. The library preparation can be achieved through amplicon-based or capture-based methods. Empirically, for DNA-based NGS assays, capture-based methods are more labor-intensive, require more input material but can be used for larger panels compared with amplicon-based methods. Among amplificon-based methods, earlier versions could detect fusions only if both partners are targeted by the assay. These methods use the traditional PCR approach, where both forward and reverse primers are gene-specific. In the event of a fusion, the library will contain amplicons that span the known fusion junction. Fusion identification then relies on the ability of the analysis software to match sequence reads to a reference database of fusion junction sequences, known as the fusion manifest file. These are exemplified by the RNA based AmpliSeq panels (Table 1). In this study, we compared Oncomine™ Comprehensive Assay v3 (Ion Torrent™, ThermoFisher, CA) and AmpliSeq for Illumina Comprehensive Panel v3 (Illumina, CA).
Table 1:
Library prep, sequencing, and analysis methods
| OCAv3 by ThermoFisher (n=14) | AICv3 by Illumina (n=23) | FusionPlex by ArcherDX (n=23) | QIAseq by QIAGEN (n=24) | |
|---|---|---|---|---|
| Input RNA | 22.5 ng | 40∼100 ng | 100 ng | 50 ng |
| Panel | Oncomine™ Comprehensive Assay v3 (OCAv3) | AmpliSeq for Illumina comprehensive panel v3 (AICv3) | FusionPlex solid tumor | QIAseq Targeted RNAscan Human Oncology |
| Library preparation chemistry | AmpliSeq | AmpliSeq | Anchored Multiplex PCR (AMP™) | Single Primer Extension Chemistry (SPE) |
| Sequencing Instrument Chip/Flow cell Read setting |
Ion S5 XL 540 200 bp |
MiSeq V2 151/151 bp |
MiSeq V3 151/151 bp |
MiSeq V2 231/71 bp |
| Gene count on panel* | 51 driver genes | 51 driver genes | 53 | 225 |
| Recommended read count |
1.5 M | 300 K | 2.5 M | 7.5 M |
| Actual read count: Median (Min - Max) | 3.4 M (2.1 – 8.5 M) |
840 K (540 K – 1.5 M) |
2.4 M (1.1 – 3.3 M) |
2.7 M (2.3 – 3.4 M) |
| Analysis software | Ion Reporter | RNA Amplicon | Archer Analysis | Biomedical Genomic workbench |
A total of 19 genes were common among all panels, although with variable exon coverage: ALK, BRAF, ERG, ETV4, FGFR1, FGFR2, FGFR3, JAK2, KRAS, MYB, NRG1, NTRK1, PDGFRA, PDGFRB, PPARG, RAF1, RET, ROS1, and TMPRSS2.
Our comparison also included newer techniques that allowed open-ended PCR to detect fusions as long as one of the fusion partners is targeted by the assay. The Anchored Multiplex PCR (AMP™, ArcherDX, CO) method and the Single Primer Extension Chemistry (SPE, QIAseq, QIAGEN, Hilden, Germany) both use specific primers for one partner gene and universal primers for the other (Table 1): AMP™ by ArcherDX performs nested PCR while SPE by QIAGEN carries out one round of PCR. Bioinformatically, unlike the alignment process of traditional amplicon-based fusion assays, sequencing reads from openended PCR technologies are assembled without a reference but de novo as contigs; and the contigs are aligned with the reference genome/transcriptome to identify fusions. In addition, the PCR amplification of these two assays incorporate molecular barcodes, enabling the assays to determine the number of cDNA molecules, termed “unique reads”, that contain the fusion before PCR. The analysis process following AMP™ by ArcherDX also calculates the number of unique start sites that are represented by each gene specific primer as a result of the random priming on the opposite end of the cDNA fragment. Unique reads and unique start sites are important parameters to assess for library complexity of the whole sample as well as the accuracy of each fusion call. As these techniques are adopted by clinical laboratories, assay performance comparison is urgently needed. The current study compared the above four assays on the same set of specimens conducted in a blinded manner.
MATERIALS AND METHODS
Samples, library preparation, and sequencing
The study was approved by the institutional review boards at Fred Hutchinson Cancer Research Center (FHCRC) and the University of Washington (UW). The study used frozen metastatic tumor samples of men with castration-resistant prostate cancer (n=24, University of Washington Prostate Cancer Donor Autopsy Program). Samples were from individual patients and included 13 lymph node metastases, 8 liver metastases, and one each of diaphragm, spleen, and periaortic metastasis. The frozen samples had been stored at −80°C for 0 to 15 (median 7.5) years at the time of the RNA isolation. Following H&E review, the frozen tissue blocks were macro-dissected to achieve maximum tumor content. The tumor percentages of the tissue blocks were not recorded during macro-dissection. These were then processed with RNA STAT-60 (Tel-Test). RNA concentration was measured using Qubit™ RNA HS Assay Kit (ThermoFisher). Samples were tested on: (1) ThermoFisher platform - performed by the Lifelab using the Oncomine Comprehensive Assay kit (OCAv3) and sequenced on the S5 XL sequencer; (2) Illumina platform - performed by Illumina using AmpliSeq for Illumina comprehensive panel v3 (AICv3) and sequenced on a MiSeq and analyzed using RNA Amplicon; (3) ArcherDX platform - libraries were prepared in house using the FusionPlex solid tumor kit, sequenced on a MiSeq, and analyzed using Archer Analysis 5.1; and (4) QIAGEN platform- libraries were prepared using the QIAseq Targeted RNAscan Oncology panel, sequenced on a MiSeq and analyzed using QIAGEN CLC Biomedical Genomics Workbench. See Figure 1 for samples that were assessed by the assays and table 1 for detailed comparison of the methods. Based on the information provided by the manufactures, the ranges of the input RNA amount needed are as follows: 20 ng for OCAv3; 40 ng for AICv3; 20 ng for FusionPlex with the range being 10 to 250 ng; and 10 to 250ng for QIAseq.
Figure 1:
Fusions detected in the study and samples tested on each assay. Green cells in the top panel denote the presence of fusion. Grey dots in the lower panel indicate samples tested by each assay.
Design of the comparison
The current study aimed to compare the process end-to-end in terms of its ability to call different types of fusions rather than to compare individual components of each assay. To achieve this goal, these were considered: (1) the same RNA samples were tested on various assays; (2) we defined the scope of the comparison to be within the 19 genes in common for all assays, out of which, ERG, ETV1, ETV4, and TMPRSS2 were shown to be involved in fusions (Figure 1); (3) the calling criteria recommended by the manufacture were used to determine reportability of a given finding; and (4) in the case of TMPRSS2-ERG, false cutoffs for different assays were calculated using the same method.
Fusion calling criteria
A given call is presented for evaluation based on the following criteria per manufacturer’s recommendations: (1) for OCAv3, a fusion that aligns to a specific fusion isoform sequence in the manifest and has more than 40 reads or a fusion that aligns to a non-targeted fusion sequence and has more than 1000 reads; (2) For AICv3, the data analysis was performed at Illumina using RNA amplicon RNA Amplicon 1.1.3.0 with default metrics; (3) for Archer analysis, reads that passed strong-evidence filters, including fusion present in the Quiver, a database of known cancer-related fusions, or a fusion with more than five unique reads, three unique start sites, and 10% reads of gene specific primer 2 (GSP2); and (4) for QIAseq, a fusion with more than 15 supporting reads and a P-value smaller than 0.001.
NGS call verification
The TMPRSS2-ERG, TMPRSS2-ETV4, SNRPN-ETV1 and MALAT1-ETV1 fusions were verified using RT-PCR. Briefly, RNA samples were reverse transcribed using SuperScript IV Reverse Transcriptase (Thermo Fisher). cDNA was then PCR amplified using Platinum™ Taq High Fidelity DNA Polymerase (Thermo Fisher). The TMPRSS2-ERG was amplified with primers targeting exon 1 of TMPRSS2 (NM_005656.3; AGCTAAGCAGGAGGCGGA) and exon 4 of ERG (NM_004449; CGACTGGTCCTCACTCACAA; supplementary figure 1A). The TMPRSS2-ETV4 was amplified with primers targeting the above sequence of TMPRSS2 and exon 6/7 junction of ETV4 (NM_001986, GGCACTGGAGTAAAGGCACT). The SNRPN-ETV1 was amplified with primers targeting exon 1/2 junction of SNRPN (NM_022804.2; GCAAGGGATCGCTTACACCT) and exon 8/9 junction of ETV1 (NM_001163147.1; CATACAGCCTGAGTCATATGCAA). The MALAT1-ETV1 was amplified with primers targeting exon 1 of MALAT1 (NR_002819.2, TTCCGGTGATGCGAGTTGTT) and the above sequence of ETV1. The breakpoint sequences of TMPRSS2-ETV4, SNRPN-ETV1 , and MALAT1-ETV1 were fusion was verified using sanger sequencing. All three ETV1 fusions were verified using the ETV1 break-apart FISH (CytoTest, Rockville, MD; supplementary figure 1B), including a CytoGreen labeled probe (760 Kb) targeting 5’-ETV1 and a CytoOrange labeled probe (790 Kb) targeting 3’-ETV1. For each sample, a range of 25 to 50 intact and non-overlapping interphase nuclei were enumerated manually using a 100x oil immersion lens on a Zeiss Z1 microscope (Carl Zeiss Canada Ltd, Canada).
Establish false-positive cutoff for TMPRSS2-ERG fusion calls
Due to the high prevalence of TMPRSS2-ERG fusions in the cohort, low level calls were detected in some samples. We therefore established false-positive cutoffs using negative samples defined by RT-PCR. Out of the 12 RT-PCR negative samples, eight were tested on all four assays and were used to define false-positive cutoffs for: (1) OCAv3: % of fusion reads over total mapped reads (%TMR); (2) AICv3: % of fusion reads over total on-target aligned reads (%TAR); (3) FusionPlex: % of unique fusion reads over total unique RNA reads and % of unique start sites of a fusion call over total unique RNA start sites; and (4) QIAseq: % unique fusion reads over total uniquely mapped reads (Supplementary table 1). For all parameters, we used the mean plus three-time the standard deviation from the eight RT-PCR negative samples to define the false-positive cutoff.
Analysis of formalin-fixed and paraffin-embedded (FFPE)tissue using OCAv3 and FusionPlex
A multiplex FFPE Reference Standard (Horizon Discovery, UK), known to have five fusions, was tested on OCAv3 and FusionPlex. These fusions are CCDC6-RET, EML4-ALK, ETV6-NTRK3, SLC34A2-ROS1, and TPM3-NTRK1. RNA isolation was performed using ReliaPrep™ FFPE Total RNA Miniprep System (Promega, WI) following the manufacture’s recommended procedure. Library preparation was performed using the OCAv3 and FusionPlex solid tumor kit. The libraries were then sequenced using 530-chips on S5 XL sequencer (supplementary table 3).
RESULTS
The quality of RNA samples and libraries
The frozen samples yielded RNA specimens with integrity numbers (RIN score) ranging from 7 to 8.9 (Median = 7.5) with sufficient quantity for all assays. Libraries by OCAv3, AICv3, and FusionPlex contained numbers of reads that met or exceeded the manufactures’ recommendations (Table 1). Libraries by QIAseq had lower numbers of reads than the recommendation because of the high number of samples sequenced in parallel. Based on data from this study, this did not lead to missed calls by QIAseq.
ETS fusions detected in our study
In our cohort, five gene fusions were identified involving the ETS family transcription factors (Figure 1): TMPRSS2-ERG (n=12), TMPRSS2-ETV4 (n=1), MALAT1-ETV1 (n=1), SLC45A3-ETV1 (n=1), and SNRPN-ETV1 (n=1).
Three fusion-detection scenarios and fusion calls by each assay
The above fusions represent three scenarios in fusion detection: (1) Fusions with known partners with exons that are targeted by the assay and with breakpoints that are known to the assay, such as TMPRSS2-ERG; (2) Fusions with known partners with exons that are targeted by the assay but with breakpoints unknown to the assay, such as TMPRSS2-ETV4; and (3) Fusions with unknown partners, such as the three ETV1 fusions. The results are presented below for each scenario.
(1). Known fusion partners and known breakpoint
In this scenario, all assays successfully detected the fusion. TMPRSS2-ERG represents a fusion type with both partner genes targeted by the assays and isoforms present in the manifest file of OCAv3 and AICv3. All samples known to have this fusion by RT-PCR were shown consistently as positive by all assays (Table 2). OCAv3, AICvs, and FusionPlex targeted exons from both genes that were involved in fusion. QIAseq does not target exon 1 of TMPRSS2 (Table 2), so its detection of the fusion resulted from single primer extension from ERG only. The assays called multiple breakpoints for the fusion, but all samples contained the most well-known fusion isoform, which fuses between exon 1 of TMPRSS2 ([hg19] chr21:42880008) and exon 4 of ERG (chr21:39817544).
Table 2:
Detailed exon coverage of each assay and the detection of fusions with known partners and known breakpoint (TMPRSS2-ERG) or unknown breakpoint (TMPRSS2-ETV4)
| 5’ | 3’ | Detection | |
|---|---|---|---|
| TMPRSS2-ERG | TMPRSS2 (NM_005656.3) Exon 1 |
ERG (NM_004449) Exon 4 |
|
| OCAv3 | Exon 1–5 | Exon 2–6 | 100% (4/4) |
| AICv3 | Exon 1–5 | Exon 2–6 | 100% (12/12) |
| FusionPlex | Exon 1–6 | Exon 2–11 | 100% (12/12) |
| QIAseq | Exon 2–5 | Exon 2–11 | 100% (12/12) |
| TMPRSS2-ETV4 | TMPRSS2 (NM_005656.3) Exon 1 | ETV4 (NM_001986) Exon 3 | |
| OCAv3 | Exon 1–5 | Exon 3 | 1/1 |
| AICv3 | Exon 1–5 | Exon 3 | 0/1* |
| FusionPlex | Exon 1–6 | Exon 2, 3–10 | 1/1 |
| QIAseq | Exon 2–5 | Exon 9 | 0/1 |
Fusion reads were present in the fastq but not called out due to the absence of the isoform in manifest file (communication with Illumina)
(2). Known fusion partners but unknown breakpoint
In this scenario, all assays except AICv3 performed as expected based on the library prep chemistry and the analysis process, as shown by both negative and positive results. TMPRSS2-ETV4 represents a fusion type with known partners but unknown breakpoint. The fusion occurred between exon 1 of TMPRSS2 and exon 3 of ETV4, consistent with the literature(5). Both of the partner exons were targeted by three assays: OCAv3, AICv3, and FusionPlex (Table 2 and supplementary table 2), while neither exon was targeted by QIAseq. The basepair location of the breakpoint was different from the manifest file of the first two assays and therefore “unknown” (Figure 2). Due to the unknown breakpoint, the fusion was called out by OCAv3 but not AICv3. FusionPlex, using de novo assembly, also recognized the fusion. Further trouble-shooting confirmed that fusion reads were present in the fastq file generated by the AICv3 but not called by the software (communication with Illumina).
Figure 2:
The TMPRSS2-ETV4 fusion reported by OCAv3 and FusionPlex and verified using RT-PCR and sanger sequencing. (A) displays the fusion in Ion Reporter Genomic Viewer (IRGV) for sample 6. The marked horizontal line and the two segments on top denote the location of the fusion breakpoint known to the manifest. The nucleotide sequence used as the reference in the manifest is displayed at the bottom. The middle portion displays the alignment of all the fusion reads (light grey) in relation to the reference and examples of a few reads in light blue. (B) displays JBrowser view of the fusion breakpoint with the highest number of unique reads in sample 6: The “BED GSP2” track shows the position and direction of the target gene specific primers (GSP); The “BED Contigs” shows the contiguous consensus sequence; and the “BAM-SNPs/Coverage” track displays read coverage, which is followed by examples of individual sequencing reads that encompassed the fusion breakpoint; (C) displays sanger sequencing results of the RT-PCR amplicon targeting TMPRSS2-ETV4 and the fusion junction. All genomic coordinates are based on human genome build GRCh37/Hg19.
(3). Detection of fusions with unknown partners
In this scenario, the assays performed as expected based on the library prep chemistry and the analysis process, as shown by both negative and positive results. The three ETV1 rearrangement exemplifies the fusions occurring between a known gene (on the panel) and an unknown partner (not on the panel). In addition, the breakpoint on the known gene occurred at three different exon locations: ETV1 (NM_004956) fused with SLC45A3 (NM_033102), SNRPN (NM_022804.2), and MALAT1 (NR_002819.2) at exons 7, 8, and 9 of the ETV1 gene, respectively (Table 3). OCAv3 and AICv3 did not target both fusion partners of ETV1 and did not report the events. OCAv3 was tested on two of the samples, and AICv3 on all three samples with ETV1 fusions. FusionPlex detected all three ETV1 fusions due to its high exon coverage of the ETV1 gene. QIAseq detected the SLC45A3-ETV1 fusion via targeting exon 1 of SLC45A3. We verified all ETV1 fusion calls using ETV1 break-apart FISH and the two novel fusions, SNRPN-ETV1 and MALAT1-ETV1, with sanger sequencing following RT-PCR (supplementary figure 1B and 1C).
Table 3:
Detection of fusions with unknown partners
| Panel information Gene (exons) | SLC45A3(E1)-ETV1(E7) | SNRPN(E2)-ETV1(E8) | MALAT1(E1)-ETV1(E9) | |
|---|---|---|---|---|
| OCAv3 | SLC45A3 (E1); ETV1 (E4–5) | Not tested | No | No |
| AICv3 | SLC45A3 (E1); ETV1 (E4–5) | No | No | No |
| FusionPlex | ETV1 (E3–13) | Yes | Yes | Yes |
| QIAseq | SLC45A3 (E1); ETV1 (E12) | Yes | No | No |
Fusion breakpoint validation for TMPRSS2-ETV4
The unknown breakpoint in TMPRSS2-ETV4 led us to further investigate this fusion (Figure 2).
By OCAv3, the most prevalent fusion breakpoint was mapped to chr21:42,880,012 for TMPRSS2 and chr17: 41622739 for ETV4. Interestingly, a 4-bp homologous sequence was observed at the fusion junction between the two partners and a low-level subset of reads showed heterogeneous breakpoints within this region seen on the TMPRSS2 side (Figure 2A). FusionPlex mapped the fusion breakpoint to the same location but the basepair location was reported as chr21:42,880,008 for TMPRSS2 and chr17: 41622735 for ETV4 (Figure 2B), due to the 4-bp homologous sequence at the junction.
Sanger sequencing verified the TMPRSS2-ETV4 fusion break point (Figure 2C), including the 4-bp sequence at the junction that can be mapped to either gene. The sequencing results were notably noisier on the TMPRSS2 side of the fusion junction compared to the ETV4 side, corroborating the variable breakpoints observed in the TMPRSS2 gene by the NGS assays.
Establishment of false positive cutoff for TMPRSS2-ERG, a highly prevalent fusion in our cohort
Although the most common isoform of TMPRSS2-ERG fusion was successfully detected by all assays, low level reads were detected in RT-PCR negative samples (supplementary table 1). A highly prevalent fusion, with over-expression of the fusion transcript, may increase the background noise of sensitive detection methods such as NGS, potentially derived from low-level cross-contamination, which can occur during extraction as well as library preparation. A false positive cutoff was needed to determine reportability.
We used read-percentage from RT-PCR negative samples to define false positive cutoff and determine reportability. Because the absolute read count and the absolute unique read count fluctuate significantly with the total read count per sample, we first defined a denominator using the total read count or total unique read count per sample and then calculated the following for each sample (also see methods): % of fusion reads over total mapped reads (%TMR) for OCAv3, % of fusion reads over total on-target aligned reads (%TAR) for AICv3, % of unique fusion reads over total unique RNA reads and % of fusion unique start sites over total unique RNA start site for FusionPlex, and % unique fusion reads over total uniquely mapped reads for QIAseq (Supplementary table 1). The numbers from eight RT-PCR negative samples tested by all assays were used, and the average plus three times’ standard deviation was calculated as the false positive cutoff. Based on the cutoffs, the fusion was called in a total of 12 samples (Figures 3).
Figure 3:
The TMPRSS2-ERG fusion reported based on false positive cutoffs established using RT-PCR as a reference method. (A) OCAv3: percent of fusion read count over total mapped reads (%TMR); (B) AICv3: percent of fusion read count over total on-target aligned reads (%TAR); (C) FusionPlex: percent of unique fusion read count over total unique RNA reads; and (D) QIAseq: percent of unique fusion read count over total unique reads. The grey and black boxes under the X-axis denote the RT-PCR call for TMPRSS2-ERG fusion. Each black dot represents a sample. On each panel, the first group from the left with eight samples were used to define false-positive cutoffs, which were calculated as the average plus three times’ standard deviation. The cutoff value is noted as the grey dashed line. The second group (negative) and third group (positive) from the left demonstrated the validation of the cutoff on the remaining samples.
The false positive cutoffs clearly delineated RT-PCR negative vs positive samples, with the values in positive samples ranged between 3- to 50-fold compared to the cutoff. For OCAv3, the %TAR in RT-PCR negative samples ranged between 0 and 0.28% with the false-positive cutoff calculated as 0.35%, and the RT-PCR positive samples showed %TMR of 3.43% or higher (Figure 3A). For AICv3, the %TAR in RT-PCR negative samples ranged between 0.002% and 0.03% with the false-positive cutoff calculated as 0.04%, and the RT-PCR positive samples showed %TAR of 0.51% or higher (Figure 3B). Two parameters were assessible for FusionPlex: the % of unique reads in RT-PCR negative samples ranged between 0 and 0.004% with the false-positive cutoff calculated as 0.01%, and the RT-PCR positive samples showed % of unique reads of 0.56% or higher (Figure 3C); percentage of unique start sites of fusion reads among total unique RNA start sites in RT-PCR negative samples ranged between 0 and 0.02% with the false-positive cutoff calculated as 0.04%, and the RT-PCR positive samples showed % total RNA start site of 0.14% or higher (supplementary table 1). For QIAseq, the % of unique reads in RT-PCR negative samples ranged between 0 and 0.001% with the false-positive cutoff calculated as 0.002%, and the RT-PCR positive samples showed % of unique reads of 0.02% or higher.
The OCAv3 and FusionPlex detected expected fusions on FFPE reference standard
The FFPE reference standard was known to have five fusions, including CCDC6-RET, EML4-ALK, ETV6-NTRK3, SLC34A2-ROS1, and TPM3-NTRK1. As shown in supplementary table 3, OCAv3 targets all 10 genes involved in the fusions. An input amount of 20 ng RNA was used for the assay, which yielded 2.2 million reads. FusionPlex targets the six driver genes involved in the fusion: ALK, ETV6, NTRK1, NTRK3, RET, and ROS1. An input amount of 200 ng was used for the assay, which yielded 2.2 million reads. All fusions were detected by both assays, and the breakpoint locations were consistent between results of the two assays and the information provided by the manufacture of the FFPE reference standard (supplementary table 4).
DISCUSSION
RNA-based fusion NGS panels are beginning to be used clinically and assay comparisons are urgently needed. In the current study, we compared the calls by four commercial assays with different chemistry for the detection of different fusion transcripts. We used a pilot cohort of frozen tissues with high tumor content, which yielded RNA with decent quality and sufficient quantity. Our observation led us to select OCAv3 and the solid tumor of FusionPlex for further in-house clinical validation. This study did not compare all four assays on poor-quality RNA samples, such as those from FFPE tissues, although we achieved successful runs using OCAv3 and FusionPlex using a multiplex FFPE Reference Standard. The study did not include low-quantity specimens or samples with low tumor content. We also did not compare the different bioinformatics processes used by the four assays because of the significant differences in the panel design, library preparation, and sequencing platform; it cannot be a truly “fair” comparison although it could be meaningful technically. Rather, we took the approach of an end-to-end comparison to evaluate the assays in its totality.
Our comparison showed that all four assays successfully detected fusions with known partners and known breakpoints, exemplified by TMPRSS2-ERG. The intrachromosomal rearrangements involving the androgen regulated TMPRSS2 and ETS transcription factors are among the first recurrent fusions identified in commonly occurring carcinomas(6). The most common type of rearrangement involves the fusion of TMPRSS2 to the ERG oncogene, which has been reported in nearly half of prostate cancer patients. Examples of such oncogenic fusions with two well-known partners are also found in other cancers, such as BCR-ABL1 in chronic myeloid leukemia, PML-RARA in acute promyelocytic leukemia, and EML4-ALK in non-small-cell lung cancer. Although the TMPRSS2-ERG fusion is known to have heterogeneous breakpoints, the most common breakpoints were well represented among TMPRSS2-ERG positive samples in the current study, which enabled the successful detection of the fusion by all four assay platforms.
The difference of the platforms in detecting known fusion with novel breakpoints was noted during the identification of TMPRSS2-ETV4. The fusion breakpoint was at a location novel to the manifest of OCAv3 and AICv3. It was called by the former but not the latter. Fusion breakpoints are typically not precise, which presents challenges for fusion calling algorithms that are based on manifest files. Although one might consider updating the manifest for better detection in the future, it is important that the software has built-in algorithms to identify transcripts that are not identical but somewhat similar to the ones in the manifest file. This challenge is minimal for analysis algorithms using de novo assembly, hence the fusion was successfully called by FusionPlex.
The de novo assembly approach paired with anchored PCR or single primer extension enables the detection of fusions with one partner gene targeted by the assays. This was exemplified by the detection of ETV1 fusions in the current cohort. The ETV1 gene rearranges with multiple partners, the majority of which are unknown(7). The location of ETV1 exons involved in the rearrangements is also highly variable. Therefore, FusionPlex, due to higher exon coverage of ETV1, detected novel fusions. The three fusions detected were SLC45A3-ETV1, SNRPN-ETV1, and MALAT1-ETV1.
SLC45A3, encoding Solute Carrier Family 45 Member 3, also known as Prostein, was identified initially as a prostate cancer association gene(8). Due to its prostate-specific and androgen-inducible expression, SLC45A3 is considered a Class II 5’ partner gene for ETS genes in prostate cancer, and has been reported to rearrange with ERG, ETV1, and ETV4(9–11).
SNRPN is well known for its association with Prader-Willi Syndrome (PWS). It is an imprinted gene located within the PWS critical region on chromosome 15 and encodes two polypeptides: the small nuclear ribonucleoprotein polypeptide N and the SNRPN upstream reading frame (SNURF) polypeptide.
The highest expression of SNRPN is seen in brain and its role as a fusion partner gene in prostate cancer is unknown(12). It was reported by one study to have higher expression in hormone-refractory prostate cancers (HRPCs) than in hormone-sensitive prostate cancers (HSPCs)(13). To our knowledge, this is the first time that SNRPN/SNURF was reported as a fusion partner of ETV1.
MALAT1, also known as the Alpha gene, is transcribed as a noncoding RNA, the metastasis-associated lung adenocarcinoma transcript 1. The MALAT1 gene was first discovered as a transcript with increased expression in non-small cell lung cancers prior to metastasis(14). In recent years, it has been reported to show increased expression in prostate cancer compared with adjacent normal tissue(15). In mouse xenograft model, blocking MALAT1 expression using antisense oligonucleotides prevented metastasis formation after tumor implantation(16). MALAT1 is also the primary fusion partner of the T-cell transcription factor EB (TFEB) in the MiT family translocation renal cell carcinoma (tRCC). The MALAT1-TEFB fusion leads to promoter substitution so that the entire coding sequence of the TFEB gene is linked to the 5’ regulatory region of MALAT1(17). The fusion leads to upregulation of TFEB transcription and translation in the cancer cells. To our knowledge, this is the first time that MALAT1 was reported to form a fusion with ETV1.
From TMPRSS2-ERG, we observed that highly prevalent fusions with established breakpoints can challenge clinical reporting by appearing as low-level calls that may represent either a low-level subclone or cross-contamination. It is therefore important for clinical labs to establish false-positive cutoffs; we recommend using samples deemed negative by an orthogonal method, such as RT-PCR. Below the cutoffs, low-level subclone fusion cannot be distinguished from low-level cross contamination with certainty. Prostate cancer is known to be heterogeneous genetically. The possibility that the low-level calls represent true sub-clones cannot be entirely ruled out. This however may be less of an issue for metastasis. At the discretion of the clinical lab, the low-level calls may be considered unreportable or of unknown clinical significance. In addition, we chose to use percent of fusion reads over total reads per sample to establish the calling criteria. Future studies are needed to evaluate the utility of this method on highly prevalent fusions in other diseases. In addition, other reference methods may be explored, such as FISH and IHC, depending on the identity of the fusion. The difference in the false positive cutoff values observed between assays can be further investigated, but it may be attributed to variation in panel size, raw vs unique reads, and the read alignment process.
In summary, the current study compared RNA-based fusion assays from four vendors in terms of the detectability of different types of fusions on a pilot cohort of prostate cancer samples. Our study highlights the importance of cross-platform comparison before validating/implementing these assays clinically. The critical observations from the cross-platform comparisons are (1) heterogeneity in fusion breakpoint presents unique challenges in read alignment for fusion-detecting NGS assays; (2) Exon coverage needs to be evaluated carefully during panel selection and design; and (3) to determine reportability of low-level reads for highly prevalent fusions, false-positive cutoffs can be established using percent of fusion reads. Such approach can be used to augment the classic approach to establish limit-of-detection using serial dilutions commonly used in DNA-based NGS assays.
Supplementary Material
Highlights.
Four fusion-detection NGS tests compared
Fusion partners and breakpoints known vs unknown impacts detectability
OCAv3 and FusionPlex selected for further clinical validation
Check percent fusion read over total read count for reportability of low-level calls
ACKNOWLEDGEMENTS
We thank Colm Morrissey and Bryce Lakely of the department of Urology at University of Washington for providing the RNA samples used in this study. We thank the patients and their families, Celestia Higano, Evan Yu, Elahe Mostaghel, Heather Cheng, Bruce Montgomery, Mike Schweizer, Funda Vakar-Lopez, Lawrence True and the rapid autopsy teams for their contributions to the University of Washington Medical Center Prostate Cancer Donor Rapid Autopsy Program. We thank the following individuals and teams for processing samples and providing input for data analysis: Efren Ballesteros Villagrana and John B. Williamson (Thermo Fisher Scientific); Members of Illumina Solutions Center (Illumina); Francesco Lescai and Song Tian (QIAGEN).
Funding:
The Institute for Prostate Cancer Research and the Pacific Northwest Prostate Cancer SPORE P50CA097186 provided support for biospecimen collection and processing. This study was supported by an ArcherDX challenge grant and Hyundai hope scholars grant.
Disclosure/Conflict of Interest:
ArcherDX supplied the ArcherDX library preparation reagents through an ArcherDX challenge grant. ThermoFisher, Illumina, and QIAGEN supplied reagents and performed the library preparation, sequencing, and preliminary data processing of these panels respectively: Oncomine™ Comprehensive Assay v3 (OCAv3) by ThermoFisher, the AmpliSeq for Illumina comprehensive panel v3 (AICv3) panel, and the QIAseq Targeted RNAscan Human Oncology by QIAGEN.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
REFERENCES
- 1.Druker BJ, Sawyers CL, Kantarjian H, et al. Activity of a specific inhibitor of the BCR-ABL tyrosine kinase in the blast crisis of chronic myeloid leukemia and acute lymphoblastic leukemia with the Philadelphia chromosome. The New England journal of medicine 2001;344:1038–42. [DOI] [PubMed] [Google Scholar]
- 2.Soda M, Choi YL, Enomoto M, et al. Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature 2007;448:561–6. [DOI] [PubMed] [Google Scholar]
- 3.Delattre O, Zucman J, Plougastel B, et al. Gene fusion with an ETS DNA-binding domain caused by chromosome translocation in human tumours. Nature 1992;359:162–5. [DOI] [PubMed] [Google Scholar]
- 4.Parker M, Mohankumar KM, Punchihewa C, et al. C11orf95-RELA fusions drive oncogenic NF-kappaB signalling in ependymoma. Nature 2014;506:451–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Tomlins SA, Mehra R, Rhodes DR, et al. TMPRSS2:ETV4 gene fusions define a third molecular subtype of prostate cancer. Cancer research 2006;66:3396–400. [DOI] [PubMed] [Google Scholar]
- 6.Tomlins SA, Rhodes DR, Perner S, et al. Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science (New York, NY) 2005;310:644–8. [DOI] [PubMed] [Google Scholar]
- 7.Attard G, Clark J, Ambroisine L, et al. Heterogeneity and clinical significance of ETV1 translocations in human prostate cancer. British journal of cancer 2008;99:314–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Xu J, Kalos M, Stolk JA, et al. Identification and characterization of prostein, a novel prostate-specific protein. Cancer research 2001;61:1563–8. [PubMed] [Google Scholar]
- 9.Esgueva R, Perner S, C JL, et al. Prevalence of TMPRSS2-ERG and SLC45A3-ERG gene fusions in a large prostatectomy cohort. Modern pathology : an official journal of the United States and Canadian Academy of Pathology, Inc 2010;23:539–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Barros-Silva JD, Paulo P, Bakken AC, et al. Novel 5’ fusion partners of ETV1 and ETV4 in prostate cancer. Neoplasia (New York, NY) 2013;15:720–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Tomlins SA, Laxman B, Dhanasekaran SM, et al. Distinct classes of chromosomal rearrangements create oncogenic ETS gene fusions in prostate cancer. Nature 2007;448:595–9. [DOI] [PubMed] [Google Scholar]
- 12.Reed ML, Leff SE. Maternal imprinting of human SNRPN, a gene deleted in Prader-Willi syndrome. Nature genetics 1994;6:163–7. [DOI] [PubMed] [Google Scholar]
- 13.Tamura K, Furihata M, Tsunoda T, et al. Molecular features of hormone-refractory prostate cancer cells by genome-wide gene expression profiles. Cancer research 2007;67:5117–25. [DOI] [PubMed] [Google Scholar]
- 14.Ji P, Diederichs S, Wang W, et al. MALAT-1, a novel noncoding RNA, and thymosin beta4 predict metastasis and survival in early-stage non-small cell lung cancer. Oncogene 2003;22:8031–41. [DOI] [PubMed] [Google Scholar]
- 15.Ren S, Liu Y, Xu W, et al. Long noncoding RNA MALAT-1 is a new potential therapeutic target for castration resistant prostate cancer. The Journal of urology 2013;190:2278–87. [DOI] [PubMed] [Google Scholar]
- 16.Gutschner T, Hammerle M, Eissmann M, et al. The noncoding RNA MALAT1 is a critical regulator of the metastasis phenotype of lung cancer cells. Cancer research 2013;73:1180–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kuiper RP, Schepens M, Thijssen J, et al. Upregulation of the transcription factor TFEB in t(6;11)(p21;q13)-positive renal cell carcinomas due to promoter substitution. Human molecular genetics 2003;12:1661–9. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



