Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2025 Jul 1;15:20740. doi: 10.1038/s41598-025-08039-6

A targeted next-generation sequencing panel for identification of clinically relevant mutation profiles in solid tumours

Kakoli Das 1,, Mandy Li Ian Tay 1, Elena Yaqing Yong 1, Khoon Leong Chuah 1
PMCID: PMC12214661  PMID: 40593313

Abstract

Targeted next generation sequencing (NGS) using multigene panels has become an effective tool for comprehensive genomic analysis in cancer, overcoming limitations of single gene assays. Nonetheless, outsourcing these assays to external laboratories and the extended turnaround time (~ 3 weeks) required for obtaining results may impede timely clinical management of cancer patients. We developed an oncopanel targeting 61 cancer-associated genes and validated its efficacy by performing NGS on 43 unique samples including clinical tissues, external quality assessment samples, and reference controls. The assay detected 794 mutations including all 92 known variants from orthogonal methods. Overall performance measures of the assay showed 99.99% repeatability and 99.98% reproducibility. Likewise, sensitivity to detect unique variants was 98.23%, with specificity at 99.99%, precision at 97.14% and accuracy at 99.99% all at 95% CI. Notably, clinically actionable mutations were observed in key genes such as KRAS, EGFR, ERBB2, PIK3CA, TP53 and BRCA1. The average turnaround time from sample processing to results was reduced to 4 days. These findings demonstrate a sensitive, high throughput oncopanel that is suitable for use in routine clinical testing. The shorter turnaround time of the assay has the potential to significantly improve patient care by facilitating more timely and personalized clinical interventions.

Keywords: Targeted sequencing, Cancer, Genomics, Somatic mutation, Oncopanel, NGS

Subject terms: Medical research, Oncology

Introduction

Classic histo-morphological diagnosis by immunohistochemistry and fluorescence in-situ hybridization is undoubtedly the gold standard in pathology. However, it is often necessary to characterize a tumour and identify genomic biomarkers for better clinical solutions. While single gene mutation assays13 in our laboratory are used for specific mutations in lung and colorectal tumours, they have limitations in comprehensive tumour profiling, detailed report with treatment options, and conserving tissue samples for future testing.

In the genomic era, rapid, accurate, and cost-effective genomic assays are crucial for precision medicine4,5 and is preferred over the multiple single-gene mutation assays. Genotype-based therapeutic approaches such as NGS has enabled the identification of actionable cancer mutations, becoming a preferred decision-making tool over multiple single-gene test. Panel-based targeted NGS is key for effective treatment, improving detection rates for personalized therapy with higher coverage and confidence in identifying somatic mutations compared to whole exome or genome sequencing, which may result in many variants of uncertain significance6,7.

The targeted NGS comprises (a) a target enrichment method to prepare the DNA libraries, which can be amplicon-PCR based8 or hybridization-capture based with custom designed biotinylated oligo-nucleotides9 (b) a sequencing chemistry, either ion semiconductor sequencing chemistry10 or a sequencing-by-synthesis chemistry11,12 and (c) a sequencing platform that can be either Ion S5 (ThermoFisher Scientific, CA, USA) or MiSeq (Illumina, USA) benchtop sequencer. Both platforms are efficient with benefits and drawbacks on the validated gene panels, the type of tissues processed, and the DNA extraction methods used prior to sequencing13.

Our laboratory currently relies on external providers for NGS testing, which, although effective, has a turnaround time (TAT) of about 3 weeks and incurs high costs due to overseas sample shipping-challenges that impact cancer patients needing urgent care. Additionally, we have experienced a recent increase in NGS test requests. While we aim to deliver rapid, reliable, in-house genomic services, we have faced difficulties optimising commercial gene panels with our sequencing platform, coupled with the high cost of NGS kits. Therefore, we decided to develop a custom pan-cancer panel focusing on genes with frequently altered regions in cancer patients to enhance diagnostic and treatment strategies. The cost of the NGS assay and bioinformatics analysis was relatively lower than the NGS assay performed by the external laboratories including the shipping charges.

We applied a hybridization-capture based DNA target enrichment method using library kits (Sophia Genetics, Saint-Sulpice VD, Vaud, Switzerland), compatible with the automated MGI SP-100RS library preparation system (MGI Tech, Shenzhen, China). It is an open platform that supports third-party kits and offers faster, reliable, and efficient processing with reduced human error, contamination risk, and greater consistency compared to manual library preparation method14. Sequencing was performed using the MGI DNBSEQ-G50RS sequencer with a cPAS sequencing technology14 for precise sequencing with high SNP and Indel detection accuracy15. The TTSH-oncopanel’s performance was assessed with Sophia DDM™ software, which uses machine learning for rapid variant analysis and visualization of mutated and wild type hotspot positions. The software connects molecular profiles to clinical insights through OncoPortal™ Plus, classifying somatic variations by clinical significance in a four-tiered system16. The TAT of the assay17 from sample processing to report generation was performed within only 4 days.

We evaluated the oncopanel performance in three steps: (a) analyzing sequencing quality using reference standards and tumour samples, (b) assessing the somatic mutation landscape in 40 diverse tumour specimens for reliability and concordance with other NGS methods, and (c) examining the clinical relevance of the TTSH-oncopanel for routine clinical testing.

Results

Results of variant analysis

Sequencing quality metrics

Based on the requirements for performance measures developed by Sophia DDM™, the sequencing runs met the expected ranges for the quality assessment of the sequencing data. Average percentage of processed reads in the four sequencing runs with average base call quality ≥ 20 was > 99%, and within the expected range (85–100%) of quality assessment of the data (Table 1). The percentage of target regions covering at least 25× to 1000 × molecular coverage was analysed and the average percentage of target region with coverage ≥ 100 × unique molecules was > 98%, within the expected range (95%-100%). The coverage 10% quantile metric ranged between 251×–329 × across the four sequencing runs. Likewise, the median percentage of target region within 0.2 × and ≤ 5 × coverage was also > 98% and the median coverage uniformity was > 99% in each of the four sequencing runs (Table 1). No known mutational hotspots were associated with the regions of low range coverage (< 0.2×). The median read coverage was 1671× (469×–2320×) and the median read length was 144 bp (112–179 bp) for all the samples. Mapped read count of the samples ranged between 2.2 and 7.1 million reads (Supplementary Table S1).

Table 1.

Sequencing quality control metrics showing the mean and median percentage across the four sequencing runs with 16 samples in each run.

Sample quality metrics Run1 (n = 16) Run2 (n = 16) Run3 (n = 16) Run4 (n = 16)
Median (range)
Coverage uniformity 99.97% (99.74–99.98%) 99.83% (98.64–99.97%) 99.88% (96.29–99.92%) 99.89% (93.82–99.97%)
Median read coverage 2102 (1627–2257) 2234 (1184–2313) 1169 (469–2320) 1563 (924–1700)
On target reads 78.59% (72.41–79.38%) 75.98% (62.57–79%) 76.92% (69.34–82.32%) 80.15% (76.78–82.32%)
Number of mapped reads 6,835,792 (6,169,212–6,954,418) 6,891,007 (5,910,302–7,120,014) 4,447,429 (2,298,866– 6,970,342) 5,936,997 (5,684,822– 6,072,324)
Mean (expected range)
Reads with average base call quality ≥ 20 97.99% (75–100%) 97.44% (75–100%) 97.46% (75–100%) 97.10% (75–100%)
Processed reads with average base call quality ≥ 20 99.48% (85–100%) 99.39% (85–100%) 99.38% (85–100%) 99.39% (85–100%)
Target region coverage ≥ 100x 99.95% (95–100%) 98.38% (95–100%) 99.82% (95–100%) 99.65% (95–100%)
Coverage 10% quantile metric across target regions 329× (≥ 250×) 298× (≥ 250×) 313× (≥ 250×) 251× (≥ 250×)
Target region with coverage < 0.2 × median coverage 0.06% 1.05% 0.20% 0.20%
Target region within 0.2x < and < 5 × median coverage 99.93% 98.94% 99.70% 99.53%
Target region with coverage > 5 × median coverage 0% 0% 0.09% 0.26%
Target region that fulfils screened position criteria 100% (95–100%) 100% (95–100%) 100% (95–100%) 100% (95–100%)

The five variant categories were then determined based on the characterized and confirmed sample data sets for all the target regions covered by the panel. A total of 593 TP (534 SNPs and 59 INDELs) and 339661 TN were identified in the nine characterized samples (Supplementary Table S2). The observed quality metrics at 95% CI was 97.14% sensitivity, 99.99% specificity, 97.14% precision and 99.99% accuracy. A total of 738 TP (640 SNPs and 98 INDELs) and 2233 EV (2054 SNPs and 179 INDELs) were identified with no FP or FN results (Supplementary Table S2) in all the 64 samples. A sensitivity of 98.23% was observed at 95% CI for all these samples.

Analytical performance

The amount of DNA material required for the NGS assay was determined by titrating HD701 at varying concentrations (10–100 ng) and sequenced. NGS detected all the 13 mutations within the expected allelic frequencies when the DNA input was ≥ 50 ng, although two of the EGFR mutations were of low quality, whereas only eight mutations were detected when the DNA input was ≤ 25 ng (Supplementary Table S3a). Therefore, ≥ 50 ng of DNA input was confirmed as the requisite amount for targeted sequencing using the TTSH-oncopanel.

For the limit of detection of the NGS assay, we detected 100% sensitivity with > 3.0% VAF in the undiluted sample compared with the expected VAFs of HD701 (Supplementary Table S3b). The detected variants showed reduced VAFs across the diluted series. Below the VAF of 2.9%, a greater percentage of variants were either of low quality with high background noise or not determined. The minimum detected VAF was therefore determined as 2.9% for an SNV and 2.9% for INDEL.

Replicate analysis

Assay reproducibility (inter-run precision) was assessed by comparing the first replicate of the 15 unique samples with the second replicate, and it was determined that the detected variants, and their variant fractions exhibited remarkable consistency between the replicate algorithm runs. A few inconsistent variants were however observed during the analysis which were filtered out (Supplementary Table S4a). For example, two out of three replicates in HD701 exhibited the variant EGFR L858R with a VAF below 2.9% and consequently these were excluded from analysis while only one sample of the three was retained as a confirmed variant. Similarly, a small repeat region of the gene KIT was included in the SG060 replicates as an INDEL in one and a SNV in the other. Discrepancies in the number of confirmed variants were also observed in the patient tumour samples S26 and S40 that were attributed to low quality variants in one of the replicates and these were excluded from analysis. Additionally, a replicate of S46 lacked sufficient read support for a PALB2 variant (c.409G > T) and was filtered out from the reproducibility analysis. The overall performance for reproducibility of the TTSH-oncopanel for total variants and unique variants was observed as 99.99% and at 99.98% at 95% CI respectively (Fig. 1a).

Fig. 1.

Fig. 1

Scatter Plot of Variant Fractions for Inter and Intra Run Replicates. (a) Reproducibility Scatter Plot of Variant Fractions for 15 Inter-Run Replicates. (b) Repeatability Scatter Plot of Variant Fractions for 5 Intra-Run Replicates. The grey, dotted lines represent the theoretical standard deviation (1 and 3 sigma) variant fractions based on the assumption of high starting materials and an observed coverage of 100X. The red dashed line is the identity. The blue square(s) in the lower-left corner represents the low variant fraction threshold (SNP = 2.9%, INDEL = 2.9%). Inconsistent result in one replicate was either due to low variant fraction below threshold or the variant had high background noise, or the quality of the variant call fell below the quality threshold, and these were filtered out.

Likewise, assay repeatability (intra-run precision) was assessed by sequencing 5 samples (2 unique samples) indexed with different barcodes and sequenced in duplicates or triplicates within a single run. The sequencing and variant repeatability analysis showed two inconsistent variants between the replicates of FFPE samples S13, S20 and S46 (Supplementary Table S4b). These inconsistencies were either due to low VAF below the threshold marked as low quality and /or detected as background noise or they had insufficient read support and were therefore excluded from analysis. The overall performance for repeatability of the TTSH-oncopanel for total variants and unique variants was 99.99% at 95% CI (Fig. 1b).

Long term test reproducibility

The TTSH-oncopanel was also evaluated for long-term reproducibility by repeatedly testing the mutation positive control, HD701. The control showed 9 SNVs ≥ 3% VAF (Supplementary Table S5). All alterations were successfully detected in all the repeat tests with the detected variants exhibiting a coefficient of variation less than 0.1x.

Molecular landscape of the tumour samples

The Sophia DDM results were compared with external NGS data and CAP for the 40 samples (28 FFPE and 12 EQA). At least 92 CVs in 38 out of 40 samples were observed, which were 100% concordant with the orthogonal genomic data from external laboratories (Supplementary Table S6). Only one (S13) of the 38 samples was MSI-H. Variants detected in two of the FFPE samples (S23 and S25) by Sophia DDM could not be confirmed as these were not detected in the orthogonal methods, either because these variants were not covered by the external laboratory targeted panel or their VAF was below the sensitivity of the assay. These two samples therefore only harboured EVs (Supplementary Table S7).

The TTSH-oncopanel detected a total of 794 mutations (629 SNVs and 165 INDELs) excluding the synonymous and intronic variants in 42 of the 61 genes in the panel in the 40 tumour samples. These mutations included, frameshifts (n = 31), inframe (n = 17), intergenic or promoter region (n = 3), nonsense (n = 10), splice site variants (n = 20) 3′ and 5′ UTRs (n = 30) and missense mutations (n = 683). Out of the 794 mutations, 702 EVs were detected exclusively by our assay and 92 were CVs (Table 2).

Table 2.

Variant analysis in the tumour samples of patients and EQA used in the study.

Tumour type N = 40 Sample no SNV n = 629 INDEL n = 165 Mutations n = 794
CV EV CV EV (SNV + INDEL)
Bladder 2 S7 3 14 0 6 23
S11 3 23 0 5 31
Brain 4 S15 2 11 0 4 17
S19 1 16 0 7 24
S23 0 17 0 6 23
S24 1 14 1 2 18
Breast 1 S18 2 16 0 5 23
Colon 3 S12 1 17 0 6 24
S14 2 12 0 6 20
S21 2 19 0 7 28
Gastric 3 S3 1 13 0 6 20
S4 1 17 0 6 24
S6 1 22 1 6 30
Lung 3 S1 2 20 2 4 28
S32 1 10 0 4 15
S34 1 16 0 5 22
Nasal cavity 1 S16 1 16 0 4 21
Ovary 1 S28 1 16 1 2 20
Pancreas 5 S2 2 17 0 4 23
S9 2 14 0 5 21
S20 2 14 0 5 21
S22 1 13 0 6 20
S27 2 15 0 6 23
Renal pelvis 1 S26 5 18 0 3 26
Spine 2 S13 1 19 1 6 27
S17 1 19 0 6 26
Female genital tract 1 S37 1 24 0 6 31
Skin 1 S25 0 17 0 7 24
EQA sample 1 1 S38 3 8 0 0 11
EQA sample 2 1 S39 1 8 2 0 11
EQA sample 3 1 S40 4 8 0 0 12
EQA sample 4 1 S41 2 9 1 0 12
EQA sample 5 1 S42 2 8 1 0 11
EQA sample 6 1 S43 3 8 1 0 12
EQA sample 7 1 S44 2 8 2 0 12
EQA sample 8 1 S45 3 8 1 0 12
EQA sample 9 1 S46 2 9 3 0 14
EQA sample 10 1 S47 3 8 1 0 12
EQA sample 11 1 S48 1 8 1 0 10
EQA sample 12 1 S49 3 8 1 0 12

EQA external quality assessment, CV confirmed variant, EV extra variant.

BRCA1 (11.21% of 794, 89 variants) was the most frequently mutated gene followed by BRIP1 (10.45%), TP53 (9.95%), BRCA2 (9.45%) and BARD1 (7.68%). The other genes (n = 37) demonstrated a narrow range of 0.1–7% of the 794 mutations. Among the cancer types, the bladder cancer sample S11 and female genital tract cancer sample S37 (3.9% of 794) followed by gastric cancer sample S6 (3.8%) exhibited the most frequent mutations (Supplementary Table S7). Notably, we observed numerous examples of mutations present in one sample but not the other. Examples of such ‘private’ mutations included AKT1 E17K (sample S43), CDK4 V57_L59del (sample S25), CDKN2A D108G (sample S26), FGFR3 N653T (sample S37), GNAQ Q209P (sample S17), GNAS R201H (sample S4), SF3B1 R775G (sample S22) and SMAD4 Y353F (sample S26).

Next, we analyzed the somatic variants with amino acid changes that were recurrent in the samples. Recurrent mutations are identical somatic mutations found at the same genomic location in two or more cancer patients. The recurrently mutated genes and regulatory elements, provide a growth advantage to tumour cell and are used in the prediction of cancer drivers. One such driver gene is the TP53 gene, which is known to regulate core oncogenic pathways and mutations in TP53 occurs at early stage of tumourigenesis across multiple tumours18. In the current study, TP53 harbored 28 distinct mutations among which the known oncogenic driver mutations, such as S215G, R175H, R273H and C135F were recurrent in two or more samples. Similarly, other known oncogenic mutations such as BRAF (V600R), EGFR (T790M and L858R), ERBB2 (Y772_A775dup), KRAS (G12D and G12V), and PIK3CA (E542K and H1047R) were also recurrent.

We further validated two confirmed mutations identified within the EGFR hotspot regions in lung cancer sample S1 by single gene mutation Idylla qPCR assay (Biocartis, Mechelen, Belgium) and confirmed the presence of these mutations. This established the accuracy of the TTSH-oncopanel for the detection of somatic mutations. Furthermore, the breast cancer sample, S18 that exhibited ERBB2 copy number (CN3.3) by NGS also showed HER2 (CN3.33) by FISH although overall HER2/CEP17 signal ratio status was negative (< 2.0). This highlighted the concordance between the two methods. On the contrary, another brain cancer sample S24 that showed EGFR exon 20 insertion (D770delinsGY) by NGS did not detect the EGFR mutation when tested with the Idylla single gene assay. This was most likely due to the absence of the EGFR target region in the single gene assay. These findings suggest the necessity for a comprehensive analysis by NGS with more information available to the clinician for patient care management.

We further examined if the TTSH-oncopanel could identify copy number alterations in known positive samples identified by the previous NGS method. NGS assay detected CNAs in 24 genes in 16/40 samples. Nine out of the 24 genes (ERBB2, FGFR1, FGFR2, FGFR3, MDM2, MET, KRAS, PIK3CA, and TERT) showed concordance with other NGS methods. Among the cancer types, ERBB2 CNAs were detected in brain tumour samples S24 (CN 69.1) and S19 (CN10.3) (Fig. 2), S7 bladder tumour (CN 4.6) and S18 breast tumour (CN 3.3). Similarly, CNAs in FGFR1 gene was identified in S32 lung tumour (CN 15.8), S23 brain tumour (CN 5.1) and S16 nasal cavity tumour (CN 3.5) samples suggesting that gene amplifications with CN > 6.0 could be a marker of cancer progression in these tumours.

Fig. 2.

Fig. 2

Copy number amplification levels in the genes detected by NGS. Plot shows CNAs in 24 genes of the TTSH-oncopanel in 16 patient FFPE tumours with ERBB2 exhibiting the highest copy numbers (CN 69.1) in patient S24.

Clinically actionable somatic mutations

Somatic variants identified by targeted NGS can be translated to ‘actionable mutations’ in the clinic. These actionable mutations or molecular targets with clinical utility may guide treatment using existing and potential new drugs. For this, pathogenicity prediction levels were determined in the variants and the ‘pathogenic’ and ‘likely pathogenic’ mutations were further analyzed using the OncoPortal™ Plus functionality for strong (Tier I) and potential (Tier II) clinical significance.

Out of the 794 mutations in the forty samples, 69 (8.5%) mutations were classified as pathogenic, and 31 (3.9%) mutations were likely pathogenic that conferred cancer predisposition (Supplementary Table S7). The clinical patients harboring pathogenic mutations (n = 58) were individually checked for their clinical associations within the disease or another disease on the OncoPortal™ Plus platform. Forty-five variants (78%) were identified as having Tier 1 or Tier II clinical significance within the disease (15/45, 33%) or another disease (30/45, 67%) for which therapy was available. These actionable variants were either sensitive (39/45, 87%) or resistant (n = 6/45, 13%) to targeted therapy or experimental drugs in the patient disease or another disease. Clinically actionable mutations were detected in bladder cancer (ERBB2 S310F), colon cancer (KRAS G12D & TP53 R175H, E204*), lung (EGFR L747_T751del, T790M, C797S & TP53 N263Tfs), pancreas (KRAS G12D, G12V, Q61R & TP53 R196*), ovary (BRCA1 N909Kfs & TP53 V173M) and breast (PIK3CA H1047R) (Fig. 3).

Fig. 3.

Fig. 3

Actionable alterations (Tier I or Tier II) detected by TTSH Oncopanel. Patients harboring pathogenic (level A) or likely pathogenic (level B) mutations that are clinically actionable (Tier I or Tier II) and sensitive or resistant to the targeted therapy.

Discussion

Targeted NGS brings a unique advantage for the detection of somatic alterations using a single platform in a wide range of solid tumours together with the quality, type, and amount of input materials for genomic profiling. In the present study, we performed the development and validation of a 61 gene targeted TTSH-oncopanel to examine a wide array of molecular alterations that are frequently detected across pan-cancer models using the MGI library prep system and sequencer that proved to be valuable in this regard. We used Sophia Genetics custom designed panel for our validation study due to its compatibility with the MGI library prep automated platform. The automated system not only enhanced efficiency but also minimized the risk of contamination and human pipetting errors and ensured a streamlined workflow that contributed to an increased library yield.

Our panel of 61 genes was optimized to provide exceptional coverage uniformity throughout the entire target regions sensitivity for SNVs, INDELs and CNAs resulting in high data quality. Accurate detection of variant allelic frequencies ≥ 3% ensured precise identification of patient somatic alterations. We achieved 100% concordance with previously analysed genomic data from other NGS methods (Foundation One/Tempus XT/ CAP) on the same samples, validating the accuracy of our variant calling. Furthermore, our assay demonstrated high reproducibility in both intra and inter-assay comparisons, affirming the reliability and precision of the TTSH-oncopanel and our NGS protocol. NGS assay from retrospective FFPE samples present unique challenges, such as achieving optimal coverage that is pivotal for assay accuracy. We demonstrated that a minimal input of DNA (50 ng) was sufficient for effective target enrichment, facilitating genomic profiling from small biopsies.

The actionable oncogenic driver mutations detected by the study proved the immediate clinical application of the TTSH-oncopanel using the OncoPortal™ Plus platform. For example, our assay detected three EGFR mutations (L747_T751del, T790M and C797S) in lung cancer patient S1, which are targetable with Tier I and Tier II therapies19,20. Our assay further revealed that the EGFR mutations co-occurred with other mutations, specifically TP53 variants. The TP53 is commonly known to be mutated in 50% of solid tumours, especially on exon 8 and its co-occurrence with EGFR in advanced lung cancer has recently been suggested21. Therefore, the TTSH oncopanel NGS assay involves a comprehensive approach to tumour characterization and a thorough assessment of the genomic markers.

The NGS assay also offered the detection of variants that were unavailable from single gene assay. It detected an EGFR variant (exon 20ins) that was undetected by Idylla single gene assay, therefore restricting the scope of Idylla assay’s detection capabilities. Likewise, we also observed KRAS mutations in the five pancreatic and two colon cancer samples, which are key factors in the disease pathogenesis and prognosis in accordance with the previous studies22,23. The KRAS G12D mutation also co-existed with TP53 mutations R175H & E204* indicating the relevance of multigene panel assessment to detect the clinically actionable mutations that may have been overlooked by more limited approaches. Besides these known alterations, we also detected EVs exclusively in our NGS analysis such as the RAD54L variant in bladder cancer sample S7 and recent studies have shown the homologous recombination repair gene to play an important role in bladder cancer progression by regulating cell cycle and cell senescence24. In addition to the somatic mutations, our study also identified gene amplification notably, the ERBB2 gene that was identified in other cancer types besides breast cancer as reported by other studies2527.

The in-house NGS assay integrated two assays, the single gene assay for mutation detection and fluorescence in situ hybridization for copy number analysis. This comprehensive approach of ‘two in one’ gene panel could potentially benefit the patient from undergoing multiple tests on various platforms. However, the study was unable to detect gene fusions as it was designed to detect only SNVs/Indels and CNVs. The assessment of gene fusions such as those involving ALK and ROS-1 genes, which are important in both the detection and treatment of lung cancer is currently achieved by Idylla cartridge-based RT-PCR assay in the laboratory. Moreover, an external laboratory independent analysis also remains to be elucidated. The study included a limited number of patients, as these were the previously characterized available samples for the retrospective validation, which demonstrated the results of an alternative method.

In conclusion, our findings demonstrate that the NGS assay using the TTSH-oncopanel, benchtop MGI instruments and Sophia DDM bioinformatics pipeline can detect clinically significant somatic alterations with high accuracy, sensitivity, precision and reproducibility. The assay does not require a bioinformatics expert in the laboratory to analyze the data and therefore reduces turnaround time to only 4 days from sample processing to clinical reporting via integration of third-party cloud software. This approach could serve as a model for other laboratories seeking to optimize their genomic workflows for increased efficiency and reliability. Application of our panel in making clinical decision may become feasible in the future.

Methods

Targeted panel design

We identified genes with known somatic alterations in cancer through our laboratory database search and published work of other NGS laboratories58,19,21,2830  and selected 518 target regions covering 95,923 bp across a 61-gene panel for somatic mutation analysis, with 48 genes included for copy number analysis, based on known cancer-related somatic alterations. Primer specificity was verified, and a custom solid tumour solution with barcodes was provided by Sophia Genetics for targeted DNA library preparation (Supplementary Table S8).

Samples for validation

A total of 43 specimens were used in the validation study, including: (a) 28 FFPE tissue samples from patients, retrospectively obtained from Tan Tock Seng Hospital (TTSH), Singapore, tested between 2018 and 2022 using alternative NGS methods (FoundationOne/tempus XT/CAP), (b) 12 EQA DNA samples from CAP (Illinois, USA), (Fig. 4) and (c) 3 reference standard DNA samples (i) HD701 multiplex reference standard (Horizon Inc., Cambridge, UK), (ii) SG060, the HD701 cell line embedded in FFPE (Sophia Genetics) and (iii) NA24385 (Coriell Institute for Medical Research, NJ, USA), a negative control DNA.

Fig. 4.

Fig. 4

Distribution of 40 tumour types for the validation study. EQA samples are the external quality assessment DNA samples received from CAP for laboratory performance. The other tumour types are the patient FFPE samples. All the samples were obtained retrospectively and tested for somatic alterations by an alternative NGS method.

Genomic DNA was extracted from 28 FFPE 5 µm sections (≥ 30% tumour content) using the Promega ReliaPrepTM system (Promega Corporation, WI, USA). DNA libraries for 64 samples (including replicates) were prepared using Sophia Genetics kits on the MGISP-100 system. The library preparation process involved DNA fragmentation, end repair, adapter ligation, multiplex PCR, and hybridization capture based target enrichment using a 61-gene panel probes (Supplementary Table S9). The risk of cross contamination between samples in a run was mitigated by using unique dual indices (UDI). The libraries were circularized to form DNA nanoballs (DNB)31 and sequenced on the MGI DNBSEQ-G50RS sequencer with 2 × 100 paired-end sequencing. Base calling and quality checks were assessed, and multiplexed CAL files were de-multiplexed into individual FASTQ files.

Sequencing data quality analysis

A pipeline was developed by Sophia DDM bioinformatics platform for variant calling (version 5.5.71)32. Accordingly, raw sequencing data was subjected to a series of filtering steps to obtain high confidence variant calls with 100 × minimum coverage. Low quality reads with poor mapping quality or an average Phred score ≤ 20 was trimmed. Clean reads were aligned to the human reference genome (hg19). Soft clipping was applied to unaligned bases at read ends, and mapping metrics such as on-target rate and median sequencing depth were assessed. PCR duplicates were removed. Coverage heterogeneity was measured as the percentage of base pairs in the target region with coverage below 0.2 × or above 5 × median target coverage. Coverage uniformity was defined as the inverse of coverage heterogeneity, set at ≥ 97% for each sample.

Analytical performance

The analytical performance of our variant analysis pipeline was evaluated for detecting single nucleotide variants (SNVs) and small insertion and deletions (INDELs). All the analyses were based on the comparison of the retained variant list generated by Sophia DDM with the confirmed variants (CVs) analysed previously by other targeted NGS methods. Two types of data sets were used in this study:

  1. Characterized samples: The reference samples (HD701, SG060, NA24385) and their replicates with at least one fully characterized region overlapping the target region. Such region(s) had been characterized by Sanger, ddPCR and/or by another massively parallel sequencing method specifying the status of each position (both presence and absence of variants). The target regions that were not characterized in the NA24385 characterized sample were labelled as ‘uncharacterized regions’.

  2. Confirmed samples: All the samples with at least one confirmed variant, with or without characterized regions.

All the reported variants were used in the evaluation of quality measures. Sensitivity was defined as the percentage of confirmed variants detected in all the samples (Supplementary Table S10). Specificity, accuracy and precision were determined only on the characterized regions as uncharacterized regions might have variants which were not confirmed. Consistency was determined for inter and intra-run replicates. Therefore, each position within the target regions of the gene panel were taken into consideration in the calculation of these statistics.

Five variant categories were determined based on these data sets for all the target regions covered by the panel (a) True positives (TP)—confirmed variants detected by Sophia DDM platform (b) False negatives (FN)—confirmed variants not detected by Sophia DDM (c) False positives (FP)—non-confirmed variants detected by Sophia DDM within characterized regions (d) Extra variants (EV)—non-confirmed variants detected by Sophia DDM in uncharacterized regions and (e) True negatives (TN) position in characterized region with no variant detected and no confirmed variant specified. For each sample, all screened positions (TP+EV+FP+TN+FN) were determined by subtracting undefined positions from the target region of the panel. Quality measures were calculated based on these variant categories. Average percentage of target region that fulfilled screened position criteria was set to 95% -100% expected range (Supplementary Table S10).

Variant call accuracy was defined as the percentage of correct calls for each variant class. To determine this, a series of filters were applied on the raw SNPs/INDEL calls including problematic or noisy regions from SNPs, undefined positions exhibiting low variant fraction (< 2% for SNPs and < 2% for INDELs) and any INDELs ≥ 10 bp located in the homopolymer regions. Information about these variants were obtained from several databases such as OMIM33, dbSNP34 and COSMIC35 identifiers. Allele frequency from 1000Genome Project36, ExAC37, GnomAD38, prediction scores from SIFT39, PolyPhen240, MutationTaster41, and clinical significance from ClinVar42. All the detected variants were manually reviewed using Integrative Genomics Viewer28.

To test the limit of detection of variant detection, serial dilutions (5%, 10%, 25%, 50% and 70%) of DNA from HD701 with 13 cancer somatic variants (10 SNVs and 3 INDELS) at 1%-33% allelic frequencies were performed (Horizon Discovery, UK) against another negative control, SeraSeq® TNA wild type (WT) mix (catalogue no. 0710-1580, SeraCare, MA, USA). Subsequently, libraries were prepared and sequenced. The limit of detection for SNVs and INDELS was targeted at 3% allelic frequency.

Assay reproducibility (inter-run precision) was assessed by sequencing 15 unique samples (2 FFPE, 10 EQA and 3 reference samples) in duplicates across separate runs. Assay repeatability (intra-run precision) was assessed by sequencing 5 unique samples (2 FFPE, 1 EQA and 2 reference samples) indexed with different barcodes and sequenced in duplicates within a single run by one operator on the same day and system. The reproducibility and repeatability was based on the assessment of sequencing and variant reproducibility of the samples and defined as their product (Supplementary Table S10). The acceptance criteria for both inter-run and intra-run were set at ≥ 95%.

Sophia DDM detected only copy number gain by normalizing target region coverage within each sample and across all samples in the same sequencing batch. Whole-gene amplifications with a large average copy number (above 3.25) were used to confidently detect amplifications Inline graphic 6.0 copies. Therefore, ≥ 3.25 copies were considered as copy number gain while the confidence threshold for amplification was ≥ 6 copies.

Microsatellite instability (MSI) analysis was determined by the MSI algorithm module of Sophia DDM that is based on NGS data using alignments of six well-characterized SSRs within long homopolymers. The sites are: BAT_25, BAT_26, CAT_25, NR_21, NR_22, NR_27. The developed algorithm classifies MS into 3 categories: (a) MSS (MS stable) with a score below 6; (b) MSI-HC (MSI with high confidence) with a score > 14; (c) MSI-LC (MSI with low confidence) with a score between 6 and 14.

Variant classification

Molecular testing information is crucial for clinical decision-making. Sequence variants were therefore classified by their pathogenicity levels (pathogenic, likely pathogenic, or likely benign) by Sophia DDM and annotated using the OncoPortal™ Plus. Detected variants were categorized into four tiers based on 2017 AMP/ASCO/CAP guidelines16. Variants were considered clinically significant if they were clinically actionable or provided prognostic and diagnostic insights for the disease. Clinically actionable variants were those for which targeted therapy was available or the drug was undergoing clinical trials or preclinical studies for the disease or another disease.

Supplementary Information

Supplementary Information. (140.2KB, xlsx)

Acknowledgements

The authors would like to thank Dr Ryan Teo and Ms Tuty Muliana Binte Ismail for technical support during the validation study.

Author contributions

K.D. was responsible for conceptualization, methodology, investigation, data curation, project administration, and wrote the original draft of the manuscript. M.L.I.T. and E.Y.Y. conducted the validation experiments and contributed to the data. K.L.C. supervised the study. All the authors reviewed and approved the final version of the manuscript.

Funding

No funding was received for conducting this study.

Data availability

The datasets generated and/or analysed during the current study is available in the NCBI dbSNP repository, https://www.ncbi.nlm.nih.gov/SNP/snp_viewTable.cgi?handle=MOLECULARPATH.

Declarations

Competing interests

The authors declare no competing interests.

Ethics approval

The study was approved by the institutional review board, the Clinical Research and Innovation Office, Tan Tock Seng Hospital, Singapore for the use of anonymised data and human biological materials. This study was conducted in compliance with the Dutch code of conduct for responsible use of human tissue in medical research. Tissue specimens and clinic-pathological data were handled in anonymized manner and in compliance with the Declaration of Helsinki (1964).

Consent to participate

The need for informed consent was waived by the Clinical Research and Innovation Office, Tan Tock Seng Hospital, Singapore.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

The online version contains supplementary material available at 10.1038/s41598-025-08039-6.

References

  • 1.Carr, T. H. et al. Defining actionable mutations for oncology therapeutic development. Nat. Rev. Cancer16, 319–329. 10.1038/nrc.2016.35 (2016). [DOI] [PubMed] [Google Scholar]
  • 2.Haiduk, T. et al. Comparison of Biocartis IDYLLA cartridge assay with Qiagen GeneReader NGS for detection of targetable mutations in EGFR, KRAS/NRAS, and BRAF genes. Exp. Mol. Pathol.120, 104634. 10.1016/j.yexmp.2021.104634 (2021). [DOI] [PubMed] [Google Scholar]
  • 3.Nkosi, D., Casler, V. L., Syposs, C. R. & Oltvai, Z. N. Utility of Select Gene Mutation Detection in Tumors by the Idylla Rapid Multiplex PCR Platform in Comparison to Next-Generation Sequencing. Genes (Basel)13 (2022). 10.3390/genes13050799 [DOI] [PMC free article] [PubMed]
  • 4.Kris, M. G. et al. Using multiplexed assays of oncogenic drivers in lung cancers to select targeted drugs. JAMA311, 1998–2006. 10.1001/jama.2014.3741 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Cheng, D. T. et al. Memorial sloan kettering-integrated mutation profiling of actionable cancer targets (MSK-IMPACT): A hybridization capture-based next-generation sequencing clinical assay for solid tumor molecular oncology. J. Mol. Diagn.17, 251–264. 10.1016/j.jmoldx.2014.12.006 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Yatabe, Y. et al. Multiplex gene-panel testing for lung cancer patients. Pathol. Int.70, 921–931. 10.1111/pin.13023 (2020). [DOI] [PubMed] [Google Scholar]
  • 7.Garcia, E. P. et al. Validation of OncoPanel: A targeted next-generation sequencing assay for the detection of somatic variants in cancer. Arch. Pathol. Lab. Med.141, 751–758. 10.5858/arpa.2016-0527-OA (2017). [DOI] [PubMed] [Google Scholar]
  • 8.Simen, B. B. et al. Validation of a next-generation-sequencing cancer panel for use in the clinical laboratory. Arch. Pathol. Lab. Med.139, 508–517. 10.5858/arpa.2013-0710-OA (2015). [DOI] [PubMed] [Google Scholar]
  • 9.Frampton, G. M. et al. Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing. Nat. Biotechnol.31, 1023–1031. 10.1038/nbt.2696 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Heather, J. M. & Chain, B. The sequence of sequencers: The history of sequencing DNA. Genomics107, 1–8. 10.1016/j.ygeno.2015.11.003 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Kumar, K. R., Cowley, M. J. & Davis, R. L. Next-generation sequencing and emerging technologies. Semin. Thromb. Hemost.45, 661–673. 10.1055/s-0039-1688446 (2019). [DOI] [PubMed] [Google Scholar]
  • 12.Ravi, R. K., Walton, K. & Khosroheidari, M. MiSeq: A next generation sequencing platform for genomic analysis. Methods Mol. Biol.1706, 223–232. 10.1007/978-1-4939-7471-9_12 (2018). [DOI] [PubMed] [Google Scholar]
  • 13.Marine, R. L. et al. Comparison of Illumina MiSeq and the Ion Torrent PGM and S5 platforms for whole-genome sequencing of picornaviruses and caliciviruses. J. Virol. Methods280, 113865. 10.1016/j.jviromet.2020.113865 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Huang, J. et al. A reference human genome dataset of the BGISEQ-500 sequencer. Gigascience6, 1–9. 10.1093/gigascience/gix024 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Li, Q. et al. Reliable multiplex sequencing with rare index mis-assignment on DNB-based NGS platform. BMC Genom.20, 215. 10.1186/s12864-019-5569-5 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Li, M. M. et al. Standards and guidelines for the interpretation and reporting of sequence variants in cancer: A joint consensus recommendation of the Association for Molecular Pathology, American Society of Clinical Oncology, and College of American Pathologists. J. Mol. Diagn.19, 4–23. 10.1016/j.jmoldx.2016.10.002 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Blum, A., Wang, P. & Zenklusen, J. C. . SnapShot: TCGA-analyzed tumors. Cell173, 530. 10.1016/j.cell.2018.03.059 (2018). [DOI] [PubMed] [Google Scholar]
  • 18.Wang, Y. Y. et al. Mapping p53 mutations in low-grade glioma: a voxel-based neuroimaging analysis. AJNR Am. J. Neuroradiol.36, 70–76. 10.3174/ajnr.A4065 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Mok, T. S. et al. Osimertinib or platinum-pemetrexed in EGFR T790M-positive lung cancer. N. Engl. J. Med.376, 629–640. 10.1056/NEJMoa1612674 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wu, Y. L. et al. Osimertinib in resected EGFR-mutated non-small-cell lung cancer. N. Engl. J. Med.383, 1711–1723. 10.1056/NEJMoa2027071 (2020). [DOI] [PubMed] [Google Scholar]
  • 21.Hao, F., Gu, L. & Zhong, D. TP53 mutation mapping in advanced non-small cell lung cancer: A real-world retrospective cohort study. Curr. Oncol.29, 7411–7419. 10.3390/curroncol29100582 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Luo, J. KRAS mutation in pancreatic cancer. Semin. Oncol.48, 10–18. 10.1053/j.seminoncol.2021.02.003 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rahman, S., Garrel, S., Gerber, M., Maitra, R. & Goel, S. Therapeutic targets of KRAS in colorectal cancer. Cancers (Basel)13 (2021). 10.3390/cancers13246233 [DOI] [PMC free article] [PubMed]
  • 24.Wang, Y. et al. Rad54L promotes bladder cancer progression by regulating cell cycle and cell senescence. Med. Oncol.39, 185. 10.1007/s12032-022-01751-7 (2022). [DOI] [PubMed] [Google Scholar]
  • 25.Bang, Y. J. et al. Trastuzumab in combination with chemotherapy versus chemotherapy alone for treatment of HER2-positive advanced gastric or gastro-oesophageal junction cancer (ToGA): a phase 3, open-label, randomised controlled trial. Lancet376, 687–697. 10.1016/S0140-6736(10)61121-X (2010). [DOI] [PubMed] [Google Scholar]
  • 26.Lae, M. et al. Assessing HER2 gene amplification as a potential target for therapy in invasive urothelial bladder cancer with a standardized methodology: results in 1005 patients. Ann. Oncol.21, 815–819. 10.1093/annonc/mdp488 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Langer, C. J. et al. Trastuzumab in the treatment of advanced non-small-cell lung cancer: is there a role? Focus on Eastern Cooperative Oncology Group study 2598. J. Clin. Oncol.22, 1180–1187. 10.1200/JCO.2004.04.105 (2004). [DOI] [PubMed] [Google Scholar]
  • 28.Thorvaldsdottir, H., Robinson, J. T. & Mesirov, J. P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform.14, 178–192. 10.1093/bib/bbs017 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell174, 1034–1035. 10.1016/j.cell.2018.07.034 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Blum, A., Wang, P. & Zenklusen, J. C. SnapShot: TCGA-analyzed tumors. Cell173(2), 530. 10.1016/j.cell.2018.03.059 (2018). [DOI] [PubMed] [Google Scholar]
  • 31.Korostin, D. et al. Comparative analysis of novel MGISEQ-2000 sequencing platform vs Illumina HiSeq 2500 for whole-genome sequencing. PLoS ONE15, e0230301. 10.1371/journal.pone.0230301 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Lepkes, L. et al. Performance of in silico prediction tools for the detection of germline copy number variations in cancer predisposition genes in 4208 female index patients with familial breast and ovarian cancer. Cancers (Basel)13. 10.3390/cancers13010118 (2021). [DOI] [PMC free article] [PubMed]
  • 33.Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F. & Hamosh, A. OMIM.org: Online Mendelian Inheritance in Man (OMIM(R)), an online catalog of human genes and genetic disorders. Nucleic Acids Res.43, D789-798. 10.1093/nar/gku1205 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res.29, 308–311. 10.1093/nar/29.1.308 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Forbes, S. A. et al. COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res.45, D777–D783. 10.1093/nar/gkw1121 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Genomes Project, C. et al. A global reference for human genetic variation. Nature526, 68–74. 10.1038/nature15393 (2015). [DOI] [PMC free article] [PubMed]
  • 37.Karczewski, K. J. et al. The ExAC browser: displaying reference data information from over 60 000 exomes. Nucleic Acids Res.45, D840–D845. 10.1093/nar/gkw971 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature536, 285–291. 10.1038/nature19057 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Ng, P. C. & Henikoff, S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res.31, 3812–3814. 10.1093/nar/gkg509 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Adzhubei, I., Jordan, D. M. & Sunyaev, S. R. Predicting functional effect of human missense mutations using PolyPhen-2. Curr. Protoc. Hum. Genet.Chapter 7, Unit7 20. 10.1002/0471142905.hg0720s76 (2013). [DOI] [PMC free article] [PubMed]
  • 41.Schwarz, J. M., Rodelsperger, C., Schuelke, M. & Seelow, D. MutationTaster evaluates disease-causing potential of sequence alterations. Nat. Methods7, 575–576. 10.1038/nmeth0810-575 (2010). [DOI] [PubMed] [Google Scholar]
  • 42.Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res.46, D1062–D1067. 10.1093/nar/gkx1153 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information. (140.2KB, xlsx)

Data Availability Statement

The datasets generated and/or analysed during the current study is available in the NCBI dbSNP repository, https://www.ncbi.nlm.nih.gov/SNP/snp_viewTable.cgi?handle=MOLECULARPATH.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES