Skip to main content
Genome Biology logoLink to Genome Biology
. 2025 Dec 1;26:409. doi: 10.1186/s13059-025-03882-2

Evaluation of false positive and false negative errors in targeted next generation sequencing

Youngbeen Moon 1,#, Young-Ho Kim 2,#, Jong-Kwang Kim 1, Chung Hwan Hong 3, Eun-Kyung Kang 3, Hye Won Choi 3, Dong‑eun Lee 4, Tae-Min Kim 5, Seong Gu Heo 6,7,8, Namshik Han 9,10,11, Kyeong-Man Hong 1,3,
PMCID: PMC12670792  PMID: 41327433

Abstract

Background

Next-generation sequencing (NGS) has become an indispensable diagnostic tool across various diseases. However, sequencing and analysis errors remain major barriers to clinical implementation. In cancer diagnostics, detecting low-level somatic variants is particularly challenging due to tumor heterogeneity and contamination from normal cells.

Results

We assess targeted next-generation sequencing (T-NGS) performance using reference-standard DNA mixtures of homozygote hydatidiform mole and heterozygote blood DNA at varying ratios, analyzed by certified NGS providers. Analytical sensitivity differs by up to 13.9-fold, and false positive (FP) error rates vary up to 615-fold, depending on provider and pipeline. For identical raw data, DRAGEN and the in-house pipeline differ by up to 36.3-fold in FP error rates. Moderately recurrent FP-prone alleles, although representing only 5.37% of all FP sites, contribute to 36.7% of total FP errors in the Geninus in-house result. Among 22 discordant variant calls between DRAGEN and in-house analyses, more than half of them are not confirmed by single base extension assays, indicating likely false positives. Compared to DRAGEN, a conventional BWA + GATK Mutect2 pipeline maintains equivalent sensitivity but produces a 4-fold increase in FP errors, along with a notable enrichment of recurrent FP-prone alleles.

Conclusions

T-NGS results from certified providers exhibit substantial variability in both sensitivity and FP error rates. Conventional pipelines not only increase FP errors but also accumulate recurrent FP-prone alleles. These findings underscore the urgent need for standardized pipelines and rigorous quality control measures to ensure the reliability of T-NGS in clinical diagnostics.

Supplementary Information

The online version contains supplementary material available at 10.1186/s13059-025-03882-2.

Keywords: NGS, False negative error, False positive error, Quality control, Reference standard, Mutation, Cancer

Background

Next-generation sequencing (NGS) technology allows the simultaneous examination of millions of DNA variants and is utilized by over 11,000 laboratories and companies in the USA, generating revenue of $8.8 billion in 2020 [1, 2] (https://www.ncbi.nlm.nih.gov/gtr/). This technology has greatly advanced the discovery of numerous cancer-associated and disease-causing mutations [35]. However, despite its widespread adoption for clinical diagnostics, high error rates have been reported across various NGS platforms, ranging from 0.26% to 12.86% [6]. For instance, a study involving 20,000 samples showed a specific NGS panel test with a false positive (FP) error rate of 1.3%, prompting authors to recommend confirmation with Sanger sequencing [7]. Additionally, discrepant variant calls have previously been reported at rates of up to 43% between the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE) databases, both constructed using whole-exome sequencing [8]. In our earlier study using targeted NGS to re-examine these discrepant sites, we found that most discrepancies were false negatives resulting from the lower sensitivity of whole-exome sequencing [9]. These findings collectively underscore the urgent need to evaluate NGS sensitivity and specificity.

To address these challenges, the Food and Drug Administration (FDA) has issued guidance to streamline the regulatory framework for NGS testing [10, 11]. Currently, clinical laboratories performing NGS tests in the USA are overseen by the Centers for Medicare and Medicaid Services (CMS) under Clinical Laboratory Improvement Amendments (CLIA) regulations [12]. Numerous studies focusing on NGS quality control have emerged from initiatives like the Sequencing Quality Control 2 (SEQC2) project, highlighting best practices for cancer mutation detection through whole-genome and whole-exome sequencing [13, 14]. These efforts also include the establishment of reference samples for NGS sequencing [1517], single-cell RNA sequencing [1820], and the performance assessment of various DNA sequencing platforms [21]. Together, these studies emphasize the necessity for further development of NGS validation technologies and the introduction of more independent reference-standard materials.

T-NGS has been adopted for clinical testing to detect mutations in tumor samples, which aids in the selection of optimal anti-cancer agents [2227]. However, mutation detection in tumor samples is significantly more challenging than in genetic diseases, primarily due to the genetic heterogeneity of cancer cells within tumor masses and the presence of contaminating normal cells [2831]. Therefore, evaluating the lower detection limits and specificity of mutation calls is crucial for T-NGS testing. The use of mixtures of various cell lines or spiked mutation sequences as reference standards has been proposed to improve T-NGS quality control [3235]. Although studies within the SEQC2 project have compared the sensitivity and accuracy of T-NGS across eight Pan-Cancer panels [36, 37] using reference standards, further investigation is still needed to ensure the broad applicability of these standards in laboratories and clinics. Quantitative pre-validation of each variant allele using whole genome/exome sequencing and droplet digital PCR methods is necessary when employing mixtures of cell lines or DNAs as reference standards [16]. Furthermore, the presence of minor subclonal cancer cells within a cell line can complicate accurate estimation of T-NGS specificity [9].

In this study, we developed a novel strategy for evaluating NGS kits and services that circumvents the need for pre-characterization studies on reference standards. By utilizing reference-standard samples consisting of mixtures of hydatidiform mole DNA and blood genomic DNA at different ratios, we specifically evaluated the lower detection limits and false positive error rates from T-NGS results generated by three accredited NGS service providers, certified by either the College of American Pathologists (CAP) or the Ministry of Food and Drug Safety (Korean FDA).

Results

Evaluation of T-NGS sensitivity using in-house bioinformatic analysis on reference-standard DNAs

To evaluate T-NGS sensitivity, mixtures of homozygote DNA (hydatidiform mole; only homozygous alleles) and heterozygote DNA (blood) were prepared in varying ratios and used as reference standards. Mixtures are denoted “CH,” indicating the percentage of homozygote DNA (e.g., CH10 = 10%).

Only SNPs in coding or splice-site exonic regions were analyzed. Alleles were classified by VAF thresholds into null (N, homozygous reference base), homozygous variant (Ho), and heterozygous variant (He), as described in Methods. Sensitivity-informative alleles were defined as genomic sites where a variant is expected to be present in only one of the two DNA sources—either the homozygote or the heterozygote—but not both. These comprised three sensitivity-informative pair types: Ho–N (homozygous variant in homozygote DNA, null in heterozygote DNA), N–Ho (null in homozygote DNA, homozygous variant in heterozygote DNA), and N–He (null in homozygote DNA, heterozygous variant in heterozygote DNA). As illustrated in Figs. 1A, 11 sensitivity-informative alleles were identified: three N–Ho, three Ho–N, and five N–He pairs.

Fig. 1.

Fig. 1

Evaluation of T-NGS sensitivity by estimating false negatives (FN) in mixed reference standards. A Sensitivity assessment using DNAs prepared by mixing hydatidiform mole and blood DNAs at varying ratios. Sensitivity-informative alleles were classified as N-Ho, Ho-N, or N-He based on variant status of homozygote and heterozygote DNAs. Expected VAFs (eVAFs) are shown for mixtures CH5–CH95 (numbers = % homozygous DNA). Detected variants are marked in red, with base proportions indicated by character size. BD Detection of Ho-N and N-Ho alleles in Companies AA (B), BB (C), and CC (D). HLA alleles are marked above graphs; FN alleles are shown within blue rectangles. The X-axis shows chromosomal position, and the Y-axis shows observed VAF (oVAF). In labels, expected VAFs are shown as percentages and abbreviated as “eVAF” (e.g., eVAF5 = 5%)

T-NGS was performed by three certified service providers using both reference-standard mixtures and unmixed DNA samples: Company AA (Macrogen, Seoul, Korea), Company BB (Theragen Bio, Seoul, Korea), and Company CC (Geninus, Seoul, Korea). The respective T-NGS kits employed were the Axen™ Cancer Panel (Company AA), SureSelect™ Cancer CGP Assay (Company BB), and CancerSCAN™ (Company CC). For Ho–N and N–Ho pairs, observed VAFs were compared to expected dilution values. The 95% detection thresholds were ~ 20% for Company AA and ~ 5% for Company BB (Figs. 1B–C). In Company CC, many HLA alleles showed persistent false negatives without expected dilution effects—typically observed as a linear relationship between expected and observed VAFs from the other variants, suggesting alignment or calling errors (Fig. 1D). After excluding HLA variants, Company CC’s sensitivity threshold was ~ 1%. Overall, sensitivity across companies ranged from 1% to 20%, consistent with the 1–5% range reported by the SEQC consortium [36].

The other sensitivity-informative allele pairs, N-He pairs—comprising a null allele in the homozygote DNA and a heterozygous allele in the heterozygote DNA—were also analyzed for Companies AA to CC. In addition, DRAGEN-based analysis was performed using FASTQ files from all three companies to further assess sensitivity.

Discrepancies in variant calling for HLA genes

In Company CC, in-house analysis of HLA genes showed substantial discrepancies from expected VAF values (Fig. 1A). Ho-N and N-Ho pairs revealed frequent false negatives, whereas DRAGEN calls aligned more closely with expected values (Fig. 2A and B). Several sensitivity-informative alleles identified by in-house methods were designated as uninformative in DRAGEN analysis. Despite similar BAM-level VAFs between methods (Fig. 2C and D), variant calls diverged, indicating errors arose from calling rather than read alignment. For N-He pairs, in-house analysis again showed weaker correlations and more false negatives than DRAGEN (Fig. 2E and F). Overall, these results indicate that in-house methods introduce greater errors in HLA variant calling than DRAGEN.

Fig. 2.

Fig. 2

Discrepancies in HLA variant calling between in-house and DRAGEN analyses (Company CC). Ho-N, N-Ho, and N-He variants in the HLA region were compared between DRAGEN and in-house methods. AD Detection of Ho-N and N-Ho variants from VCF (A, B) and BAM (C, D) data. EF Detection of N-He variants from DRAGEN (E) and in-house (F). The X-axis shows alleles by chromosome/position, and the Y-axis shows observed (oVAF) at each expected VAF values (eVAF), where expected VAFs are shown as percentages, labeled ‘eVAF’. False negatives are marked within blue rectangles. Sensitivity-informative alleles are indicated above the graphs, with DRAGEN classifications (N-He, N-N, NC, Ho-He, Ho-Ho) noted for in-house calls. NC = not called

Correlation between expected and observed VAFs

The reliability of VAF measurements was assessed by comparing observed VAFs with expected values from variant dilutions in mixed reference standards [38]. Strong linear correlations were observed in in-house analyses from Companies BB, CC, and DD (R² >0.99; Fig. 3), whereas Company AA showed reduced correlation (R² = 0.9585) and poor detection of low-VAF variants (Fig. 3A). DRAGEN analysis of the same FASTQ files improved both correlation (R² = 0.9713) and low-VAF detection (Additional file 1: Fig. S1A), indicating that AA’s reduced performance stemmed from its in-house pipeline. In contrast, DRAGEN analysis produced only minor differences for Companies BB and CC (Additional file 1: Figs. S1B, S1C).

Fig. 3.

Fig. 3

Correlation of observed vs. expected variant allele fractions (oVAF vs. eVAF) in in-house analyses. AD Correlation plots for Companies AA (A), BB (B), CC (C), and DD (D). The X-axis shows oVAF, and the Y-axis shows eVAF. R² and P values are indicated, with allele counts shown inside circles. The lower-left region (boxed) is re-drawn at the bottom right with adjusted scaling

All three companies used T-NGS kits without unique molecular identifiers (UMIs), which are known to improve sensitivity and specificity [3941]. To evaluate the impact of UMIs, Company CC generated a new raw data set using a UMI-based TruSight™ Oncology 500 (TSO500) Assay kit, with the same reference standards but a different sample set. The results (labeled as Company DD), processed with the DRAGEN pipeline, showed a high correlation between observed and expected VAFs (R² = 0.9943; Fig. 3D).

Comparison of T-NGS sensitivity across bioinformatics methods

After excluding HLA variants and sites with < 10 reads, sensitivity was assessed from merged sensitivity-informative alleles (Fig. 4A). Probit regression estimated the 95% detection thresholds to range from 1.606% (Company DD) to 22.25% (Company AA in-house), representing a 13.9-fold difference. Fisher’s exact test ranked sensitivity as DD > CC-I > CC-D ≈ BB-D > AA-D > BB-I > AA-I, consistent with the order of 95% detection thresholds (DD 1.606% >CC-I 1.722% >CC-D 2.193% >BB-D 3.003% >BB-I 4.328% ≈ AA-D 4.554% >AA-I 22.25%) (Fig. 4A). Here, company identifiers (AA–DD) are combined with analytic methods (D = DRAGEN, I = in-house, and in silico) to indicate the specific analysis results.

Fig. 4.

Fig. 4

Sensitivity of T-NGS methods across companies (AA–DD) and analytical approaches (in-house, DRAGEN, and in silico). A Comparison of sensitivity across companies and analytical approaches using in-house and DRAGEN methods. B Comparison of sensitivity between DRAGEN and in silico methods in results from Companies BB and CC. P values at different eVAFs are shown in the lower panels of the graphs. The X-axis represents the expected variant allele fraction (eVAF), and the Y-axis shows the detection rate of the alleles at each eVAF. In the figure labels, companies (AA–DD) are combined with analytic methods (D = DRAGEN, I = in-house, and in silico) to indicate results

Although sequencing depth generally correlates with sensitivity [4244], the relationship was not linear. Company CC had ~ 3-fold greater depth than BB but similar sensitivity, while CC and DD had comparable depths yet DD performed better (Additional file 2: Table S1). Per-base sequence quality (Additional file 1: Fig. S2, Additional file 2: Table S1) showed no major issues aside from a slight decline in AA, suggesting that factors such as target capture efficiency, amplification, and variant-calling parameters also strongly affect sensitivity. These findings emphasize the importance of direct quantitative assessment of T-NGS sensitivity rather than relying solely on sequencing depth or quality metrics.

Estimation of T-NGS sensitivity by in silico dilution experiment for T-NGS results

In silico dilution experiments were performed using T-NGS results from Companies BB and CC (Additional file 1: Fig. S3), where read ratios from CH100 and CH0 were resampled to mimic CH0.5–CH99.5 mixtures (see Methods). With HLA variants excluded, the 95% limits of detection increased with in silico dilution: 1.60-fold (4.79% vs. 3.00%) for Company BB and 2.03-fold (4.43% vs. 2.19%) for Company CC (Fig. 4B), indicating ~ 2-fold lower sensitivity. Notably, in Company CC, many HLA gene variants from in silico results showed poor linearity and high false-negative rates, consistent with real dilution experiments (Additional file 1: Fig. S3).

Evaluation of false positive (FP) errors in T-NGS

The specificity of NGS for detecting low-level VAF alleles has rarely been assessed due to methodological limitations [36, 4547]. Using reference-standard DNAs, we estimated false positive (FP) errors after excluding variants in HLA genes and sites with < 100 reads. FP alleles were defined as variants present in mixed DNA samples but absent from both homozygote and heterozygote DNAs.

To quantify specificity, FP errors were classified into three allele-pair types (Fig. 5A): (1) R–R pairs: both DNAs carry the reference allele. (2) V–V pairs: both DNAs carry the variant allele. (3) V–R pairs: one DNA carries a variant allele and the other the reference allele. This classification allowed systematic evaluation across pipelines.

Fig. 5.

Fig. 5

Estimation of false positive (FP) error rates in targeted NGS. A Method for calculating FP errors using allele pairs: R-R (reference in both DNAs), R-V (reference and variant in both DNAs), and V-V (variant in both DNAs). Example: FP rate = 0.13 (4/30) with one error from R-R, two from V-V, and one from R-V (errors in red). BD FP error rates in R-R (B), R-V (C), and V-V (D) pairs at VAF cutoffs of 0 (left), 0.01 (middle), and 0.05 (right). Red circles in CC-D and CC-I mark the CH10 sample (index hopping); red boxes in CC-D mark CH0.5 and CH99.5 (second-batch samples). The percentage of homozygous DNA follows “CH.” Y-axis: FP error rate per Mb/kb of target regions. X-axis: companies (AA–DD) and analysis type (D = DRAGEN, I = in-house)

Marked variability in FP error rates was observed. R–R pairs ranged from 12.09 per Mb (AA-D) to 2,570 per Mb (DD) (Fig. 5B), excluding AA in-house due to very low sensitivity. R–V pairs showed higher rates, especially in CC (1,741 per Mb) and DD (5,196 per Mb) (Fig. 5C). V–V pairs showed extremely high FP rates (> 0.5% in four methods), 22-fold higher than R–R in DD and 37,271-fold higher in CC (Fig. 5D), disproportionately inflating total FP estimates.

Sample-specific artifacts contributed to FP variability. In Company CC, the CH10 sample showed exceptionally high FP errors in R-R pairs (38,715 per Mb in-house; 1,903 per Mb DRAGEN), consistent with index hopping. Second-batch samples (CH0.5, CH99.5) also exhibited elevated FP rates in R-R pairs (Fig. 5B), whereas second-batch samples in Company BB did not differ from first-batch results.

FP errors from reference base calls at V–V pairs (VVR errors)

Most FP errors in V–V pairs resulted from erroneous reference base calls, termed VVR errors. Excluding VVR errors reduced total FP rates by ~ 60–70% in Company BB, from 13.9 to 4.22 per Mb (DRAGEN) and from 12.07 to 4.83 per Mb (in-house) (Additional file 1: Fig. S4).

VVR errors likely stem from bioinformatics bias favoring reference base calls. Because the reference base can be directly inferred from the reference genome, while variant calls rely on stricter conditions, pipelines may preferentially call reference alleles at very low VAFs. Supporting this, diluted reference bases in mixed DNA samples were detected at > 80% rates, even at eVAF = 0.5% (Additional file 1: Fig. S5). This indicates preferential reference base calling during variant analysis.

Importantly, most V–R and VVR errors were removed by applying VAF cutoffs (1% or 5%) (Fig. 5C and D), suggesting that these errors predominantly occur at very low allele fractions. Together, these findings indicate that VVR errors should not be considered true FP events but rather artifacts of biased reference base calling.

Comparison of total FP error rates after removing VVR errors

After excluding VVR errors, FP error rates were recalculated (Fig. 6A). Excluding the low-sensitivity in-house results from Company AA (~ 22%), the lowest FP rates were observed in Company BB (median: 4.22 per Mb, DRAGEN; 4.83 per Mb, in-house). DRAGEN results from Companies CC and AA were intermediate (5.77 and 12.09 per Mb, respectively), while Company DD showed the highest FP rate (2,596.39 per Mb), yielding a maximum 615-fold difference (2,596.39/4.22).

Fig. 6.

Fig. 6

Total false positive (FP) error rates. AC FP error rates with VAF cutoffs of 0% (A), 1% (B), and 5% (C). Statistical significance is shown above the graphs as P values (red indicates P < 0.05). Red circles in CC-D and CC-I indicate the CH10 sample (index hopping), and red boxes in CC-D and CC-I indicate the CH0.5 and CH99.5 samples (second batch). The percentage of homozygous DNA is indicated by the value following “CH.” Y-axis: FP error rate per Mb/kb of target regions. X-axis: companies (AA–DD) and analysis type (D = DRAGEN, I = in-house)

Applying VAF cutoffs (1% or 5%) markedly reduced FP error rates in Company DD (Fig. 6B and C). At the 5% cutoff, its FP rate fell below those of Companies AA (P = 0.0056) and BB (P = 0.0039).

The impact of DRAGEN varied across providers. In Company CC, DRAGEN reduced FP errors by 96.7% (5.77 vs. 209.46 per Mb; P = 0.0010). Conversely, Company AA showed increased FP errors with DRAGEN (12.09 vs. 0 per Mb; P = 0.0286), while Company BB exhibited a modest, non-significant reduction (0.875-fold; P = 0.7175). These differences likely reflect pipeline stringency: Company CC’s in-house method appeared permissive compared to DRAGEN defaults, whereas Company AA’s in-house pipeline applied stricter filters.

FP error-prone alleles

To investigate the origin of false positives (FPs), we examined recurrence of FP alleles in R–R pairs from Company CC (Additional file 1: Fig. S6). Nearly half of FP alleles (44.2%, 728/1725) recurred (FP calls in ≥ 2 of 11 mixtures) across multiple reference-standard mixtures, and just 5.4% moderately recurrent alleles (FP calls in ≥ 5 of 11 mixtures, N = 198) accounted for 36.7% of all FP events (N = 3688), indicating the presence of recurrent FP error–prone sites. Eliminating such sites could substantially reduce FP rates in T-NGS analyses.

Batch effects were also evident. Although total FP rates were higher in the second batch (Fig. 6A and B), highly recurrent alleles (FP calls in ≥ 8 of 11 mixtures) were less frequent in CH0.5 and CH99.5 samples from the second batch (Additional file 1: Fig. S7A). Similar batch-to-batch variation occurred in Company BB’s in-house results (Additional file 1: Fig. S7B), where some FP alleles appeared only in the second batch. Importantly, these variants were still detectable at the BAM level in both in-house and DRAGEN pipelines and in DRAGEN VAF data, suggesting that discrepancies stem from downstream bioinformatic processing rather than raw sequence quality.

Impact of conventional analysis on FP errors

Conventional analysis (BWA + GATK Mutect2) of FASTQ data from Company BB markedly increased FP errors compared to DRAGEN and in-house pipelines. R–R FP errors rose to a median of 19.92 per Mb—4.22- and 4.82-fold higher than DRAGEN (4.22 per Mb, P < 0.0001) and in-house (4.13 per Mb, P < 0.0001), respectively (Fig. 7A). Low-VAF heterozygous variants (He-low, VAF < 0.1) also increased significantly (median 19.31 per Mb), representing 10.67- and 8-fold increases over DRAGEN (1.81 per Mb, P < 0.0001) and in-house (2.41 per Mb, P < 0.0001) results (Fig. 7B, Additional file 1: Fig. S8). These He-low variants, often forming HeL pairs (He-low–N, N–He-low, He-low–He-low), recurred across samples and appeared to represent FP-prone artifacts.

Fig. 7.

Fig. 7

Increased false positive (FP) error rates and sensitivity by conventional analysis (Company BB). A Higher FP error rates in R-R pairs by conventional analysis than by DRAGEN or in-house methods (P < 0.0001). B Increased HeL and HeLL pairs in conventional analysis (P < 0.0001). C Higher sensitivity by conventional analysis than by DRAGEN. P values are shown below graphs. For A and B: X-axis = company (AA–DD) and analysis type (D = DRAGEN, I = in-house, C = conventional); Y-axis = FP errors per Mb. For C: X-axis = expected VAF, Y-axis = detection rate. eVAF1.125 represents combined results of eVAF1 and eVAF1.25

Expanding the FP definition to include HeL and HeLL (0.01 > VAF > 0 in both DNAs) pairs further elevated error rates, particularly in the in-house results from Company CC and UMI-based results from Company DD (Additional file 1: Fig. S9). In contrast to increased FP error rates, the conventional pipeline showed no difference in sensitivity compared to DRAGEN, although it exhibited higher sensitivity than the in-house and in silico analyses (Fig. 7C).

Conventional analysis also showed batch-to-batch variability, especially in HeL pair detection (Additional file 1: Fig. S8C), underscoring the susceptibility of FP-prone alleles to technical variation across sequencing runs.

Validation of FP and inconsistent variant calls with single base extension assays

To assess whether discordant variants between pipelines represented true variants or false positives, we performed single base extension (SBE) assays on 22 target sites in CH100 (homozygous) and CH0 (heterozygous) DNAs. Several discrepancies were observed between SBE and VCF calls. For example, an A variant in NOTCH2 showed high VAFs by SBE (0.70 in CH00, 0.62 in CH0) but was absent or < 0.1 in DRAGEN and in-house VCFs (Fig. 8A). An A variant in MDC1 was detected by SBE (VAF = 0.29 in CH0) but not by DRAGEN (Fig. 8B). Conversely, a G variant in FANCD2 was absent in SBE but present in DRAGEN (0.16) and in-house (0.08) VCFs (Fig. 8C). Similarly, a T variant in ZNF217 was absent by SBE but appeared in in-house calls (0.10 and 0.07; Fig. 8D).

Fig. 8.

Fig. 8

Representative single base extension (SBE) validation of inconsistently called variants. A Variant classified as N–He-low (DRAGEN/in-house); SBE shows He–He. B Variant called as N–N (DRAGEN) but He-low–He-low (in-house); SBE shows N–He. C Variant called as He-low–N (both methods); SBE shows N–N. D Variant called as N–N (DRAGEN) but He-low–He-low (in-house); SBE also shows N–N. Gene names are shown above each SBE result. Upper panels: SBE assays of homozygous (CH100) and heterozygous (CH0) DNAs, with variant bases marked by arrows. The X-axis represents the length of extended products, and the Y-axis represents signal intensity in raw SBE assay data. Lower panels: variant information (chromosomal position, observed VAFs from VCF and BAM by DRAGEN/in-house, and companies reporting the variant)

Expanding validation to 18 additional sites showed that 11 variants had SBE VAFs < 0.01, indicating likely false positives (2 in Fig. 8, 9 in Additional file 1: Fig. S10). Four variants showed weak signals (VAF 0.01–0.03; Additional file 1: Fig. S11), while seven displayed strong signals (VAF > 0.1; 2 in Fig. 5, 8 in Additional file 1: Fig. S12). Five of the strong variants were in NOTCH2 and MDC1, both paralog-rich genes prone to mapping artifacts. Excluding these, only two variants (in APOBEC3B and FANCD2) showed clear SBE signals, though their VAFs were inconsistent with VCF values, underscoring challenges in T-NGS variant calling.

Comparison of SBE results to VCF and BAM data (Additional file 2: Table S2) showed that DRAGEN had the highest concordance with SBE, matching in 14 of 22 variants. In contrast, only 5 variants matched DRAGEN BAM, 2 matched in-house VCFs, and 1 matched in-house BAM files. These findings indicate that DRAGEN more effectively filters spurious variants and is less prone to false positives than in-house pipelines.

Discussion

The sensitivity and specificity of next-generation sequencing (NGS) results from various service providers and commercial kits have not been extensively studied [45, 4851], despite their critical role in clinical decision-making, particularly in cancer genomics. To address this gap, we evaluated targeted NGS (T-NGS) results using a reference-standard approach involving mixtures of homozygote hydatidiform mole DNA and heterozygote blood DNA. Our analysis revealed striking variability among service providers, with sensitivity differing by 13.9-fold and specificity by up to 615-fold.

While prior studies assessed T-NGS sensitivity [20, 5256], many relied on less precise approaches. By employing reference standards and controlled dilutions, we obtained more accurate estimates, ranging from 1.61% to 22.25%. Company AA’s in-house pipeline failed to detect variants below 20% allele fraction, whereas re-analysis with DRAGEN improved sensitivity, underscoring the importance of bioinformatics settings.

HLA regions posed particular challenges. In Company CC, expected dilution effects were not observed, indicating alignment and variant-calling failures in this complex locus [5760]. DRAGEN performed better than in-house methods but still left many false negatives, highlighting the need for HLA-specific refinements in T-NGS pipelines.

Although accurate variant calling is essential for interpreting NGS results [61, 62], false positive (FP) errors from preferential base calls were substantial. Many of them arose in V–V pairs, producing disproportionately high FP error rates. Detection of reference bases with < 0.5% allele fraction across platforms confirmed a systemic bias toward reference calling. Excluding such errors (VVR) reduced FP rates by 60–70%, underscoring their critical role in specificity estimation.

Few studies have evaluated T-NGS specificity comprehensively, especially for low-VAF variants [7, 4547]. In this study, FP error rates varied widely (4.22–2,596 per Mb). For Company CC, DRAGEN reduced FP errors 36-fold compared to in-house (5.77 vs. 209.46 per Mb), whereas Company AA showed increased FP errors under DRAGEN, likely reflecting stricter in-house filtering. These findings show that pipeline stringency strongly influences specificity, even with the same Illumina sequencing data.

Discrepant variant calls have previously been reported at rates of up to 43% between the GDSC and CCLE cancer cell line databases [8]. While one study attributed these inconsistencies to genetic or transcriptional evolution rather than sequencing artifacts [63], our earlier work using targeted NGS demonstrated that many arose from false negatives in whole-exome sequencing [9]. Extending this finding, the present study identified substantial discordance between DRAGEN and in-house pipelines applied to the same data. Among 22 discrepant variant calls, 11–15 (50–68%) were not confirmed by single base extension (SBE) assays, suggesting that they were false positives. DRAGEN showed the highest concordance with SBE (14/22), indicating greater robustness in filtering spurious calls compared to in-house pipelines. These results underscore the importance of rigorous FP filtering when interpreting NGS data.

Many research labs and clinical providers continue to use conventional pipelines based on BWA [64] and GATK Mutect2 [65, 66]. In our evaluation using default settings, conventional analysis demonstrated comparable sensitivity to the DRAGEN analysis but resulted in a 4–5-fold increase in FP errors and an 8–11-fold rise in low-VAF calls of ‘HeL’ and ‘HeLL’ compared to DRAGEN or in-house pipelines. Many of the variants with low VAFs recurred across samples, suggesting they represent FP-prone artifacts. Thus, the in-house pipeline appears to prioritize specificity over sensitivity, whereas the DRAGEN method demonstrates a more balanced performance between sensitivity and specificity.

Company DD uniquely employed unique molecular identifiers (UMIs) to improve accuracy [39, 41]. UMIs significantly enhanced sensitivity but paradoxically yielded the highest FP error rate, contradicting the common assumption that UMIs always improve specificity [39, 40]. Applying a 5% VAF cutoff reduced FP rates below those of non-UMI datasets, emphasizing the need to optimize filtering parameters, even with advanced technologies.

The SEQC2 project evaluated sensitivity and specificity across eight pan-cancer panels [36], reporting detection limits of 1–5% allele fraction—consistent with our findings (1.61–4.55%) after excluding Company AA and DD. Our study revealed even greater variability in sensitivity, likely reflecting random or process-based errors. The FP rates in SEQC2 (10⁻⁵ to 10⁻⁶) were also comparable to ours (2.09 × 10⁻⁴ to 4.22 × 10⁻⁶). Unlike SEQC2, which relied on pre-characterized reference standards, our approach did not require prior variant annotation, further emphasizing the need for robust and universally accepted reference materials for T-NGS evaluation.

While some FP-prone sites in T-NGS have been identified [6769], they are rarely reported to end-users. Our study highlights reproducibly error-prone sites and proposes practical detection methods. Excluding these alleles from clinical reports could substantially reduce FP calls. However, batch-to-batch and sample-to-sample variation complicate this task, reinforcing the need for additional FP-reduction strategies—including report-level filtering of known error-prone loci.

Several limitations of our study should be noted: (1) Variant types: We focused on base substitutions, not structural variants, deletions, or amplifications. Future studies should broaden this scope. (2) Sample type: We used freshly prepared DNA mixtures, while clinical samples are often FFPE, which may degrade performance due to DNA damage [70]. (3) Platform scope: Only Illumina systems were tested. Cross-platform evaluations would offer more generalizable insights. (4) Pipeline transparency: Lack of technical details from service providers limited deeper interpretation of observed differences.

Overall, this study highlights substantial variability in the sensitivity and specificity of T-NGS results from certified providers, underscoring the need to rigorously quantify these metrics for clinical interpretation—particularly given the current emphasis on novelty over practical implementation in precision medicine models [71]. As the field progresses, the use of complex, unstandardized bioinformatics pipelines may compromise reproducibility. Incorporating well-characterized reference standards and systematically benchmarking both commercial and open-source tools (e.g., Amazon Genomics CLI, nf-core) will be essential to enhancing the clinical reliability and utility of T-NGS.

Conclusions

T-NGS sensitivity and specificity vary widely among certified providers. Balancing sensitivity and specificity requires robust reference standards, transparent pipelines, and optimized filtering strategies. Mixtures of homozygous and heterozygous DNAs serve as effective reference materials and should be adopted broadly to standardize T-NGS evaluation and improve reliability in precision oncology.

Methods

Preparation of reference-standard DNA materials

Blood samples were obtained with informed consent and used in accordance with the Declaration of Helsinki. The study received approval from the Institutional Review Board (IRB) of the National Cancer Center in Korea. Genomic DNA was extracted from blood samples using the DNeasy Blood and Tissue Kit (Qiagen, Valencia, CA). Human hydatidiform mole DNA (NA07489, Coriell, Camden, NJ) was further purified using QIAamp DNA Mini Kits to remove RNA contamination. Reference-standard DNAs (CH0.5–CH99.5) were prepared by mixing hydatidiform mole DNA (homozygote DNA) and blood-derived genomic DNA (heterozygote DNA) at ratios ranging from 99.5:0.5 to 0.5:99.5. DNA concentration and purity were assessed using a Nanodrop 8000 UV-vis spectrometer (Thermo Scientific, Waltham, MA) and a Qubit 2.0 Fluorometer (Life Technologies, Grand Island, NY). The reference standards were sent to Company AA (Macrogen, Seoul, Korea), Company BB (Theragen Bio, Seoul, Korea), and Company CC (Geninus, Seoul, Korea). The following kits were used: AA, Axen™ Cancer Panel; BB, SureSelect™ Cancer CGP Assay; CC, CancerSCAN™; and DD, TruSight™ Oncology 500 (TSO500) Assay. A sample input of 200 ng was used for each kit, except for the DD kit of TSO500, which required 30 ng.

Second batch samples, including CH0.5 and CH99.5, which were not included in the first batch, were sent separately to Company CC, and their raw data were produced at a different time. Similarly, additional second batch samples, including CH0.5, CH1, CH2.5, CH97.5, CH99, and CH99.5, were sent separately to Company BB, with their raw data also produced at a different time.

Raw data production for T-NGS analyses

The quality and quantity of DNA were re-evaluated following each company’s standard protocols. Target capture, paired-end NGS library construction, and sequencing were performed according to each company’s methods, all employing Illumina sequencing technology. Companies AA and CC are CAP-accredited, while Company BB is accredited by the Korean FDA for T-NGS services. TSO500 kit (DD), which incorporates unique molecular identifiers (UMIs), was utilized by Company CC for raw data production.

Variant detection and alignment

Variants were analyzed using each company’s proprietary bioinformatics pipelines as well as the DRAGEN system (version 4.2; Illumina, San Diego, CA, USA), which employed the GRCh37 (hg19) reference genome. In-house analyses for Companies AA to CC were conducted using each company’s own alignment and variant calling tools under their respective conditions. Among these, only Company BB provided a BED file, while variant analyses for Companies AA and CC were restricted to exonic regions of target genes. For the DRAGEN system, default settings were applied for all steps.

For conventional analysis of the data from Company BB, sequencing reads from FASTQ files were aligned to the human reference genome (GRCh37) using BWA-MEM (v0.7.19) with default parameters [64], generating BAM files. PCR duplicates were marked, and base quality score recalibration was performed using GATK (v4.6.2) following the GATK Best Practices workflow [65, 66]. Somatic single nucleotide variants (SNVs) and small insertions/deletions (indels) were then called using GATK Mutect2 with default settings. The resulting variant calls were annotated using Funcotator, employing the default data sources provided by GATK.

For analysis of sequencing data from Company DD, the Illumina DRAGEN TSO500 pipeline was employed (Illumina), using default parameters [72]. The pipeline was executed with sample metadata formatted in accordance with Illumina standards and raw sequencing output in the form of base call (BCL) files. These BCL files were converted to FASTQ format and aligned to the human reference genome (GRCh37) using the DRAGEN hardware-accelerated alignment algorithm, producing BAM files. Variant calling was performed using the DRAGEN variant caller configured for the TSO500 panel, enabling detection of SNVs and small indels. Variant quality was assessed based on DRAGEN’s default thresholds. Functional annotation of the identified variants was conducted using the TSO500 Analysis Software (v2.1.1) and Illumina-provided annotation resources.

In silico dilution and variant calling

To simulate variant allele frequency (VAF) gradients without additional wet-lab experiments, we employed an in silico dilution strategy using raw paired-end FASTQ files generated from two distinct DNA sources—homozygous and heterozygous samples. Each FASTQ read pair (R1 and R2) was subsampled at predefined ratios (e.g., 0.5:99.5, 1:99, …, 99.5:0.5) to generate synthetic mixtures, while preserving read pairing integrity. These computationally generated datasets were then aligned to the human reference genome using the DRAGEN Bio-IT platform (Illumina). Variant calling and annotation were performed using DRAGEN’s built-in pipelines with default settings.

Estimation of T-NGS sensitivity

Detection sensitivity in NGS is typically expressed as the Limit of Detection (LOD), following guidelines from the American College of Medical Genetics and Genomics (ACMG) [12]. To estimate the sensitivity of T-NGS, Probit regression, a linear regression model with a cumulative normal distribution link function, was applied to estimate a regression line using all or part of the observed data, from which the LOD value was determined by estimating the lowest concentration that can detect 95% of alleles.

Reference-standard DNAs were used to estimate sensitivity. Alleles were classified by VAF thresholds as follows: 0 = null (N); 0 < VAF < 0.01 = He-ll; 0.01 ≤ VAF < 0.1 = He-low; 0.10–0.75 = heterozygous (He); and > 0.90 = homozygous (Ho). Sensitivity-informative pairs were then defined by comparing classifications between homozygous and heterozygous DNAs, yielding N–Ho, Ho–N, and N–He pairs. The other pairs were classified as sensitivity-uninformative allele. Variant absence at sensitivity-informative sites in DNA mixtures was counted as a false negative (FN), and their VAFs were used to evaluate linearity across dilution ratios.

Estimation of T-NGS specificity

Specificity was evaluated using results from the reference-standard DNAs. Specificity-informative alleles were defined as those present in T-NGS results from mixed reference-standard DNAs but absent from both DNA1 and DNA2. Specificity-informative alleles were categorized into three pairs: 1) R-R pairs (reference bases in both homozygote and heterozygote DNAs); 2) V-V pairs (no reference bases in either homozygote or heterozygote DNA); and 3) V-R pairs (both reference and variant bases in homozygote and heterozygote DNAs). FP errors in V-V pairs (VVR errors) were excluded from total FP error calculations.

For FP error analysis, HeL pairs included He-low–He-low, He-low–N, and N–He-low combinations, while HeLL pairs included He-ll–He-ll, He-ll–N, and N–He-ll.

Validation of variants by single base extension

To assess the presence of variants in homozygote and heterozygote DNAs, target regions flanking the variant sites were amplified by PCR using forward and reverse primers (FP and RP; Additional file 2: Table S3) under the following cycling conditions: 28 cycles of 40 s at 95 °C, 40 s at 60 °C, and 60 s at 72 °C. Annealing temperatures were adjusted to 70 °C for MDC1 and 64 °C for APOBEC3B. In addition, 5% DMSO was added to the PCR reaction buffer for targets in APOBEC3B and NOTCH2. PCR products were purified using the AxyPrep™ PCR Cleanup Kit (Axygen Scientific, Inc., Union city, CA) to remove residual primers.

The purified PCR products were subjected to single base extension (SBE) using the SNaPshot Kit (Applied Biosystems, Foster City, CA, USA) and extension primers (EP; Additional file 2: Table S3) under the following cycling conditions: 25 cycles of 10 s at 96 °C, 5 s at 50 °C, and 30 s at 60 °C. To eliminate non-specific extension products, 1 μl of alkaline phosphatase (Roche, Mannheim, Germany) was added to the reaction. Subsequently, 1 μl of the treated product was mixed with 14 μl of HiDi™ formamide (Applied Biosystems) and 0.5 μl of GeneScan-120 LIZ size standard (Applied Biosystems), then denatured at 95 °C for 5 min.

The extended products were analyzed on a 3500 Genetic Analyzer (Applied Biosystems), and peak profiles were assessed using GeneMapper software (version 6.1, Applied Biosystems).

Supplementary Information

13059_2025_3882_MOESM1_ESM.docx (9.7MB, docx)

Additional file 1. Fig. S1 Correlation of observed variant allelic fractions and expected values. Fig. S2 Per base sequence quality of raw T-NGS data for DNA2 sample from each company. Fig. S3 Detection of Ho-N, N-Ho, and N-He alleles by in silico and conventional methods. Fig. S4 Total false positive errors in T-NGS variant calls for Company BB with exclusion or inclusion of VVR errors. Fig. S5 Detection rate of diluted reference bases in mixed DNA reference standards. Fig. S6 False positive (FP) error–prone alleles in the in-house results from Company CC. Fig. S7 Batch-to-batch difference in the in-house results from Companies BB and CC. Fig. S8 False positive (FP) errors across bioinformatics methods (DRAGEN, in-house, and conventional) for Company BB. Fig. S9 Total false positive (FP) error rates including HeL and HeLL pairs. Fig. S10 Single base extension (SBE) results suggesting N–N pairs. Fig. S11 Single base extension (SBE) results suggesting variants with VAFs between 0.01 and 0.03. Fig. S12 Single base extension (SBE) results showing strong peak signals.

13059_2025_3882_MOESM2_ESM.xlsx (26.6KB, xlsx)

Additional file 2. Table S1. Quality control summary of analyzed samples from four companies. Table S2. Comparison of single base extension (SBE) results with DRAGEN (D) and in-house (I) analyses. Table S3. Primer information for single base extension assays.

Acknowledgements

The Bioinformatics Analysis Team assisted with DRAGEN-based raw data analysis; the Biostatistics Collaboration Team supported the statistical analyses; and the Genetic Analysis Team supported the single base extension assays at the Research Core Center, National Cancer Center, Korea.

Peer review information

Andrew Cosgrove and Claudia Feng were the primary editors of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team. The peer-review history is available in the online version of this article.

Author contributions

Conceptualization, K-M.H.; methodology, E-K.K., H.W.C., S.G.H., Y-H.K.; in vitro assay, H.W.C.; data acquisition and formal analysis, Y.M., J-K.K.; statistical analysis, D-e.L.; data curation, Y.M., C.H.H., Y-H.K.; writing-original draft preparation, Y.M., C.H.H., K-M.H.; writing-review and editing, T-M.K., S.G.H., N.H.; supervision and funding acquisition, K-M.H.

Funding

This research was supported by the intramural program of the National Cancer Center, Korea (grant 2311430 to K-M. H.).

Data availability

All T-NGS fastq files used for this study are publicly available in the NCBI Sequence Read Archive (SRA) under the BioProject accession number PRJNA1134909 (https:/www.ncbi.nlm.nih.gov/sra/PRJNA1134909) [73].

Declarations

Ethics approval and consent to participate

Blood samples were obtained with informed consent, and the study received approval from the Institutional Review Board (IRB) of the National Cancer Center in Korea.

Consent for publication

All authors read and approved the final manuscript. All participants provided consent for publication.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Youngbeen Moon and Young-Ho Kim contributed equally to this work.

References

  • 1.Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016;17:333–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Slatko BE, Gardner AF, Ausubel FM. Overview of Next-Generation sequencing technologies. Curr Protoc Mol Biol. 2018;122:e59. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Nangalia J, Campbell PJ. Genome sequencing during a patient’s journey through cancer. N Engl J Med. 2019;381:2145–56. [DOI] [PubMed] [Google Scholar]
  • 4.Consortium ITP-CAWG. Pan-cancer analysis of whole genomes. Nature. 2020;578:82–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Moore L, Cagan A, Coorens THH, Neville MDC, Sanghvi R, Sanders MA, Oliver TRW, Leongamornlert D, Ellis P, Noorani A, et al. The mutational landscape of human somatic and germline cells. Nature. 2021;597:381–6. [DOI] [PubMed] [Google Scholar]
  • 6.Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, et al. A tale of three next generation sequencing platforms: comparison of ion Torrent, Pacific biosciences and illumina miseq sequencers. BMC Genomics. 2012;13:341. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Mu W, Lu HM, Chen J, Li S, Elliott AM. Sanger confirmation is required to achieve optimal sensitivity and specificity in next-generation sequencing panel testing. J Mol Diagn. 2016;18:923–32. [DOI] [PubMed] [Google Scholar]
  • 8.Hudson AM, Yates T, Li Y, Trotter EW, Fawdar S, Chapman P, Lorigan P, Biankin A, Miller CJ, Brognard J. Discrepancies in cancer genomic sequencing highlight opportunities for driver mutation discovery. Cancer Res. 2014;74:6390–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Kim YH, Song Y, Kim JK, Kim TM, Sim HW, Kim HL, et al. False-negative errors in next-generation sequencing contribute substantially to inconsistency of mutation databases. PLoS One. 2019;14:e0222535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Food US, Administration D. Considerations for design, development, and analytical validation of next generation sequencing (NGS)-based in vitro diagnostics (IVDs) intended to aid in the diagnosis of suspected germline diseases: guidance for stakeholders and Food and Drug Administration staff. Silver Spring (MD): U.S. Food and Drug Administration; 2018. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-design-development-and-analytical-validation-next-generation-sequencing-ngs-based.
  • 11.Luh F, Yen Y. FDA guidance for next generation sequencing-based testing: balancing regulation and innovation in precision medicine. NPJ Genom Med. 2018;3:28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Rehm HL, Bale SJ, Bayrak-Toydemir P, Berg JS, Brown KK, Deignan JL, Friez MJ, Funke BH, Hegde MR, Lyon E, et al. ACMG clinical laboratory standards for next-generation sequencing. Genet Med. 2013;15:733–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Xiao W, Ren L, Chen Z, Fang LT, Zhao Y, Lack J, Guan M, Zhu B, Jaeger E, Kerrigan L, et al. Toward best practice in cancer mutation detection with whole-genome and whole-exome sequencing. Nat Biotechnol. 2021;39:1141–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Zhao Y, Fang LT, Shen TW, Choudhari S, Talsania K, Chen X, Shetty J, Kriga Y, Tran B, Zhu B, et al. Whole genome and exome sequencing reference datasets from a multi-center and cross-platform benchmark study. Sci Data. 2021;8:296. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Fang LT, Zhu B, Zhao Y, Chen W, Yang Z, Kerrigan L, Langenbach K, de Mars M, Lu C, Idler K, et al. Establishing community reference samples, data and call sets for benchmarking cancer mutation detection using whole-genome sequencing. Nat Biotechnol. 2021;39:1151–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Jones W, Gong B, Novoradovskaya N, Li D, Kusko R, Richmond TA, Johann DJ Jr., Bisgin H, Sahraeian SME, Bushel PR, et al. A verified genomic reference sample for assessing performance of cancer panels detecting small variants of low allele frequency. Genome Biol. 2021;22:111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Mercer TR, Xu J, Mason CE, Tong W, Consortium MS. The sequencing quality control 2 study: establishing community standards for sequencing in precision medicine. Genome Biol. 2021;22:306. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Natarajan KN, Miao Z, Jiang M, Huang X, Zhou H, Xie J, Wang C, Qin S, Zhao Z, Wu L, et al. Comparative analysis of sequencing technologies for single-cell transcriptomics. Genome Biol. 2019;20:70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Chen W, Zhao Y, Chen X, Yang Z, Xu X, Bi Y, Chen V, Li J, Choi H, Ernest B, et al. A multicenter study benchmarking single-cell RNA sequencing technologies using reference samples. Nat Biotechnol. 2021;39:1103–14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Deveson IW, Gong B, Lai K, LoCoco JS, Richmond TA, Schageman J, et al. Evaluating the analytical validity of circulating tumor DNA sequencing assays for precision oncology. Nat Biotechnol. 2021;39:1115–28. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Foox J, Tighe SW, Nicolet CM, Zook JM, Byrska-Bishop M, Clarke WE, Khayat MM, Mahmoud M, Laaguiby PK, Herbert ZT, et al. Performance assessment of DNA sequencing platforms in the ABRF Next-Generation sequencing study. Nat Biotechnol. 2021;39:1129–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kim H, Yun JW, Lee ST, Kim HJ, Kim SH, Kim JW. Korean society for genetic diagnostics clinical guidelines C: Korean society for genetic diagnostics guidelines for validation of Next-Generation Sequencing-Based somatic variant detection in hematologic malignancies. Ann Lab Med. 2019;39:515–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Lee SH, Lee B, Shim JH, Lee KW, Yun JW, Kim SY, Kim TY, Kim YH, Ko YH, Chung HC, et al. Landscape of actionable genetic alterations profiled from 1,071 tumor samples in Korean cancer patients. Cancer Res Treat. 2019;51:211–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Lee J, Choi S, Jung D, Jung Y, Kim JH, Jung S, Lee WS. Mutational characterization of colorectal cancer from Korean patients with targeted sequencing. J Cancer. 2021;12:7300–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Jung K, Lee S, Na HY, Kim JW, Lee JC, Hwang JH, Kim JW, Kim J. NGS-based targeted gene mutational profiles in Korean patients with pancreatic cancer. Sci Rep. 2022;12:20937. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Suh KJ, Kim SH, Kim YJ, Shin H, Kang E, Kim EK, et al. Clinical application of next-generation sequencing in patients with breast cancer: real-world data. J Breast Cancer. 2022;25:366–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Jang M, Pak HY, Heo JY, Lim H, Choi YL, Shim HS, et al. Trends and clinical characteristics of next-generation sequencing-based genetic panel tests: an analysis of Korean nationwide claims data. Cancer Res Treat. 2024;56:27–36. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Fisher R, Pusztai L, Swanton C. Cancer heterogeneity: implications for targeted therapeutics. Br J Cancer. 2013;108:479–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Rubben A, Araujo A. Cancer heterogeneity: converting a limitation into a source of biologic information. J Transl Med. 2017;15:190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Zhang J, Spath SS, Marjani SL, Zhang W, Pan X. Characterization of cancer genomic heterogeneity by next-generation sequencing advances precision medicine in cancer treatment. Precis Clin Med. 2018;1:29–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Abbasi A, Alexandrov LB. Significance and limitations of the use of next-generation sequencing technologies for detecting mutational signatures. DNA Repair (Amst). 2021;107:103200. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Frampton GM, Fichtenholtz A, Otto GA, Wang K, Downing SR, He J, Schnall-Levin M, White J, Sanford EM, An P, et al. Development and validation of a clinical cancer genomic profiling test based on massively parallel DNA sequencing. Nat Biotechnol. 2013;31:1023–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Shin HT, Choi YL, Yun JW, Kim NKD, Kim SY, Jeon HJ, Nam JY, Lee C, Ryu D, Kim SC, et al. Prevalence and detection of low-allele-fraction variants in clinical cancer samples. Nat Commun. 2017;8:1377. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Miura T, Yasuda S, Sato Y. A simple method to estimate the in-house limit of detection for genetic mutations with low allele frequencies in whole-exome sequencing analysis by next-generation sequencing. BMC Genom Data. 2021;22:8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Sahraeian SME, Fang LT, Karagiannis K, Moos M, Smith S, Santana-Quintero L, Xiao C, Colgan M, Hong H, Mohiyuddin M, Xiao W. Achieving robust somatic mutation detection with deep learning models derived from reference data sets of a cancer sample. Genome Biol. 2022;23:12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Gong B, Li D, Kusko R, Novoradovskaya N, Zhang Y, Wang S, Pabon-Pena C, Zhang Z, Lai K, Cai W, et al. Cross-oncopanel study reveals high sensitivity and accuracy with overall analytical performance depending on genomic regions. Genome Biol. 2021;22:109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Gong B, Kusko R, Jones W, Tong W, Xu J. Ultra-deep multi-oncopanel sequencing of benchmarking samples with a wide range of variant allele frequencies. Sci Data. 2022;9:288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Li W, Huang X, Patel R, Schleifman E, Fu S, Shames DS, et al. Analytical evaluation of circulating tumor DNA sequencing assays. Sci Rep. 2024;14:4973. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Kou R, Lam H, Duan H, Ye L, Jongkam N, Chen W, Zhang S, Li S. Benefits and challenges with applying unique molecular identifiers in next generation sequencing to detect low frequency mutations. PLoS ONE. 2016;11:e0146638. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Tsagiopoulou M, Maniou MC, Pechlivanis N, Togkousidis A, Kotrova M, Hutzenlaub T, Kappas I, Chatzidimitriou A, Psomopoulos F. UMIc: A preprocessing method for UMI deduplication and reads correction. Front Genet. 2021;12:660366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Sun J, Philpott M, Loi D, Li S, Monteagudo-Mesas P, Hoffman G, Robson J, Mehta N, Gamble V, Brown T Jr., et al. Correcting PCR amplification errors in unique molecular identifiers to generate accurate numbers of sequencing molecules. Nat Methods. 2024;21:401–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Jennings LJ, Arcila ME, Corless C, Kamel-Reid S, Lubin IM, Pfeifer J, et al. Guidelines for validation of Next-Generation Sequencing-Based oncology panels: a joint consensus recommendation of the association for molecular pathology and college of American pathologists. J Mol Diagn. 2017;19:341–65. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Petrackova A, Vasinek M, Sedlarikova L, Dyskova T, Schneiderova P, Novosad T, Papajik T, Kriegova E. Standardization of sequencing coverage depth in NGS: recommendation for detection of clonal and subclonal mutations in cancer diagnostics. Front Oncol. 2019;9:851. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Deshpande D, Chhugani K, Chang Y, Karlsberg A, Loeffler C, Zhang J, Muszynska A, Munteanu V, Yang H, Rotman J, et al. RNA-seq data science: from Raw data to effective interpretation. Front Genet. 2023;14:997383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Matthijs G, Souche E, Alders M, Corveleyn A, Eck S, Feenstra I, Race V, Sistermans E, Sturm M, Weiss M, et al. Guidelines for diagnostic next-generation sequencing. Eur J Hum Genet. 2016;24:2–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Bauer P, Kandaswamy KK, Weiss MER, Paknia O, Werber M, Bertoli-Avella AM, Yuksel Z, Bochinska M, Oprea GE, Kishore S, et al. Development of an evidence-based algorithm that optimizes sensitivity and specificity in ES-based diagnostics of a clinically heterogeneous patient population. Genet Med. 2019;21:53–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Rose Brannon A, Jayakumaran G, Diosdado M, Patel J, Razumova A, Hu Y, Meng F, Haque M, Sadowska J, Murphy BJ, et al. Enhanced specificity of clinical high-sensitivity tumor mutation profiling in cell-free DNA via paired normal sequencing using MSK-ACCESS. Nat Commun. 2021;12:3770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Salk JJ, Schmitt MW, Loeb LA. Enhancing the accuracy of next-generation sequencing for detecting rare and subclonal mutations. Nat Rev Genet. 2018;19:269–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Harismendy O, Ng PC, Strausberg RL, Wang X, Stockwell TB, Beeson KY, Schork NJ, Murray SS, Topol EJ, Levy S, Frazer KA. Evaluation of next generation sequencing platforms for population targeted sequencing studies. Genome Biol. 2009;10:R32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Bacher U, Shumilov E, Flach J, Porret N, Joncourt R, Wiedemann G, Fiedler M, Novak U, Amstutz U, Pabst T. Challenges in the introduction of next-generation sequencing (NGS) for diagnostics of myeloid malignancies into clinical routine use. Blood Cancer J. 2018;8:113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Song P, Wu LR, Yan YH, Zhang JX, Chu T, Kwong LN, Patel AA, Zhang DY. Limitations and opportunities of technologies for the analysis of cell-free DNA in cancer diagnostics. Nat Biomed Eng. 2022;6:232–45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Newman AM, Bratman SV, To J, Wynne JF, Eclov NC, Modlin LA, Liu CL, Neal JW, Wakelee HA, Merritt RE, et al. An ultrasensitive method for quantitating Circulating tumor DNA with broad patient coverage. Nat Med. 2014;20:548–54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Dai P, Wu LR, Chen SX, Wang MX, Cheng LY, Zhang JX, Hao P, Yao W, Zarka J, Issa GC, et al. Calibration-free NGS quantitation of mutations below 0.01% VAF. Nat Commun. 2021;12:6123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Willey JC, Morrison TB, Austermiller B, Crawford EL, Craig DJ, Blomquist TM, Jones WD, Wali A, Lococo JS, Haseley N, et al. Advancing NGS quality control to enable measurement of actionable mutations in Circulating tumor DNA. Cell Rep Methods. 2021;1:100106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Cheng C, Xiao P. Evaluation of the correctable decoding sequencing as a new powerful strategy for DNA sequencing. Life Sci Alliance. 2022;5:e202101294. [DOI] [PMC free article] [PubMed]
  • 56.Cheng C, Fei Z, Xiao P. Methods to improve the accuracy of next-generation sequencing. Front Bioeng Biotechnol. 2023;11:982111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Hosomichi K, Shiina T, Tajima A, Inoue I. The impact of next-generation sequencing technologies on HLA research. J Hum Genet. 2015;60:665–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Weimer ET, Montgomery M, Petraroia R, Crawford J, Schmitz JL. Performance characteristics and validation of Next-Generation sequencing for human leucocyte antigen typing. J Mol Diagn. 2016;18:668–75. [DOI] [PubMed] [Google Scholar]
  • 59.Ka S, Lee S, Hong J, Cho Y, Sung J, Kim HN, et al. Hlascan: genotyping of the HLA region using next-generation sequencing data. BMC Bioinformatics. 2017;18:258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Sverchkova A, Burkholz S, Rubsamen R, Stratford R, Clancy T. Integrative HLA typing of tumor and adjacent normal tissue can reveal insights into the tumor immune response. BMC Med Genomics. 2024;17:37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Groza C, Kwan T, Soranzo N, Pastinen T, Bourque G. Personalized and graph genomes reveal missing signal in epigenomic data. Genome Biol. 2020;21:124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Koboldt DC. Best practices for variant calling in clinical sequencing. Genome Med. 2020;12:91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Ben-David U, Siranosian B, Ha G, Tang H, Oren Y, Hinohara K, Strathdee CA, Dempster J, Lyons NJ, Burns R, et al. Genetic and transcriptional evolution alters cancer cell line drug response. Nature. 2018;560:325–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. In arXiv e-prints, vol. arXiv:1303.3997; 2013.
  • 65.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The genome analysis toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011;43:491–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Pfeifer SP. From next-generation resequencing reads to a high-quality variant data set. Heredity (Edinb). 2017;118:111–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Tomkova M, Schuster-Bockler B. DNA modifications: naturally more error prone? Trends Genet. 2018;34:627–38. [DOI] [PubMed] [Google Scholar]
  • 69.Davis EM, Sun Y, Liu Y, Kolekar P, Shao Y, Szlachta K, et al. Sequencerr: measuring and suppressing sequencer errors in next-generation sequencing data. Genome Biol. 2021;22:37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Zhang Y, Blomquist TM, Kusko R, Stetson D, Zhang Z, Yin L, Sebra R, Gong B, Lococo JS, Mittal VK, et al. Deep oncopanel sequencing reveals within block position-dependent quality degradation in FFPE processed samples. Genome Biol. 2022;23:141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Markowetz F. All models are wrong and yours are useless: making clinical prediction models impactful for patients. NPJ Precis Oncol. 2024;8:54. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Behera S, Catreux S, Rossi M, Truong S, Huang Z, Ruehle M, Visvanath A, Parnaby G, Roddey C, Onuchic V, et al. Comprehensive genome analysis and variant detection at scale using DRAGEN. Nat Biotechnol. 2025;43:1177–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 73.Moon Y, Kim YH, Kim JK, Hong CH, Kang EK, Choi HW, Lee D-E, Kim T-M, Heo SG, Han N, Hong K-M. Evaluation of false positive and false negative errors in targeted next generation sequencing. Datasets. NCBI Seq Read Archive 2024, PRJNA1134909 (https://www.ncbi.nlm.nih.gov/sra/PRJNA1134909). [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

13059_2025_3882_MOESM1_ESM.docx (9.7MB, docx)

Additional file 1. Fig. S1 Correlation of observed variant allelic fractions and expected values. Fig. S2 Per base sequence quality of raw T-NGS data for DNA2 sample from each company. Fig. S3 Detection of Ho-N, N-Ho, and N-He alleles by in silico and conventional methods. Fig. S4 Total false positive errors in T-NGS variant calls for Company BB with exclusion or inclusion of VVR errors. Fig. S5 Detection rate of diluted reference bases in mixed DNA reference standards. Fig. S6 False positive (FP) error–prone alleles in the in-house results from Company CC. Fig. S7 Batch-to-batch difference in the in-house results from Companies BB and CC. Fig. S8 False positive (FP) errors across bioinformatics methods (DRAGEN, in-house, and conventional) for Company BB. Fig. S9 Total false positive (FP) error rates including HeL and HeLL pairs. Fig. S10 Single base extension (SBE) results suggesting N–N pairs. Fig. S11 Single base extension (SBE) results suggesting variants with VAFs between 0.01 and 0.03. Fig. S12 Single base extension (SBE) results showing strong peak signals.

13059_2025_3882_MOESM2_ESM.xlsx (26.6KB, xlsx)

Additional file 2. Table S1. Quality control summary of analyzed samples from four companies. Table S2. Comparison of single base extension (SBE) results with DRAGEN (D) and in-house (I) analyses. Table S3. Primer information for single base extension assays.

Data Availability Statement

All T-NGS fastq files used for this study are publicly available in the NCBI Sequence Read Archive (SRA) under the BioProject accession number PRJNA1134909 (https:/www.ncbi.nlm.nih.gov/sra/PRJNA1134909) [73].


Articles from Genome Biology are provided here courtesy of BMC

RESOURCES