Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 May 15.
Published in final edited form as: Cancer. 2020 Dec 21;127(10):1576–1589. doi: 10.1002/cncr.33393

Ultra-sensitive detection of tumor-specific mutations in saliva of patients with oral cavity squamous cell carcinoma

Ashwini Shanmugam 1,*, Arun K Hariharan 1,*, Rifat Hasina 2,*, Jayalakshmi R Nair 1, Shanmukh Katragadda 1, Sivaraj Irusappan 1, Aarthi Ravichandran 1, Vamsi Veeramachaneni 1, Radhakrishna Bettadapura 1, Muddasir Bhati 3, Veena Ramaswamy 1, Vishal US Rao 4, Ritvi K Bagadia 4, Ashwini Manjunath 1, Manjunath NML 4, Monica Charlotte Solomon 5, Shiuli Maji 1, Urvashi Bahadur 1, Chetan Bettegowda 6, Nickolas Papadopoulos 6, Mark W Lingen 7, Ramesh Hariharan 1, Vaijayanti Gupta 1,, Nishant Agrawal 2,, Evgeny Izumchenko 8,
PMCID: PMC8084899  NIHMSID: NIHMS1654758  PMID: 33405231

Abstract

Background:

Oral cavity squamous cell carcinoma (OCSCC) is the most common head and neck malignancy. While survival of patients with advanced stage disease remains ~20–60%, when detected at early stage, survival approaches 80%, posing a pressing need for a well-validated profiling method for patients with high risk of developing OCSCC. Tumor DNA detection in saliva may provide a robust biomarker platform that overcomes limitations of current diagnostic tests. However, there is no routine saliva-based screening method for patients with OCSCC.

Methods:

We have designed a custom next generation sequencing panel with unique molecular identifiers that covers coding regions of 7 frequently mutated genes in OCSCC, and applied it on DNA extracted from 121 treatment-naïve OCSCCs and matched preoperative saliva specimens.

Results:

Using stringent variants calling criteria, mutations were detected in 106 tumors, consistent with a predicted detection of at least 88%. Moreover, mutations identified in primary malignancies, were also detected in 93% of saliva samples. To ensure that variants are not errors resulting in false positive calls, we performed a multistep analytical validation of this approach: (i) re-sequencing of 46 saliva samples confirmed 88% of somatic variants; (ii) no functionally relevant mutations were detected in saliva samples from 11 healthy subjects without history of tobacco and alcohol; (iii) using a panel of 7 synthetic loci across 8 sequencing runs, we confirmed that our platform is reproducible and provides sensitivity on par with droplet digital PCR.

Conclusions:

These data highlight the feasibility of somatic mutation identification in driver genes in saliva collected upon OCSCC diagnosis.

Keywords: OCSCC, mutation, saliva, oral rinse, early detection, NGS, liquid biopsy

Precis:

We have designed a custom NGS test with unique molecular identifiers that covers the coding regions of 7 frequently mutated genes in OCSCC (this minimal gene set predicted incidence of at least one somatic aberration in 88% of OCSCC patients). Our results demonstrate that this quick, sensitive, and non-invasive method can be used for detection of low frequency tumor-associated mutations in saliva specimens collected from patients with OCSCC.

Introduction

Oral cavity squamous cell carcinoma (OCSCC) accounts for nearly 50% of all head and neck cancers1. In 2018 alone, there was an estimated incidence of 355,000 oral cancer cases and 177,000 deaths worldwide2, with India having the highest case burden of 120,000. OCSCC is notorious for poor prognosis, which reflects its propensity to present as clinically advanced disease upon diagnosis1,3,4. Despite numerous therapeutic advances, the long-term survival for patients with HPV-negative OCSCC has remained ~55%, and earlier detection is critical510. If caught early, a patient has a survival rate of 80%, sharply in contrast to survival of 20–60% when diagnosed in the later stages. Consumption of alcohol, tobacco products, and betel quid and areca nut increase the risk of OCSCC. While prominent in oropharyngeal cancer, the prevalence of human papilloma virus (HPV) infection in OCSCC is low (~2.2%) and its significance remains debatable1114.

Current standard detection methods for OCSCC include the conventional visual and tactile exam (CVTE) followed by tissue biopsy and histologic evaluation. However, its use as a modality for large-scale population-based screening has recognized limitations. First, a sampling bias may lead to underdiagnosis or misdiagnosis, particularly in diffuse and/or multifocal lesions. And second, these procedures are invasive, require specialized expertise, and are associated with pain/discomfort, sometimes leading to treatment delay1517. As early diagnosis is crucial for reducing mortality rate of OSCC patients, several adjunctive screening devices/tests (such as hand-held light-based devices for assessing autofluorescence/tissue reflectance) have recently emerged with claims of enhancing the identification and prognostication of oral lesions1822. Recently, the Council on Scientific Affairs of the American Dental Association (ADA) conducted a comprehensive systematic review of the published literature with a goal of providing primary care clinicians with practical, real world recommendations regarding the clinical utility of the commercially available adjuncts/tests in the context of screening for oral potentially malignant disorders23,24. The conclusion of the meta-analysis was that there is insufficient evidence to support the contention that any of the current devices/tests demonstrated sufficient diagnostic accuracy to be used in conjunction with the CVTE, underscoring the need for molecular-based biomarkers.

Somatic mutations are one of the hallmarks of carcinogenic progression that allows reliable differentiation between cancer and normal tissues. The exclusive nature of tumor-defining driver genetic alterations makes them attractive biomarkers with a theoretical specificity approaching 100% when detectable. Although head and neck squamous cell carcinoma (HNSCC) is considered to be a vastly heterogeneous disease in respect to the specific mutations harbored in a given tumor, our group and others have revealed that HNSCC is largely driven by tumor suppressor mutations, with TP53 mutated in 86% of HPV negative samples followed by FAT1, CDKN2A, and NOTCH1 mutations in approximately 20% of samples2527. In addition, mutations in the oncogene PIK3CA were also identified in approximately 20% of head and neck malignancies. As such, genetic alterations in frequently mutated genes may serve as predictive biomarkers for detection of HNSCC despite the considerable inter-tumoral heterogeneity among HNSCC patients. Mutation based DNA biomarkers have several distinct advantages - unlike RNA and protein, they have no physiologic background and are not influenced by signaling changes induced during disease progression or therapy. Furthermore, unlike RNA or protein-based assays, DNA-based alterations should theoretically be found in appreciable levels only within cancer cells and not normal cells, allowing high specificity. Moreover, DNA is stable and amplifiable and may better stoichiometrically correlate with disease burden. Collectively, cancer specific genetic mutations allow for tremendous specificity.

It is known that as tumor grows, it sheds tumor cells and DNA into various body fluids, including saliva2832. The presence of tumor cells and tumor-derived DNA (tDNA) in saliva of patients with head and neck cancer is well-documented3335. Several studies have shown that mutations in tDNA exactly correspond to mutations in the primary tumor31,3638, and could therefore be used as a surrogate for a biopsy31. Given its non-invasive and easy-to-collect nature, saliva is an attractive matrix for inexpensive screening, diagnosis, monitoring, and post treatment surveillance of OCSCC, including for cases which require serial sample collection over time. While, tDNA is often present at frequencies less than 1% of all DNA in saliva samples, including patients with significant disease burden29,31, remarkable advances in next-generation sequencing (NGS) technologies (such as incorporation of molecular tagging and advancements in algorithms for background noise suppression), has made NGS a highly sensitive and specific platform for rare allele detection3942. We have designed a custom NGS test with unique molecular identifiers (UMIs) that covers the coding regions of 7 frequently mutated genes in OCSCC. This minimal gene set derived from the analysis of three independent public datasets, predicted incidence of at least one somatic aberration in 88% of patients with OCSCC, and was subsequently confirmed by our study. This suggests that targeted biomarkers sequencing may provide a viable alternative to whole exome sequencing (WES) to identify somatic mutations, which is associated with long processing times, cost and difficulties in confidently calling variants due to the low sequencing depth. We recruited 121 treatment-naïve OCSCC patients with 44% in early stages (Stages I and II), and sequence-profiled DNA isolated from the primary tumor tissue and matched pre-operative oral rinse using this test. Saliva samples from eleven healthy subjects were also profiled in duplicate for clinical specificity. In this paper, we present a low-cost, rapid and accurate NGS based test with high clinical utility aimed at detecting mutations in the oral rinse for early diagnosis and potential screening of OCSCC.

Materials and Methods

Ethics and Patient Recruitment

The study was approved by the Medical Ethics Committees of the three participating cancer centers, namely (a) HCG Cancer Centre, Bangalore, (b) HCG Panda Cancer Hospital, Cuttack, and (c) Tata Memorial Hospital, Mumbai. 121 treatment-naïve patients clinically diagnosed with OCSCC were enrolled into the study after obtaining their informed written consent. Staging was performed using the American Joint Committee on Cancer guidelines; clinical staging was used wherever histopathological evaluation was unavailable (9 subjects in the cohort). 44% of the subjects (n = 53) had early stage disease (Stages I and II) while the cancer was advanced (Stages III and IV) in the remaining 68 subjects (56%). Three-quarters of the cohort were male and 52% of the subjects were above the age of 50 (n = 64). Eleven age-matched normal subjects with no history of tobacco usage or alcohol consumption, and with no prior oral cancer or pre-cancer lesions were recruited. Detailed demographic and clinicopathological data for all individuals used in this study is presented in Supplementary Table S1 and summarized in Table 1.

Table 1.

Summary of patient demographic and clinicopathological data.

Subject Demographics
OCSCC
Total Enrolled 121
Age
Mean (±SD) 49.2 (±12.8)
≤40 29 (23.97%)
41–50 34 (28.10%)
51–60 30 (24.79%)
61–70 21 (17.36%)
≥71 7 (5.79%)
Gender
Female 30 (24.8%)
Male 91 (75.2%)
Stage-wise
Stage I 17 (14%)
Stage II 36 (29.8%)
Stage III 26 (21.5%)
Stage IV 42 (34.7%)
Early (I and II) 53 (43.8%)
Late (III and IV) 68 (56.2%)
Risk Factors
Tobacco Use (including betel quid and areca nut) 56 (46.28%)
Alcohol Consumption 4 (3.31%)
Both 38 (31.40%)
Unknown 23 (19.01%)
Healthy Controls
Total Enrolled 11
Age
Mean (+-SD) 47 (+−4.45)
Gender
Female 5 (45.45%)
Male 6 (54.55%)

Control Samples for Analytical Validation

To determine sensitivity of the panel, Seraseq® ctDNA Mutation Mix v2 variant allele frequencies (VAF) 0.25% (SeraCare Life Sciences Inc., Milford, MA, USA) was used. The specificity of the panel was evaluated using genomic DNA from the NA12878 cell line (Coriell Institute for Medical Research, Camden, NJ, USA).

Sample Collection

Matched primary tumor and oral rinse samples were collected from each subject. For formalin fixed and paraffin embedded (FFPE) samples with ≥20% tumor content the entire histological section was processed. For tumors with neoplastic content of <20%, the tumor areas were marked by the pathologist and scraped from the FFPE block for downstream processing. Oral rinse samples were collected prior to surgery or biopsy. Subjects were requested to swish 15ml of 0.9% saline solution in their mouths for 15–30 seconds before spitting it into a collection tube. Immediately after collection, the oral rinse was centrifuged at 3000g for 10 minutes at 4°C. The resulting pellet was resuspended in 10ml of ThinPrep® PreservCyt® Solution (Hologic, Inc., Marlborough, MA, USA), which allows the long-term preservation of saliva samples at room temperature. Primary tumor samples from surgery or biopsy were formalin fixed and paraffin embedded (FFPE) as per standard protocols. Both sample types were transported at room temperature to the central NGS testing laboratory.

Selection of Genes and Panel Design

Since one purpose of the study was to design a low-cost test for OCSCC, we developed a panel to maximize the number of unique patients who could be profiled with a minimal panel footprint. Three datasets were used for identifying the genes, (a) OSCC tumors from TCGA Head and Neck Squamous Cell Carcinoma dataset (n = 329)43, (b) ICGC Gingivo-buccal cohort (n = 50)44, and (c) MD Anderson Oral Squamous Cell Carcinoma cohort (n = 40)45. The sample IDs and cohort details of the three public datasets are given in Supplementary Table S2. Seven genes were identified which would cover at least 85% of the cohort across the datasets, namely CASP8, PIK3CA, FAT1, CDKN2A, NOTCH1, HRAS, and TP53. Hybridization probes were designed to capture all coding exons in the selected genes and were manufactured by IDT (Integrated DNA Technologies, Coralville, IA, USA) with 2x tiling for the target regions. The total number of target bases for this panel was 29.8Kbp.

Tumor DNA Extraction and Profiling

DNA from FFPE tissue was extracted using the QIAamp DNA FFPE Tissue Kit (Qiagen, Hilden, Germany) as per manufacturer’s recommendations. DNA quality was assessed by estimating percent amplifiable DNA using Alu-based qPCR quantification. Libraries were then prepared from 200ng of FFPE DNA using the KAPA Hyper plus Kit (Roche, Basel, Switzerland) with IDT’s xGen Dual Index unique molecular identifiers (UMI) Adapters (Integrated DNA Technologies, Coralville, IA, USA) for molecular and sample based barcoding. Targeted enrichment was performed using IDT’s custom synthesized xGen Lockdown Probes (Integrated DNA Technologies, Coralville, IA, USA) with modifications to hybridization temperature and time. Library quality was assessed using Agilent TapeStation 2200 (Agilent Technologies, Santa Clara, CA, USA) and were quantified using Qubit 2.0 Fluorometer (Invitrogen, Carlsbad, CA, USA). The final libraries were sequenced in on MiSeq (Illumina, Inc., San Diego, CA, USA), with loading optimized to achieve 1–2 million reads per sample.

Oral Rinse DNA Extraction and Profiling

DNA from oral rinse was isolated using QIAamp DNA Mini Kit (Qiagen, Hilden, Germany) as per manufacturer’s instructions. An input of 100ng was used to prepare libraries; the process is the same as described above. Final libraries were sequenced in on NextSeq500 (Illumina, Inc., San Diego, CA, USA). The library loading was optimized such that 35–45 million reads were sequenced per sample.

Bioinformatics and Interpretation Pipeline

All bioinformatics analyses were carried out on Strand NGS (ver. 3.3).

Alignment and Read Filters

Paired 150 bp reads were aligned against the GRCh37 (hg19) human genome assembly with minimum alignment identity set to 90%. Read pairs with identical UMI tags mapping to the exact same genomic position were clustered into UMI families. A consensus read was created from each UMI family. At each position in the read, the consensus base was that which occurred in more than 60% of reads in the UMI family. N was used to indicate bases where no consensus could be obtained. Quality of the consensus base was set to the maximum base quality at that position within the family. Consensus reads with alignment identity less than 95% and reads with indeterminate bases (Ns) were filtered out. Additionally, partially aligned and translocated reads were filtered out. Reads were removed if either paired read failed any of the above filters. An additional filter of UMI family size ≥3 (i.e. a consensus read must have been derived from at least three raw reads) was applied in case of the oral rinse samples. Quality control (QC) parameters such as total reads, average coverage, and percentage of reads on-target (% on-target), were determined for both raw reads and consensus reads. An average of 1,066 unique consensus reads (2,050X raw reads) per sample per base with an average of 2.26% LC and 86% on-target reads was achieved in FFPE samples. In saliva samples, an average coverage of 8,879X unique consensus reads (56,117X raw reads) per sample per base was achieved after applying read filters to eliminate background noise, with an average of 86% on-target reads. The data is presented in Supplementary Figure S1 and summarized in Table 2.

Table 2.

Sequencing metrics.

Sample type Platform Total raw reads Total consensus reads %Reads mapped to targets Average coverage (raw reads) Average coverage (consensus reads)
FFPE MiSeq 1,098,707 580,917 86 2,050 1,066
Oral rinse NextSeq500 30,024,859 8,247,771 86 56,117 8,879

Variant Calling

Single nucleotide variants (SNVs), and insertions/deletions (InDels) were detected from the final read list using a binomial SNV caller46. Only variants supported by at least 5 different consensus reads were called in both FFPE and saliva samples. FFPE is prone to noise due to artifacts of formalin fixation, which in our assay can be suppressed with the use of UMIs. A subset of data from FFPE was evaluated for variant calling (with UMIs) reproducibility with thresholds between 2% and 5% VAF. The data was found to be >99% reproducible at 4% VAF. Therefore, the threshold for variant detection in FFPE was set at 4% VAF. In case of tDNA from saliva, the threshold was set at 0.1% VAF in oral rinse, at par with emerging literature on somatic variant calling in liquid biopsies4749. In addition to the 5 consensus reads requirement for variant calling in oral rinse, we only considered high quality consensus reads that had been created from at least 3 raw reads. For both matrices, the pile-up at each base position of the panel was analyzed after discarding bases with quality <20, and paired-reads with different base calls at the position. Variants in homopolymer stretches and those with high strand and tail distance bias were also filtered out.

Interpretation

StrandOms, a clinical genomics interpretation and reporting platform from Strand Life Sciences, was used to prioritize and interpret variants identified in the samples. The primary tumor and oral rinse samples were interpreted independently. StrandOms contains an annotation engine with algorithms to identify the impact of the variant from public databases (dbSNP, 1000 Genomes, COSMIC etc) and bioinformatics prediction tools, along with proprietary content (data from over 15,000 samples) on genes, diseases, and therapeutic impact of somatic variants. A list of variants is categorized and annotated with likely functional effects as described in Sen et al46. The final list of clinically relevant somatic variants was shortlisted manually using the following rules. Germline variants were excluded if a variant had a recorded population allele frequency above 0.01 in any public database. The list was further pruned by checking against an internal database of germline mutations from 15,000 patients of the same ethnicity to avoid any cohort-specific germline variants. Additionally, if a variant had variant allele frequency (%VAF) >30% in both the matched FFPE and saliva samples, it was eliminated as a likely germline variant and not included in the concordance analysis. Final list of clinically relevant variants includes functionally damaging, and likely functionally damaging events including variants of unknown significance. Therefore, the final list of shortlisted variants per sample are most likely somatic.

Reproducibility and Concordance Analysis

For reproducibility analysis, we processed four primary tumor samples and 57 oral rinse samples (46 from OCSCC subjects and 11 from healthy subjects) in duplicates and independently sequenced. In primary tumor samples, we assessed reproducibility of all variants ≥4% VAF. Overall reproducibility in FFPE was calculated as follows:

100× Number of variants reproduced in the replicates at4%VAFTotal number of variants at4%VAF

In case of oral rinse samples, all variants in the range of 0.2 to 30% VAF from one replicate were assessed for their presence in the other replicate at variant calling threshold of 0.1% VAF. Reproducibility was calculated as the fraction of variants from a sample present in its replicate. Note that reproducibility was calculated in both directions – percentage of variants called in replicate 1 present in replicate 2 and vice versa. Overall reproducibility in oral rinse was calculated as follows:

100×Number of variant calls reproduced in the replicates at0.1%VAFTotal number of variant calls between 0.2 to 30%VAF

For concordance analysis, the variants obtained from interpreting the primary tumor sample were assessed for their presence in the matched saliva sample. Sample-pairs were called concordant if at least one tumor-specific mutation was found in the matched oral rinse specimen. Overall concordance was calculated as follows:

100 × Number of concordant sample pairsTotal number of primary tumor samples with clinically relevant variants

ddPCR

The 20 μl ddPCR reaction containing Supermix (Bio-Rad), primers, mutant and wild-type probe and template DNA were loaded into a droplet generator. The emulsion was transferred into a 96 well plate, sealed, and cycled using a C-1000 thermal cycler (Bio-Rad) under the following conditions: 10 min hold at 95°C, 45 cycles of 95°C for 15 s then 60°C for 60 s. After amplification, the plate was transferred to a droplet reader from which raw fluorescence amplitude data is extracted to the Quantasoft software for downstream analysis.

Data Availability

Raw targeted NGS data generated in this study will be available from the corresponding authors upon reasonable request.

Results

Design of custom oral cancer panel

Tumor-specific mutations in body fluids are generally present at low frequencies. NGS-based tests for detecting variants present below 0.5% VAF require the usage of unique molecular identifiers (UMIs) for noise suppression in conjunction with high depth of sequencing (typically >50,000x per locus). Consequently, costs of such tests would be prohibitively high for large gene panels. We aimed to select an optimal number of genes to build a panel that would cover >85% OCSCC patients in any cohort without redundancies. To this end, we have obtained WES data from 3 independent publicly available OCSCC databases (n = 419): TCGA-HNSC dataset, MD Anderson oral squamous cell carcinoma dataset, and ICGC gingivo-buccal cohort (Supplementary Table S2). For TCGA-HNSC dataset, only tumors of oral cavity ( ) were included to the analysis, while other anatomical sites (larynx, hypopharynx, oropharynx and tonsil) were excluded. For each cohort, we have selected a minimum number of frequently mutated genes with at least 80% of the patients harboring at least one genomic alteration in any gene in the panel. TP53, FAT1 and CASP8 were the top three mutated genes in all three databases. From the remaining genes specific to databases, we prioritized CDKN2A, NOTCH1, PIK3CA and HRAS for their clinical and biological significance in OCSCC (Figure 1A). Subsequently, a panel was designed with probes covering the coding exons of these seven genes. With this panel design, we observed that 82%, 89%, and 90% of the MD Anderson, TCGA, and ICGC cohorts respectively presented at least one mutation (Figure 1B). Upon combining cohorts from the three datasets, approximately 88% of the subjects were represented minimally by one mutation in this panel. Our analysis of the public data shows that the remaining 12% of the cohort is represented by a long tail of genes whose inclusion would significantly add to the sequencing costs per sample. This gives confidence that our panel should be able to profile >85% of OCSCC patients on a population-based level.

Figure 1: Selection of genes for the targeted OCSCC panel.

Figure 1:

A. A minimal set of genes where mutations would represent >85% distinct samples in the dataset were identified from three different WES studies on OCSCC patients, namely the TCGA-HNSC dataset restricted to OCSCC data (n=329), the ICGC (n=50) and the MD Anderson dataset (n=40). The intersection on the Venn diagram represents the core set of genes (TP53, FAT1 and CASP8) that captures the highest number of OCSCC samples across the three datasets. PIK3CA, HRAS, NOTCH1, and CDKN2A were manually curated for their clinical and biological significance in OCSCC, after evaluating the top 20 genes in each dataset. B. Bar chart shows the proportion of distinct samples (in each dataset and all three datasets combined) carrying at least one mutation in the 7 genes panel.

Ultra-deep targeted sequencing of primary OCSCC tumors

We first applied this targeted sequencing approach on DNA extracted from 121 treatment naïve FFPE-derived primary OCSCC surgical specimens (Table 1). These patients had not been treated with chemotherapy or radiation before their tumor biopsy, so the spectrum of changes will largely reflect lesions in their naturally occurring malignant state. We obtained 86% average on-target coverage with a median average consensus depth of 1,066X (2,050X raw read depth) across all sequenced tumor samples (Table 2, upper row). Using the stringent variant calling criteria of at least 5 mutant reads at 4% VAF, followed by filtering for functionally relevant somatic variants (see Methods section for details), 106 (87.6%) of the 121 sequenced specimens had at least one mutation detected in the seven genes included in our panel. Missense mutations were the most common type of variants identified in the cohort constituting 48.6% of the 278 variants identified, followed by nonsense mutations (28.8%). Mutations were detected in 75.5% of stage I/II and 97% of stage III/IV tumors (Figure 2A). In the samples where variants were found, 71 of 106 samples carried more than one reported mutation (21 in early and 50 in late stage disease). TP53 (n = 91) was the most frequently mutated gene, followed by CDKN2A (n = 34), FAT1 (n = 33), and CASP8 (n = 28). While mutation frequencies in our cohort of 121 OCSCC tumors largely resembled mutation pattern in the combined patient cohort from TCGA, ICGC, and MD Anderson WES datasets (Figure 2B), 6 of 7 genes in our sequencing panel showed higher mutation frequency in the study cohort compared to the frequencies for these genes seen in publicly available data (Figure 2B). Detection of additional variants is likely due to a much higher coverage achieved with our targeted sequencing approach compared to the coverage usually obtained with WES in FFPE samples, which typically range between 70X and 100X4345. While there was a trend toward higher proportion of variant detected in patients with advanced disease, we did not find an enrichment of a specific mutation by stage (Supplementary Tables S3A and S3B).

Figure 2: Targeted sequencing of primary OCSCC malignancies.

Figure 2:

A. Heatmap shows a sample-wise mutation distribution across the sequenced primary tumor specimens (106 subjects where at least one mutation was detected by the targeted 7 gene panel). Top panel: number of mutations per patient; Right panel: mutational frequency for each gene included in the targeted sequencing panel. The gender and histopathological stage classification are indicated in the strip chart below the heat map. B. Histogram comparing frequency distribution of mutations per gene in our study cohort and combined (TCGA, MD Anderson and ICGC) public datasets (n=419).

To validate the reproducibility of our sequencing and analytical workflow, new libraries were prepared from DNA extracted from 4 FFPE samples (subject OC-02–021, OC-020–035, OC-03–008, and OC-03–015). These libraries were sequenced and analyzed independently using variant calling threshold of 4% VAF. All variants (somatic and germline) detected in these 4 samples were considered for analysis. The reproducibility analysis has confirmed over 99% of the variants, and the prevalence of the mutant reads was very consistent between the two independent sequencing runs (Supplementary Table S4). Taken together, these observations support the credibility of the targeted biomarkers sequencing as a promising screening approach.

Analytical validation of the assay performance for low frequency variants detection in saliva

Unlike the FFPE tumor specimens, which contain higher degree of neoplastic cellularity (Supplementary Table S1), the presence of tDNA in body fluids is small, and high-DNase activity in saliva specimens further enhances tDNA degradation and abates its quality. As such, we have performed a vigorous multistep analytical validation of our sequencing method to ensure its ability to reliably detect mutant variants at low tDNA concentrations. To this end, a reference synthetic positive control containing seven well-characterized mutations in TP53 and PIK3CA genes with 0.25% VAF was ordered from SeraCare (Supplementary Table S5). This sample was sequenced across 8 independent runs. Orthogonal validation of the variants in the positive control by droplet digital PCR (ddPCR) assays were provided by the manufacturer. Notably, all expected variants were reproducibly detected across all independent sequencing runs (Figure 3A), thereby establishing the analytical sensitivity of the test at 100%. Furthermore, sensitivity for mutation detection was on par with that of the ddPCR assay, a gold-standard method for detecting low prevalence tumor-associated mutations (Figure 3A). To assess the analytical specificity of the targeted sequencing panel for detecting single nucleotide variations (SNVs) and InDels, we used a well-characterized reference material derived from the NA12878 normal cell line. From the Sanger-sequenced regions of NA12878 which were confirmed to be devoid of variants, five regions, comprising of 665bp in the HRAS gene, overlapped with probes in our gene panel. This negative control sample was sequenced across nine independent runs with only 2 false positive SNV calls detected at 0.1% VAF, resulting in a specificity of ~99.97% (Figure 3B).

Figure 3: Analytical validation of the assay performance for low frequency variant detection.

Figure 3:

A. A positive control containing synthetic loci with 7 known mutations in TP53 and PIK3CA genes (at 0.25% VAF) was sequenced across 8 independent sequencing runs (solid colored lines). The same input material was also assessed by ddPCR assay (dashed black line). All mutations were detected in each one of the sequencing runs with expected VAF and remarkable concordance between ddPCR and NGS generated VAF values. B. Mutation free genomic loci containing a region of 665 bases in HRAS gene from a well-characterized contemporary normal NA12878 cell line was sequenced across 9 independent sequencing runs. Table summarizes the number of false positive calls detected in each sequencing run at >0.1% VAF.

Targeted sequencing of the matched pre-treatment oral rinse specimens

As the next step, we have applied our targeted sequencing panel that was used to sequence 121 OCSCC tumors described in Figure 2, on DNA extracted from the matched pre-treatment salivary oral rinses collected from these patients. Sequencing on Illumina NextSeq sequencer resulted in an average consensus read depth of 8,879X per sample, after applying read filters to eliminate background noise (with an average of 86% on-target reads) (Table 2, bottom row). Based on our previous published experience with detecting tumor associated mutations in bodily fluids31, the cut-off for variant calling in oral rinses was set at 0.1%. Additionally, a variant was required to be supported by 5 distinct consensus reads of a minimum family size of 3 to ensure that it was not a sequencing artifact. Of the 121 oral rinse samples, 95.87% (n = 116) had at least one somatic variant identified. The oral rinse samples, on an average, had 3 somatic mutations with 75% of samples having ≥2 somatic mutations, similar to the primary tumors. Missense mutations accounted for 45.35% of the 377 variants identified in the cohort, while nonsense mutations constituted 28.9%. The top four mutated genes in the oral rinse specimens were the same as those observed in the primary tumors, with TP53 remaining the most mutated gene (n = 91), followed by FAT1 (n = 50), CDKN2A (n = 49), and NOTCH1 (n = 43) (Supplementary Table S6).

In liquid biopsies including saliva, background noise introduced during library preparation and known errors of sequencing-by-synthesis chemistry50,51 may contribute to false positives at low frequencies. Thus, we have re-sequenced 46 of the 121 oral rinse specimens to ensure that detected variants are not an error resulting in a false positive call. For reproducibility analysis, we confirmed presence of all somatic variants called at ≥0.2% VAF, a more stringent approach compared to evaluating reproducibility from a selected set of variants52,53. It is important to note that with a variant calling threshold of 0.1% VAF, even variants genuinely present at 0.1% in the biological sample will often manifest in the data at frequencies slightly above or below the threshold, due to experimental variation. Hence, for reproducibility analysis, we increased the variant query set to 0.2–30% VAF in one replicate, and found that 87.6% of the variants were detected in the other replicate at 0.1% VAF and above (Supplementary Table S7). We further assessed the oral rinse specimens collected from 15 patients for whom no variants were reported in the primary FFPE tumors. Only two of these subjects showed clinically actionable variants at >=0.2% VAF. To validate that these variants are true somatic mutations and not sequencing artifacts, the specimens were re-sequenced in an independent run. Both variants were present at >0.1% allele frequency in the replicate, thereby indicating that high sensitivity of mutation detection in the saliva of OCSCC subjects. Notably, of the 11 oral rinse specimens collected from confirmed normal subjects without a visible oral cavity lesion and without history of tobacco usage, only one sample showed a single variant at >0.2% VAF, and one patient carried mutation between 0.1 and 0.2% VAF. However, these variants were not reproducible in the independent resequencing analysis (Supplementary Table S7). Furthermore, 9 oral rinse samples in which no mutations were detected remained free of genetic aberrations during the re-sequencing analysis, further supporting the specificity of the oral rinse sequencing assay.

Concordance between primary tumor and oral rinse specimens

To assess if saliva-derived DNA is a good matrix for non-invasive detection of cancer in OCSCC patients, we evaluated whether somatic mutations present in the solid tumor were represented in the saliva. Given that the oral rinse specimens contain a significant number of non-tumor cells from the oral cavity, allele frequencies of tumor associated mutations are expected to be <1% VAF29,31,5456. With the stringent filters applied by our variant calling algorithm, which only calls a sample concordant if the variant is present at ≥0.1% VAF, the overall concordance was 93.4% (99 of 106 sample pairs) (Figure 4 and Supplementary Table S8). The high concordance in mutation distribution across the tested genes between primary tumors and saliva specimens is shown in Figure 5. Although we notice that 93% of the OCSCC tumors in our cohort could have been detected in paired oral rinse specimens by an even smaller 5 gene panel (TP53, CDKN2A, FAT1, CASP8 and NOTCH1) (Figure 5H), all 7 genes in our assay are mutated in at least 5% of the patients with OCSCC (Figure 2B). Therefore, inclusion of PIK3CA and HRAS into the panel may increase the detection rate in a larger cohort or population based screening. While the average allele frequency across primary tumor variants was 22.84% (S.D. ±16.05), the concordant variants in the oral rinses were detected at a mean of 0.68% VAF (S.D. ±0.665). Interestingly, in late stage cancers (Stages III and IV), the concordance between mutations detected in primary tumors and matched oral rinse specimens was as high as 97%, with concordance frequency decreased to 88% in patients with early stage disease (Stages I and II) (Figure 4). Taken together, our data supports the feasibility of reliable somatic mutation identification in driver genes in saliva samples for OCSCC diagnosis, even at early stages of the disease.

Figure 4: Concordance between primary tumor and oral rinse specimens.

Figure 4:

Percentage of OCSCC samples with functional mutations identified in primary tumor biopsies that were also detected in paired pre-treatment oral rinse specimens. Bar chart indicates concordance seen in patients with early stage (I and II) disease, late stage (III and IV) disease, and combined concordance across the entire cohort. Samples with more than one functional mutation in the primary tumor were considered to be concordant if any one of the mutations was detected in the paired oral rinse specimen.

Figure 5. Mutations distribution in primary tumor biopsies and matched oral rinse specimens.

Figure 5.

A-G. Lollipop plots show the landscape of genetic aberrations detected in primary tumors (top) and matched oral rinses (bottom) in each gene included in the sequencing panel. The variants are color coded by the mutation type (red – nonsense, green – missense, blue – deletion, violet – insertion). Gene domains are indicated in the bottom of each panel. H. Table depicts a dynamic increase in cumulative detection in the oral rinse with addition of each gene to the sequencing panel.

Discussion

Despite improved locoregional control and reduced treatment-related morbidity, 5-year survival for patients with OCSCC remains low, in part due to failure in early diagnosis. While early detection of OCSCC substantially increases overall survival510, histopathologic examination of incisional tissue biopsy (a gold standard approach for cancer diagnosis) is invasive, costly, and depends on examiner experience1517. Novel strategies based on detection of genetic biomarkers offer new hope for improved diagnosis of cancer. However, a single tumor biopsy may fall short of accurately capturing clinically relevant genetic variants in a heterogeneous malignancy57,58, resulting in improper molecular classification of the lesion and subsequently, inadvertent down-staging of the disease. Therefore, there is a pressing need for a non-invasive, rapid, accurate, and cost-effective screening approach that would overcome these challenges.

Over the last decade, there has been increasing interest in liquid biopsies - detection of cancer specific biomarkers in patients’ body fluids59,60. While a majority of liquid biopsy based diagnostic tests for solid malignancies rely on serum or plasma specimens59,60, saliva is a better medium for detection of OCSCC. Saliva is in direct contact with oral cavity lesions, its collection is non-invasive, painless, and requires minimal training, making saliva an ideal biofluid to screen individuals with a high risk of developing OCSCC and early diagnosis of the disease3335,60. Using PCR-based assays, several retrospective studies, including those by the members of our group, have reported that tumor specific mutations are detectable in saliva of patients with OCSCC31,3638. Furthermore, saliva-based detection of tumor DNA performed better than plasma-based detection, especially in patients with early stage disease29. However, the clinical adoption of PCR or ddPCR assays as a routine screening practice for OCSCC detection is hindered by their low scalability (they can only interrogate a limited set of variants) and limited multiplex capability. Targeted NGS technology overcomes these complications, and offers an advantageous approach for high-throughput and highly sensitive detection of tumor specific variants in small biopsies, FFPE-derived material, and saliva specimens61,62. However, this method is not widely used for OCSCC diagnosis, and its accuracy is yet to be confirmed.

This motivated us to develop an ultra-deep NGS-based assay for rapid sequencing of the entire coding regions of 7 frequently mutated driver genes in OCSCC. We have focused on targeted sequencing rather than a strategy based on WES, whole genome sequencing, copy number analysis, epigenetic changes, expression analysis, and/or proteomics52,53,63. While each of these classes of alterations play a critical role in carcinogenesis, our goal was to develop highly specific and easily reproducible diagnostic and screening platforms that could be widely used in the clinical setting. We focused on minimizing the panel size with the goal of achieving at least 85% overall clinical utility for the entire panel. Selected genes were rank-ordered and mutated in at least 5% of the patients in each of the three publicly available databases. These genes, for the most part, were mutually exclusive, and have well characterized clinical and biological significance in OCSCC.

A targeted UMI tagged NGS panel with a small footprint was designed to accurately call low-level somatic variants at 0.1% VAF. Targeted panels that do not use UMI, rely on modeling of background sequencing errors, which can distinguish true positives from background noise at a minimum of 0.3 to 0.5% VAF64,65. Molecular tagging substantially lowers the limit of detection, which is essential for reliable detection of rare alleles in body fluids. Previous target enrichment attempts have steered away from hybridization-based approaches for rare allele detection, primarily due to high percentage of off-target capture. In our assay, we circumvent this problem by applying two rounds of hybridization. Targeted hybridization approaches have previously been explored in the context of pan-cancer panels with footprints of 16 or more genes52,53. While large panel size reduces off-target capture, it requires a higher number of reads per sample to achieve sufficient depth for variant allele detection at 0.1%, which increases the cost and offsets the use of such assays for early detection screening. Our targeted saliva-based seven-gene panel used in this study costs a fraction of other plasma-based NGS panels currently available in the market (such as FoundationOne® Liquid CDx and Guardant360® CDx). Therefore, targeted dual-capture hybridization enrichment coupled with UMI-tagged minimal panel footprint provides a sensitive and cost-effective alternative for accurate detection of low frequency alleles in a complex genomic background of saliva specimens.

To test this sequencing workflow, we first applied it on DNA extracted from 121 FFPE-derived primary OCSCC tumors. Overall, 86% of reads mapped to the reference sequence and average depth was over 1,000X across all tested specimens (14-fold higher compared to ~70X that could be achieved with WES of FFPE-derived samples4345). Furthermore, nearly 99% of mutations were concurrently detected in two parallel sequencing runs, supporting the high reproducibility of this targeted sequencing approach. Notably, somatic mutations were detected in 88% of the specimens, confirming the clinical utility of this gene panel predicted from the public datasets. While we acknowledge that we won’t be able to identify patients with low prevalent mutations of unknown biologic and clinical relevance, inclusion of rarely mutated genes has limited prognostic utility in a heterogeneous population of patients with OCSCC and also would substantially increase the screening cost.

Cell-free (cf) DNA is often the source of material for most liquid biopsy assays. However, in saliva, DNA is extracted primarily from the shedding mucosal cells. Compared to cfDNA, which is subject to degradation by high DNase activity, DNA extracted from the cellular fraction of saliva is far less fragmented, thereby increasing detection accuracy of rare alleles. As such, salivary oral rinse specimens used in this study were collected by asking the patients to swish and gargle with saline in order to increase the cellular fraction. As high sequencing depth is required for accurate low-level variants calling, these oral rinse specimens were sequenced on Illumina NextSeq sequencer, resulting in an average depth of 8,879X consensus reads. Such depth overcomes the shortcomings of sequencing even highly degraded material. At ≥0.1% VAF, independent re-sequencing of 46 oral rinse specimens confirmed presence of mutant alleles with 87.7% concordance, and 93.4% of mutations detected in primary tumors were also identified in matched oral rinse specimens. Furthermore, although early stage malignancies have lower levels of neoplastic cells and therefore more likely to yield false negative results, an 88% concordance in early stage disease confirms the success of our sequencing and analytical approach. Somatic mutations that were not seen in the primary tumors were detected in five of the sequenced oral rinse specimens. While these results are consistent with previous reports on higher mutational prevalence in body fluids29,31 and the nature of these mutations remains to be investigated, these variants most likely have been missed due to the undersampling of the heterogeneous primary tumor, suggesting that saliva is highly representative of the intratumor mutational heterogeneity.

Taken together, our results demonstrate that this quick, sensitive, cost-efficient, and non-invasive method can be used for detection of low frequency tumor-associated mutations in salivary oral rinse specimens collected from patients with OCSCC. These findings provide the foundation for using this sequencing platform for risk assessment by screening high-risk individuals, early detection, monitoring during treatment, and tumor surveillance after completion of treatment. With an annual incidence of over 350,000 new cases of OCSCC and approximately two-thirds of these cases occurring in developing nations, the value of this tool in addressing the continuing challenges in screening the high risk population will likely increase over time.

Supplementary Material

Sup tab S2

Supplementary Table S1: Detailed demographic and clinicopathological data of the OCSCC study cohort.

Sup fig S1

Supplementary Figure S1: NGS summary data. Boxplots showing quality control metrics for all reads and final filtered reads across FFPE-derived primary OCSCC tumors (A) and oral rinse specimens (B). Each boxplot depicts the 25th quartile, the median and the 75th quartile of each metric.

Sup tab S4

Supplementary Table S3: A. Variants shortlisted for clinical relevance in FFPE samples. B. Variant summary by stage and gene.

Sup tab S5

Supplementary Table S4: Reproducibility summary in FFPE samples.

Sup tab S6

Supplementary Table S5: Sensitivity analysis using SeraCare positive control.

Sup tab S8

Supplementary Table S7: Reproducibility summary of somatic variants in saliva samples.

Sup tab S7

Supplementary Table S6: Variants shortlisted for clinical relevance in saliva samples.

Sup tab S9

Supplementary Table S8: Concordance status of variants found in primary tumors and matched saliva samples.

Sup tab S3

Supplementary Table S2: Sample IDs and details of the cohorts from the three public datasets TCGA, ICGC and MD Anderson.

Acknowledgments:

We thank our patients for their courage and altruism. We would like to thank Ananta Deo, Meghana Gowda & Pallavi Birajdar (Strand Life Sciences) for their role in patient recruitment and sample collection. This work was supported by Tata Trusts through Tata Centre for Development at the University of Chicago award to N.A. and R.H., the following NIH grants: R01DE028674 to N.A. and M.W.L., R01DE027809 to E.I. and U01CA230691 to N.A., C.B. and N.P., and the philanthropic support of Jill and Ozzie Giglio.

Footnotes

Conflict of Interest: Arun K Hariharan, Ashwini Shanmugam, Sivaraj Irusappan, Jayalakshmi R Nair, Shiuli Maji, Urvashi Bahadur, Vamsi Veeramachaneni, Radhakrishna Bettadapura, Ashwini Manjunath, Aarthi Ravichandran, Ramesh Hariharan, Sivaraj Irusappan, Shanmukh Katragadda, Veena Ramaswamy and Vaijayanti Gupta are affiliated with Strand Life Sciences. The remaining authors declare no conflict of interest.

References

  • 1.Dumache R Early Diagnosis of Oral Squamous Cell Carcinoma by Salivary microRNAs. Clin Lab 63, 1771–1776 (2017). [DOI] [PubMed] [Google Scholar]
  • 2.Bray F, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 68, 394–424 (2018). [DOI] [PubMed] [Google Scholar]
  • 3.Hunter KD, Parkinson EK & Harrison PR Profiling early head and neck cancer. Nat Rev Cancer 5, 127–135 (2005). [DOI] [PubMed] [Google Scholar]
  • 4.Bettendorf O, Piffko J & Bankfalvi A Prognostic and predictive factors in oral squamous cell cancer: important tools for planning individual therapy? Oral Oncol 40, 110–119 (2004). [DOI] [PubMed] [Google Scholar]
  • 5.Axell T, Pindborg JJ, Smith CJ & van der Waal I Oral white lesions with special reference to precancerous and tobacco- related lesions: conclusions of an international symposium held in Uppsala, Sweden, May 18–21 1994. International Collaborative Group on Oral White Lesions. J Oral Pathol Med 25, 49–54 (1996). [DOI] [PubMed] [Google Scholar]
  • 6.Forastiere A, Koch W, Trotti A & Sidransky D Head and neck cancer. N Engl J Med 345, 1890–1900 (2001). [DOI] [PubMed] [Google Scholar]
  • 7.Rhodus NL Oral cancer: leukoplakia and squamous cell carcinoma. Dent Clin North Am 49, 143–165, ix (2005). [DOI] [PubMed] [Google Scholar]
  • 8.Shiboski CH, Shiboski SC & Silverman S Jr. Trends in oral cancer rates in the United States, 1973–1996. Community Dent Oral Epidemiol 28, 249–256 (2000). [DOI] [PubMed] [Google Scholar]
  • 9.Siegel RL, Miller KD & Jemal A Cancer Statistics, 2017. CA: a cancer journal for clinicians 67, 7–30 (2017). [DOI] [PubMed] [Google Scholar]
  • 10.Warnakulasuriya S Global epidemiology of oral and oropharyngeal cancer. Oral Oncol 45, 309–316 (2009). [DOI] [PubMed] [Google Scholar]
  • 11.Poling JS, et al. Human papillomavirus (HPV) status of non-tobacco related squamous cell carcinomas of the lateral tongue. Oral Oncol 50, 306–310 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Castellsague X, et al. HPV Involvement in Head and Neck Cancers: Comprehensive Assessment of Biomarkers in 3680 Patients. J Natl Cancer Inst 108, djv403 (2016). [DOI] [PubMed] [Google Scholar]
  • 13.Lingen MW, et al. Low etiologic fraction for high-risk human papillomavirus in oral cavity squamous cell carcinomas. Oral Oncol 49, 1–8 (2013). [DOI] [PubMed] [Google Scholar]
  • 14.Zafereo ME, et al. Squamous cell carcinoma of the oral cavity often overexpresses p16 but is rarely driven by human papillomavirus. Oral Oncol 56, 47–53 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Allen K & Farah CS Screening and referral of oral mucosal pathology: a check-up of Australian dentists. Aust Dent J 60, 52–58 (2015). [DOI] [PubMed] [Google Scholar]
  • 16.Khurshid Z, et al. Role of Salivary Biomarkers in Oral Cancer Detection. Adv Clin Chem 86, 23–70 (2018). [DOI] [PubMed] [Google Scholar]
  • 17.Ford PJ & Farah CS Early detection and diagnosis of oral cancer: strategies for improvement. Journal of Cancer Policy 1, e2–e7 (2013). [Google Scholar]
  • 18.Lingen MW, Kalmar JR, Karrison T & Speight PM Critical evaluation of diagnostic aids for the detection of oral cancer. Oral Oncol 44, 10–22 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Patton LL, Epstein JB & Kerr AR Adjunctive techniques for oral cancer examination and lesion diagnosis: a systematic review of the literature. J Am Dent Assoc 139, 896–905; quiz 993–894 (2008). [DOI] [PubMed] [Google Scholar]
  • 20.Rethman MP, et al. Evidence-based clinical recommendations regarding screening for oral squamous cell carcinomas. J Am Dent Assoc 141, 509–520 (2010). [DOI] [PubMed] [Google Scholar]
  • 21.Macey R, et al. Diagnostic tests for oral cancer and potentially malignant disorders in patients presenting with clinically evident lesions. The Cochrane database of systematic reviews, Cd010276 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Walsh T, et al. Clinical assessment to screen for the detection of oral cavity cancer and potentially malignant disorders in apparently healthy adults. The Cochrane database of systematic reviews, Cd010173 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Lingen MW, et al. Evidence-based clinical practice guideline for the evaluation of potentially malignant disorders in the oral cavity: A report of the American Dental Association. J Am Dent Assoc 148, 712–727.e710 (2017). [DOI] [PubMed] [Google Scholar]
  • 24.Lingen MW, et al. Adjuncts for the evaluation of potentially malignant disorders in the oral cavity: Diagnostic test accuracy systematic review and meta-analysis-a report of the American Dental Association. J Am Dent Assoc 148, 797–813.e752 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Agrawal N, et al. Exome sequencing of head and neck squamous cell carcinoma reveals inactivating mutations in NOTCH1. Science 333, 1154–1157 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Stransky N, et al. The mutational landscape of head and neck squamous cell carcinoma. Science 333, 1157–1160 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature 517, 576–582 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Anglim PP, et al. Identification of a panel of sensitive and specific DNA methylation markers for squamous cell lung cancer. Mol Cancer 7, 62 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Bettegowda C, et al. Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci Transl Med 6, 224ra224 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Diehl F, et al. Circulating mutant DNA to assess tumor dynamics. Nat Med 14, 985–990 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Wang Y, et al. Detection of somatic mutations and HPV in the saliva and plasma of patients with head and neck squamous cell carcinomas. Sci Transl Med 7, 293ra104 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Schubert AD, et al. Somatic mitochondrial mutation discovery using ultra-deep sequencing of the mitochondrial genome reveals spatial tumor heterogeneity in head and neck squamous cell carcinoma. Cancer letters 471, 49–60 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Lousada-Fernandez F, et al. Liquid Biopsy in Oral Cancer. in Int J Mol Sci, Vol. 19 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Wang X, Kaczor-Urbanowicz KE & Wong DT Salivary biomarkers in cancer detection. Med Oncol 34, 7 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Cristaldi M, et al. Salivary Biomarkers for Oral Squamous Cell Carcinoma Diagnosis and Follow-Up: Current Status and Perspectives. Frontiers in Physiology 10(2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Diehl F, et al. Analysis of mutations in DNA isolated from plasma and stool of colorectal cancer patients. Gastroenterology 135, 489–498 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kinde I, et al. Evaluation of DNA from the Papanicolaou test to detect ovarian and endometrial cancers. Sci Transl Med 5, 167ra164 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Wang Y, et al. Detection of tumor-derived DNA in cerebrospinal fluid of patients with primary tumors of the brain and spinal cord. Proc Natl Acad Sci U S A 112, 9704–9709 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Kinde I, Wu J, Papadopoulos N, Kinzler KW & Vogelstein B Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci U S A 108, 9530–9535 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Newman AM, et al. An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat Med 20, 548–554 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Newman AM, et al. Integrated digital error suppression for improved detection of circulating tumor DNA. Nat Biotechnol 34, 547–555 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Kennedy SR, et al. Detecting ultralow-frequency mutations by Duplex Sequencing. Nat Protoc 9, 2586–2606 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Hoadley KA, et al. Cell-of-Origin Patterns Dominate the Molecular Classification of 10,000 Tumors from 33 Types of Cancer. Cell 173, 291–304.e296 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Mutational landscape of gingivo-buccal oral squamous cell carcinoma reveals new recurrently-mutated genes and molecular subgroups. Nat Commun 4, 2873 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Pickering CR, et al. Integrative genomic characterization of oral squamous cell carcinoma identifies frequent somatic drivers. Cancer Discov 3, 770–781 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Sen M, et al. StrandAdvantage test for early-line and advanced-stage treatment decisions in solid tumors. Cancer Med 6, 883–901 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Hodara E, et al. Multiparametric liquid biopsy analysis in metastatic prostate cancer. JCI insight 4(2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Onidani K, et al. Monitoring of cancer patients via next-generation sequencing of patient-derived circulating tumor cells and tumor DNA. Cancer science 110, 2590–2599 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Nakagaki T, et al. Targeted next-generation sequencing of 50 cancer-related genes in Japanese patients with oral squamous cell carcinoma. Tumour biology : the journal of the International Society for Oncodevelopmental Biology and Medicine 40, 1010428318800180 (2018). [DOI] [PubMed] [Google Scholar]
  • 50.Schirmer M, D’Amore R, Ijaz UZ, Hall N & Quince C Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data. BMC bioinformatics 17, 125 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Nakamura K, et al. Sequence-specific error profile of Illumina sequencers. Nucleic acids research 39, e90 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Lanman RB, et al. Analytical and Clinical Validation of a Digital Sequencing Panel for Quantitative, Highly Accurate Evaluation of Cell-Free Circulating Tumor DNA. PLoS One 10, e0140712 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Clark TA, et al. Analytical Validation of a Hybrid Capture-Based Next-Generation Sequencing Clinical Assay for Genomic Profiling of Cell-Free Circulating Tumor DNA. J Mol Diagn 20, 686–702 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Mattox AK, et al. Applications of liquid biopsies for cancer. Sci Transl Med 11(2019). [DOI] [PubMed] [Google Scholar]
  • 55.Mazurek AM, Rutkowski T, Fiszer-Kierzkowska A, Małusecka E & Składowski K Assessment of the total cfDNA and HPV16/18 detection in plasma samples of head and neck squamous cell carcinoma patients. Oral oncology 54, 36–41 (2016). [DOI] [PubMed] [Google Scholar]
  • 56.Hamana K, et al. Monitoring of circulating tumour-associated DNA as a prognostic tool for oral squamous cell carcinoma. British journal of cancer 92, 2181–2184 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Cao W, et al. Multiple region whole-exome sequencing reveals dramatically evolving intratumor genomic heterogeneity in esophageal squamous cell carcinoma. Oncogenesis 4, e175–e175 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Gabusi A, et al. Prognostic impact of intra-field heterogeneity in oral squamous cell carcinoma. Virchows Archiv 476, 585–595 (2020). [DOI] [PubMed] [Google Scholar]
  • 59.Siravegna G, Marsoni S, Siena S & Bardelli A Integrating liquid biopsies into the management of cancer. Nature reviews. Clinical oncology 14, 531–548 (2017). [DOI] [PubMed] [Google Scholar]
  • 60.Arantes LMRB, Carvalho A.C.d., Melendez ME & Carvalho AL Serum, plasma and saliva biomarkers for head and neck cancer. Expert Review of Molecular Diagnostics 18, 112–185 (2018). [DOI] [PubMed] [Google Scholar]
  • 61.Zakrzewski F, et al. Targeted capture-based NGS is superior to multiplex PCR-based NGS for hereditary BRCA1 and BRCA2 gene analysis in FFPE tumor samples. BMC Cancer 19, 396 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Hirotsu Y, et al. Dual-molecular barcode sequencing detects rare variants in tumor and cell free DNA in plasma. Sci Rep 10, 3391 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Cohen JD, et al. Detection and localization of surgically resectable cancers with a multi-analyte blood test. Science 359, 926–930 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Balaji SA, et al. Analysis of solid tumor mutation profiles in liquid biopsy. Cancer Med 7, 5439–5447 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Bennett CW, Berchem G, Kim YJ & El-Khoury V Cell-free DNA and next-generation sequencing in the service of personalized medicine for lung cancer. Oncotarget 7, 71013–71035 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Sup tab S2

Supplementary Table S1: Detailed demographic and clinicopathological data of the OCSCC study cohort.

Sup fig S1

Supplementary Figure S1: NGS summary data. Boxplots showing quality control metrics for all reads and final filtered reads across FFPE-derived primary OCSCC tumors (A) and oral rinse specimens (B). Each boxplot depicts the 25th quartile, the median and the 75th quartile of each metric.

Sup tab S4

Supplementary Table S3: A. Variants shortlisted for clinical relevance in FFPE samples. B. Variant summary by stage and gene.

Sup tab S5

Supplementary Table S4: Reproducibility summary in FFPE samples.

Sup tab S6

Supplementary Table S5: Sensitivity analysis using SeraCare positive control.

Sup tab S8

Supplementary Table S7: Reproducibility summary of somatic variants in saliva samples.

Sup tab S7

Supplementary Table S6: Variants shortlisted for clinical relevance in saliva samples.

Sup tab S9

Supplementary Table S8: Concordance status of variants found in primary tumors and matched saliva samples.

Sup tab S3

Supplementary Table S2: Sample IDs and details of the cohorts from the three public datasets TCGA, ICGC and MD Anderson.

Data Availability Statement

Raw targeted NGS data generated in this study will be available from the corresponding authors upon reasonable request.

RESOURCES