Summary
Recent studies show that aneuploidies and driver gene mutations precede cancer diagnosis by many years1–4. We assess whether these genomic signals can be used for early detection and pre-emptive cancer treatment using the neoplastic precursor lesion Barrett’s esophagus, as an exemplar5. Shallow whole genome sequencing of 777 biopsies, sampled from 88 patients in Barrett’s surveillance over a period of up to 15 years shows that genomic signals can distinguish progressive from stable disease even ten years prior to histopathological transformation. These findings are validated on two independent cohorts of 76 and 248 patients. These methods are low cost and applicable to standard clinical biopsy samples. Compared with current management guidelines based on histopathology and clinical presentation, genomic classification enables earlier treatment for high risk patients as well as reduction of unnecessary treatment and monitoring for patients who are unlikely to develop cancer.
Early diagnosis of cancer is one of the best strategies to improve patient survival and decrease treatment-related side-effects that contribute to poorer health, however this strategy poses a risk of overtreatment6. Therefore, accurate biomarkers of early cancer progression are needed to stratify patients. Copy number (CN) alterations, though common in cancer are rarely found in normal tissues, raising the question whether these signals could help diagnose patients earlier.
This strategy can be tested in esophageal adenocarcinoma (EAC), which has a 5-year survival rate of less than 20%7. Its precursor tissue is known as Barrett’s esophagus (BE); however, the risk for a patient with BE progressing to EAC is only around 0.3% per annum8. Current surveillance programs focus on the presence and grade of dysplasia in BE patients as determined by histopathological examination of biopsies. Low- and high-grade dysplasia (LGD, HGD) are used as surrogates for early cancer transformation and trigger intervention, commonly by endoscopic resection and radiofrequency ablation9,10. Additional risk factors for progression include increasing age, male gender, greater length of the BE segment, and tobacco use at the initial evaluation, although these are not yet part of clinical guidelines11.
Improvements to risk assessment have focused on identifying individual molecular biomarkers, particularly p53 expression12–16 and DNA methylation changes17,18. However, identification of mutational biomarkers for progression has been difficult, due to the low frequency of recurrent point mutations in either BE19 or EAC20,21. Instead, EAC and BE are characterized by early and frequent genomic (CN and structural) instability20–24. As ongoing genomic instability leads to a large extent of clonal diversity, multiple investigations have focused on the heterogeneity and diversity of BE tissues25 as markers of higher risk26–29.
We investigated genome-wide CN instability as a marker for risk of progression using shallow whole genome sequencing (sWGS; average depth 0.4x) in a retrospective, demographically matched, case-control cohort of patients (n=88) with all available endoscopy samples (n=777) collected during clinical surveillance for BE (Fig. 1a). sWGS was chosen as the protocol as it provides a genome-wide perspective on CN and the level of genomic instability and has been optimised for use in formalin-fixed paraffin embedded (FFPE) samples30.
CN patterns were examined at multiple levels of the esophagus to understand how patients who progress differ from non-progressors. We observed that the genomes of individual progressive patients display a generalized disorder across the genomes that varies between samples and over time (Fig. 1b). Additionally, CN changes were not confined to cytological atypia (e.g. LGD, HGD), since similar profiles were observed for the non-dysplastic BE (NDBE) samples (n=518; Fig. 1c-d).
The CN information and a measure of overall complexity (Methods; Supplementary Fig. 1). were used to generate a cross-validated elastic-net regularized logistic regression model of progression and classification with the endpoint HGD or intramucosal cancer (IMC; Methods), and subsequently validated using an independent cohort of 76 patients (n=213 samples), alongside an orthogonal validation of the Seattle BE Study SNP array samples (n=1272)31.
This model was designed to be independent of demographic risk factors11 as our cohort was matched for sex, BE segment length, age at diagnosis, and smoking status (Supplementary Table 1). We used the area under the receiver operating curve (AUC, ROC) to evaluate the model training performance. As the model included the diagnostic samples with the most extreme CN (e.g. HGD, and IMC) we additionally trained a model excluding these and found that the AUC concordance was high (Supplementary Fig. 2a), indicating that the model was not sensitive to extreme samples. Aggregating predictions either per-endoscopy (mean or max sample predictions) or per-patient (mean or max predictions excluding HGD/IMC samples) did not measurably increase the prognostic accuracy (Supplementary Fig. 2b), suggesting that a single sample (e.g. pooled 4-quadrant biopsy) may be sufficient for prediction which could be ideal for clinical application.
Using all sample predictions generated by the model we evaluated the relative risk (RR) across the cohort. Those samples with the highest RR were more than 20 times more likely to progress than average, while those with the lowest RR were 10 times less likely (Fig. 2a). This information enabled us to calibrate risk classifications based on the enrichment of samples from progressor or non-progressor patients to maximize the sensitivity of our classes: ‘low’ (Pr≤0.3; sensitivity=0.87, specificity=0.65), ‘moderate’ (0.3>Pr<0.5), or ‘high’ (Pr≥0.5, sensitivity=0.72, specificity=0.82).
Samples from patients who progressed were classified as “high risk” for progression independent of histopathology (Fig. 2b). Most importantly, CN profiles in NDBE samples that belonged to progressor patients were classified as high risk in 60.5% (104/172) while in non-progressor patients 64.7% (224/346) of samples were classified as “low risk”.
The model was then used to predict and classify risks per-sample for the validation cohort (76 patients, 213 samples). 78/142 (55%) samples from non-progressor patients were classified as low risk, and 55/71 (77%) of samples from patients who progressed were classified as high risk. As in the discovery cohort, high risk classification of progressor patient samples was largely independent of histopathology (Fig. 2c). Similarly, when we used our model to classify the historical Seattle study patient dataset (n=248, samples=1273 SNP array) we again find that samples from progressors are classified as high risk regardless of pathology (Supplementary Fig. 3-4). However, in this case the algorithm unsurprisingly suffers a loss of accuracy due to the differences in the methodology (see Supplemental Methods and Results for complete analysis and endpoint differences).
When sample classifications were plotted according to their spatial distribution in the segment and time of collection in the clinical history, strikingly concordant patterns emerged. Most progressive patient samples are classed as high risk throughout the disease history, while non-progressive patient samples are consistently low risk (Fig. 2d, Supplementary Fig. 5). This concordance is evident when we plot the highest risk at each timepoint per patient (Fig. 3a). For patients that progress, 50% (8/16) of endoscopies had at least one sample classified as high risk 8 or more years prior to transformation. This classification is in accordance with current diagnostic guidelines requiring only a single dysplastic sample to recommend treatment for a patient (Fig. 3b). Cases which lack early CN patterns of progression acquired these over the following years, leading to 78% (18/23) of endoscopies with at least one high risk sample one to two years prior to HGD/IMC diagnosis.
More interesting were the patients who have not yet progressed but display a consistent pattern of high-risk endoscopies. Two patients were high risk in every sequenced sample, while the remaining patients displayed a mix of risks at each timepoint (Fig. 2d), presenting what could be clonal diversity in very early progression to EAC (follow-up for these patients continues) and resulting in consistent high-risk over time (Fig. 3a).
Statistical algorithms can be improved by increasing the size of the dataset. We therefore conducted sub-sampling of the discovery cohort with increasing numbers of patients and model training as described (Methods). With each increment in the number of patients the predictive accuracy of the model increased, reaching a (cross validated) AUC of 0.89 (specificity=0.83, sensitivity=0.82) when combining all discovery and validation patients (n=164; Supplementary Fig. 6), indicating that a larger knowledge bank of CN and progression data from BE will continue to improve the precision of patient stratification and the sensitivity of the model by adding stronger statistical signals and accounting for broader biological variation.
Current guidelines for the management of BE focus on the length of the BE segment and the presence or absence of LGD/HGD in any biopsy sample taken during endoscopy32,33. Most of our patients were in surveillance prior to the current treatment recommendations for LGD, and hence we can compare a set of recommendations based on the current guidelines33 with our model applying a similar criteria, but overlaying our risk classifications (Fig. 4a). We applied these recommendations across our entire discovery cohort (88 patients) and evaluated the first two endoscopies available excluding the endpoint (Fig. 4b, Supplementary Table 4). Using these criteria at the patient’s second surveillance endoscopy available (i.e. several years prior to transformation), 54% of progressor patients (19/35) would have received earlier treatment. Only 5 of these patients had repeat LGD diagnoses that could recommend earlier treatment or more aggressive surveillance under current pathology-based guidelines. 40% of progressor patients (14/35) would continue to receive yearly surveillance per current guidelines. The remaining 6% (2/35) would have been recommended reduced surveillance (3-5 years), however they would not have been diagnosed any differently under current guidelines as they were consistently NDBE. One patient (13) may have had delayed treatment, but this would have occurred under current guidelines as well as no dysplasia was identified prior to transformation. 51% of patients who have not progressed (21/40) would have less frequent endoscopies, 33% (13/40) would continue to receive yearly surveillance per current guidelines, and 17% (7/40) would have had potentially unnecessary treatment compared to current guidelines. Three patients from our discovery cohort are shown with the guidelines compared (Fig. 4c-d) as examples. Furthermore, the increasing sensitivity of the model as samples are taken closer to the endpoint is evident as most progressive patients are recommended treatment at their penultimate endoscopy while none would be recommended longer surveillance times (Supplementary Table 4).
Recent evidence from the large-scale pan-cancer studies have suggested that genomic alterations are present many years before detectable disease1 in many cancer types. BE constitutes a known pre-malignant condition with historical follow-up to test whether genomic medicine can contribute to early cancer detection. Previous studies of BE progression have shown that genomic and epigenetic changes are present prior to cancer progression and differ in patients who do ultimately develop cancer including: p53 expression12,14; DNA methylation changes17,18; CN losses and copy neutral loss of heterozygosity26,28,34; and high clonal diversity27.
However, our analysis has shown that even highly variable CN profiles generated from the entire biopsy sample (not dissected or separated) translate into surprisingly stable predictions of a patient’s risk of progression. Further, these single-sample predictions were as accurate as aggregated data from multiple biopsies across the entire endoscopy or patient, showing that despite high levels of divergence there are common patterns of CN alterations indicative of progressive disease. This level of predictive power using a genome-wide algorithm is more challenging to achieve with a focussed biomarker approach given the disease heterogeneity.
Perhaps most interestingly for biomarker investigations is that, while our statistical model selects some genomic regions of instability as features that are known to be early drivers of EAC (e.g. TP53; Supplementary Fig. 7), few other features have any clearly associated tumour suppressor genes or other cancer-related activity (Supplementary Table 3). The heterogeneous nature of BE would partly explain the differences between the features our model selects as significantly contributing to progression from those found in previous studies28, however, there is currently no clear functional explanation for most of the features identified. It is likely that the sum of many small changes and the breakdown of gene regulatory control fuels oncogenicity.
While this study provides good evidence that genomic changes can predict future cancer risk, it is limited by the relatively small number of patients in the cohort, particularly progressive patients. Future studies that include more longitudinal genomic data will improve the sensitivity and specificity estimates of this model.
Ultimately, the combined use of low-cost genomic technologies, standard clinical samples and statistical modelling presented here is an example of how genomic medicine can be implemented for early detection for cancer. This demonstrates that genomic risk stratification has a realistic potential to enable earlier intervention for high-risk conditions, and at the same time reduce the intensity of monitoring and even reduce overtreatment in cases of stable disease.
Methods
Patient cohorts
A nested case-control cohort of 90 patients were initially recruited to this study from patients that had been under surveillance for BE in the East of England from 2001 to 2016 for a total of 632 person years. Permission to analyse existing clinical diagnostic samples was approved by the relevant institutional ethics committees (REC 14-NW-0252). Cases comprised 45 patients who progressed from NDBE to HGD or IMC with a minimum follow-up of 1 year (mean 4.6 ± 3.7 years). Controls were 45 patients who had not progressed beyond LGD starting from NDBE with a minimum follow-up of 3 years (mean 6.7 ± 3.2 years). Cases and controls were matched for age, gender, and length of BE segment (Supplementary Table 1). Patients had endoscopies at intervals determined by clinical guidelines with 4-quadrant biopsies taken every 2cm of BE length (Seattle protocol). One non-progressor patient revoked consent prior to analysis, and a second non-progressor was later removed during analysis when multiple comorbidities affecting the esophagus were identified. A total of 777 samples were sequenced with 773 passing our post-processing quality control.
An independent unmatched cohort of 75 patients was subsequently selected from patients under surveillance for BE in the East of England from 2001 to 2018 for model validation. This cohort was comprised of 18 patients who had progressed from NDBE to HGD or IMC with a minimum follow-up of 1 year (mean 6.1 ± 3.4 years) and 58 patients who had not progressed beyond LGD starting from NDBE with a minimum follow-up of 1.5 years (mean 5.4 ± 3.0 years). The earliest available endoscopy samples subsequent to initial BE diagnosis were obtained to assess future risk. No endpoint samples (e.g. HGD or IMC) were included. This cohort was selected from available samples with no attempt to match demographics, however no significant differences were found between the groups (Supplementary Table 2). A total of 219 samples were sequenced from this cohort, with 213 passing our post-processing quality control.
Each sample from both cohorts was graded by multiple expert GI histopathologists using current clinical guidelines for IMC, HGD, LGD, indeterminate (ID), and NDBE. A single biopsy graded as HGD or IMC was considered the endpoint for progression as patients were immediately recommended treatment in the clinic. Since 2014 patients with LGD are also routinely treated with RFA making prospective analysis of the real rate of progression difficult.
All patients had previously given informed consent to be part of the following studies: Progressor study (REC -10/H0305/52), Barrett’s Biomarker Study (REC -01/149), OCCAMs (REC 07/H0305/52 and 10/H0305/1), BEST (REC 06/Q0108/272) BEST2 (REC 10/H0308/71), Barrett’s Gene Study (REC 02/2/57), Time& TIME 2 (REC 09/H0308/118), NOSE study (REC 08/H0308/272), Sponge study (REC 03/306).
Patient samples from the Seattle Barrett’s Esophagus Study31, which use SNP arrays as an orthogonal measure of CN with an endpoint of EAC, were also included for further validation (Supplemental Methods and Results).
Tissue Sample Processing and p53 IHC
Formalin fixed, paraffin embedded (FFPE) tissue samples from routine surveillance endoscopies were processed from scrolls, without microdissection as this protocol aims to be clinically relevant. Following the Seattle protocol for endoscopic surveillance 4-quadrant biopsies were taken every 1-2cm of the Barrett’s length at each endoscopy per patient. At each 1-2cm length the quadrant biopsies were pooled for sequencing as a single sample to ensure sufficient DNA (75ng) was present.
An additional section at each level of the Barrett’s segment (n=88,n=590 sections) were stained (IHC) using a monoclonal antibody for wild-type and mutant p53 (NCL-L-p53-D07) at the NHS Addenbrooke’s Hospital UK on the Leica BOND-MAX™ system using Bond Polymer Refine Detection reagents (Leica Microsystems UK Ltd., Milton Keynes, UK) and graded by an expert pathologist as aberrant (absent or over-expressed) or normal35,36.
Shallow whole genome sequencing pipeline
Single-end 50-base pair sequencing was performed at a depth of 0.4X on the Illumina HiSeq platform. Sequence alignment was performed with BWA37 v.0.7.15, and pre-processing of the reads for mappability, GC content, and filtering was performed with QDNAseq30 using 50kb bins. Only autosomal sequences are retained after filtering due to low-depth mappability and GC correction. Samples were segmented for CN analysis using the piecewise constant fit function (pcf) in the R Bioconductor `copynumber` v1.16 package38. Input to this function was the GC adjusted read counts from QDNAseq (Supplementary Fig. 8).
Post-processing quality control
Per-segment residuals were calculated and the overall variance across the median absolute deviation of the segment residuals was derived as a per-sample quality control measure. This measure was developed using an additional set of samples (n=233), from fresh-frozen tumor tissue, FFPE cell-line tissue, and FFPE patient samples. No relationship was found between sample age and data quality, and post-segmentation quality issues were not resolvable (Supplemental Fig. 9). Therefore, samples with a mean variance of the segment residuals greater than 0.008 were excluded from analysis. This excluded more than 73% (171/233) from the quality control samples across all sample types (FFPE patient, FFPE cell line, fresh-frozen tumor). In the discovery cohort we excluded 0.5% (4/777) of samples, and in the validation cohort 2% (6/219) of samples.
Statistical methods
We encoded all CN data on a genome-wide scale by taking a weighted average of the segmented values per 5Mb windows, and mean standardization per genomic window. In order to evaluate chromosomal instability on a larger scale we averaged the segmented values across chromosome arms and adjusted each 5Mb window by the difference between the window and the arm. The resulting data was 589 5Mb windows and 44 chromosome arms. We additionally included a measure of genomic complexity (‘cx’) by summing, per-sample, the 5Mb windows that had CN values two standard deviations from the mean.
We performed elastic-net regression with the R glmnet39 package to fit regression models with varying regularization parameters. 5-fold cross validation repeated 10 times was performed on a per-patient basis removing all samples from 20% of patients in each fold. This process was performed in three conditions: using all samples; excluding HGD/IMC samples; excluding LGD/HGD/IMC. The two exclusion conditions were performed in order to assess the contribution of dysplasia to the classification rate of the model.
The model was additionally tuned on two parameters: 1) QDNAseq bin size; and 2) elastic-net regression penalty, between 0 (ridge) and 1 (lasso). We assessed the cross-validation classification performance of the model at multiple QDNAseq bin sizes, and at multiple regression penalties. We selected the final QDNAseq bin size by comparing the leave-one-patient out predictions from the discovery cohort, to the model predictions for the validation. This was done to minimize the batch errors in the raw data (Supplementary Figs. 10-11). For the regression penalty parameter, all models had a cross-validation classification rate of 72-75%. We therefore selected the parameter that limited the number of coefficients (n=74) and was not full lasso (e.g. 0.9). Coefficients determining the logarithmic relative risk change stemming from a unit change were calculated for each genomic region selected.
Subsequently, a leave-one-patient out analysis (excluding all samples of an individual) was performed to generate predictions for all samples from a single individual and estimate the overall model accuracy using the area under the ROC curve using the R pROC40 package.
Supplementary Material
Acknowledgements
We thank the patients who donated tissue samples to this project. The laboratory of R.C.F. is funded by a Core Programme Grant from the Medical Research Council (RG84369). This work was also funded by a United European Gastroenterology Research Prize (RG76026). We thank the Human Research Tissue Bank, which is supported by the UK National Institute for Health Research (NIHR) Cambridge Biomedical Research Centre, from Addenbrooke’s Hospital. Additional infrastructure support was provided from the Cancer Research UK–funded Experimental Cancer Medicine Centre. We also thank Brian J. Reid, Patricia C. Galipeau, and Carissa A. Sanchez from the Fred Hutchinson Cancer Research Center for their time and help in understanding their data, as well as Alexander Wolfgang Jung from the EMBL-EBI for his time.
Footnotes
Data availability
Sequencing data and associated metadata that support this study have been deposited in the European Genome-phenome Archive under accession EGAD00001006033. The code and model that support these findings have been provided as an R package in a GitHub repository (https://github.com/gerstung-lab/BarrettsProgressionRisk).
Author contributions
S.K. developed the statistical methods, analysed data, and wrote the manuscript and supporting information with input from E.G., R.C.F., and M.G. E.G. put together the discovery cohort, developed the sWGS methods, generated the sWGS data, and curated the clinical information with support from A.V.J. S.K. and E.G. are joint first authors. The initial processing pipeline was developed by D.C.W., D.J.W. and M.E., and provided input to the data analysis for the sWGS data. W.J., R.R., C.K. and A.M. identified, collected, and assessed pathology for patient samples. S.A., A.B., and C.K. sequenced the validation cohort and QC samples. R.C.F. initiated and jointly supervised the study with M.G. and are joint corresponding authors.
References
- 1.Gerstung M, et al. The evolutionary history of 2,658 cancers. Nature. 2020;578:122–128. doi: 10.1038/s41586-019-1907-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Mitchell TJ, et al. Timing the Landmark Events in the Evolution of Clear Cell Renal Cell Cancer: TRACERx Renal. Cell. 2018 doi: 10.1016/j.cell.2018.02.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Lee JJ-K, et al. Tracing Oncogene Rearrangements in the Mutational History of Lung Adenocarcinoma. Cell. 2019 doi: 10.1016/j.cell.2019.05.013. [DOI] [PubMed] [Google Scholar]
- 4.Abelson S, et al. Prediction of acute myeloid leukaemia risk in healthy individuals. Nature. 2018;559:400–404. doi: 10.1038/s41586-018-0317-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Gregson EM, Bornschein J, Fitzgerald RC. Genetic progression of Barrett’s oesophagus to oesophageal adenocarcinoma. Br J Cancer. 2016;115:403–410. doi: 10.1038/bjc.2016.219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Esserman LJ et al. Addressing overdiagnosis and overtreatment in cancer: A prescription for change. Lancet Oncol. 2014;15:e234–e242. doi: 10.1016/S1470-2045(13)70598-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Siegel RL, Miller KD, Jemal A. Cancer statistics. CA Cancer J Clin. 2016;66:7–30. doi: 10.3322/caac.21332. [DOI] [PubMed] [Google Scholar]
- 8.Masclee GMC, Coloma PM, De Wilde M, Kuipers EJ, Sturkenboom MCJM. The incidence of Barrett’s oesophagus and oesophageal adenocarcinoma in the United Kingdom and the Netherlands is levelling off. Aliment Pharmacol Ther. 2014;39:1321–1330. doi: 10.1111/apt.12759. [DOI] [PubMed] [Google Scholar]
- 9.Phoa KN, et al. Radiofrequency ablation vs endoscopic surveillance for patients with Barrett esophagus and low-grade dysplasia: A randomized clinical trial. JAMA - J Am Med Assoc. 2014;311:1209–1217. doi: 10.1001/jama.2014.2511. [DOI] [PubMed] [Google Scholar]
- 10.Shaheen NJ, et al. Radiofrequency Ablation in Barrett’s Esophagus with Dysplasia. N Engl J Med. 2009;360:2277–2288. doi: 10.1056/NEJMoa0808145. [DOI] [PubMed] [Google Scholar]
- 11.Parasa S, et al. Development and Validation of a Model to Determine Risk of Progression of Barrett’s Esophagus to Neoplasia. Gastroenterology. 2018;154:1282–1289.e2. doi: 10.1053/j.gastro.2017.12.009. [DOI] [PubMed] [Google Scholar]
- 12.Younes M, et al. p53 protein accumulation predicts malignant progression in Barrett’s metaplasia: a prospective study of 275 patients. Histopathology. 2017;71:27–33. doi: 10.1111/his.13193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Pettit K, Bellizzi A. Evaluation of p53 Immunohistochemistry Staining Patterns in Barrett Esophagus With Low-Grade Dysplasia. Am J Clin Pathol. 2015;144 A382–A382. [Google Scholar]
- 14.Sikkema M, et al. Aneuploidy and Overexpression of Ki67 and p53 as Markers for Neoplastic Progression in Barrett’s Esophagus: A Case–Control Study. Am J Gastroenterol. 2009;104:2673–2680. doi: 10.1038/ajg.2009.437. [DOI] [PubMed] [Google Scholar]
- 15.Keswani RN, Noffsinger A, Waxman I, Bissonnette M. Clinical use of p53 in Barrett’s esophagus. Cancer Epidemiol Biomarkers Prev. 2006;15:1243–9. doi: 10.1158/1055-9965.EPI-06-0010. [DOI] [PubMed] [Google Scholar]
- 16.Reid BJ, et al. Predictors of progression in Barrett’s esophagus II: baseline 17p (p53) loss of heterozygosity identifies a patient subset at increased risk for neoplastic progression. Am J Gastroenterol. 2001;96:2839–2848. doi: 10.1111/j.1572-0241.2001.04236.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Alvi MA, et al. DNA methylation as an adjunct to histopathology to detect prevalent, inconspicuous dysplasia and early-stage neoplasia in Barrett’s esophagus. Clin Cancer Res. 2013;19:878–888. doi: 10.1158/1078-0432.CCR-12-2880. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Jin Z, et al. A multicenter, double-blinded validation study of methylation biomarkers for progression prediction in Barrett’s esophagus. Cancer Res. 2009;69:4112–4115. doi: 10.1158/0008-5472.CAN-09-0028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Weaver JMJ, et al. Ordering of mutations in preinvasive disease stages of esophageal carcinogenesis. Nat Genet. 2014;46:837–843. doi: 10.1038/ng.3013. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Secrier M, et al. Mutational signatures in esophageal adenocarcinoma define etiologically distinct subgroups with therapeutic relevance. Nat Genet. 2016;48:1131–1141. doi: 10.1038/ng.3659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Frankell AM, et al. The landscape of selection in 551 esophageal adenocarcinomas defines genomic biomarkers for the clinic. Nat Genet. 2019;51:506–516. doi: 10.1038/s41588-018-0331-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Nones K, et al. Genomic catastrophes frequently arise in esophageal adenocarcinoma and drive tumorigenesis. Nat Commun. 2014;5 doi: 10.1038/ncomms6224. 5224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Blum A, et al. RNA Sequencing Identifies Transcriptionally-Viable Gene Fusions in Esophageal Adenocarcinomas. Cancer Res. 2016 doi: 10.1158/0008-5472.CAN-16-0979. canres.0979.2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.The Cancer Genome Atlas Research Network. Integrated genomic characterization of oesophageal carcinoma. Nature. 2017 doi: 10.1038/nature20805. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Ross-Innes CS, et al. Whole-genome sequencing provides new insights into the clonal architecture of Barrett’s esophagus and esophageal adenocarcinoma. Nat Genet. 2015;47:1038–1046. doi: 10.1038/ng.3357. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Maley CC, et al. Genetic clonal diversity predicts progression to esophageal adenocarcinoma. Nat Genet. 2006;38:468–473. doi: 10.1038/ng1768. [DOI] [PubMed] [Google Scholar]
- 27.Martinez P, et al. Dynamic clonal equilibrium and predetermined cancer risk in Barrett’s oesophagus. Nat Commun. 2016;7 doi: 10.1038/ncomms12158. 12158. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Li X, et al. Assessment of esophageal adenocarcinoma risk using somatic chromosome alterations in longitudinal samples in Barrett’s esophagus. Cancer Prev Res. 2015;8:845–856. doi: 10.1158/1940-6207.CAPR-15-0130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Martinez P, et al. Evolution of Barrett’s Esophagus through space and time at single-crypt and whole-biopsy levels. Nat Commun. 2018:1–12. doi: 10.1038/s41467-017-02621-x. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Scheinin I, et al. DNA copy number analysis of fresh and formalin-fixed specimens by whole-genome sequencing : improved correction of systematic biases and exclusion of problematic regions. Genome Res. 2014:1–24. doi: 10.1101/gr.175141.114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Li X, et al. Temporal and spatial evolution of somatic chromosomal alterations: A case-cohort study of Barrett’s esophagus. Cancer Prev Res. 2014;7:114–127. doi: 10.1158/1940-6207.CAPR-13-0289. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Shaheen NJ, Falk GW, Iyer PG, Gerson LB. ACG Clinical Guideline: Diagnosis and Management of Barrett’s Esophagus. Am J Gastroenterol. 2016;111:30–50. doi: 10.1038/ajg.2015.322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Fitzgerald RC, et al. British Society of Gastroenterology guidelines on the diagnosis and management of Barrett’s oesophagus. Gut. 2014;63:7–42. doi: 10.1136/gutjnl-2013-305372. [DOI] [PubMed] [Google Scholar]
- 34.Stachler MD, et al. Paired exome analysis of Barrett’s esophagus and adenocarcinoma. Nat Genet. 2015;47:1047–55. doi: 10.1038/ng.3343. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Kaye PV, et al. Novel staining pattern of p53 in Barrett’s dysplasia - the absent pattern. Histopathology. 2010;57:933–935. doi: 10.1111/j.1365-2559.2010.03715.x. [DOI] [PubMed] [Google Scholar]
- 36.Kaye PV, et al. Barrett’s dysplasia and the Vienna classification: Reproducibility, prediction of progression and impact of consensus reporting and p53 immunohistochemistry. Histopathology. 2009;54:699–712. doi: 10.1111/j.1365-2559.2009.03288.x. [DOI] [PubMed] [Google Scholar]
- 37.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Nilsen G, et al. Copynumber: Efficient algorithms for single- and multi-track copy number segmentation. BMC Genomics. 2012;13:591. doi: 10.1186/1471-2164-13-591. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw. 2010;33:1–22. [PMC free article] [PubMed] [Google Scholar]
- 40.Robin X, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77. doi: 10.1186/1471-2105-12-77. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.