Metastatic colorectal adenocarcinoma tumor purity assessment from whole exome sequencing data

Noura Tbeileh; Luika Timmerman; Aras N Mattis; Kan Toriguchi; Yosuke Kasai; Carlos Corvera; Eric Nakakura; Kenzo Hirose; David B Donner; Robert S Warren; Eveliina Karelehto

doi:10.1371/journal.pone.0271354

. 2023 Apr 6;18(4):e0271354. doi: 10.1371/journal.pone.0271354

Metastatic colorectal adenocarcinoma tumor purity assessment from whole exome sequencing data

Noura Tbeileh ^1,^2,^¤a, Luika Timmerman ^2,^*, Aras N Mattis ^2,^3,⁴, Kan Toriguchi ^1,^2,^¤b, Yosuke Kasai ^1,^2,^¤c, Carlos Corvera ^1,², Eric Nakakura ^1,², Kenzo Hirose ^1,², David B Donner ^1,², Robert S Warren ^1,^2,^*,^#, Eveliina Karelehto ^1,^2,^#

Editor: Elizabeth Christie⁵

PMCID: PMC10079084 PMID: 37022995

Abstract

Tumors rich in stroma are associated with advanced stage and poor prognosis in colorectal adenocarcinoma (CRC). Abundance of stromal cells also has implications for genomic analysis of patient tumors as it may prevent detection of somatic mutations. As part of our efforts to interrogate stroma-cancer cell interactions and to identify actionable therapeutic targets in metastatic CRC, we aimed to determine the proportion of stroma embedded in hepatic CRC metastases by performing computational tumor purity analysis based on whole exome sequencing data (WES). Unlike previous studies focusing on histopathologically prescreened samples, we used an unbiased in-house collection of tumor specimens. WES from CRC liver metastasis samples were utilized to evaluate stromal content and to assess the performance of three in silico tumor purity tools, ABSOLUTE, Sequenza and PureCN. Matching tumor derived organoids were analyzed as a high purity control as they are enriched in cancer cells. Computational purity estimates were compared to those from a histopathological assessment conducted by a board-certified pathologist. According to all computational methods, metastatic specimens had a median tumor purity of 30% whereas the organoids were enriched for cancer cells with a median purity estimate of 94%. In line with this, variant allele frequencies (VAFs) of oncogenes and tumor suppressor genes were undetectable or low in most patient tumors, but higher in matching organoid cultures. Positive correlation was observed between VAFs and in silico tumor purity estimates. Sequenza and PureCN produced concordant results whereas ABSOLUTE yielded lower purity estimates for all samples. Our data shows that unbiased sample selection combined with molecular, computational, and histopathological tumor purity assessment is critical to determine the level of stroma embedded in metastatic colorectal adenocarcinoma.

Introduction

Tumors are heterogeneous mixtures of cancer cells and non-cancerous stromal elements, such as fibroblasts, endothelial and immune cells [1]. The term used to describe the proportion of malignant cells versus stroma within the tumor mass is purity. Histologically, low tumor purity i.e., high level of stroma embedded within the tumor mass, has been linked to poor prognosis in colorectal cancer (CRC) [2, 3]. Transcriptomic analyses and molecular subtyping of CRC also support the conclusion that an abundance of stromal cells within the tumor, particularly cancer-associated fibroblasts, correlates with worse survival [4]. Additionally, low tumor purity hinders genomic and transcriptomic characterization of the malignant cells, for example, by preventing accurate detection of somatic variants in cancer driver genes thus leading to false negative findings [5].

To determine tumor purity, pathologists examine hematoxylin and eosin (H&E) -stained tumor sections visually and estimate the fraction of malignant and stromal cells within the sample. Recently, computational tools employing data generated by molecular assays such as single nucleotide polymorphism arrays, DNA methylation, RNA-sequencing, and whole exome sequencing (WES) have been developed to assess tumor purity. Such in silico tools can be differentiated by the methods they use to infer purity e.g., copy number alterations (CNAs), somatic mutations, loss of heterozygosity signals, allelic fraction values, or deep learning models [6]. The computational tools strive to provide unbiased estimates of tumor purity, but the accuracy of such estimates have varied between tools depending on the method of inference used by the tool. Previous assessments of in silico purity tools have used public databases such as TCGA for their sample sets [6, 7]. While the TCGA CRC database is extensive, the samples in it have been histologically prescreened to omit those with tumor purity below 60% thus leading to biased sampling [8].

Here we aimed to overcome this sampling bias by investigating tumor purity in a cohort of randomly selected in-house colorectal adenocarcinoma liver metastasis (CRCLM) specimens. WES and estimated tumor purity in tissue specimens collected from chemotherapy naïve and treated patients, as well as with matching tumor organoid cultures, was determined and then evaluated using three in silico tumor purity tools: ABSOLUTE [9], Sequenza [10], and PureCN [11]. Overall, we observed lower tumor purity than was previously reported [8] for primary CRC samples, with a median computed purity below 50% across all CRCLM patient tumors and all computational methods used in this study. Variant allele frequencies (VAFs) of the pathogenic mutations found in patient tumors were consistent with the computational tumor purity estimates. As expected, organoid cultures were enriched for cancer cells with median estimated purity above 90% and high VAFs. Additionally, we found varying concordance among the in silico tools and between the computational and pathologist tumor purity estimates.

Materials and methods

Samples and consent

Eighteen patients underwent resection for colorectal adenocarcinoma liver metastasis at the University of California, San Francisco (UCSF). The research protocol was approved by the institutional review board of University of California San Francisco (IRB#10–05031). This study was exempt from informed consent as only excess archival patient tissue was collected and as samples were subsequently anonymized.

No minors were involved in these studies. One half of the excess tissue sample from resected liver metastases and adjacent normal liver was snap frozen by transfer to a Dewar flask containing liquid nitrogen, transported to the laboratory, and stored at −80°C. The remaining portion of the tumor specimen was used to generate a mouse xenograft and to isolate tumor organoids. All animal experiments were carried out by members of the UCSF Preclinical Therapeutics Core Facility in accordance with the University of California San Francisco animal care and use committee (IACUC# AN179937-03A). Animals were anesthetized using isoflurane (2–4%) during SC PDX implantation and, at endpoint, euthanized by asphyxiation with CO2 followed by cervical dislocation, as recommended by the Panel on Euthanasia of the American Veterinary Medical Association." An overview of the samples used for this study is shown in Table 1.

Table 1. Overview of patient colorectal adenocarcinoma liver metastasis samples.

Patient ID	Primary tumor location	Chemotherapy	Matched normal DNA	Tumor organoids	Sample type used for tumor DNA	Pathologist assessment
CR1106	Colon	Yes	Yes	PDXO	Snap frozen	FFPE
CR1107	Rectum	Yes	Yes	PDO	FFPE	FFPE
CR1116	Colon	Yes	Yes	PDXO	Snap frozen	FFPE
CR1119	Colon	Yes	Yes	PDXO	Snap frozen	FFPE
CR1121	Colon	Yes	Yes	PDO	Snap frozen	FFPE
CR1123	Rectum	Yes	Yes	PDXO	Snap frozen	FFPE
CR550	Colon	Yes	n/a	n/a	Snap frozen	FFPE
CR611	Colon	Yes	n/a	n/a	Snap frozen	FFPE
CR623	Colon	Yes	n/a	n/a	Snap frozen	n/a
CR644	Colon	No	n/a	n/a	Snap frozen	FFPE
CR655	Colon	Yes	n/a	n/a	Snap frozen	n/a
CR661	Colon	No	n/a	n/a	Snap frozen	n/a
CR674	Colon	Yes	n/a	n/a	Snap frozen	n/a
CR692	Colon	No	n/a	n/a	Snap frozen	n/a
CR703	Colon	Yes	n/a	n/a	Snap frozen	FFPE
CR704	Colon	No	n/a	n/a	Snap frozen	n/a
CR719	Colon	No	n/a	n/a	Snap frozen	FFPE
CR726	Colon	No	n/a	n/a	Snap frozen	FFPE

Open in a new tab

PDO; patient-derived organoids, PDXO; patient-derived xenograft organoids, FFPE; formalin-fixed paraffin-embedded, n/a; not available.

Organoid culture

Tumor organoids were generated either directly from patient tumor (patient-derived organoids, PDOs) or from the xenograft tumor (patient-derived xenograft organoids, PDXOs). For two patients xenografts did not take but organoid lines could be established directly from the patient tumors. Conversely, PDOs could not be generated directly from four patient tumor specimens and instead PDXOs were used. All organoids were generated by the method previously described by Kondo et al. [12] with some modifications. Briefly, patient or PDX tumor specimens were minced using a scalpel and digested using Liberase (Sigma) and DNase (Qiagen) in a Gentle Macs Octo Dissociator (Miltenyi) for 1 hour at 37 degrees C. Dissociated tumor suspension was then sequentially filtered, first through 500μm, 250μm and 100μm filters to remove undigested material, and then a 40μm filter was employed to retain small clusters of cells while allowing individual cells to pass through. The cell clusters were transferred to ultra-low attachment plates and cultured in organoid media (DMEM/F12 Glutamax (Gibco), 1x PenStrep/Glutamine (Invitrogen), 1x STEMPRO hESC SFM (Invitrogen), 0.1mM beta-mercaptoethanol, 8ng/mL bFGF (Invitrogen), 1.8%BSA (Invitrogen) and 2% growth factor reduced Matrigel (Corning). DNA for whole exome sequencing was extracted from low passage organoids (p < 5) except for CR1107 PDOs for which additional passaging (p = 16) was necessary to obtain sufficient yield.

Whole exome sequencing

RNA-free genomic DNA was extracted from fresh frozen tumor specimens and from organoids using the Macherey-Nagel NucleoSpin Tissue mini kit according to manufacturer instructions. DNA from patient CR1107 tissues was extracted from formalin fixed paraffin embedded (FFPE) sections of tissue at Novogene Co., Ltd in Beijing, China. Whole exome capture and sequencing (WES) was performed at Novogene Co.at sequencing depth of 100X, Ltd. Briefly, genomic DNA was randomly sheared into short fragments of 180–280 bp. The fragments were end repaired, A-tailed, and further ligated with Illumina adapters. The fragments with adapters were PCR amplified, size selected, and purified. The prepared libraries were hybridized in buffer with biotin-labeled probes, and magnetic beads with streptavidin captured the exons of genes. Subsequently, non-hybridized fragments were washed out and probes were digested. The captured libraries were enriched by PCR amplification. Library quality was assessed using Qubit and real-time PCR for quantification, and bioanalyzer for size distribution detection. Quantified libraries were pooled and sequenced on Illumina platform with PE150. Burrows-Wheeler Aligner (BWA) mapped the paired-end clean reads to the reference genome (GRCh38) [13]. SAMtools sorted and indexed the original BAM file followed by Picard to mark duplicate reads [14]. The WES data from this study has been deposited in NIH dbGaP repository with accession number: phs003059.

Variant calling

For detecting pathogenic variants in patient tumors, single nucleotide polymorphisms were called by using GATK’s HaplotypeCaller from BAM files and annotated by ANNOVAR [15, 16]. These variants were then filtered down to variants determined to be pathogenic in the ClinVar database [17]. For the in silico tumor purity tools described below, somatic and germline variants were called using GATK4’s Mutect version 2.2 according to GATK Best Practices. Tumor-only samples were called with af-only-gnomad.hg38.vcf.gz as the germline resource (from Broad Institute’s Google Cloud Bucket) with the flags—genotype-germline-sites true—genotype-pon-sites true to keep germline mutations in the output VCF. Matched tumor samples and organoids were run using the same flags with their additional matched normal BAM file.

ABSOLUTE

ABSOLUTE infers purity from relative copy number profiles from the provided segmentation file input. Ambiguous cases are resolved through pre-computed statistical models of cancer karyotypes based on a diverse sample reference collection. This algorithm also attempts to account for copy number alterations and point mutations in tumor subclones [9]. A segmentation file for each clinical and organoid sample with a matched normal liver sample was produced using GATK4’s ModelSegments CNA workflow. This file contains total segmented copy ratios for the tumor sample. ABSOLUTE used this file as input and was run with the default parameters. For each sample, purity was estimated with the max.non.clonal parameter set to the default 5%, 30% and 50% to account for tumor heterogeneity. Purity for each run was accepted as the maximum log-likelihood solution.

Sequenza

Sequenza performs allele-specific segmentation before applying a probabilistic model to segmented data, taking into account the average sequencing depth ratio of tumor versus normal and B allele frequency and estimating model parameters through a maximum a posteriori approach to infer purity and ploidy [10]. Tumor, organoid, and matched normal BAM files were analyzed using the workflow described in the Sequenza User Guide. Input files were preprocessed using sequenza-utils, the Python library accompanying the Sequenza R-package. The preprocessed files were analyzed using the Sequenza R-package. Purity for each sample was accepted as the first solution (of “cellularity”) in the confints_CP.txt file output which was determined through a maximum likelihood estimation and the 95% confidence interval.

PureCN

PureCN employs a likelihood model on segmented data that identifies artifacts caused by incorrect read alignment or contamination of DNA from other individuals, incorporates the important information provided by somatic point mutations from VCF input, uses copy number and SNV information jointly, and supports uneven tiling of targets across the genome to give the best purity estimate [11]. PureCN was used to call the purities of tumor-only samples using its tumor-only mode according to PureCN’s best practices. Data was segmented through PureCN’s internal segmentation method. GC-normalized coverages were calculated for all samples using PureCN’s script Coverage.R. A normal database was then built using all normal sample coverages using NormalDB.R. Then PureCN’s main script was run to infer purity taking as input the tumor sample’s normalized GC coverage file, VCF from Mutect, normal database, and baits interval file obtained from Agilent. Tumor and organoid samples with a matched normal were run using a similar workflow. The normal database for each run contained all normal samples except the one used in the matched run. PureCN’s main script took as input the tumor or organoid normalized GC coverage file, the matched normal normalized GC coverage file, the normal database, VCF from Mutect, and baits interval file. Accepted purity for every sample was determined to be the maximum likelihood solution determined by PureCN for each sample.

Pathologist estimate

For tumor samples with an available FFPE tumor specimen, a board-certified gastrointestinal pathologist from UCSF estimated the percentage of tumor cells within the tumor area of the tissue section. Additionally, proportions of necrosis and fibrosis were estimated within the tumor area. All estimates were based on 4μm thick H&E-stained sections of the specimens.

Statistical analysis

The Spearman rank test was used to measure correlation between the variant allele frequencies and in silico tumor purity assay results. The mean tumor purity between chemotherapy naïve and treated patient specimens was compared by unpaired two-tailed t test with alpha level of 0.05. Data were analyzed using GraphPad Prism 9 (GraphPad Software Inc., La Jolla, CA).

Results

We employed three in silico tools, ABSOLUTE, Sequenza and PureCN, to estimate tumor purity in CRCLM from 6 patients for which matching normal, tumor and tumor-derived organoid WES data was available. Results for the patient tumor samples and the matching organoids are displayed for each tool in Fig 1A and 1B. Based on the TCGA guidelines, we consider purity less than 60% low and more than 60% high [20]. Median estimated purities for the patient tumors by all methods were low (ABSOLUTE 22%, Sequenza 42.5% and PureCN 48.5%) whereas estimates for the tumor-derived organoids were higher (ABSOLUTE 35%, Sequenza 100% and PureCN 93%). ABSOLUTE consistently produced lower purity estimates than the other two tools. Higher tumor purity was expected for the organoid samples as stromal components are typically lost during organoid generation [18, 19]. In fact, we used normal DNA from patient CR1121 and matching CR1121 tumor organoids to compare the three tools against a sample set with known ratios of normal and tumor DNA (Fig 1C). Organoid and normal DNA was mixed at following ratios (organoid% /normal%): 75/25, 30/70 and 15/85. All tools predicted the pure 100% organoid sample to have tumor purity of 95–100% whereas estimates for the mixed samples were variable. Sequenza and PureCN estimated the 75/25 sample to have purity of 58% while the ABSOLUTE result was lower at 26%. Purity estimates for the 30/70 and 15/85 samples from all tools were underestimates with results ranging between 20–10%. We subsequently employed the only tool of the three compatible with tumor only data, PureCN, to analyze additional CRCLM tumors from 12 patients for which no matching normal or organoid DNA was available. As shown in Fig 2, with few exceptions we again observed low median tumor purity (median 46.5%, range 17–89%).

Fig 1 — ABSOLUTE was run with the max.non.clonal parameter set to either 30% or 50%. CRCLM tumors (A) and tumor-derived organoids (B) from 6 patients for whom matching normal DNA was available were analyzed. (C) Tumor purity estimates of samples with known ratios of normal and tumor DNA from patient CR1121. PDO; patient-derived tumor organoids, PDXO; patient-derived xenograft organoids.

A pathologist estimated the tumor purity of FFPE sections as the percentage of tumor cells within the tumor area. Representative H&E stained section of a CRCLM patient tumor (CR726) is shown in Fig 3. Pathologist tumor purity estimates for all available samples are shown in Table 2. Proportion (%) of necrosis and fibrosis within the tumor area was also assessed. In silico tumor purity results from the PureCN tool are also included in Table 2 for comparison. For approximately half of the samples, pathologist tumor purity estimates were similar to those obtained by in silico tools but for the remaining samples the estimates differed.

Fig 3 — Tumor area is outlined with cyan and the zoomed in area within it in yellow.

Table 2. Pathologist estimate of tumor purity based on H&E stained sections of FFPE CRCLM tumor samples from 12 patients.

	Pathologist			In silico
Patient ID	Tumor purity ^a	Necrosis ^b	Fibrosis ^b	Tumor purity ^c
CR1106	20	25	10	59
CR1107	60	50	10	38
CR1116	65	70	5	91
CR1119	70	10	10	72
CR1121	80	30	20	32
CR1123	35	15	20	23
CR550	0	100	0	19
CR611	45	50	15	42
CR644	60	17	40	89
CR703	40	72	8	32
CR719	30	62	8	51
CR726	20	50	30	17

Open in a new tab

^a % of tumor cells within tumor area

^b % of necrosis/fibrosis within tumor area

^c In silico tumor purity results using PureCN tool

Tumor WES data from all 18 patients was screened for known pathogenic variants and the variant allele frequencies (VAF) were investigated. As shown in Fig 4A, relatively few pathogenic variants were detected and apart from patient tumors CR1116, CR692, and CR644, variants exhibited low VAFs. For example, APC and TP53 mutations across all tumors on average had VAFs of 19% and 26%, respectively. In contrast, the majority of the sequencing reads from tumor organoids harbored the same pathogenic variants as their parent tumors but with higher VAFs (Fig 4B). Furthermore, we found significant positive correlations between the VAF of the most prevalent pathogenic variant in each sample and the in silico tumor purity estimates from Sequenza and PureCN algorithms and from ABSOLUTE when run with max.non.clonal parameter set to 50% (Fig 5A–5D). However, there was no significant correlation between the pathologist tumor purity assessment and the VAFs (Fig 5E), nor between the pathologist purity assessment and the PureCN in silico purity results (Fig 5F).

Fig 4 — Variant allele frequencies of the pathogenic single nucleotide variants detected in CRCLM tumors (A) from 18 patients and in tumor organoids (B) derived from 6 of these patients. PDO; patient-derived tumor organoids, PDXO; patient-derived xenograft organoids. Amino acid change of each variant is displayed within the heatmap cells.

Fig 5 — **(A)** and **(B) ABSOLUTE, (C) Sequenza, (D) PureCN**, or (E.) pathologist tumor purity assessment. (F) Correlation between the PureCN *in silico* tumor purity results and pathologist purity assessment. Each dot represents a patient tumor specimen or a tumor-derived organoid sample.

Lastly, we stratified the tumor samples based on patient chemotherapy status prior to hepatectomy and compared the tumor purity estimates between chemotherapy naïve and treated patients from PureCN. As shown in Fig 6, we found no statistically significant difference between naïve and treated patients.

Discussion

Determining tumor purity is important because of the role that stroma plays in cancer progression and because of the confounding effect low purity has on molecular analyses of the malignant cells. Here we determined tumor purity in colorectal cancer liver metastasis specimens from 18 patients as well as in matching tumor organoids from 6 patients. By using whole-exome sequencing data and three in silico tumor purity tools, ABSOLUTE, Sequenza, and PureCN, we found lower tumor purity than has previously been reported for CRC [8]. This may be specific for liver metastases as previous reports have focused on primary CRC tumors. However, the lower median purity estimates we observed might also be due to the unbiased selection of the tumors we employed in this study as opposed to studies using the TCGA database which is composed mainly of samples from patients without prior chemotherapy treatment and high tumor purity as determined by histopathological evaluation [20].

Pathogenic mutations in APC, KRAS and TP53, are typically found in 80%, 40–50% and 70% of metastatic CRC tumors, respectively [21, 22]. We found only a few of the common pathogenic variants in CRC and detected KRAS mutations in only 2 out of 18 (11.8%) samples. Additionally, VAFs of the mutations detected were low, meaning that only a minority of the sequencing reads harbored the variant while most were wildtype. In contrast, most of the detected mutations in patient tumors exhibited high VAFs in matching organoids. This was expected as protocols used to generate organoids generally enrich for rapidly growing epithelial cells and deplete stromal components [18, 19]. Overall, the VAF data agreed with the output of in silico purity tools, particularly of Sequenza and PureCN, both of which suggested low tumor purity for this CRCLM cohort.

The computational tools used in this study identified low median tumor purity across the sample set. Sequenza and PureCN gave similar results while ABSOLUTE produced lower estimates for most samples. Variation in results between Sequenza and PureCN may be attributed to the differences in copy number segmentation methods and the data utilized. PureCN relies on a database of normal samples and on the matched normal sample whereas Sequenza compares tumor data only to the matched normal sample [7]. Since the normal database used in this study consisted of normal samples collected from only 6 patients, PureCN’s algorithm may have performed sub-optimally. In general, Sequenza and PureCN provide more informative output than ABSOLUTE. While ABSOLUTE only provided a selection of possible solutions, Sequenza and PureCN provided segmentation output, and several additional visualizations to better inform the user of how their optimal solution was chosen. With ABSOLUTE we observed that the max.non.clonal parameter greatly affected the tumor purity estimate. A default setting of 5% often did not produce an output for our samples which may indicate a high level of cancer cell heterogeneity. Attempting to take sub-clonality into account, we set the max.non.clonal parameter at 30% or 50% and obtained purity estimates for all samples. However, these were consistently lower than with the other computational tools. PureCN was the only tool that estimated purity from tumor only samples with no matching normal data available. We found significant positive correlation between the VAF results and the tumor purity estimates from all three computational tools. However, the correlation was strongest with Sequenza and PureCN. When analyzing mixed samples with known ratios of normal and tumor DNA, all tools underestimated purity. Notably, ABSOLUTE failed to predict high purity for a sample that consisted of 75% cancer DNA and resulted in a purity estimate of 26%, irrespective of the algorithm’s max.non.clonal parameter.

For part of the samples, similar tumor purity estimates were observed when comparing results from in silico analysis and from the pathologist. However, we did not find statistically significant correlation between the results of the pathologist and the output of the computational tools. This lack of correlation has been reported by others [6, 7] and was suggested to result from the qualitative nature of pathologist estimates and the failure of the assessed slides to fully account for tumor heterogeneity. For our sample set this lack of correlation might also be explained by the sample evaluated by the pathologist not being from the same area of the tumor used to extract genomic DNA for WES. The only exception to this was the tumor sample from patient CR1107 for which the same FFPE sample was used for both DNA extraction and sectioning. Despite this, the in silico and pathologist purity estimates for this tumor were disconcordant. Establishing a standard whereby pathological and molecular samples are derived from adjacent tumor pieces may produce better agreement between these two types of purity analyses. In addition, a combined laser-capture microdissection and genomic measure might be employed to further yield accurate estimates of tumor purity. Importantly, we cannot discount the fact that pathologist was also able to provide estimates of necrosis and fibrosis in addition to tumor purity, parameters not currently available with the in silico tools.

Neoadjuvant chemotherapy has been reported to result in enrichment of cancer-associated fibroblasts within the residual tumor mass [23]. We noted a slightly lower mean purity i.e., higher stromal content, in tumors from patients who had received neoadjuvant chemotherapy prior to hepatectomy. However, this finding was not statistically significant. Additional specimens, with information on the type and the timing of the neoadjuvant therapy prior to surgery, are needed to investigate the effect of chemotherapy on tumor purity.

Our data shows that metastatic CRC tumors often have an unappreciated abundance of stromal cells that genomic and transcriptomic studies of prescreened databases such as TCGA underestimate. Further research is needed to evaluate whether this is a feature of liver metastases or if found across all stages of CRC tumors. We found considerable variation between tumor purity estimates from different in silico tools as well as from pathologist estimates. Therefore, molecular assays, both genomic and transcriptomic, and in silico tools should be employed together with histopathological assessment to estimate tumor purity more accurately.

Data Availability

All relevant data are within the paper. Whole-exome sequences of patients cannot be made publicly available for patient confidentiality reasons. This data is deposited in the dbGaP repository (accession number phs003059), for restricted access for researchers who meet the dbGaP criteria for access to confidential data.

Funding Statement

This work was supported by a University of California Cancer Research Coordinating Committee award #C21CR2154 to RSW (https://ucop.edu/research-initiatives/programs/crcc/index.html) and in part by a gift from the Edmund Wattis Littlefield Foundation to RSW. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

1.Hanahan D, Coussens LM. Accessories to the Crime: Functions of Cells Recruited to the Tumor Microenvironment. Cancer Cell. 2012. Mar 20;21(3):309–22. doi: 10.1016/j.ccr.2012.02.022 [DOI] [PubMed] [Google Scholar]
2.Huijbers A, Tollenaar R a. EM, Pelt GW v, Zeestraten ECM, Dutton S, McConkey CC, et al. The proportion of tumor-stroma as a strong prognosticator for stage II and III colon cancer patients: validation in the VICTOR trial. Ann Oncol. 2013. Jan 1;24(1):179–85. [DOI] [PubMed] [Google Scholar]
3.West NP, Dattani M, McShane P, Hutchins G, Grabsch J, Mueller W, et al. The proportion of tumour cells is an independent predictor for survival in colorectal cancer patients. Br J Cancer. 2010. May 11;102(10):1519–23. doi: 10.1038/sj.bjc.6605674 [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Calon A, Lonardo E, Berenguer-Llergo A, Espinet E, Hernando-Momblona X, Iglesias M, et al. Stromal gene expression defines poor-prognosis subtypes in colorectal cancer. Nat Genet. 2015. Apr;47(4):320–9. doi: 10.1038/ng.3225 [DOI] [PubMed] [Google Scholar]
5.Cheng J, He J, Wang S, Zhao Z, Yan H, Guan Q, et al. Biased Influences of Low Tumor Purity on Mutation Detection in Cancer. Front Mol Biosci. 2020;7:343. doi: 10.3389/fmolb.2020.533196 [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Yadav VK, De S. An assessment of computational methods for estimating purity and clonality using genomic data derived from heterogeneous tumor tissue samples. Brief Bioinform. 2015. Mar;16(2):232–41. doi: 10.1093/bib/bbu002 [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Haider S, Tyekucheva S, Prandi D, Fox NS, Ahn J, Xu AW, et al. Systematic Assessment of Tumor Purity and Its Clinical Implications. JCO Precis Oncol. 2020. Sep 4;4:PO.20.00016. doi: 10.1200/PO.20.00016 [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Aran D, Sirota M, Butte AJ. Systematic pan-cancer analysis of tumour purity. Nat Commun. 2015. Dec 4;6(1):8971. doi: 10.1038/ncomms9971 [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Carter SL, Cibulskis K, Helman E, McKenna A, Shen H, Zack T, et al. Absolute quantification of somatic DNA alterations in human cancer. Nat Biotechnol. 2012. May;30(5):413–21. doi: 10.1038/nbt.2203 [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Favero F, Joshi T, Marquard AM, Birkbak NJ, Krzystanek M, Li Q, et al. Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data. Ann Oncol. 2015. Jan;26(1):64–70. doi: 10.1093/annonc/mdu479 [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Riester M, Singh AP, Brannon AR, Yu K, Campbell CD, Chiang DY, et al. PureCN: copy number calling and SNV classification using targeted short read sequencing. Source Code Biol Med. 2016. Dec 15;11(1):13. doi: 10.1186/s13029-016-0060-z [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Kondo J, Ekawa T, Endo H, Yamazaki K, Tanaka N, Kukita Y, et al. High-throughput screening in colorectal cancer tissue-originated spheroids. Cancer Sci. 2019. Jan;110(1):345–55. doi: 10.1111/cas.13843 [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinforma Oxf Engl. 2009. Jul 15;25(14):1754–60. doi: 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinforma Oxf Engl. 2009. Aug 15;25(16):2078–9. doi: 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Poplin R, Ruano-Rubio V, DePristo MA, Fennell TJ, Carneiro MO, Van der Auwera GA, et al. Scaling accurate genetic variant discovery to tens of thousands of samples [Internet]. Genomics; 2017. Nov [cited 2021 Dec 13]. Available from: http://biorxiv.org/lookup/doi/10.1101/201178 [Google Scholar]
16.Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010. Sep;38(16):e164. doi: 10.1093/nar/gkq603 [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018. Jan 4;46(D1):D1062–7. doi: 10.1093/nar/gkx1153 [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Fujii M, Sato T. Somatic cell-derived organoids as prototypes of human epithelial tissues and diseases. Nat Mater. 2021. Feb;20(2):156–69. doi: 10.1038/s41563-020-0754-0 [DOI] [PubMed] [Google Scholar]
19.Drost J, Clevers H. Organoids in cancer research. Nat Rev Cancer. 2018. Jul;18(7):407–18. doi: 10.1038/s41568-018-0007-6 [DOI] [PubMed] [Google Scholar]
20.Muzny DM, Bainbridge MN, Chang K, Dinh HH, Drummond JA, Fowler G, et al. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012. Jul;487(7407):330–7. doi: 10.1038/nature11252 [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Yaeger R, Chatila WK, Lipsyc MD, Hechtman JF, Cercek A, Sanchez-Vega F, et al. Clinical Sequencing Defines the Genomic Landscape of Metastatic Colorectal Cancer. Cancer Cell. 2018. Jan 8;33(1):125–136.e3. doi: 10.1016/j.ccell.2017.12.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Mendelaar PAJ, Smid M, van Riet J, Angus L, Labots M, Steeghs N, et al. Whole genome sequencing of metastatic colorectal cancer reveals prior treatment effects and specific metastasis features. Nat Commun. 2021. Jan 25;12(1):574. doi: 10.1038/s41467-020-20887-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Lotti F, Jarrar AM, Pai RK, Hitomi M, Lathia J, Mace A, et al. Chemotherapy activates cancer-associated fibroblasts to maintain colorectal cancer-initiating cells by IL-17A. J Exp Med. 2013. Dec 16;210(13):2851–72. doi: 10.1084/jem.20131195 [DOI] [PMC free article] [PubMed] [Google Scholar]

PLoS One. doi: 10.1371/journal.pone.0271354.r001

Decision Letter 0

Elizabeth Christie

17 Feb 2022

PONE-D-21-40240Metastatic colorectal adenocarcinoma tumor purity assessment from whole exome sequencing dataPLOS ONE

Dear Dr. Timmerman,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please address all of the reviewers comments, and consider expanding upon the Data Availability Statement to clarify what criteria researchers must meet to obtain the WES data. Alternatively, consider depositing the WES data in a repository for controlled access data.

Please submit your revised manuscript by Apr 03 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Elizabeth Christie

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

3. PLOS requires an ORCID iD for the corresponding author in Editorial Manager on papers submitted after December 6th, 2016. Please ensure that you have an ORCID iD and that it is validated in Editorial Manager. To do this, go to ‘Update my Information’ (in the upper left-hand corner of the main menu), and click on the Fetch/Validate link next to the ORCID field. This will take you to the ORCID site and allow you to create a new iD or authenticate a pre-existing iD in Editorial Manager. Please see the following video for instructions on linking an ORCID iD to your Editorial Manager account: https://www.youtube.com/watch?v=_xcclfuvtxQ

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: No

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Tbeileh et al. studied the proportion of stromal cells in hepatic CRC metastases by performing computational tumor purity analysis based on whole exome sequencing data. They used an unbiased inhouse collection of tumor specimens. Matching tumor-derived organoids were also utilized as a mostly pure cancer control. They applied three in silico tumor purity tools, ABSOLUTE, Sequenza and PureCN. They found that patient metastasis specimens had a median tumor purity below 50%. In line with this, some of the VAFs of oncogenes and tumor suppressor genes were undetectable or low in the patient tumors, although positive correlation was observed between VAFs and in silico tumor purity estimates. Sequenza and PureCN produced concordant results whereas ABSOLUTE yielded lower purity estimates for all samples.

Focusing on the metastatic CRC with non-biased samples, and showing the lower cancer cell purity than expected are interesting findings. There are concerns as follow.

Comments

The platforms of tumor purity estimation in this study are relatively old. Please explain why they applied these three methods among many others.

Doi: 10.1093/bioinformatics/btz406.

DOI: 10.1200/PO.20.00016 JCO

Organoids were used as a control of mostly pure cancer cells in this study. Since it is difficult to understand the context, it should be briefly noted in abstract why the organoids were used.

Method for organoid preparation and culture should be described more in detail, especially the timing of the sampling after preparation or culture. Longer culture period might select the cancer cells adapted to the culture conditions.

The timings of the operations after chemotherapy should be shown.

Histopathological analysis should be described more in detail, since it will make a big difference when the estimation is based on the tumor/stromal ‘area’ or ‘cells’. It would be better to show representative HE images along with the instruction of the way of the histopathological analysis; each area, each cell, and so on. It would be better to perform the histopathological analysis at multiple sites to show the heterogenous cancer cell purity in one tumor.

P3L42 Since the molecular/computational and histopathological estimation are not consistent in this study, the conclusion of the abstract, just saying combination of the analyses is important is not appropriate.

P6L91-93

Please describe the relationship between snap frozen samples and fresh samples. Were they from the same samples or from different sites?

P10L180

‘tumor’ is better to be ‘cancer cells’

P11-P12

The purity estimation results are described as ‘low’ and ‘high’. Please make the criteria of low and high clearer.

Table 2

Since this table is not in comparison with the other types of estimation, it is difficult to interpret the data for a reader. It would be better to show the tumor purity at least also in figure 1.

It seems like that the histopathological estimation is performed by areas but not cells. If so, it is not surprising that the estimation by ‘area’ are not consistent with the estimation by ‘cells’.

In addition to CR1107, CR1121 is also consistent with in silico estimation.

Figure 1C

ABSOLUTE might yield lower purity estimates for higher purity samples (figure 1B), although when the purity is lower, it might be superior than others (figure 1C). Therefore, the conclusion ‘ABSOLUTE yielded lower purity estimates for all samples’ might not be appropriate.

P14L249, Figure 5

It is hard to say ‘a trend’ from this data.

Figure 3

Please explain the discrepancy of APC mutation of CR1106 between A and B

Reviewer #2: In this manuscript Tbeilah et al, explore the use of three in silico tools which utilise WES data to assess the tumour purity in liver mets associated with colorectal cancer. Whilst the premise is a good one, the limited number of samples overall, and the limited number with pathology review make it difficult to draw significant conclusions. I recommend that the authors revise the manuscript and incorporate additional samples in their analysis.

1.Pathologist assessment was performed on only 6 cases. This sample size is rather small and given the inconsistencies in the performance of the 3 tools it is very difficult to make any conclusions. This aspect of the work would be strengthened with a larger number of samples. This would then allow the authors to confirm (or not) and the trends described.

2.The in silico tumour purity vales for Sequenza and PureCN was similar for most samples but was very different for cases PDXO1107 and PDXO1123. It is reasonable to assume (as also mentioned by the authors) that organoid samples which have been enriched for epithelial cells would have a reasonably high purity. Can the authors comment on this difference and explain the low purity levels estimated by PureCN and comment on the accuracy of this tool?

3. Sample CR1106 was estimated to have 3% tumour and 62% stroma by the pathologist. This was the lowest of all the samples. The VAF for APC of the associated organoid was very high but was not detected at all in the original tumour. Do the authors believe that there is a minimum purity threshold that needs to be met for the tools to be useful?

4. Fig 4 shows the correlation b/w the VAF of the most prevalent pathogenic variant for each sample and the in silico tumour purity results. Can the authors provide details of the what the variant was for each sample?

5. Can the authors explain why they chose to use a mix of PDO’s (n=2) and PDXO’s (n=4)? Again, these numbers are small and introduce additional variables.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2023 Apr 6;18(4):e0271354. doi: 10.1371/journal.pone.0271354.r002

Author response to Decision Letter 0

12 May 2022

We have uploaded a reviewer response document, a revised manuscript with and without the changes tracked, and a new set of figures of proper format per the editor's request. We have no changes to make to the financial disclosures. The data availability statement has been changed to indicate that the data will be housed at dbGaP for limited general research use (GRU). Some limitation is necessary since our data is whole exome sequences, which, with some effort, may be used to identify an individual.

Our manuscript meets PLOS ONE's style requirements, and my ORCID number is correct. We added a section at the end of the methods section of the manuscript titled "Data Availability" which states that the data is (will be) housed at the NIH web site named dbGaP, per an email from Radovan Lumban-Tobing, and another from Edrian Nim Tolentino. Per their request we have also added consent information into the manuscript.

Specific reviewer's comments are addressed in the document "Response to Reviewers", uploaded today.

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(27.2KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0271354.r003

Decision Letter 1

Elizabeth Christie

31 May 2022

PONE-D-21-40240R1Metastatic colorectal adenocarcinoma tumor purity assessment from whole exome sequencing dataPLOS ONE

Dear Dr. Timmerman,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but there are some minor points raised by Reviewer #1 to be addressed. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Jul 15 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.
A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.
An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

We look forward to receiving your revised manuscript.

Kind regards,

Elizabeth Christie

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

Reviewer #2: (No Response)

**********

6. Review Comments to the Author

Reviewer #1: I should have mentioned them in the first comments, but I hope the following comments will help to make the manuscript better.

Abstract

1) Overall, metastatic specimens had a median tumor purity below 50% whereas the organoids were enriched for cancer cells with purity estimates above 90%.

As the author added in the revised manuscript, adding ‘according to all computational methods’ would be better than just ‘overall’. Where is the value of ‘90%’ described in the results? Actually, the average of Figure 1B is 70%.

2) In line with this, VAFs of oncogenes and tumor suppressor genes were undetectable or low in patient tumors, but high in matching organoid cultures.

Since some of the VAFs are 100% in Figure 4, it would be better to put ‘in some cases’ in this sentence.

3)Figure

In Figure 5F, the legend of the x-axis, in silico tumor purity % (PureCN) would be better than just PureCN.

Reviewer #2: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

PLoS One. 2023 Apr 6;18(4):e0271354. doi: 10.1371/journal.pone.0271354.r004

Author response to Decision Letter 1

15 Jun 2022

Response to Reviewers' comments for the manuscript: Metastatic colorectal adenocarcinoma purity assessment from whole exome sequencing

Reviewer #1: I should have mentioned them in the first comments, but I hope the following comments will help to make the manuscript better.

1. Overall, metastatic specimens had a median tumor purity below 50% whereas the organoids were enriched for cancer cells with purity estimates above 90%.

We thank the reviewer for pointing out this error. We neglected to update this part of the abstract when we re-analyzed data for our previous resubmission. We have now modified the abstract text to show the purity median for all patient samples and for all organoids (across all computational methods), line 37. Median tumor purity results are given separately for each computational tool in the results section. Thus we have harmonized the description of the analysis results in the abstract with the analysis in the body of the manuscript.

2. In line with this, VAFs of oncogenes and tumor suppressor genes were undetectable or low in patient tumors, but high in matching organoid cultures.

Since some of the VAFs are 100% in Figure 4, it would be better to put ‘in some cases’ in this sentence.

We are unsure where in this sentence the reviewer wishes to include ‘in some cases’. We slightly modified the sentence in the abstract and hope it is clearer now, line 40.

3. In Figure 5F: the legend of the x-axis, in silico tumor purity % (PureCN) would be better than just PureCN.

We have modified the figure legend as suggested.

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(16.1KB, docx)}

PLoS One. doi: 10.1371/journal.pone.0271354.r005

Decision Letter 2

Elizabeth Christie

29 Jun 2022

Metastatic colorectal adenocarcinoma tumor purity assessment from whole exome sequencing data

PONE-D-21-40240R2

Dear Dr. Timmerman,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Elizabeth Christie

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

Reviewer #1: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

Reviewer #1: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

Reviewer #1: Yes

**********

6. Review Comments to the Author

Reviewer #1: All comments have been well addressed.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

**********

PLoS One. doi: 10.1371/journal.pone.0271354.r006

Acceptance letter

Elizabeth Christie

28 Mar 2023

PONE-D-21-40240R2

Metastatic colorectal adenocarcinoma tumor purity assessment from whole exome sequencing data

Dear Dr. Timmerman:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Elizabeth Christie

Academic Editor

PLOS ONE

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(27.2KB, docx)}

Attachment

Submitted filename: Response to Reviewers.docx

Click here for additional data file.^{(16.1KB, docx)}

Data Availability Statement

[pone.0271354.ref001] 1.Hanahan D, Coussens LM. Accessories to the Crime: Functions of Cells Recruited to the Tumor Microenvironment. Cancer Cell. 2012. Mar 20;21(3):309–22. doi: 10.1016/j.ccr.2012.02.022 [DOI] [PubMed] [Google Scholar]

[pone.0271354.ref002] 2.Huijbers A, Tollenaar R a. EM, Pelt GW v, Zeestraten ECM, Dutton S, McConkey CC, et al. The proportion of tumor-stroma as a strong prognosticator for stage II and III colon cancer patients: validation in the VICTOR trial. Ann Oncol. 2013. Jan 1;24(1):179–85. [DOI] [PubMed] [Google Scholar]

[pone.0271354.ref003] 3.West NP, Dattani M, McShane P, Hutchins G, Grabsch J, Mueller W, et al. The proportion of tumour cells is an independent predictor for survival in colorectal cancer patients. Br J Cancer. 2010. May 11;102(10):1519–23. doi: 10.1038/sj.bjc.6605674 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref004] 4.Calon A, Lonardo E, Berenguer-Llergo A, Espinet E, Hernando-Momblona X, Iglesias M, et al. Stromal gene expression defines poor-prognosis subtypes in colorectal cancer. Nat Genet. 2015. Apr;47(4):320–9. doi: 10.1038/ng.3225 [DOI] [PubMed] [Google Scholar]

[pone.0271354.ref005] 5.Cheng J, He J, Wang S, Zhao Z, Yan H, Guan Q, et al. Biased Influences of Low Tumor Purity on Mutation Detection in Cancer. Front Mol Biosci. 2020;7:343. doi: 10.3389/fmolb.2020.533196 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref006] 6.Yadav VK, De S. An assessment of computational methods for estimating purity and clonality using genomic data derived from heterogeneous tumor tissue samples. Brief Bioinform. 2015. Mar;16(2):232–41. doi: 10.1093/bib/bbu002 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref007] 7.Haider S, Tyekucheva S, Prandi D, Fox NS, Ahn J, Xu AW, et al. Systematic Assessment of Tumor Purity and Its Clinical Implications. JCO Precis Oncol. 2020. Sep 4;4:PO.20.00016. doi: 10.1200/PO.20.00016 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref008] 8.Aran D, Sirota M, Butte AJ. Systematic pan-cancer analysis of tumour purity. Nat Commun. 2015. Dec 4;6(1):8971. doi: 10.1038/ncomms9971 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref009] 9.Carter SL, Cibulskis K, Helman E, McKenna A, Shen H, Zack T, et al. Absolute quantification of somatic DNA alterations in human cancer. Nat Biotechnol. 2012. May;30(5):413–21. doi: 10.1038/nbt.2203 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref010] 10.Favero F, Joshi T, Marquard AM, Birkbak NJ, Krzystanek M, Li Q, et al. Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data. Ann Oncol. 2015. Jan;26(1):64–70. doi: 10.1093/annonc/mdu479 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref011] 11.Riester M, Singh AP, Brannon AR, Yu K, Campbell CD, Chiang DY, et al. PureCN: copy number calling and SNV classification using targeted short read sequencing. Source Code Biol Med. 2016. Dec 15;11(1):13. doi: 10.1186/s13029-016-0060-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref012] 12.Kondo J, Ekawa T, Endo H, Yamazaki K, Tanaka N, Kukita Y, et al. High-throughput screening in colorectal cancer tissue-originated spheroids. Cancer Sci. 2019. Jan;110(1):345–55. doi: 10.1111/cas.13843 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref013] 13.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinforma Oxf Engl. 2009. Jul 15;25(14):1754–60. doi: 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref014] 14.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinforma Oxf Engl. 2009. Aug 15;25(16):2078–9. doi: 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref015] 15.Poplin R, Ruano-Rubio V, DePristo MA, Fennell TJ, Carneiro MO, Van der Auwera GA, et al. Scaling accurate genetic variant discovery to tens of thousands of samples [Internet]. Genomics; 2017. Nov [cited 2021 Dec 13]. Available from: http://biorxiv.org/lookup/doi/10.1101/201178 [Google Scholar]

[pone.0271354.ref016] 16.Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010. Sep;38(16):e164. doi: 10.1093/nar/gkq603 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref017] 17.Landrum MJ, Lee JM, Benson M, Brown GR, Chao C, Chitipiralla S, et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 2018. Jan 4;46(D1):D1062–7. doi: 10.1093/nar/gkx1153 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref018] 18.Fujii M, Sato T. Somatic cell-derived organoids as prototypes of human epithelial tissues and diseases. Nat Mater. 2021. Feb;20(2):156–69. doi: 10.1038/s41563-020-0754-0 [DOI] [PubMed] [Google Scholar]

[pone.0271354.ref019] 19.Drost J, Clevers H. Organoids in cancer research. Nat Rev Cancer. 2018. Jul;18(7):407–18. doi: 10.1038/s41568-018-0007-6 [DOI] [PubMed] [Google Scholar]

[pone.0271354.ref020] 20.Muzny DM, Bainbridge MN, Chang K, Dinh HH, Drummond JA, Fowler G, et al. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012. Jul;487(7407):330–7. doi: 10.1038/nature11252 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref021] 21.Yaeger R, Chatila WK, Lipsyc MD, Hechtman JF, Cercek A, Sanchez-Vega F, et al. Clinical Sequencing Defines the Genomic Landscape of Metastatic Colorectal Cancer. Cancer Cell. 2018. Jan 8;33(1):125–136.e3. doi: 10.1016/j.ccell.2017.12.004 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref022] 22.Mendelaar PAJ, Smid M, van Riet J, Angus L, Labots M, Steeghs N, et al. Whole genome sequencing of metastatic colorectal cancer reveals prior treatment effects and specific metastasis features. Nat Commun. 2021. Jan 25;12(1):574. doi: 10.1038/s41467-020-20887-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[pone.0271354.ref023] 23.Lotti F, Jarrar AM, Pai RK, Hitomi M, Lathia J, Mace A, et al. Chemotherapy activates cancer-associated fibroblasts to maintain colorectal cancer-initiating cells by IL-17A. J Exp Med. 2013. Dec 16;210(13):2851–72. doi: 10.1084/jem.20131195 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Metastatic colorectal adenocarcinoma tumor purity assessment from whole exome sequencing data

Noura Tbeileh

Luika Timmerman

Aras N Mattis

Kan Toriguchi

Yosuke Kasai

Carlos Corvera

Eric Nakakura

Kenzo Hirose

David B Donner

Robert S Warren

Eveliina Karelehto

Roles

Abstract

Introduction

Materials and methods

Samples and consent

Table 1. Overview of patient colorectal adenocarcinoma liver metastasis samples.

Organoid culture

Whole exome sequencing

Variant calling

ABSOLUTE

Sequenza

PureCN

Pathologist estimate

Statistical analysis

Results

Fig 1. In silico tumor purity assessment using whole-exome sequencing data and ABSOLUTE, Sequenza and PureCN tools.

Fig 2. PureCN tumor purity assessment of CRCLM patient tumors from 12 patients using tumor only data.

Fig 3. Representative H&E stained section of the patient CR726 FFPE CRCLM tumor sample.

Table 2. Pathologist estimate of tumor purity based on H&E stained sections of FFPE CRCLM tumor samples from 12 patients.

Fig 4.

Fig 5. Correlation between the variant allele frequency (VAF) of the most prevalent pathogenic variant in each sample and of in silico and pathologist tumor purity results.

Fig 6. PureCN tumor purity estimates with mean and standard deviation for all 18 patients grouped by their chemotherapy status prior to hepatectomy.

Discussion

Data Availability

Funding Statement

References

Decision Letter 0

Elizabeth Christie

Roles

Author response to Decision Letter 0

Decision Letter 1

Elizabeth Christie

Roles

Author response to Decision Letter 1

Decision Letter 2

Elizabeth Christie

Roles

Acceptance letter

Elizabeth Christie

Roles

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases