Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2020 Dec 17;15(12):e0241148. doi: 10.1371/journal.pone.0241148

Transcriptomic and proteomic intra-tumor heterogeneity of colorectal cancer varies depending on tumor location within the colorectum

Sigrid Salling Árnadóttir 1,2, Trine Block Mattesen 1,2, Søren Vang 1,2, Mogens Rørbæk Madsen 3, Anders Husted Madsen 2,3, Nicolai Juul Birkbak 1,2, Jesper Bertram Bramsen 1,2,, Claus Lindbjerg Andersen 1,2,‡,*
Editor: Amanda Ewart Toland4
PMCID: PMC7746197  PMID: 33332369

Abstract

Background

Intra-tumor heterogeneity (ITH) of colorectal cancer (CRC) complicates molecular tumor classification, such as transcriptional subtyping. Differences in cellular states, biopsy cell composition, and tumor microenvironment may all lead to ITH. Here we analyze ITH at the transcriptomic and proteomic levels to ascertain whether subtype discordance between multiregional biopsies reflects relevant biological ITH or lack of classifier robustness. Further, we study the impact of tumor location on ITH.

Methods

Multiregional biopsies from stage II and III CRC tumors were analyzed by RNA sequencing (41 biopsies, 14 tumors) and multiplex immune protein analysis (89 biopsies, 29 tumors). CRC subtyping was performed using consensus molecular subtypes (CMS), CRC intrinsic subtypes (CRIS), and TUMOR types. ITH-scores and network maps were defined to determine the origin of heterogeneity. A validation cohort was used with one biopsy per tumor (162 tumors).

Results

Overall, inter-tumor transcriptional variation exceeded ITH, and subtyping calls were frequently concordant between multiregional biopsies. Still, some tumors had high transcriptional ITH and were classified discordantly. Subtyping of proximal MSS tumors were discordant for 50% of the tumors, this ITH was related to differences in the microenvironment. Subtyping of distal MSS tumors were less discordant, here the ITH was more cancer-cell related. The subtype discordancy reflected actual molecular ITH within the tumors. The relevance of the subtypes was reflected at protein level where several inflammation markers were significantly increased in immune related transcriptional subtypes, which was verified in an independent cohort (Wilcoxon rank sum test; p<0.05). Unsupervised hierarchical clustering of the protein data identified large ITH at protein level; as the multiregional biopsies clustered together for only 9 out of 29 tumors.

Conclusion

Our transcriptomic and proteomic analyses show that the tumor location along the colorectum influence the ITH of CRC, which again influence the concordance of subtyping.

Introduction

Colorectal cancer (CRC) is one of the leading causes of cancer related deaths worldwide [1]. UICC TNM staging divides the disease in prognostic subgroups, however within each subgroup there is great variability in response to therapy and clinical outcome [2]. In accordance with this, recent molecular studies have shown great inter-tumor heterogeneity within each UICC TNM stage [35]. It has been suggested that this heterogeneity complicates development of novel treatment strategies and biomarkers [6, 7].

The heterogeneity may partly be rooted in human embryogenesis, as different embryonic layers give rise to the proximal and the distal colon. Other factors related to the location in the colon may also play a role in forming the heterogeneity, such as the different physiological functions of the proximal colon, the distal colon, and the rectum. Furthermore, there are differences in bacterial composition and basal immune activity from proximal colon to rectum [810]. Many CRC features are distinct for tumors of the proximal colon compared to the distal colon or rectum [11] (for reviews see [1215]). Hypermutated tumors, with microsatellite instability (MSI) are frequently located in the proximal colon, while tumors in the distal colon or rectum commonly exhibit microsatellite stability (MSS), and chromosomal instability (CIN) [16]. At the same time, infiltration of cytotoxic T cells has been linked to a good prognosis in tumors located in the proximal colon, but not for tumors in the distal colon [17].

Besides this inter-tumor heterogeneity, multiregional biopsy studies have shown that CRC commonly exhibit intra-tumor heterogeneity (ITH), which increases the complexity even further. Most studies of ITH have been performed with a cancer-cell-centric focus, with emphasis on genetic sub-clones of cancer cells [1820]. However, ITH would expectedly also take place on transcriptional level, due to local differences within the tumor both in regard to cancer cells and the tumor microenvironment (TME). While genetic profiles are relatively stable, transcriptional profiles change during cell cycle, cell differentiation, and in response to local signaling. Hence it may be speculated that transcriptional ITH is highly dynamic and that it varies dependent on tumor location.

In the last decade, several methods have been developed for classifying CRC tumors into homogeneous molecular subtypes based on transcriptional profiling. Some of these methods use cancer cell specific transcripts, while others use all transcripts, from immune-, stromal- and cancer cells. Isella and colleagues developed a subtyping approach, the CRC Intrinsic Subtypes (CRIS), which was based solely on cancer cell related transcripts [5]. CRIS subtypes have been reported to be robust across multiregional biopsies, because they are not influenced by the contribution from the TME [21]. The Consensus Molecular Subtypes (CMS) divide CRC tumors into four subtypes without considering the origin of the RNA [3]. Our previously published TUMOR type (TT) classifier also uses all transcripts for classification, but distinguishes between cancer-, stroma-, and immune cell transcripts [4]. Hence, by combining these three classifiers, they may enhance our understanding of the origin of transcriptional ITH within tumors.

The primary aim of this study was to characterize transcriptional ITH of CRC by sampling multiregional biopsies from each tumor. By combining three subtyping approaches we obtain information about each biopsy from each tumor, and thereby insight into the characteristics of the cancer cells and the TME in each tumor area. This way the subtyping become a useful tool in understanding the origin of ITH and whether it changes depending of the tumor location within the colorectum. The origin of the heterogeneous transcripts was further analyzed through calculations of tumor specific ITH-scores and generation of network maps. In addition we have explored the relation between subtyping, tumor location, and inflammation at protein level.

Materials and methods

Study design

The aim of this study was to characterize transcriptomic and proteomic ITH of CRC using multiregional biopsies from primary tumors. From transcriptomic data the degree of ITH, the biology behind ITH, and the impact of tumor location was explored.

Patient and sample collection

Previously untreated patients diagnosed with TNM stage II and III colorectal tumors larger than 3 cm in diameter were consecutively enrolled at The Surgical Research Unit at Herning Regional Hospital, Denmark in the period from 2014 to 2017. The study was approved by the Central Denmark Regional Committees on Health Ethics (J. no. 1-10-72-221-14), and all patients gave written informed consent. From each tumor, three to five samples were collected from spatially distinct sites of the luminal surface to address ITH. Samples were collected immediately after surgery, snap-frozen in liquid nitrogen, and stored at -80° for later analysis. For sample overview see S1 Table.

MSI status

The MSI status was determined with a pentaplex polymerase chain reaction of quasimonomorphic mononucleotide repeats [22]. Tumors were defined as MSI, if >3 out of 5 PCR markers were positive, as previously described [4].

RNA purification, sequencing, and data processing

RNA was purified using the RNeasy mini kit (Qiagen, Hilden, Germany). RNA quality was assessed using Agilent RNA 6000 Nano/Pico kits on an Agilent 2100 Bioanalyzer (Agilent Technologies, CA, USA). RNA concentration was measured using Qubit RNA HS assay kit (ThermoFischer Scientific, MA, USA). RNA sequencing was performed at the NGS Core facility, Department of Molecular Medicine, Aarhus University Hospital, as previously described [23]. In short, ribosomal RNA was removed using the Ribo-Zero Gold rRNA Removal Kit (Illumina, CA, USA) leaving both coding and non-coding RNA for whole-transcriptomic sequencing. Synthesis of directional libraries for paired-end sequencing were performed using ScriptSeq v2 RNA-seq Library preparation Kit (Illumina) following manufacturer’s instructions. A minimum of 34 million read pairs (median 65 million read pairs) were sequenced per sample on an Illumina NextSeq500 using high output flow cells (Illumina). Data processing of the paired raw sequence reads was performed using TopHat2 [24], with mapping to the human reference genome HG19. FPKM values were calculated using Cufflinks [25], while raw read counts were calculated using HTSeq and Gencode v19 transcript information [26]. RNA data from four patients, have been published before [18], however, the raw RNA sequencing data have been reanalyzed with an updated pipeline.

RNA sequencing of TNM stage II and III samples from the validation cohort was performed by polyA-sequencing as previously described [4].

Protein extraction

Proteins were extracted from 10–15 tissue slices of 10μm each. These were transferred to an Eppendorf tube containing 200ul cold RIPA buffer supplemented with 1mM phenylmethylsulfonyl fluorid (PMSF) and 1x protease inhibitor cocktail (PIC)(10x, P8320 Sigma) freshly added (Sigma-Aldrich, MO, USA). After vortexing (1min) the samples were incubated on ice for 30 minutes, followed by centrifugation at 14.000 rpm for 10 minutes at 4°C. Supernatant was transferred into new cold tubes and stored at -80°C. Protein quantification was performed using the PierceTM BCA Protein Assay Kit (ThermoFisher Scientific) following manufacturers protocol. Sample aliquots were diluted in RIPA buffer with PMSF and PIC to obtain 0.4μg/μl. These were transferred into a 96 well plate and stored at -80°C until analysis.

Multiplex proximity extension assay (PEA) analysis and quality control

Multiplex PEA analysis was performed by technicians at Olink Proteomics (Uppsala, Sweden). In short, protein samples (0.4μg/μl) in RIPA buffer were mixed with oligonucleotide-labeled antibodies from the Immuno-Oncology panel covering 96 proteins. Each protein is targeted by two antibodies, when these bind their target protein and thereby come into proximity of each other, their oligonucleotides anneal and a PCR target sequence is formed. This was followed by an amplification reaction and subsequently the results were measured using standard qPCR. From the resulting Ct values, normalized protein expression (NPX) values were calculated using the Olink® NPX Manager, which uses both internal and external controls for normalization. All flagged (failed) samples and proteins with an overall detection level below 75% were removed prior to data analysis.

Clustering and heatmaps

For the RNA heatmap, only protein coding genes were included (gene list obtained online from [27]). Log2(CPM) values were calculated based on TMM normalized raw counts using EdgeR [28], with a filter of log2(CPM)>1. For the protein heatmap, NPX values were used for all proteins from the immuno-oncology panel with confident detection (n = 68). For both approaches, row z-scores were calculated for each gene/protein. Unsupervised clustering and heatmaps were generated using the function aheatmap from the Rpackage:NMF [29], with 1-pearson’s correlation distance method, and Ward.D2 linkage. Clustering bootstrapping was performed to evaluate the significance of patient-specific clusters using the pvclust R package (1000 repetitions) and significance were estimated by Approximately Unbiased (AU) p-values: Clusters were considered significant for AU values ≥95, which indicates a significance p-value ≤0.05 [30].

Subtype classification

Molecular subtypes were assigned to each sample based on FPKM values. CMS subtypes were assigned using the nearest-centroid single sample predictor CMS classifier and log2 transformed FPKM values [3]. CRIS subtypes were assigned using the CRIS classifier (the predictCRIS-ClassKTSP function) in R (available online from [5]). TT were assigned using the Tumor Subtype Classifier [4]. Caleydo Stratomex was used to visualize the connection between subtypes [31].

ITH-score and tumor specific network maps

An ITH-score was calculated for each gene based on differences in RNA expression between multiregional biopsies from each tumor. This was done by calculating the standard deviation (STD) of log2(CPM) values for each gene within each tumor. A high ITH-score was defined as STD >0.5. Stroma scores for each gene were defined based on the xenografts studies by Isella et al. [32]. Genes with Stroma scores >0.5 were considered “Stromal genes”. The gene copy number alterations (CNA) scores were defined for each gene as the standard deviation of the GISTIC2 copy-numbers of all samples within the COREAD cohort available at the UCSC XENA Public Data Hubs [33]. Genes with CNA scores >0.5 were considered “CNA genes” (i.e. affected by chromosomal copy number alternations in CRC). Genes were defined as housekeeping genes if included on the”Human housekeeping genes revisited" list published by Eisenberg et al [34].

Single sample gene set enrichment analysis (ssGSEA) was performed for all samples using the ssGSEAProjection module v 9.1.1 [35] of the Genepattern bioinformatics platform [36] using log2(CPM) values and the Molecular Signatures Database (MsigDB) gene set collection v6.2 as input [35]. For tumors with the largest amount of high ITH-score genes, tumor specific network maps were created using the Enrichment Map app [37] and the Cytoscape software [38]. These network maps were based on inputting the 5000 gene sets with the most varying ssGSEA enrichment scores between multiregional biopsies for each tumor. The generated maps were colored according to differences in ssGSEA enrichment score between biopsies using median normalized values for each gene set.

Statistical analysis

All statistics were performed in R with functions from the stats package. Tumors were classified as belonging to the left- or right-side cluster in protein heatmap, based on the majority of biopsies. Fisher’s exact test was performed using fisher.test with default settings. The significance of differences between subtyping groups were tested using the Wilcoxon rank sum test, with the function wilcox.test, paired = false.

Results

ITH of CRC subtypes varies between proximal and distal tumors

In order to assess how the tumor location affects ITH on the level of transcriptional subtypes, CRC subtyping using the CMS-, CRIS-, and TT-classifiers were applied to total RNA sequencing data from multiregional biopsies (n = 3) from 14 tumors. Since these classifiers use different gene sets and approaches for subtyping, they reflect different kinds of ITH; e.g. ITH arising due to inter-biopsy differences in the cell type distribution of both cancer epithelium and stromal cells (CMS and TT) or ITH arising due to transcriptional differences between the cancer cell populations in the biopsies (CRIS). For MSI tumors from the proximal colon, all three classifiers concordantly called high immune subtypes (CMS1+CRIS-A+SSC) in all biopsies (Fig 1A). For MSS tumors of the proximal colon, the multiregional classifications were less consistent; for these tumors all three classifiers showed inconsistent classification in 50% of tumors, though not necessarily in the same tumors (Fig 1A and 1B). For tumors of the distal colon and rectum, the rate of concordant calls for the multiregional biopsies were much higher (Fig 1A). The CMS classifier showed 100% concordance, the TT classifier 80%, while the CRIS classifier only classified 40% of the tumors concordant (Fig 1A and 1B). The low concordance for the cancer cell transcript based CRIS subtypes indicates that the main reasons for transcriptional ITH in the distal tumors are cancer cell related. The proximal and distal colon tumors differed in their overall subtype distribution and in their composition of subtype calls between classifiers (Fig 1C). The proximal colon tumors were typically more immune-related (CMS1, CRIS-A, CRIS-B, and SSC) than the distal colon tumors (Fig 1C). Furthermore, samples classified as CMS2+CIN, were typically CRIS-B or CRIS-E in the proximal tumors, but CRIS-C or CRIS-E in the distal tumors. Overall CRIS-E (originally described as Paneth cell-like) was often present in the heterogeneously called tumors (Fig 1A and 1C).

Fig 1. Tumor location influences ITH of transcriptional CRC subtyping.

Fig 1

(A) All biopsies were classified using three different classifiers; CMS, CRIS and TT. Each circle depicts a tumor, and each piece a biopsy. All tumors are presented three times, one for each classifier. * denotes CMS subtype calls that were only called, when using the SSP.nearest method, in contrast the SSP.predicted method left these samples unclassified. CRIS classifications with multiple colors are due to uncertain subtype calls. (B) Discordant or concordant subtype calls within tumors, for all tumors, proximal tumor, and distal tumors (grey = discordant subtype calls within a tumor, blue = concordant subtype calls). (C) Caleydo Stratomex plots of the distribution and correlation between subtypes called for each biopsy. Top = highlighted for proximal tumors, bottom = highlighted for distal tumors. (CMS = consensus molecular subtypes, CRIS = CRC intrinsic subtypes, TT = tumor types).

ITH of overall transcriptional profile

To investigate the overall ITH on gene expression level we performed unsupervised clustering of all tumor biopsies using transcriptomic profiles including coding genes (log2(CPM)>1) (Fig 2A). For 11 out of 14 patients all biopsies clustered together, indicating that the main clustering factor was the tumor of origin (Fig 2A; pair-wise inter-sample correlations in RNA expression profiles are given in S1A Fig). The remaining three tumors that did not cluster in tumor-specific clusters (P05, P13, and P17; indicated by arrows) were among the discordantly subtyped tumors when using the classifiers, particular tumors P05 and P13 (Fig 1A). Hence, the ITH detected with the classifiers, which use a subset of transcripts were also present when assessing ITH across all transcripts. In line with this, tumors that were concordantly subtyped had a significantly higher intra-tumor correlation at transcriptional level, compared to tumors with discordantly subtyped biopsies (p = 0.0037, Wilcoxon rank sum test) (S1B Fig). However, the majority of biopsies cluster in tumor-specific clusters. Furthermore, the biopsies also tend to cluster based on the subtype combination called across the classifiers (Fig 2A). This is especially pronounced for the CMS and TT subtypes while the CRIS subtypes are more intermixed. For some heterogeneous tumors (P05 and P13), their biopsies cluster with biopsies from other tumors with the same subtype combination, rather than clustering in tumor-specific clusters. Looking at all biopsies, they split into three main clusters; the left-most cluster is primarily immune related subtypes (CMS1 + CRIS-A/CRIS-B + SSC). The middle cluster is more mixed; however, all metabolic subtyped (CMS3 + Goblet) biopsies are included within this cluster. The right-most cluster contains subtypes related to chromosomal CNAs (CMS2 + CRIS-C/CRIS-E + CIN) (Fig 2A). Overall, these clusters indicate that both the cancer cell transcripts and the TME contribute to the clustering of the samples.

Fig 2. Transcriptional ITH.

Fig 2

(A) Transcriptional heatmap of protein coding genes. For 11/14 tumors, the multiregional biopsies cluster in tumor-specific clusters (marked with orange). Biopsies from the remaining 3 tumors are combined in clusters (marked in black). The clusters indicated with asterisk (*) were statistical significant as evaluated by bootstrapping (Approximately Unbiased (AU) values ≥95). Arrows below the heatmap indicate discordant clustering biopsies (black: P05, red: P13, blue: P17). Annotations above the heatmap illustrates tumor location, MSI/MSS status, and subtype calls for all three classifiers. The majority of samples cluster according to their subtype combination. (B-D) Distribution of ITH-score for three gene panels, the four colors represent quartiles ranging from low to high ITH-score. Patient specific ITH-scores are calculated as standard deviation for each gene between all biopsies from each tumor. Inter-tumor denotes variation between tumors and is calculated as the STD between all biopsies from all tumors. (B) A panel of stromal genes (n = 618) are enriched for high ITH-score genes. (C) A panel of housekeeping genes (n = 3415) are more stable both within and between tumors. (D) Transcripts related to common copy number alterations (CNAs) are equally distributed in all four categories for most patients (n = 831). P05 have a larger fraction of high ITH-score genes related to CNAs.

To explore which types of genes varied the most within cancers, we calculated a tumor specific ITH-score for each gene. This was done using the standard deviation (STD) of the log2(CPM) values for each gene among the multiregional biopsies for the tumor. Furthermore, we did this across biopsies from all tumors to obtain an inter-tumor heterogeneity score. As the TME content may vary between biopsies, we investigated a panel of stromal transcripts to see, whether these showed high ITH-scores. Generally, stromal genes were characterized by having high ITH-scores in most samples (Fig 2B). Furthermore, their expression also showed high inter-tumor variation (Fig 2B: last column). This was different from housekeeping genes, which had lower ITH-scores both within each tumor and between tumors (Fig 2C), which is expected for house-keeping RNA. As CNAs may influence gene transcript levels, we investigated a panel of genes located in genomic regions commonly affected by CNAs in CRC. Expectedly, the inter-tumor variation was larger than the intra-tumor variation (Fig 2D). Generally, the distal tumors had higher levels of ITH in relation to CNA, than the proximal tumors, which may explain the higher number of discordant subtype calls in the distal tumors, when using the cancer cell specific CRIS classifier (Fig 1). However, this was also evident for one proximal tumor (P05), where the CNA related transcripts had a high ITH-score, indicating ITH of CNA events within that tumor (Fig 2D).

ITH of molecular pathways match heterogeneously called subtypes

The fraction of genes with a high ITH-score (STD > 0.5) varied between tumors (Fig 3A). The tumors whose biopsies did not cluster together (Fig 2A), were among those with the highest number of high ITH-score genes (Fig 3A), and in agreement, the multiregional biopsies of these tumors were often discordantly called by the CRC classifiers (Figs 1A and 3B).

Fig 3. ITH of molecular pathways.

Fig 3

(A) Barplot with quantification of number of genes with high ITH-score (STD > 0.5) for each tumor and all tumors combined. * denotes tumors with the highest amount of high ITH-score genes, which are included in B-D. (B) Subtyping results for each biopsy from tumor P13 and P05. (C-D) Tumor-specific network maps for two tumors illustrating the 5000 ssGSEA terms with the highest ITH for tumor P13 in (C) and tumor P05 in (D). Yellow dots/font indicates mechanisms that are upregulated in the sample compared to the other samples from the same tumor, while blue indicates downregulated mechanisms.

Next we investigated if the genes with high ITH-score were involved in specific biological processes. Gene set enrichment analysis, using ssGSEA supported that tumors displaying high ITH on gene level also exhibited high ITH in regards to activity of biological processes. This ITH was plotted in tumor specific network maps, based on up- or down-regulation of the most varying molecular mechanisms (Fig 3C and 3D + S2 Fig). For the two proximal tumors (P05 and P13) with the highest number of high ITH-score genes, the network maps pinpointed biological functions and processes likely underlying the heterogeneous subtype calls (Fig 3C and 3D). Biopsies from the first of these tumors (P13) were classified with three different classes using both the CMS and TT classifiers (Fig 3B). In the network maps, biopsy 1 had increased activity of metabolic pathways (Fig 3C left), which matched the classified CMS3 and Goblet subtypes. Biopsy 2 was classified as CMS1+ CRIS-A + SSC, all immune related subtypes. This corresponded well with the fact that this biopsy exhibited upregulated immune processes compared to the other biopsies, as well as low metabolic activity (Fig 3C middle). Biopsy 3 was classified as CMS2 + CIN, which matched the low immune and low metabolic activity in this network map (Fig 3C right). Network maps from the second heterogeneous tumor (P05), showed the same tendencies (Fig 3D). Biopsy 1 had higher activity in metabolic gene sets, compared to the other biopsies, which matched the called metabolic subtypes (CMS3 and Goblet; Fig 3D left). Even though biopsy 2 and 3 were classified concordantly, there were some differences in their network maps. Biopsy 2 had increased activity in regards to cell cycle and DNA replication, while biopsy 3 had high activity of Wnt signaling (Fig 3D middle and right). These network maps illustrates the dynamic state of RNA transcripts, as differences in ongoing cellular functions cause ITH on RNA level. Varying biological processes were also observed for the remaining two tumors with high ITH-score genes (P4 and P17) (Fig 2A), even though these were classified concordantly (S2 Fig). These differences included varying activity of processes related to metabolism, extracellular matrix (ECM), cell cycle, and immune processes. Indicating that the transcriptional ITH is only partly captured by the subtyping classifiers.

Tumor location within the colon influences immune infiltration

Given that stromal RNAs exhibited particularly pronounced ITH (Fig 2B) we explored the ITH of the TME further by measuring the protein levels of 92 cancer and immune related proteins in 29 tumors (3–4 biopsies per tumor) using antibody based PEA analysis (Fig 4). Unsupervised hierarchical clustering of the protein data, identified large ITH at protein level. Only for 9 out of 29 tumors did the multiregional biopsies cluster together. The two main clusters formed were associated with distinct biological and clinical features. The left cluster was significantly enriched for MSI tumors (75%), while the right cluster was enriched for MSS tumors (95%) (p<0.0005, Fisher’s exact test) (Fig 4A; pair-wise inter-sample correlations in protein expression profiles are given in S1C Fig).

Fig 4. ITH of immune response on protein level.

Fig 4

(A) Heatmap of protein levels based on the immuno-oncology panel. All columns represent a sample; biopsies in tumor-specific clusters are marked with orange (9/29 tumors). Annotations indicate tumor location and MSI/MSS status. Row-side trees marked with red represent a TAM Inflammation Panel and Tcyt Cell Response Panel. (B) Boxplots showing calculated sample-means for TAM Inflammation panel (top) and the Tcyt cell panel (bottom) for each tumor. Fill colors (1–8) indicate tumor location as illustrated in the schematic figure of the colon and rectum. Red/blue border colors indicate MSI/MSS status. Each bar illustrates results from all biopsies from each tumor. (TAM = tumor associated macrophage, Tcyt = cytotoxic T cell).

The proteins analyzed, included several inflammation proteins connected to tumor associated macrophages (TAMs) [39]. The cluster analysis revealed high intra-sample expression correlation between these TAM related inflammation proteins (Fig 4A). The TAM inflammation proteins define a panel, which pinpoints samples on both the left and right sample clusters that are likely to be inflamed (Fig 4A). Likewise, another protein panel with chemokines and granzymes related to an active cytotoxic T (Tcyt) cell response also showed intra-sample expression correlation (Fig 4A).

To analyze the ITH of the TAM inflammation- and Tcyt response-panels within each tumor, we calculated sample means of protein expression for both panels. Some tumors had highly varying levels of each panel between the multiregional biopsies, indicating ITH in relation to the immune activity (Fig 4B). However, it was not the same tumors that exhibited ITH for the two panels. Taking the tumor location into consideration, we observed different patterns for the TAM inflammation panel and the Tcyt cell response panel (Fig 4B). The TAM inflammation levels were highest in MSI tumors, independent of the location within the proximal colon. For MSS tumors, the inflammation levels showed less ITH and overall higher levels in the proximal colon tumors, compared to the distal colon. In contrast, the Tcyt levels were highest in the tumors located in the cecum independent of MSI/MSS status (Fig 4B). This indicates that a certain environment may be present in the cecum, leading to a higher Tcyt cell response in the tumors independent of the mutational profile of the cancer cells [40].

Inflammation at protein level varies between transcriptional subtypes

Multiregional biopsy analysis revealed that some tumors show large variation between biopsies in regards to the TAM inflammation on protein level (Fig 4B). We wanted to see whether this ITH on protein level was related to the transcriptional subtypes (Fig 5A). For the CMS classifier, the CMS1 classified biopsies showed high levels of inflammation, even within the highly heterogeneous tumor P13 (Fig 5A–left). However, for the remaining subtypes (CMS2 and CMS3), the inflammation levels were more similar. This was also the case for the TT classifier, where the biopsies with the highest levels of inflammation were classified as immune-related subtypes (SSC and Stroma) (Fig 5A–right). In contrast, the immune-related CRIS subtypes (CRIS-A and CRIS-B) showed varying inflammation levels (Fig 5A–middle).

Fig 5. TAM Inflammation on protein level varies between subtypes.

Fig 5

(A) Dotplots showing TAM inflammation panel (mean NPX) protein level for multiregional biopsies from each tumors (n = 14) with all three classifiers (CMS, CRIS, and TT). Colors indicate transcriptional subtype. Proximal to distal tumor location is indicated by an arrow. (B) Boxplots for each classifier (CMS, CRIS, TT), showing protein inflammation panel mean for all samples (n = 41) grouped based on transcriptional subtype. (C) Boxplots for each classifier, showing protein inflammation panel mean for samples from a validation cohort (n = 162), grouped based on transcriptional subtype. (* = p<0.05; ** = p<0.01; *** = p<0.001, Wilcoxon rank sum test).

Next, we wanted to see, whether TAM inflammation on protein level was related to the transcriptional subtypes if we grouped all biopsies based on their subtype. The protein level was significantly higher in the subtypes related to immune signaling, including CMS1, CRIS-A, CRIS-B, SSC, and Stroma (p<0.05, Wilcoxon rank rum test) (Fig 5B). This was furthermore verified in an independent validation cohort with single biopsies from 162 stage II and III CRC tumors (Fig 5C). Supporting the notion that the transcriptional based subtypes are linked to the inflammation on protein level.

Discussion

Most previous studies about ITH of CRC have been focusing on genetic ITH of cancer cells [1820]. Here we focused on the transcriptional ITH to explore the degree of ITH and the origin thereof, and whether tumor location influence the ITH. Here the presence or absence of ITH was assessed by subtype classification of multiregional tumor biopsies using the well-established CRC subtype classifiers CMS, TT, and CRIS. They reflect different kinds of ITH; either ITH arising due to differences in cell type distributions (CMS and TT) or ITH arising due to transcriptional differences between the cancer cells in the multiregional biopsies (CRIS). Interestingly, we observed that the degree of ITH of subtyping varied depending on tumor location (Fig 1). This was especially pronounced for the CMS classifier, which called half of the proximal tumors with discordant subtypes, while all multiregional biopsies of the distal tumors were called concordantly. The same tendency was observed for the TT classifier, while the CRIS classifier showed the opposite pattern. The CRIS classifications were often discordant within the distal tumors, which match the increased level of high ITH-score genes related to CNA in the distal tumors (Fig 2). We have previously shown that CRIS subtypes are sensitive to CNAs, and that ITH on CNA level influenced the CRIS type called [18], which might be part of the explanation.

We found that clustering of the total transcriptomic profile of protein coding genes lead to tumor-specific clustering for 11 out of 14 tumors (Fig 2A). This indicates that the variation between tumors often exceeds the transcriptomic ITH. A recent study, showed similar results for non-small cell lung cancer, where the majority of samples clustered tumor-specifically [41]. However, for a few tumors we observed that the ITH on transcriptional level was so large, that the samples resembled other tumors more than they resembled each other. To look more into the source of this ITH, we calculated tumor specific ITH-scores for each gene. A large proportion of genes of stromal origin had high ITH-scores, indicating that the stromal TME contributes to a part of the ITH in all tumors (Fig 2B). This heterogeneity of the stromal compartment existed even though we sampled multiregional biopsies from histologically similar tumor areas. Even larger variations have been reported, when comparing different areas such as the central tumor and the invasive front of the tumor [6]. For the most heterogeneous tumors, we created tumor-specific network maps based on ssGSEA terms (Fig 3). Here we saw that gene sets related to metabolism, cell cycle, Wnt signaling, and immune responses varied in activity between the multiregional biopsies. Importantly, these variations matched the subtypes called for each biopsy, which was furthermore related to inflammation on protein level (Fig 5). This indicates that the heterogeneously called subtypes within tumors are actually due to molecular differences between the sampled sites. This supports the view that ITH of subtyping might not be a flaw in the classifier, but rather the reality of the given tumors.

A higher number of discordant called tumors (by CMS and TT) in the proximal colon indicate that the TME may vary more in the proximal tumors compared to the distal tumors. To look further into this, we analyzed the immune signaling on protein level, and found varying ITH within tumors, and different levels of immune response related to tumor location (Fig 4). Some tumors had very uniform immune signaling in all biopsies, while others showed large variation between sampling sites. These findings suggests that local immune environments exists within the tumors, and that a single biopsy might not suffice, when determining the prognostic impact of the immune response. A recent study by Cremonesi and colleagues highlighted the importance of the microbiota for stimulating chemokine production from cancer cells [42]. They showed that the microbiota influenced T cell trafficking to the tumor, and thereby influenced the prognosis of the patients. One may speculate if the microbiota also exhibits ITH in CRC complicating things even further. In relation to T cell trafficking, we saw that tumors from the cecum exhibited high levels of Tcyt cell response on protein level. This could be due to the immune environment normally present in the cecum [40], which again might be linked to the microbiota. Unfortunately, it is beyond the scope of this study to explore this association further. However, we generally see higher inflammation in the proximal colon MSS tumors, compared to the distal colon MSS tumors (Fig 4).

The MSI tumors analyzed had the highest levels of TAM related inflammation, independent of tumor location even within the proximal colon. While for the MSS/CIN tumors a higher inflammation level was often observed in the proximal colon compared to the distal colon and rectum (Fig 4). In depth characterization of CRC tumors have previously shown that MSS/CIN tumors often have similar CNA profiles independent of tumor location in the proximal or distal colon [16]. Taken together, this may indicate that the location alone influences the inflammatory response in these MSS tumors. Interestingly, several tumors were highly heterogeneous in this regard. Further studies, looking more into the link between ITH, immune response, and tumor location would be interesting. It remains to be seen whether the dichotomy of distinguishing between the proximal and distal colon will eradicate some heterogeneity, or if a more fluent gradient is present along the colon, as suggested by Yamauchi and colleagues [43].

Conclusion

Our results indicate that ITH of CRC is influenced by the tumor location within the colorectum. By classifying the biopsies into subtypes, we found that the microenvironment more often lead to transcriptional ITH in the proximal colon compared to the distal colon. Importantly, the subtyping heterogeneity between biopsies seem to be due to actual ITH within tumors, since the discordant subtypes matched the biological processes within the biopsies, and the inflammation on protein level. In contrast, the distal tumors were primary classified as discordant due to cancer cell related heterogeneity. Hence, tumor subtyping based on a single biopsy may turn out to be problematic, as tumor location influence the transcriptional ITH in CRC, a topic that may be explored further in future studies. Overall, our results presented here suggest that tumor location, inter-tumor heterogeneity, and ITH should preferably be considered together in future attempts to establish clinically relevant biomarkers for CRC.

Supporting information

S1 Fig. Inter-sample correlations in RNA expression and protein expression profiles.

(A) Correlation matrix showing the pair-wise, inter-sample correlations (Pearson’s r) in RNA expression profiles (12,593 RNAs with average expression >1 (log2(CPM))). (B) Plot showing the intra-tumor correlation in RNA expression between biopsies (Pearson’s r; Y-axis) according to the number of classifiers (CMS, CRIS, TT) that exhibit discordant subtype calls for a tumor sample (X-axis; no discordant calls is ‘0’, whereas values 1, 2 and 3 indicate the number of classifiers that exhibit discordant subtype calls for a tumor). Biopsies that have discordant calls exhibits lower correlation in RNA expression to the other biopsies from the same tumor than biopsies with no discordant calls. Red bar indicate average values for each category. The p-value indicates that biopsies with no discordant calls for any classifier (X-axis = 0) exhibit significantly higher intra-tumor correlation in RNA expression than biopsies from tumors with discordant classifier calls (X-axis = 1–3) as evaluated by a Wilcoxon rank sum test (WRS). (C) Correlation matrix showing the pair-wise, inter-sample correlations (Pearson’s r) in protein expression profiles (68 proteins).

(TIF)

S2 Fig. Tumor specific network maps.

(A) Subtyping results for each biopsy from tumors P04 and P17. (B-C) Tumor-specific network maps for two tumors illustrating the 5000 ssGSEA terms with the highest ITH for tumor P04 in (B) and tumor P17 in (D). Yellow dots/font indicates mechanisms that are upregulated in the sample compared to the other samples from the same tumor, while blue indicates downregulated mechanisms.

(TIF)

S1 Table. Sample overview.

Sample information about multiregional biopsies, including PatientID, SampleID, Tumor_location, MSI_MSS, Age, Gender, Subtyping results, and RNA sequencing read depth.

(XLSX)

Acknowledgments

We thank Pamela Celis, Jesper B. Kristensen, Lisbet Kjeldsen, and Susie L. Larsen for excellent technical assistance. Furthermore, we thank the staff at the NGS Core Center at Aarhus University Hospital, Denmark and the Analysis Service at Olink Proteomics, Uppsala, Sweden. The Danish Cancer Biobank is acknowledged for providing biological material.

List of abbreviations

AU

approximately unbiased

CIN

chromosomal instability

CNA

copy number alterations

CMS

consensus molecular subtypes

CRC

colorectal cancer

CRIS

CRC intrinsic subtypes

FPKM

fragments per kilobase million

ITH

intra-tumor heterogeneity

LogCPM

logarithmic counts per million

NPX

normalized protein expression

MSI

microsatellite instability

MSS

microsatellite stability

PEA

proximity extension assay

PIC

protease inhibitor cocktail

PMSF

phenylmethylsulfonyl fluorid

ssGSEA

single sample gene set enrichment analysis

TAM

tumor associated macrophage

Tcyt

Cytotoxic T cell

TME

tumor microenvironment

TT

TUMOR types

Data Availability

All relevant data are available on EGA (accession no. EGAS00001004668).

Funding Statement

This research was supported by Aarhus University (SSA), The Dagmar Marshalls Foundation (SSA), Aage and Johanne Louis-Hansen’s Foundation (SSA), the Novo Nordisk Foundation (NNF16OC0023182) (JBB), the Danish Cancer Society (R40-A1965_11_S2, R56-A3110-12-S2, R107-A7035, R133-A8520), the Danish Council for Independent Research (Medical Sciences) (DFF-0602-02128B, DFF–4183-00619) (CLA), and the National Cancer Institute (R01 CA207467) (CLA). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: a cancer journal for clinicians. 2018;68(6):394–424. [DOI] [PubMed] [Google Scholar]
  • 2.Puppa G, Sonzogni A, Colombari R, Pelosi G. TNM staging system of colorectal carcinoma: a critical appraisal of challenging issues. Arch Pathol Lab Med. 2010;134(6):837–52. 10.1043/1543-2165-134.6.837 [DOI] [PubMed] [Google Scholar]
  • 3.Guinney J, Dienstmann R, Wang X, de Reynies A, Schlicker A, Soneson C, et al. The consensus molecular subtypes of colorectal cancer. Nature medicine. 2015;advance online publication. 10.1038/nm.3967 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bramsen JB, Rasmussen MH, Ongen H, Mattesen TB, Ørntoft M-BW, Árnadóttir SS, et al. Molecular-Subtype-Specific Biomarkers Improve Prediction of Prognosis in Colorectal Cancer. Cell reports. 2017;19(6):1268–80. 10.1016/j.celrep.2017.04.045 [DOI] [PubMed] [Google Scholar]
  • 5.Isella C, Brundu F, Bellomo SE, Galimi F, Zanella E, Porporato R, et al. Selective analysis of cancer-cell intrinsic transcriptional traits defines novel clinically relevant subtypes of colorectal cancer. Nat Commun. 2017;8:15107 10.1038/ncomms15107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Dunne PD, McArt DG, Bradley CA, Reilly PG, Barrett HL, Cummins R, et al. Challenging the Cancer Molecular Stratification Dogma: Intratumoral Heterogeneity Undermines Consensus Molecular Subtypes and Potential Diagnostic Value in Colorectal Cancer. Clinical Cancer Research. 2016;22(16):4095 10.1158/1078-0432.CCR-16-0032 [DOI] [PubMed] [Google Scholar]
  • 7.Molinari C, Marisi G, Passardi A, Matteucci L, De Maio G, Ulivi P. Heterogeneity in Colorectal Cancer: A Challenge for Personalized Medicine? International journal of molecular sciences. 2018;19(12):3733 10.3390/ijms19123733 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Szmulowicz UM, Hull TL. Colonic Physiology In: Beck DE, Roberts PL, Saclarides TJ, Senagore AJ, Stamos MJ, Wexner SD, editors. The ASCRS Textbook of Colon and Rectal Surgery. New York, NY: Springer New York; 2011. p. 23–39. 10.1007/s11605-011-1460-7 [DOI] [Google Scholar]
  • 9.Donaldson GP, Lee SM, Mazmanian SK. Gut biogeography of the bacterial microbiota. Nat Rev Microbiol. 2016;14(1):20–32. 10.1038/nrmicro3552 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Mowat AM, Agace WW. Regional specialization within the intestinal immune system. Nature Reviews Immunology. 2014;14:667 10.1038/nri3738 [DOI] [PubMed] [Google Scholar]
  • 11.Missiaglia E, Jacobs B, D'Ario G, Di Narzo AF, Soneson C, Budinska E, et al. Distal and proximal colon cancers differ in terms of molecular, pathological, and clinical features. Ann Oncol. 2014;25(10):1995–2001. 10.1093/annonc/mdu275 [DOI] [PubMed] [Google Scholar]
  • 12.Lee GH, Malietzis G, Askari A, Bernardo D, Al-Hassi HO, Clark SK. Is right-sided colon cancer different to left-sided colorectal cancer?–A systematic review. European Journal of Surgical Oncology (EJSO). 2015;41(3):300–8. 10.1016/j.ejso.2014.11.001 [DOI] [PubMed] [Google Scholar]
  • 13.Tamas K, Walenkamp AME, de Vries EGE, van Vugt MATM, Beets-Tan RG, van Etten B, et al. Rectal and colon cancer: Not just a different anatomic site. Cancer Treatment Reviews. 2015;41(8):671–9. 10.1016/j.ctrv.2015.06.007 [DOI] [PubMed] [Google Scholar]
  • 14.Yang SY, Cho MS, Kim NK. Difference between right-sided and left-sided colorectal cancers: from embryology to molecular subtype. Expert Review of Anticancer Therapy. 2018;18(4):351–8. 10.1080/14737140.2018.1442217 [DOI] [PubMed] [Google Scholar]
  • 15.Gallois C, Pernot S, Zaanan A, Taieb J. Colorectal Cancer: Why Does Side Matter? Drugs. 2018;78(8):789–98. 10.1007/s40265-018-0921-7 [DOI] [PubMed] [Google Scholar]
  • 16.Cancer Genome Atlas N. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487(7407):330–7. 10.1038/nature11252 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Berntsson J, Svensson MC, Leandersson K, Nodin B, Micke P, Larsson AH, et al. The clinical impact of tumour-infiltrating lymphocytes in colorectal cancer differs by anatomical subsite: A cohort study. International journal of cancer Journal international du cancer. 2017;141(8):1654–66. 10.1002/ijc.30869 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Arnadottir SS, Jeppesen M, Lamy P, Bramsen JB, Nordentoft I, Knudsen M, et al. Characterization of genetic intratumor heterogeneity in colorectal cancer and matching patient-derived spheroid cultures. Molecular oncology. 2018;12(1):132–47. 10.1002/1878-0261.12156 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Uchi R, Takahashi Y, Niida A, Shimamura T, Hirata H, Sugimachi K, et al. Integrated Multiregional Analysis Proposing a New Model of Colorectal Cancer Evolution. PLoS Genet. 2016;12(2):e1005778 10.1371/journal.pgen.1005778 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Suzuki Y, Ng SB, Chua C, Leow WQ, Chng J, Liu SY, et al. Multiregion ultra-deep sequencing reveals early intermixing and variable levels of intratumoral heterogeneity in colorectal cancer. Molecular oncology. 2017;11(2):124–39. 10.1002/1878-0261.12012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Dunne PD, Alderdice M, O'Reilly PG, Roddy AC, McCorry AMB, Richman S, et al. Cancer-cell intrinsic gene expression signatures overcome intratumoural heterogeneity bias in colorectal cancer patient classification. 2017;8:15657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Suraweera N, Duval A, Reperant M, Vaury C, Furlan D, Leroy K, et al. Evaluation of tumor microsatellite instability using five quasimonomorphic mononucleotide repeats and pentaplex PCR. Gastroenterology. 2002;123(6):1804–11. 10.1053/gast.2002.37070 [DOI] [PubMed] [Google Scholar]
  • 23.Hedegaard J, Lamy P, Nordentoft I, Algaba F, Høyer S, Ulhøi Benedicte P, et al. Comprehensive Transcriptional Analysis of Early-Stage Urothelial Carcinoma. Cancer cell. 2016;30(1):27–42. 10.1016/j.ccell.2016.05.004 [DOI] [PubMed] [Google Scholar]
  • 24.Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology. 2013;14(4):R36 10.1186/gb-2013-14-4-r36 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotech. 2010;28(5):511–5. 10.1038/nbt.1621 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Pyl PT, Anders S, Huber W. HTSeq—a Python framework to work with high-throughput sequencing data. Bioinformatics. 2014;31(2):166–9. 10.1093/bioinformatics/btu638 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.NIH_GDC-Data-Portal. TCGA gene list download center 2019 [25.02.2019]. Available from: https://portal.gdc.cancer.gov/exploration?facetTab=genes.
  • 28.McCarthy DJ, Smyth GK, Robinson MD. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2009;26(1):139–40. 10.1093/bioinformatics/btp616 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Gaujoux R, Seoighe C. A flexible R package for nonnegative matrix factorization. BMC Bioinforma. 2010;11 10.1186/1471-2105-11-367 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Suzuki R, Shimodaira H. Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics. 2006;22(12):1540–2. 10.1093/bioinformatics/btl117 [DOI] [PubMed] [Google Scholar]
  • 31.Lex A, Streit M, Schulz HJ, Partl C, Schmalstieg D, Park PJ, et al. StratomeX: Visual Analysis of Large-Scale Heterogeneous Genomics Data for Cancer Subtype Characterization. Computer Graphics Forum. 2012;31(3pt3):1175–84. 10.1111/j.1467-8659.2012.03110.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Isella C, Terrasi A, Bellomo SE, Petti C, Galatola G, Muratore A, et al. Stromal contribution to the colorectal cancer transcriptome. Nature genetics. 2015. 10.1038/ng.3224 [DOI] [PubMed] [Google Scholar]
  • 33.Goldman M, Craft B, Hastie M, Repečka K, Kamath A, McDade F, et al. The UCSC Xena platform for public and private cancer genomics data visualization and interpretation. bioRxiv. 2019:326470. [Google Scholar]
  • 34.Eisenberg E, Levanon EY. Human housekeeping genes, revisited. Trends Genet. 2013;29(10):569–74. 10.1016/j.tig.2013.05.010 [DOI] [PubMed] [Google Scholar]
  • 35.Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005;102 10.1073/pnas.0506580102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP. GenePattern 2.0. Nature genetics. 2006;38:500 10.1038/ng0506-500 [DOI] [PubMed] [Google Scholar]
  • 37.Merico D, Isserlin R, Stueker O, Emili A, Bader GD. Enrichment map: a network-based method for gene-set enrichment visualization and interpretation. PloS one. 2010;5(11):e13984–e. 10.1371/journal.pone.0013984 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome research. 2003;13(11):2498–504. 10.1101/gr.1239303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Ribatti D, Nico B, Crivellato E, Vacca A. Macrophages and tumor angiogenesis. Leukemia. 2007;21:2085 10.1038/sj.leu.2404900 [DOI] [PubMed] [Google Scholar]
  • 40.Paski SC, Wightman R, Robert ME, Bernstein CN. The Importance of Recognizing Increased Cecal Inflammation in Health and Avoiding the Misdiagnosis of Nonspecific Colitis. The American Journal Of Gastroenterology. 2007;102:2294 10.1111/j.1572-0241.2007.01389.x [DOI] [PubMed] [Google Scholar]
  • 41.Lee W-C, Diao L, Wang J, Zhang J, Roarty EB, Varghese S, et al. Multiregion gene expression profiling reveals heterogeneity in molecular subtypes and immunotherapy response signatures in lung cancer. Modern Pathology. 2018;31(6):947–55. 10.1038/s41379-018-0029-3 [DOI] [PubMed] [Google Scholar]
  • 42.Cremonesi E, Governa V, Garzon JFG, Mele V, Amicarella F, Muraro MG, et al. Gut microbiota modulate T cell trafficking into human colorectal cancer. Gut. 2018;67(11):1984–94. 10.1136/gutjnl-2016-313498 [DOI] [PubMed] [Google Scholar]
  • 43.Yamauchi M, Morikawa T, Kuchiba A, Imamura Y, Qian ZR, Nishihara R, et al. Assessment of colorectal cancer molecular features along bowel subsites challenges the conception of distinct dichotomy of proximal versus distal colorectum. Gut. 2012;61(6):847 10.1136/gutjnl-2011-300865 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Amanda Ewart Toland

10 Aug 2020

PONE-D-20-20744

Transcriptomic and proteomic intra-tumor heterogeneity of colorectal cancer varies depending on tumor location within the colorectum

PLOS ONE

Dear Dr. Andersen,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

1.  Statistical significance/meaning is missing from multiple of the data illustrated in the figures.  For the data in figure 2 perform a bootstrap analysis to determine if the heatmap/dendrogram clustering are significant.  For the data in Figure 4 perform a correlation coefficient analysis.

2.  Per both reviewers there are multiple places in which more details are needed.  For example, the total reads for RNA-seq are needed.  Include the other details the reviewers have requested. 

3.  Clarify reviewer 1's question on whether ITH was due to the different transcripts the classifiers were using or were inherent in the data, perhaps this could be made clearer?

4.  Correct the typographical errors and fix the number inconsistencies identified by the reviewers.

5.  Have a collaborator or a scientific writing service read the manuscript and provide suggestions on how to improve readability. 

6. Confirm that the RNA-seq data have been deposited into a repository for access and include the access information.

Please submit your revised manuscript by Sep 24 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Amanda Ewart Toland, Ph.D.

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2.

We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Árnadóttir et al, have RNA sequenced 41 biopsies from 14 stage II or III CRC tumours. They also carried out multiplex immune protein analysis on 89 biopsies from 29 tumours. The amount of ITH was explored by analysing the RNA sequencing data from multi-region samples. By combining three different RNA sequencing based classifiers they hoped to establish an understanding on the amount of ITH present. Specifically, they combined CRC Intrinsic Subtypes (CRIS), The Consensus Molecular Subtypes (CMS) and TUMOR type (TT). This was an interesting paper to read. Below are some minor points:

The paper has been written well and is clear, with only two minor points. There was one typo present on line 51 “outcome [2] In” is missing a full stop. A grammatical mistake was present on line 393 "It might be speculated, whether the microbiota exhibit ITH in CRC complicating things even further."

Regarding the work, it was a little difficult to discern whether or not the ITH was due to the different transcripts the classifiers were using or were inherent in the data, perhaps this could be made clearer?

Regarding the statistical analysis, one comment regarding the statistical analysis, on the heatmap and dendrogram on Figure 2a - it might be useful to also bootstrap to determine whether or not the clusters are significant.

Reviewer #2: The manuscript by Arnadottir et al describes an interesting multiregional sampling of multiple tumors (14 tumors with 41 regions for RNA-seq; 29 tumors with 89 regions for multiplex immunoassays). Similar DNA sequencing types of studies have revealed widespread genetic and epigenetic ITH. The authors find that sometimes the regions are concordant for the measurements within a tumor, and sometimes not. They show that tumor location within the colorectum influence the type of expression ITH. Overall, these findings are of interest. The manuscript is complex and difficult to follow in many places. This reviewer found it difficult to get a “big picture” impression of the data. However, the overall data are important because they show how one biopsy may not be adequate for reproducible classification using the illustrated schemes. Several comments:

1) Fig 1 presents the overall data well. However, the Results for the RNA-seq data (page 8) are based on classification schemes, with 50% of the tumors having at least one biopsy with a different classification. It would be useful to:

a) provide the RNA-seq depth (reads per sample)

b) provide some statistical summary of the raw data (ie the correlation of CPM values between samples from the same tumor)

c) Provide the 50% misclassification rate information in the Abstract

2) Fig 2 shows that using RNA expression levels, the different biopsies from the same patients tend to cluster with individual patients. This Figure is hard to decipher. It might be useful to put arrows to indicate samples that do not cluster by patient (ie such as P05-Tc). The manuscript also states that samples “cluster” by subtype, but this claim is not all that obvious in Fig 2A because some subtypes (say for example CRIS) seem more randomly distributed along the horizontal axis. This statement should be clarified.

3) Fig 3 looks at the most discordantly expressed genes (STD>.5) and does a supervised gene enrichment analysis. Such types of analysis almost always give an output. The authors describe their analysis but provide little insights on how this type of analysis can explain expression ITH.

4) Fig 4 show abundant protein level ITH with only 9 of 30 tumors (why is it 29 in the Abstract?) having its multiple regions clustering together. A simple, more standard statistical description such as a correlation coefficient (within and between tumors) could provide more meaning to this data. This poor concordance information between biopsies from the same tumor should be mentioned in the Abstract.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2020 Dec 17;15(12):e0241148. doi: 10.1371/journal.pone.0241148.r002

Author response to Decision Letter 0


26 Sep 2020

PONE-D-20-20744

Transcriptomic and proteomic intra-tumor heterogeneity of colorectal cancer varies depending on tumor location within the colorectum

PLOS ONE

Dear Dr. Andersen,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Response from Authors: We are indeed very happy that PLOS ONE finds the manuscript to have merit for publication. As described below, we have addressed the comments made by the reviewers and we would like to thank both the editor and reviewers for their constructive remarks.

1. Statistical significance/meaning is missing from multiple of the data illustrated in the figures. For the data in figure 2 perform a bootstrap analysis to determine if the heatmap/dendrogram clustering are significant. For the data in Figure 4 perform a correlation coefficient analysis.

Response from Authors: A bootstrap analysis for the data in Figure 2 and a coefficient analysis for the data in Figure 4 (new S1 Fig) have been added as requested. For further details, please see the responses to the reviewers below.

2. Per both reviewers there are multiple places in which more details are needed. For example, the total reads for RNA-seq are needed. Include the other details the reviewers have requested.

Response from Authors: RNA-seq read depths (total reads) have been added to the excel table ‘S1_Table.xlsx’. Please also see our responses below.

3. Clarify reviewer 1's question on whether ITH was due to the different transcripts the classifiers were using or were inherent in the data, perhaps this could be made clearer?

Response from Authors: This question has been addressed, as stated below in the response to the reviewer’s question (please see below). The following sentences have been added:

Line 197: ‘Since these classifiers use different gene sets and approaches for subtyping, they reflect different kinds of ITH; e.g. ITH arising due to inter-biopsy differences in cell type distribution of both cancer epithelium and tumor stromal cells (CMS and TT) or ITH arising due to transcriptional differences between the cancer cell populations in the biopsies (CRIS).’

Line 380: ‘Here the presence or absence of ITH was assessed by subtype classification of multiregional tumor biopsies using the well-established CRC subtype classifiers CMS, TT and CRIS. They reflect different kinds of ITH; either ITH arising due to differences in cell type distributions (CMS and TT) or ITH arising due to transcriptional differences between the cancer cells in the multiregional biopsies (CRIS).’

4. Correct the typographical errors and fix the number inconsistencies identified by the reviewers.

Response from Authors: These errors have been corrected.

5. Have a collaborator or a scientific writing service read the manuscript and provide suggestions on how to improve readability.

Response from Authors: The manuscript has been reviewed by a collaborator to ease readability.

6. Confirm that the RNA-seq data have been deposited into a repository for access and include the access information.

Response from Authors: The RNA-seq data is being deposited into the European Genome-phenome Archive (EGA) database (accession number (EGAS00001004668). As the data are personal and potentially sensitive, the National Committee on Health Research Ethics and the Danish Data Protection Agency requires that both the European general data protection regulation (GDPR) and Danish data protection laws are complied with unconditionally, before data access can be granted. This means that request’ties have to accept to enter into a data access agreement with the Danish data controller, which among other things ensure that the data will only be used for statistical and/or research purposes and that confidentiality of the data subjects will preserved at all times. We confirm that the authors had no special access to data, and that qualifying readers can obtain the same data the authors had access to.

Data availability statement:

‘RNA sequencing data is deposited at the European Genome-phenome Archive (EGA, [https://www.ebi.ac.uk/ega/] accession number EGAS00001004668, which is hosted by the European Bioinformatics Institute (EBI) and the Centre for Genomic Regulation (CRG). Data sharing is only possible for the sole purpose of carrying out statistical or scientific research of significant importance to society. Request for data access should be directed to the EGAS00001004668 Data Access Committee through the EGA website.

5. Review Comments to the Author

Reviewer #1: Árnadóttir et al, have RNA sequenced 41 biopsies from 14 stage II or III CRC tumours. They also carried out multiplex immune protein analysis on 89 biopsies from 29 tumours. The amount of ITH was explored by analysing the RNA sequencing data from multi-region samples. By combining three different RNA sequencing based classifiers they hoped to establish an understanding on the amount of ITH present. Specifically, they combined CRC Intrinsic Subtypes (CRIS), The Consensus Molecular Subtypes (CMS) and TUMOR type (TT). This was an interesting paper to read. Below are some minor points:

The paper has been written well and is clear, with only two minor points. There was one typo present on line 51 “outcome [2] In” is missing a full stop. A grammatical mistake was present on line 393 "It might be speculated, whether the microbiota exhibit ITH in CRC complicating things even further."

Response from Authors:

A full stop has been added in line 51.

The grammatical error has been corrected, the sentence now reads (Line 418): ‘One may speculate if the microbiota also exhibits ITH in CRC complicating things even further.’

Regarding the work, it was a little difficult to discern whether or not the ITH was due to the different transcripts the classifiers were using or were inherent in the data, perhaps this could be made clearer?

Response from Authors: The origin of the ITH has now been explained further to highlight that the difference in ITH observed by the three classifiers as discordant subtype calls, is due to the different transcripts in the classifiers. Overall, the observed ITH between biopsies are reflected by many transcripts: we now present a correlation matrix in S1A Fig and a summary figure in S1B Fig, which illustrate that biopsies from patients with discordant subtype calls exhibit a poorer intra-patient correlation in RNA expression (>12.000 RNAs) than patients with concordant subtype calls. This indicates that ITH affects many transcripts and goes beyond the transcripts that are included in each of the classifiers. Still, among the many transcripts that exhibit ITH, the different classifiers focus on different transcript subsets. The CRIS classifier was originally developed by selecting only transcript originating from the cancer epithelial cells. Therefore, CRIS classification will primarily capture the ITH caused by different cancer cell clones in the multiregional biopsies. In contrast, the CMS and TT classifiers were developed using all transcripts (both from the epithelial cells and the surrounding tumor stroma cells). Consequently, CMS and TT are additionally capturing ITH that is arising due to differences in the abundance of different non-tumor (e.g. immune) cell types in the biopsy, in addition to differences in the cancer cells. In accordance with these differences in the classifier transcripts we find that the CMS classifier exhibit discordant call exclusively in the proximal colon, which is known to exhibit higher and varying immune cell infiltration (e.g PMCID: PMC6048410), whereas the CRIS classifier exhibit most discordance in the immune cell poorer distal/rectal tumors, where cancer epithelial transcripts are expected to contribute relatively more to ITH. To introduce these considerations in the manuscript the following sentences have been added:

Line 197: ‘Since these classifiers use different gene sets and approaches for subtyping, they reflect different kinds of ITH; e.g. ITH arising due to inter-biopsy differences in cell type distribution of both cancer epithelium and tumor stromal cells (CMS and TT) or ITH arising due to transcriptional differences between the cancer cell populations in the biopsies (CRIS).’

Line 232: ‘Hence, the ITH detected with the classifiers, which use a subset of transcripts, are also present when assessing ITH across all transcripts. In line with this, tumors that were concordantly subtyped had a significantly higher intra-tumor correlation at transcriptional level, compared to tumors with discordantly subtyped biopsies (p = 0.0037, Wilcoxon rank sum test) (S1B Fig).’

Line 380: ‘Here the presence or absence of ITH was assessed by subtype classification of multiregional tumor biopsies using the well-established CRC subtype classifiers CMS, TT and CRIS. They reflect different kinds of ITH; either ITH arising due to differences in cell type distributions (CMS and TT) or ITH arising due to transcriptional differences between the cancer cells in the multiregional biopsies (CRIS).’

Regarding the statistical analysis, one comment regarding the statistical analysis, on the heatmap and dendrogram on Figure 2a - it might be useful to also bootstrap to determine whether or not the clusters are significant.

Response from Authors: A bootstrap analysis has been performed for the data in figure 2a and the figure has been updated to with information on the significance of tumor-specific clustering. The following information has been added to the material and methods:

Line 156: ‘Clustering bootstrapping was performed to evaluate the significance of patient-specific clusters using the pvclust R package (1000 repetitions) and significance were estimated by Approximately Unbiased (AU) p-values: Clusters were considered significant for AU values ≥95, which indicates a significance p-value ≤0.05 [30].’

And in the figure legend for figure 2a:

Line 249: ‘The clusters indicated with asterisk (*) were statistical significant as evaluated by bootstrapping (Approximately Unbiased (AU) values ≥95).’

Reviewer #2: The manuscript by Arnadottir et al describes an interesting multiregional sampling of multiple tumors (14 tumors with 41 regions for RNA-seq; 29 tumors with 89 regions for multiplex immunoassays). Similar DNA sequencing types of studies have revealed widespread genetic and epigenetic ITH. The authors find that sometimes the regions are concordant for the measurements within a tumor, and sometimes not. They show that tumor location within the colorectum influence the type of expression ITH. Overall, these findings are of interest. The manuscript is complex and difficult to follow in many places. This reviewer found it difficult to get a “big picture” impression of the data. However, the overall data are important because they show how one biopsy may not be adequate for reproducible classification using the illustrated schemes.

Several comments:

1) Fig 1 presents the overall data well. However, the Results for the RNA-seq data (page 8) are based on classification schemes, with 50% of the tumors having at least one biopsy with a different classification. It would be useful to:

a) provide the RNA-seq depth (reads per sample)

Response from Authors: The RNA-seq read depths (total reads per sample) have been added to the excel file ‘S1_Table.xlsx’.

Moreover the following sentence has been added to the manuscript.

Line 118: ‘A minimum of 34 million read pairs (median 65 million read pairs) were sequenced per sample on an Illumina NextSeq500 using high output flow cells (Illumina).’

b) provide some statistical summary of the raw data (ie the correlation of CPM values between samples from the same tumor)

Response from Authors: We have included correlation matrixes as S1 Fig, which contain the Pearson’s correlation r-values for pairwise inter-sample comparisons of CPM values from all 41 samples with RNA data and protein expression values for 89 samples with protein data.

The following lines have been added to the manuscript:

Line 230: ‘…pair-wise inter-sample correlations in RNA expression profiles are given in S1A Fig.’

Line 322: ‘…pair-wise inter-sample correlations in protein expression profiles are given in S1C Fig).’

c) Provide the 50% misclassification rate information in the Abstract

Response from Authors: This information has been added to the abstract (line 35), which now reads: ‘Subtyping of proximal MSS tumors were discordant for 50% of the tumors, this ITH was related to differences in the microenvironment. Subtyping of distal MSS tumors were less discordant, here the ITH was more cancer-cell related’.

2) Fig 2 shows that using RNA expression levels, the different biopsies from the same patients tend to cluster with individual patients. This Figure is hard to decipher. It might be useful to put arrows to indicate samples that do not cluster by patient (ie such as P05-Tc). The manuscript also states that samples “cluster” by subtype, but this claim is not all that obvious in Fig 2A because some subtypes (say for example CRIS) seem more randomly distributed along the horizontal axis. This statement should be clarified.

Response from Authors: Arrows have been added to Fig 2, indicating the location of the samples from the heterogeneous tumors P05, P13 and P17. The following description has been added to the figure legend (line 250): ‘Arrows below the heatmap indicate discordant clustering biopsies (black: P05, red: P13, blue: P17).’

The statement about clustering of subtypes has been clarified (line 236), it now reads: ‘…However, the majority of biopsies cluster in tumor-specific clusters. Furthermore, the biopsies also tend to cluster based on the subtype combination called across the classifiers. This is especially pronounced for the CMS and TT subtypes while the CRIS subtypes are more intermixed. For some heterogeneous tumors (P05 and P13), their biopsies cluster with biopsies from other tumors with the same subtype combination, rather than clustering in tumor-specific clusters.’

3) Fig 3 looks at the most discordantly expressed genes (STD>.5) and does a supervised gene enrichment analysis. Such types of analysis almost always give an output. The authors describe their analysis but provide little insights on how this type of analysis can explain expression ITH.

Response from Authors: The following sentence has been added (line 308): ‘These network maps illustrates the dynamic state of RNA transcripts, as differences in ongoing cellular functions cause ITH on RNA level.’

4) Fig 4 show abundant protein level ITH with only 9 of 30 tumors (why is it 29 in the Abstract?) having its multiple regions clustering together. A simple, more standard statistical description such as a correlation coefficient (within and between tumors) could provide more meaning to this data. This poor concordance information between biopsies from the same tumor should be mentioned in the Abstract.

Response from Authors: ‘9 out of 30’ is an error, it has been corrected to ‘9 out of 29’.

We have included correlation matrixes as S1 Fig, which contain the Pearson’s correlation r-values for pairwise inter-sample comparisons of CPM values from all 41 samples with RNA data and protein expression values for 89 samples with protein data.

The information about the poor concordance on protein level between biopsies from the same tumor has been added to the abstract.

The following lines have been added to the manuscript:

Line 230: ‘…pair-wise inter-sample correlations in RNA expression profiles are given in S1A Fig.’

Line 322: ‘…pair-wise inter-sample correlations in protein expression profiles are given in S1C Fig).’

Line 40 (abstract): ‘Unsupervised hierarchical clustering of the protein data identified large ITH at protein level; as the multiregional biopsies clustered together for only 9 out of 29 tumors.’

Attachment

Submitted filename: Response to reviewers.docx

Decision Letter 1

Amanda Ewart Toland

9 Oct 2020

Transcriptomic and proteomic intra-tumor heterogeneity of colorectal cancer varies depending on tumor location within the colorectum

PONE-D-20-20744R1

Dear Dr. Andersen,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Amanda Ewart Toland, Ph.D.

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: (No Response)

Reviewer #2: The authors have improved the manuscript. Describing "heterogeneity" is complex and the authors have done a good job in describing ITH at the protein and transcript level in CRCs

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Acceptance letter

Amanda Ewart Toland

9 Dec 2020

PONE-D-20-20744R1

Transcriptomic and proteomic intra-tumor heterogeneity of colorectal cancer varies depending on tumor location within the colorectum

Dear Dr. Andersen:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Amanda Ewart Toland

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Inter-sample correlations in RNA expression and protein expression profiles.

    (A) Correlation matrix showing the pair-wise, inter-sample correlations (Pearson’s r) in RNA expression profiles (12,593 RNAs with average expression >1 (log2(CPM))). (B) Plot showing the intra-tumor correlation in RNA expression between biopsies (Pearson’s r; Y-axis) according to the number of classifiers (CMS, CRIS, TT) that exhibit discordant subtype calls for a tumor sample (X-axis; no discordant calls is ‘0’, whereas values 1, 2 and 3 indicate the number of classifiers that exhibit discordant subtype calls for a tumor). Biopsies that have discordant calls exhibits lower correlation in RNA expression to the other biopsies from the same tumor than biopsies with no discordant calls. Red bar indicate average values for each category. The p-value indicates that biopsies with no discordant calls for any classifier (X-axis = 0) exhibit significantly higher intra-tumor correlation in RNA expression than biopsies from tumors with discordant classifier calls (X-axis = 1–3) as evaluated by a Wilcoxon rank sum test (WRS). (C) Correlation matrix showing the pair-wise, inter-sample correlations (Pearson’s r) in protein expression profiles (68 proteins).

    (TIF)

    S2 Fig. Tumor specific network maps.

    (A) Subtyping results for each biopsy from tumors P04 and P17. (B-C) Tumor-specific network maps for two tumors illustrating the 5000 ssGSEA terms with the highest ITH for tumor P04 in (B) and tumor P17 in (D). Yellow dots/font indicates mechanisms that are upregulated in the sample compared to the other samples from the same tumor, while blue indicates downregulated mechanisms.

    (TIF)

    S1 Table. Sample overview.

    Sample information about multiregional biopsies, including PatientID, SampleID, Tumor_location, MSI_MSS, Age, Gender, Subtyping results, and RNA sequencing read depth.

    (XLSX)

    Attachment

    Submitted filename: Response to reviewers.docx

    Data Availability Statement

    All relevant data are available on EGA (accession no. EGAS00001004668).


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES