Summary
Although breast cancer mortality is largely caused by metastasis, clinical decisions are based on analysis of the primary tumor and on lymph node involvement but not on the phenotype of disseminated cells. Here, we use multiplex imaging mass cytometry to compare single-cell phenotypes of primary breast tumors and matched lymph node metastases in 205 patients. We observe extensive phenotypic variability between primary and metastatic sites and that disseminated cell phenotypes frequently deviate from the clinical disease subtype. We identify single-cell phenotypes and spatial organizations of disseminated tumor cells that are associated with patient survival and a weaker survival association for high-risk phenotypes in the primary tumor. We show that p53 and GATA3 in lymph node metastases provide prognostic information beyond clinical classifiers and can be measured with standard methods. Molecular characterization of disseminated tumor cells is an untapped source of clinically applicable prognostic information for breast cancer.
Keywords: breast cancer, lymph node metastases, multiplexed imaging, single-cell analysis, imaging mass cytometry, metastasis, metastatic process, prognostics signature, disseminating tumor cells, survival analysis
Graphical abstract
Highlights
-
•
Tumor cell phenotypes differ in primary breast tumors and matched lymph node metastases
-
•
High- and low-risk disseminated cell phenotypes and prognostic markers are identified
-
•
Tumor cell phenotypic heterogeneity and spatial mixing is associated with good outcome
-
•
Disseminated tumor cell phenotypes are an untapped source of clinical information
Fischer et al. use multiplex imaging mass cytometry to analyze single tumor cell phenotypes in a breast cancer patient cohort with matched primary tumors and lymph node metastases. They report high phenotypic variability between these tissue sites and describe disseminated cell phenotypes and spatial organization associated with patient prognosis.
Introduction
Guided by histopathological classification and molecular characterization, targeted treatment has significantly improved the prognosis of patients with breast cancer.1,2,3,4,5,6,7,8,9 Still, approximately one-third of breast cancer patients eventually develop distant metastases and succumb to disease.10 Although mortality is largely caused by metastases, the histologic and genetic analysis on which clinical decisions are based is performed on the primary tumor that is removed during treatment. Disseminated tumor cells in nodal metastases, while also an important classifier of breast cancer risk, are not commonly characterized for standard clinical markers and have yet to be studied in large patient cohorts with single-cell approaches. For these reasons, little is known about how disseminating cell phenotypes relate to those of the primary tumor or to disease outcome.
The spread of tumor cells to sentinel lymph nodes (LNs) is an important prognostic factor in breast cancer and a key criterion in treatment decisions.11,12,13,14 Nodal status quantifies the number and/or locations of LNs with metastases in clinical TNM (tumor-node-metastasis) staging of breast cancer patients.13 The favorable nodal status pN0 defines patients without detected metastases in sentinel LNs. Node-positive patients are categorized as pN1, pN2, or pN3, have increasingly worse prognosis, and indicate increasing numbers of axillary LNs with metastases or increasing distance of involved non-axillary LNs from the primary tumor. Node-positive breast cancer patients typically receive adjuvant chemotherapy.12,15
Nodal status does not, however, provide information about the morphological, molecular, or cellular characteristics of LN metastases, in contrast to measures such as tumor grade or molecular subtype that are used to guide clinical decisions. Based on current single-marker thresholds for classifying tumors as estrogen receptor (ER)-positive or human epidermal growth factor receptor 2 (HER2)-positive (1% and 10% of positive single cells, respectively, according to American Society of Clinical Oncology/College of American Pathologists [ASCO/CAP] guidelines), consistent molecular subtype classification of primary tumors and LN metastases has been reported.16,17 However, several other studies indicate that tumor grade and molecular subtype markers (ER, progesterone receptor [PR], HER2, and Ki-67) are not always consistently expressed through tumor progression or at primary and metastatic sites.18,19,20,21,22,23 Furthermore, some patients with high nodal status survive long term whereas some node-negative patients do not, indicating that these features do not fully capture patient prognosis.24,25 Single-cell studies have shown substantial intra- and inter-tumor heterogeneity in primary tumors associated with tumor type and patient outcome,26,27,28,29,30,31 but which single-cell phenotypes disseminate is unknown.
We reasoned that paired analyses of cellular phenotypes in primary tumors and matched LN metastases could reveal biologically and clinically relevant features of cancer progression. We thus used 40-plex imaging mass cytometry (IMC)32 to perform such an analysis in 205 breast cancer patients using a tissue microarray (TMA)-based approach. Although ASCO/CAP classifications largely matched between paired primary tumors and LN metastases, the most abundant cell phenotype in the LN rarely coincided with that in the primary tumor core. The predominant disseminated cell phenotype was, however, typically detectable as a cellular subpopulation in the matched primary tumor (78% of sample pairs), indicating that cell phenotypes with metastatic potential can usually be identified in primary tumors.
There were large variations in cellular phenotypes between primary and LN sites, including frequent deviation of the disseminated cells from the clinically assigned molecular subtype of the primary tumor. HER2 was the most stable clinical marker across tissue pairs, and cells representing the triple-negative subtype were enriched in LN metastases. For tumors clinically defined as luminal but with predominantly triple-negative LN metastases, we observed triple-negative cells in the primary tumor, which by definition are not considered in clinical decisions. We confirmed the presence of these cells with immunofluorescence imaging of whole primary tumor sections. We identified single-cell phenotypes and spatial organizations of tumor cells in LN metastases that were associated with both favorable and poor outcomes, identified marker signatures reflective of the prognostic disseminated phenotypes, and validated these using immunohistochemistry (IHC). We have thus identified metastatic biomarkers that are measurable with standard clinical methods to yield prognostic information beyond nodal status and classification of the primary tumor.
Results
We compared the single-cell phenotypic landscape of primary tumors with that of disseminated cells in matched LN metastases in a cohort of breast cancer patients with clinicopathological and long-term survival information. The sample set comprised TMA cores of pathologist-selected tumor regions from 771 primary tumors and 271 LN metastases collected prior to treatment, with matched samples from primary tumor and LN metastasis for 205 patients. We stained TMAs with a 40-plex breast-cancer-specific antibody panel targeting the clinical markers ER, PR, HER2, and Ki-67 as well as signaling proteins, epigenetic factors, oncogene products, markers of treatment resistance and invasive potential, and tumor microenvironment components such as endothelial, mesenchymal, and immune cell types (Figure S1). High-dimensional images were acquired using IMC, segmented into more than 2 million single cells, and analyzed on the basis of single-cell marker expression and cellular organization within the tissue architecture (Figures 1 and S1).32,33
Figure 1.
Single-cell phenotypes of primary breast tumors and matched LN metastases characterized by IMC
(A) Heatmap of Z-scored mean marker expressions of the epithelial phenotype clusters identified using PhenoGraph across the cells of all tumors in this study. Markers used for clustering, additional markers of interest, and spatial single-cell features are indicated. The fractions of cells in each cluster found in primary breast cancers and LN metastases are shown (right). Relative fractions of cells shared between the clusters and clinically assigned molecular subtypes are visualized as bubble plots (left), and absolute cell counts are displayed in the adjacent bar plots.
(B) Scatterplot relating the log fold changes (FCs) of tumor phenotype clusters between primary tumors and LN metastases to their overall abundance as log counts per million (log CPM). Significantly differentially abundant clusters identified by edgeR (p < 0.01) are highlighted in color. Low-abundance clusters with a log CPM below 12 are not labeled with the cluster number.
(C) Heatmap of Z-scored mean marker expressions of non-epithelial phenotype clusters identified using PhenoGraph (otherwise equivalent to B).
(D) t-Distributed stochastic neighbor embedding dimensionality reduction map of 227,820 randomly subsampled cells (10% from each image) colored by tissue type of origin (top) or single-cell phenotype (bottom).
See also Figure S1.
We identified the single-cell phenotypes in all 771 primary tumor and 271 metastatic LN tissues based on high-dimensional single-cell marker expression. We initially classified cells as epithelial or non-epithelial and then identified single-cell phenotypes using unsupervised PhenoGraph clustering34 on each group with a selected set of markers. Within the epithelial cell group, we identified 59 tumor cell clusters or phenotypes; most were found at both the primary and metastatic LN sites, and there were no marker intensity shifts between sites (Figure 1A). These tumor cell phenotypes reflected all clinical subtypes and were very similar to previously identified phenotypes.26,27 We identified multiple non-luminal (luminal cytokeratin [CK]-negative) and triple-negative (ER-, PR-, and HER2-negative) subtypes, further distinguished from each other by varying expression of basal CKs, c-MYC, EGFR, p53, CAIX, Ki-67, and CD15 (Figure 1A).35 We also identified luminal cell phenotypes (CK7-, CK8/18-, or panCK-positive but CK5/14-negative) with and without ER, PR, and/or HER2 expression and with varying levels of luminal CKs, E/P-cadherin, p-mTOR, GATA3, and androgen receptor (AR) (Figure 1A). Bcl-2, proposed as a prognostic breast cancer marker,36,37,38 strongly correlated with both GATA3 and ER expression in our analysis (Spearman correlations of 0.67 in both cases) and added little discriminatory information. We also observed a correlation between AR and HER2 expression, particularly in tumors of the HER2 molecular subtype, supporting prior evidence that AR signaling contributes to HER2-driven malignancy independent of ER (Figure 1A).39 None of the single-cell phenotypes were found exclusively in LN metastases, but cell phenotypes with high expression of ER, PR, GATA3, Bcl-2, p-mTOR, and the luminal CKs were enriched in primary tumor tissues (e.g., clusters 8, 9, 20, and 49) (Figure 1A). Differential abundance analysis confirmed this and showed that a phenotype with low hormone receptor expression (cluster 34) and certain triple-negative phenotypes (clusters 29, 37, 58, and 59) were significantly more abundant in LN metastases than in primary tumor tissues (Figure 1B).
Clustering of non-epithelial cells identified 31 stromal and immune cell phenotype clusters (Figure 1C). We separated immune cells into macrophages, B cells, CD4+ and CD8+ T cells, dendritic cells, granulocytes, and antigen-presenting cells (HLA-DR+) (Figure 1C). Cells of all immune and stromal clusters were found in both primary tumor and LN metastases but, as expected, most stromal cells were from primary tumors, and certain CD4+ T cells and B cells, including highly proliferative ones, were strongly enriched in the LNs (Figures 1C and 1D).
Single-cell phenotypes differ in paired primary tumors and LN metastases
To relate disseminated tumor cells to the single-cell phenotypes in the primary tumor, we compared primary tumors with matched LN metastases in 205 patients. The primary tumor almost always contained cells corresponding to the clinical molecular subtype (Figure 2A), and only minor variability in subtype was observed between matched samples based on ASCO/CAP thresholds for hormone-receptor-positive (5% positive cells at the time of classification) or HER2-positive (10% positive cells) classification (Figure 2B). However, multiplex analysis of all cells revealed substantial differences between the tumor cell phenotypes at the two sites of the same patient (Figures 2C and S2A). The most abundant cell phenotype differed between primary tumor and LN metastasis in 84% of cases (Figures 2C and 2D), based on analysis of single available cores representing approximately 1,800 cells from each site. However, the most abundant disseminated phenotype cluster was detected as a subpopulation in 78% of primary tumors (Figures 2C and S2B).
Figure 2.
Single-cell phenotypes differ in paired primary tumors and LN metastases
(A) Bar plot of the number of patients for whom the clinical classification corresponded to the classification based on ASCO/CAP thresholds for hormone receptors (HRs) and HER2 applied to IMC measurements of the primary tumor core. Patients for whom this is not the case, but whose primary tumor ASCO/CAP-based classification matches that of the matched LN metastasis (primary = LN met), are indicated.
(B) Flow diagram relating ASCO/CAP-based classification of the primary tumor core (left) to that of the matched LN metastasis core (right). Block sizes represent the number of patients with the indicated classification, and lines connect matched primary tumor and LN metastasis samples from the same patient.
(C) Flow diagram relating the predominant primary tumor phenotype (left) to the most abundant phenotype identified in the matched LN metastasis (right). Blocks are colored according to the most abundant tumor phenotype, and the block sizes represent the number of patients with the indicated phenotype. Lines connecting primary tumors and LN metastases are colored according to the predominant phenotype of the primary tumor, and their width represents the number of patients shared between blocks.
(D) Bars indicate the number of patients for whom the most abundant disseminated single-cell phenotype matches the predominant phenotype in the primary tumor (left) or is identified (at least 5 cells) in the primary tumor (right). The ranked bar plot displays the fraction of cells in the primary tumors corresponding to the cell phenotype that is most abundant in the matched LN metastases. Bar color indicates the most abundant disseminated tumor phenotype.
(E) Log FCs in frequency of tumor phenotypes between the matched primary breast and LN site identified as significant (p < 0.01) by paired-design differential abundance testing using edgeR and with overall abundance above 12 log CPM.
See also Figures S2–S4.
For primary or LN tissues analyzed separately, we detected strong positive correlations only among cluster densities of similar phenotypes (Figure S2C), as previously observed.26 However, when matched primary and LN metastatic tissues were compared, we observed positive correlations between different phenotypes including phenotypes characterized by different clinical molecular markers (Figure S2C). Accordingly, categorizing patients based on the cellular composition of their primary tumor or LN metastasis yielded different groups (Figure S3).
Clinical marker expression in paired primary tumors and LN metastases
We compared clinical marker expression in matched primary and metastatic samples. Molecular subtype assignments based on ASCO/CAP criteria matched well between sites (Figures 2B and S4A) but multiplexed single-cell phenotypes showed substantial variability (Figures 2C and S4B). Overall, the fraction of tissues with high hormone receptor expression was smaller in LN metastases compared with primary tumors (Figures 2C and S4B). Certain hormone-receptor-positive cell phenotypes (e.g., epithelial cluster 20) formed the bulk of some primary tumors but were not detected in LN metastases (Figures 1A, 1B, and 2C); the disseminated cells of these primary tumors were frequently negative for hormone receptors (e.g., epithelial cluster 27, Figure 2C). Other cell phenotypes, such as those with high HER2 expression (e.g., epithelial cluster 17) of the HER2 or luminal B (HER2+) clinical subtype, dominated both the primary tumor and corresponding LN metastasis in most cases (Figure 2C), as we observed previously.40 Nevertheless, as in previous studies,23,40,41,42 our analysis did identify a few HER2-positive primary tumors that resulted in predominantly HER2-negative or even luminal hormone-receptor-positive HER2-negative LN metastases (Figure 2C; predominant primary/LN phenotype pairs 17/21, 17/4, and 17/11). Conversely, we also observed rare cases in which predominantly luminal HER2-negative primary tumors yielded HER2-positive LN metastases (phenotype pairs 8/17 and 11/17, Figure 2C).
Most triple-negative luminal CK-negative primary tumors were associated with similar cell phenotypes in the LN metastases and only rarely gave rise to LN metastases with luminal expression patterns (e.g., phenotype pairs involving cluster 29 and clusters 27, 34, or 22 and phenotype pairs involving clusters 1 or 5 and clusters 11, 37, or 7; Figure 2C). Of greater interest, we observed several cases of lower-risk luminal (i.e., hormone-receptor-positive) primary tumors in which the dominant disseminated cell phenotype was triple-negative (Figures 2C and S4B). These disseminated cells in some cases expressed luminal CKs (e.g., phenotype pair 20/27) and in other cases did not (e.g., phenotype pairs involving cluster 21, 53, or 9 and clusters 29, 42, 37, or 58), and thus showed characteristics of the most aggressive triple-negative breast cancer. Thus, the metastatic LN sites in these cases contain higher-risk molecular phenotypes than those revealed by the clinical classification of the primary tumor.
Given the potential clinical implications of this observation, we further investigated triple-negative subpopulations in whole primary tumor sections of ten patients with luminal primary tumors and predominantly triple-negative cells in the LN metastases, using immunofluorescence imaging (Figures 3 and S5; STAR Methods). As expected, most non-epithelial cells as well as epithelial cells of a clinically defined triple-negative control sample were classified as negative for ER, PR, and HER2 (Figures 3A and 3B), with all marker-positive epithelial cells in the triple-negative tumor stemming from healthy ducts. In all the luminal tumors, however, we identified both clinical-marker-positive and triple-negative tumor cell populations (Figures 3A and 3B), in accordance with our single-core multiplex analysis. These triple-negative tumor subpopulations made up less than 15% of tumor cells in 80% of cases (Figures 3B and S2B) and were spatially dispersed (Figures 3C, 3D, and S5). Thus, whole-slide immunofluorescence imaging confirmed the presence of triple-negative cell subpopulations in primary tumors clinically classified as hormone-receptor-positive luminal tumors.
Figure 3.
Triple-negative subpopulations identified in whole sections of luminal primary tumors
(A) Density distributions of single-cell immunofluorescence-based expression of the clinical marker combination (ER, PR, and HER2, combined in the Cy7 fluorescence channel), compared between luminal tumors and the triple-negative reference (left), between tumor and stromal cells (middle), and between clinical marker combination-positive and -negative single cells (right).
(B) Fractions of clinical marker combination-positive and -negative tumor cells in a TNBC reference and in ten luminal primary tumors with predominantly triple-negative tumor cells in the LN metastasis.
(C) Spatial maps of clinical marker-positive and -negative single cells in whole primary tumor sections.
(D) Example images of luminal primary tumor regions containing clinical marker-positive and -negative cells. Bottom row shows magnified regions (indicated with white boxes, top row).
See also Figure S5.
Taken together, these analyses revealed extensive variability in the predominant cellular phenotypes of matched primary tumor and metastatic LN tissues, although the most abundant disseminated cell phenotype was identified as a subpopulation in the primary tumor in most cases. Disseminated tumor cell phenotypes were enriched for triple-negative cells, and these were frequently observed in patients clinically classified as having luminal disease.
Identification of dissemination-prone tumor cells
Next, we assessed pairwise differential abundances of tumor single-cell phenotype clusters between primary tumors and their matched LN metastases with the goal of identifying tumor cell phenotypes that have a propensity to disseminate. Cells with high levels of ER, PR, GATA3, and luminal CKs were more abundant in primary tumors than matched metastases (epithelial clusters 8, 9, 20, 21, 49, and 53; Figure 2E) as were tumor cells with low luminal expression patterns (epithelial cluster 38), with high expression of p-mTOR (epithelial cluster 4), and with some triple-negative cell phenotypes including EGFR-positive (epithelial cluster 5), c-MYC-positive (epithelial cluster 15), and hypoxic cells (epithelial cluster 44; Figure 2E). The cells more abundant in LN metastases than in matched primary tumors included HER2-positive cells with otherwise low marker expression (epithelial cluster 51) and two types of triple-negative cell phenotypes, one expressing high levels of p53 (epithelial cluster 29) and the other highly proliferative and expressing vimentin (epithelial cluster 37; Figure 2E). These findings are consistent with the fact that triple-negative and HER2-positive tumors are considered aggressive. Thus, we identified dissemination-prone cell phenotypes that are enriched in LNs relative to primary tumors, consistent with our observation that the predominant phenotype in LN metastases rarely coincided with that in the primary tumor (Figures 2A and 2B).
Disseminated tumor phenotypes are associated with patient survival
The dissemination of tumor cells to the LNs, reflected by the nodal status, is an important prognostic factor, but current clinical evaluation does not take into account the phenotypes of disseminated cells.15 To determine which disseminated tumor phenotypes are associated with patient survival and how these phenotypes relate to clinical status, we used a lasso-regularized Cox proportional hazards model. We input categorical presence or absence information of all tumor cell phenotypes in the LN metastases (using a threshold of 5 cells to define presence) as well as the clinical classifications (primary tumor grade, molecular subtype of primary tumor, and nodal status), thus controlling for these clinical categories. The model identified a predictive set of disseminating cell phenotypes in LN metastases that stratify node-positive patients by survival, even when taking into account clinical classification of the primary tumor. Disseminating cells that were luminal CK-negative and triple-negative and that expressed high levels of p53 (epithelial cluster 29) were associated with poor patient survival (Figure 4A). Since cells of cluster 29 were more abundant in LN metastases and rarely predominant in primary tumors (Figure 2), these may represent dissemination-prone cells with an increased metastatic potential. Conversely, hormone-receptor-positive disseminated cells were associated with better node-positive patient outcome when identified in LN metastases (in order of increasing effect size: epithelial clusters 34, 23, and 9) (Figure 4A). The largest positive prognosis effect was associated with epithelial cell cluster 9 (Figure 4A). These cells were hormone-receptor-positive with high levels of GATA3 and were more abundant in primary tumor tissues than in LNs (Figures 2E and 4A). This good-prognosis disseminated phenotype may occasionally be drained to LNs but may have only minor invasive potential or may be sensitive to standard therapy.
Figure 4.
Disseminated tumor cell phenotypes are associated with patient survival
(A) Hazard ratios of stratifying predictors selected by a lasso-regularized Cox proportional hazards model for overall survival. Inputs were categorical presence (at least 5 cells) or absence of information of all tumor cell phenotypes in the LN, and the clinical classifications of the patients (primary tumor grade, molecular subtype of primary tumor, and nodal status) for the 213 patients with LN metastasis samples and survival information available.
(B) Kaplan-Meier survival curves and 95% confidence intervals (CIs) (gray) for patients with exclusively good- or bad-prognosis survival-associated phenotypes as identified in (A) (n = 92 good-prognosis patients, 31 bad-prognosis patients).
(C) Bar plot of the number of patients with different nodal status in the good- and bad-prognosis groups in (B).
(D) Hazard ratios of stratifying features selected by a lasso-regularized Cox proportional hazards model for overall survival. Inputs were average tumor cell marker expression levels in the LN metastases. Average tumor cell expression levels of the survival-associated LN metastasis markers p53 and GATA3 in a regular Cox proportional hazards model (hazard ratios and 95% CIs displayed) accounting for the clinical patient classifications (primary tumor grade, molecular subtype of primary tumor, and nodal status) for the 213 patients with LN metastasis samples and survival information available. The reference groups for the categorical predictors are grade 3, nodal status pN1, and molecular subtype luminal B (HER2−). Patients without information available for one of the clinical classifications were treated as a separate category referred to as NA.
In (A) and (D), hazard ratios are displayed without CIs because CIs of individual coefficients are not meaningful after feature selection.
We further analyzed patients with disseminated cells of exclusively bad- or exclusively good-prognosis survival-associated phenotypes. Patients with LN metastases containing cells from cluster 29, but not the good-prognosis clusters 9, 23, 34, and 50, had a drastically worse prognosis than those with the opposite pattern (Figure 4B). Although the patients with the worse prognosis were slightly enriched for nodal status pN3 and the better prognosis patients for pN2, approximately 70% of patients in both groups were of nodal status pN1 and could therefore not be distinguished on this basis (Figure 4C). Therefore, phenotyping of the disseminated tumor cells in the LNs provided prognostic information not evident in the primary tumor or in nodal status alone.
Prognostic LN biomarkers
Since clinical practice relies on single-marker IHC, we sought to simplify our identified prognostic disseminated phenotypes to more applicable LN risk signatures based on only a few markers. Lasso-regularized Cox proportional hazards modeling identified high levels of p53 and high levels of GATA3 as having the strongest unfavorable and favorable prognostic effects, respectively (Figure 4D). Importantly, controlling for standard clinical classifiers revealed that LN metastasis expression of p53 and GATA3 provided more prognostic information for node-positive patients than primary tumor histopathological and molecular classification as well as nodal status (Figure 4E).
To demonstrate that p53 and GATA3 have the potential to be used as diagnostic markers with clinical methods, we used standard IHC to stain these markers in LN metastasis sections adjacent to those previously analyzed with IMC (Figures 5A and 5B). Average tumor cell expression levels of p53, GATA3, and clinical markers measured with IMC and IHC were strongly correlated (Figure 5C). Either the average p53 or GATA3 expression or the fraction of highly expressing cells as measured with IHC were significantly associated with patient survival in a Cox proportional hazards model (Figures 5D and 5E), and adding these markers as covariates to a model that already contained LN metastasis expression of ER, PR, and Ki-67 significantly improved performance (Figures 5F and 5G). Thus, expression levels of p53 and GATA3 in LN metastases measured using standard clinical methods yield prognostic information not captured by primary tumor classification, by nodal status, or by measuring ER, PR, or Ki-67 in the LNs.
Figure 5.
Prediction based on IHC stains of key markers
(A and B) Representative images of sequential sections of LN metastasis cores stained with standard IHC (top) and IMC (bottom) compared for (A) p53 and (B) GATA3. Scale bars are indicated in micrometers.
(C) Scatterplots showing correlation of the average tumor cell expression levels measured with IHC and IMC in sequential sections for the indicated markers. Pearson’s correlation coefficient is indicated.
(D and E) Hazard ratios and 95% CIs returned by a Cox proportional hazards model provided with (D) the average expression levels of p53 and GATA3 and (E) the fraction of highly expressing tumor cells for p53 and GATA3, both based on IHC measurements.
(F and G) Hazard ratios and 95% CIs as in (D) and (E), but where the model was additionally provided with (F) average expression levels and (G) the fraction of highly expressing tumor cells for the indicated clinical markers as well as for p53 and GATA3. Vertical brackets and significance stars indicate likelihood ratio tests for the comparison of nested models. The smaller model was based only on the clinical markers (ER, PR, Ki-67), and the bigger model additionally included p53 and GATA3.
Intra-LN metastasis heterogeneity is associated with patient survival
We assessed whether cellular heterogeneity of the metastatic lesion is informative about patient prognosis. We quantified tumor cell heterogeneity using Shannon entropy43 and assessed its association with overall patient survival. We found that increased intra-lesion heterogeneity in the LN metastases was strongly associated with better patient outcome even when we controlled for clinical features such as tumor grade, molecular subtype, and nodal status (Figures 6A and 6B). When the same analysis was performed on the primary tumors we observed no such association, and there was also no association between heterogeneity of matched primary tumors and LN metastases (Figure 6A). We observed these patterns of association with heterogeneity when modeling primary and disseminated sites independently (594 primary tumors and 213 LN metastases with survival information) and simultaneously (166 matched samples with survival information). The favorable prognostic effect of LN metastasis heterogeneity remained even when controlling for all clinical classifiers and the identified LN metastasis risk markers p53 and GATA3 (Figure 6C).
Figure 6.
LN metastasis heterogeneity is associated with patient survival
(A) Hazard ratios of predictors selected by a Cox proportional hazards model for overall survival of the 166 patients with matched primary tumor and LN metastasis samples and survival information available. Inputs were primary tumor and intra-lesion LN metastasis heterogeneity and clinical patient categories (primary tumor grade, molecular subtype of primary tumor, and node status). 95% CIs are indicated. The reference groups for the categorical predictors are grade 3, nodal status pN1, and molecular subtype luminal B (HER2−). Patients without information available for one of the clinical classifications were treated as a separate category (NA).
(B) Example images of LN metastases (upper row) and Kaplan-Meier survival curves (lower row) of the patients with LN metastases in different heterogeneity quartiles. The black curve represents survival across all patients with LN metastasis. Dashed lines indicate 95% CIs. Scale bars are indicated in micrometers.
(C) Hazard ratios of predictors selected by a Cox proportional hazards model for overall survival of the 213 patients with LN metastasis samples and survival information. The model accounts for the clinical patient categories, the identified LN metastasis risk biomarkers p53 and GATA3, and LN metastasis heterogeneity. The x axis is on a log scale, and 95% CIs are indicated.
(D) Correlation between intra-lesion heterogeneity (Shannon entropy calculated using all tumor cells) and tissue mixing of different phenotypes (average intra-community Shannon entropy) in the LN samples from 271 patients.
The schematics show increasing local heterogeneity. See also Figure S6.
To determine whether heterogeneity in LN metastases was correlated with spatially separated phenotypes or intermixed cells, we assessed the levels of local heterogeneity within communities of neighboring tumor cells.26 We calculated a mixing score for each tissue that reflects the average intra-community (i.e., local) heterogeneity. In this scoring system, low values occur when a single tumor cell phenotype dominates the tissue or when there is no physical contact between different phenotypes, and high values occur when cells of many different phenotypes interact in every community. We observed a strong correlation between LN metastasis heterogeneity and mixing of cellular phenotypes (i.e., a high score) and also a significant association of mixing with improved survival (Figures 6D and S6A). These findings indicate that better prognosis is associated with physically intermixed cell phenotypes instead of spatially separate but diverse subpopulations. Poor-prognosis LN metastases with low heterogeneity and low cellular mixing, produced by a single dominating tumor phenotype, were enriched for phenotypes considered more aggressive in primary tumors (triple-negative and HER2+) (Figure S6B). Correspondingly, community types composed of triple-negative or HER2+ cells, including the high-risk epithelial cluster 29, showed low average intra-community heterogeneity (Figure S6C). Furthermore, community types with high average heterogeneity were preferentially composed of hormone-receptor-positive cells, generally considered less aggressive and more responsive to treatment (Figure S6C). In summary, heterogeneity and spatial mixing of cell phenotypes in LN metastases, but not in primary tumors, were associated with improved patient prognosis.
Features of the primary tumor are associated with prognostic disseminating cells
Phenotyping of disseminated cells in LN metastases may improve breast cancer diagnostics for node-positive patients but would not be useful to guide clinical decisions before LN metastases are detectable. Ideally, features of the primary tumor that are indicative of risk-associated disseminating cells should be identified. Based on binary presence/absence measures for all identified phenotypes, we observed a general pattern of correlation between the good-prognosis, hormone-receptor-positive disseminated cell phenotypes and the presence of luminal and hormone-receptor-positive phenotypes in the primary tumor and a correlation between the high-risk p53-positive triple-negative disseminated cell cluster and non-luminal cell clusters in the primary tumor tissue (Figure S7). Primary tumor phenotypes that positively predicted the presence of risk-associated disseminated cell phenotypes (clusters 9, 34, 23, 50, and 29) always included the same phenotype as the disseminated cells in question (Figure S7), suggesting that the presence of these phenotypes in the primary tumor can be used as an indicator for risk.
Despite the frequent presence of the triple-negative p53-positive high-risk phenotype (cluster 29) in primary tumors of patients who also had these cells present in the LNs, fewer than 25% of these patients were clinically categorized as having triple-negative breast cancer (TNBC), and the remaining tumors were from all molecular subtypes (Figure 7A). This is in agreement with our observation that there are triple-negative cell subpopulations in luminal primary tumors (Figure 3).
Figure 7.
Primary tumor biomarkers of systemic disease
(A) Flow diagram of primary tumors and corresponding LN metastases in which disseminated cells of epithelial clusters 9 and 29 were identified (threshold for presence of 5 cells). The blocks represent the number of primary tumors (left) and LN metastases (right) with the indicated phenotypes. Outermost vertical bars indicate the clinical classification of patients (order not meaningful).
(B) Kaplan-Meier survival curves and 95% CIs (gray) of patients with primary tumors or LNs that contained cells of cluster 29 but not of cluster 9 (cyan), that contained cells of cluster 29 (blue), and that did not contain cluster 29 cells (black). Analyzed were LN metastases, node-positive primary tumors including or excluding clinically diagnosed TNBCs, and node-negative primary tumors, as indicated.
See also Figure S7.
In the 395 samples of node-positive patients, we found that the two disseminated phenotypes with the greatest opposing impacts, epithelial clusters 9 (good prognosis) and 29 (poor prognosis), were rarely both present in the same tissue sample, whether primary tumor or LN metastasis (Figure 7A). For cluster 9, the majority of positive LNs originated from primary tumors in which this phenotype was also present (Figure 7A), although cluster 9 cells were also found in primary tumors with no matched disseminated phenotypes (Figures 2D and 7A). In node-positive patients with primary tumors containing cluster 29 cells, the same phenotype was identified in most LN metastases (Figure 7A). Without considering information from a matched LN sample, node-positive patients with cells of cluster 29 in the primary tumor, and without cells of cluster 9, had poorer overall survival than those with primary tumors lacking cells belonging to this cluster (Figure 7B). This was the case even when we excluded high-risk, clinically diagnosed triple-negative samples (Figure 7B), thereby demonstrating the independent prognostic power of this dissemination-prone phenotype. Interestingly, the small fraction of patients who had cells of cluster 29 in the primary tumor but did not have LN metastases did not have poor prognosis (44 of 256 patients node-negative, Figure 7B), possibly because many factors affect the capacity of tumor cells to disseminate.
In summary, the dissemination-prone tumor cell phenotypes we identified have strong prognostic value for breast cancer patients. Our findings emphasize the importance of identifying subpopulations of cell phenotypes within the primary tumor and LN metastasis for improved diagnosis and treatment of disseminated cancers and argue for the use of biomarkers relevant to systemic as well as localized disease.
Discussion
Our analysis of complex single-cell phenotypes in primary breast tumors and LN metastatic tissues revealed substantial phenotypic variability between primary tumors and LN metastases from individual patients. The predominant single-cell phenotype in the LN was not usually the predominant phenotype in the primary tumor but was identified as a subpopulation in most cases. Our data suggest that subpopulations of dissemination-prone phenotypes, including triple-negative phenotypes, can be present in primary tumors of all molecular subtypes of breast cancer. Currently, clinical decisions are based on histologic stratification of primary tumors, but complex subpopulations, especially of triple-negative cells defined by the absence of multiple markers, are not detectable using single-marker IHC. Multiplex single-cell analysis has the power to identify subpopulations of dissemination-prone cell phenotypes in the primary tumors that may not be detected using standard assays.
Survival analysis identified disseminating tumor cell phenotypes associated with distinct prognoses, indicative of different metastatic potentials or, alternatively, different therapeutic responses. For instance, we identified hormone-receptor-positive disseminated cells associated with good prognosis; these are potentially more sensitive to standard endocrine therapy than other phenotypes. Patients with LN metastases of these lower-risk phenotypes, who typically receive chemotherapy in addition to endocrine therapies, could benefit from studies assessing risks of overtreatment.44,45 We also identified a triple-negative disseminated phenotype with high p53 expression that was associated with poor prognosis and that was informative even when taking nodal status into account. Notably, we observed this phenotype in patients with primary tumors spanning all clinically assigned molecular subtypes, showing that this high-risk phenotype is not captured by conventional subtyping of the primary tumor. Thus, the phenotypes of metastatic cells, frequently distinct from those of the primary tumor, might have a strong influence on disease outcome. Our data strongly indicate that the phenotypes of disseminated cells should inform clinical decisions. To enable this, we translated our findings to standard clinical methods. We used standard IHC measurements of p53 and GATA3 in LN metastases to show that these markers serve as a simple signature for patient risk and provide prognostic information beyond current clinical criteria.
HER2 was the clinical marker with expression most consistently correlated in the primary tumor and disseminated cells, but we did observe cases with mismatched HER2 expression. In some cases with HER2-positive disseminated cells but predominantly HER2-negative cells in the primary tumor, we found that the patient had been clinically classified as HER2+, suggesting that our sampling of a single TMA core may not have captured the region of HER2 expression. Nevertheless, our data were sufficient to identify the most abundant disseminated phenotype as a primary tumor subpopulation in 78% of the matched cases. Also, our key observation that poor-prognosis triple-negative cells can be detected in primary tumors of all molecular subtypes would not be affected by regional variability of markers or undersampling. Rather, these observations suggest that even with limited sampling we detect poor-prognosis phenotypes that go unmeasured during clinical classification of patients.
Finally, we showed that increased heterogeneity of the cellular phenotypes of LN lesions was strongly associated with better patient prognosis. This was surprising, as cellular heterogeneity has been shown in some cases to increase the risk that a treatment-resistant phenotype with metastatic capacity is present.30,46,47 Since we observed no association between heterogeneity and patient survival in the primary tumors, it is unlikely that the association between LN metastasis heterogeneity and survival was a function of the antibody panel. Conceivably, less advanced and more therapy-sensitive tumor subtypes may tend to differentiate48,49 instead of spreading clonally after seeding of LN metastases, thereby creating the intermixed heterogeneous tumor tissues that we observed to be associated with a better prognosis. The association between low levels of heterogeneity in LNs and poor prognosis may be reflective of either monoclonal metastases seeded by an aggressive non-differentiating subtype or, in the possible case of parallel seeding, a highly invasive phenotype that spreads quickly and outcompetes less aggressive subtypes.
Although treatment based on molecular stratification of primary tumors has significantly improved breast cancer patient prognosis, metastases remain the primary cause of death from this disease. The current clinical standard does not take phenotypes of disseminated cells into account. Here, we have identified risk-associated disseminated phenotypes with prognostic value beyond current clinical classifications and have shown that these phenotypes were frequently present in the primary tumor, although typically as rare subpopulations. The high-risk disseminated phenotypes we identified are frequently missed by current clinical criteria for primary tumor classification. These phenotypes could serve as biomarkers for high-risk disseminated cells and likelihood of metastatic disease progression. Disseminated cell phenotyping could improve prognostic and therapeutic accuracy in node-positive patients.
Limitations of the study
Retrospective studies like this one must work with the information available for the cohort. Specific follow-up treatment information was not available for our cohort, but it is very likely that patients received standard therapy at the time, which was endocrine therapy for luminal primary tumors and chemotherapy in cases of high primary tumor burden, nodal involvement, or metastatic disease. The cohort was assembled prior to the establishment of anti-HER2 therapy. Furthermore, TMAs only sample a limited amount of tumor. Hence, some of the phenotypic variation we observed between matched tissue pairs may be attributed to limited sampling of either the primary tumor or the LN lesion. Increased sampling may yield more precise estimates of the prevalence of different cellular subpopulations. We note, however, that the TMAs in our study were of representative tumor cores and taken by trained pathologists, as is standard practice in retrospective cohort studies. Finally, in the immunofluorescence analysis of whole primary tumor sections, we identified tumor cells based on panCK positivity. This approach does not capture tumor cells that may have undergone epithelial-mesenchymal transition and thus do not express panCK. This does not affect our conclusion that there are triple-negative (ER-, PR-, and HER2-negative) tumor cells present in the analyzed tumors, but we may have underestimated the actual number.
STAR★Methods
Key resources table
REAGENT or RESOURCE | SOURCE | IDENTIFIER |
---|---|---|
Antibodies | ||
D1H2 | Cell Signaling | Cat# 4499; RRID:AB_10544537 |
C36B11 | Cell Signaling | Cat# 9733; RRID:AB_2616029 |
EP1601Y | Abcam | Cat# ab214586; RRID:AB_869890 |
10/Fibronectin | BD Biosciences | Cat# 610078; RRID:AB_397486 |
TAL 1B5 | Abcam | Cat# ab20181; RRID:AB_445401 |
C51 | Cell Signaling | Cat# 4546; RRID:AB_2134843 |
HI98 | Biolegend | Cat# 301902; RRID:AB_314194 |
KP1 | Thermo Fisher | Cat# 14-0688-82; RRID:AB_11151139 |
polyclonal_PA5-16722 | Thermo Fischer | Cat# PA5-16722; RRID:AB_10980222 |
1A4 | Abcam | Cat# 7817; RRID:AB_262054 |
D21H3 | Cell Signaling | Cat# 5741; RRID:AB_10695459 |
9E10 | Biolegend | Cat# 626802; RRID:AB_2148451 |
3B5 | BD Biosciences | Cat# 554299; RRID:AB_395352 |
polyclonal_A0452 | Dako | Cat# A0542; RRID:AB_2335677 |
HTA28 | Biolegend | Cat# 641002; RRID:AB_1227659 |
EP1347Y | Abcam | Cat# ab216655; RRID:AB_2864379 |
100 | Biolegend | Cat# 658702; RRID:AB_2562959 |
EP1 | Epitomics | Cat# IR084 |
EPR4097 | Abcam | Cat# ab108398; RRID:AB_10863604 |
SP1 | Abcam | Cat# ab187260; RRID:AB_2927684 |
EP2 | Epitomics | Cat# AC-0028EU |
SP2 | Abcam | Cat# ab239793; RRID:AB_2927687 |
7F5 | Cell Signaling | Cat# 2527; RRID:AB_10695803 |
polyclonal_CD44 | R&D Systems | Cat# AF3660; RRID:AB_10971655 |
D6F11 | Cell Signaling | Cat# 5153; RRID:AB_10691711 |
UCHL1 | Biolegend | Cat# 304202; RRID:AB_314418 |
HI100 | Biolegend | Cat# 304120; RRID:AB_493763 |
2B11 | Thermo Fisher | Cat# 14-9457-82; RRID:AB_11063696 |
L50-823 | BD Biosciences | Cat# 558686; RRID:AB_2108590 |
L26 | Thermo Fisher | Cat# 14-0202-82; RRID:AB_10734340 |
C8/144B | Thermo Fisher | Cat# 14-0085-82; RRID:AB_11150240 |
polyclonal_CA9_AF2188 | R&D Systems | Cat# AF2188; RRID:AB_416562 |
36/E-Cadherin | BD Biosciences | Cat# 610182; RRID:AB_397581 |
B56 | BD Biosciences | Cat# 556003; RRID:AB_396287 |
D38B1 | Cell Signaling | Cat# 4267; RRID:AB_2246311 |
D57.2.2E | Cell Signaling | Cat# 4858; RRID:AB_916156 |
Polyclonal_CD4_AF-379 | R&D Systems | Cat# AF-379; RRID:AB_354469 |
poly vwf | Millipore | Cat# AB7356; RRID:AB_92216 |
EPR3094 | Abcam | Cat# ab207090; RRID:AB_2889382 |
49F9 | Cell Signaling | Cat# 2976; RRID:AB_490932 |
RCK105 | Abcam | Cat# ab9021; RRID:AB_306947 |
AE3 | Millipore | Cat# MAB1611; RRID:AB_2134409 |
AE1 | Millipore | Cat# MAB1612; RRID:AB_2132794 |
C92-605 | BD Biosciences | Cat# 559565; RRID:AB_397274 |
F21-852 | BD Biosciences | Cat# 552596; RRID:AB_394437 |
Biological samples | ||
ZTMA 21 (TMA containing mostly primary tumor cores) | University Hospital Zurich | Ethics: BASEC-PB-2019-1111 Table S1 |
ZTMA 25 (TMA containing mostly lymph node metastasis cores) | University Hospital Zurich | Ethics: BASEC-PB-2019-1111 Table S1 |
ZTMA 26 (TMA containing mostly primary tumor cores) | University Hospital Zurich | Ethics: BASEC-PB-2019-111 Table S1 |
Deposited data | ||
IMC raw data of ZTMA 21 and 25 | Zenodo | Zenodo: https://doi.org/10.5281/zenodo.7494413 |
IMC raw data of ZTMA 26, all TMAs’ extracted single-cell data and associated metadata (incl. patient data), IF images and extracted single-cell data of whole primary breast cancer sections | Zenodo | Zenodo: https://doi.org/10.5281/zenodo.7494509 |
IMC image stacks and single-cell masks in tiff format, Scans and extracted single-cell data of IHC stains on ZTMA 25 | Zenodo | Zenodo: https://doi.org/10.5281/zenodo.7494727 |
Software and algorithms | ||
Processing and analysis code for this study | GitHub and Zenodo | Github: https://github.com/BodenmillerGroup/BC_LN_metastses; Zenodo: https://doi.org/10.5281/zenodo.7650864 |
IMC preprocessing pipeline | GitHub | Github: https://github.com/BodenmillerGroup/ImcSegmentationPipeline |
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Bernd Bodenmiller (bernd.bodenmiller@uzh.ch).
Materials availability
This study did not generate new unique reagents.
Experimental model and subject details
Human subjects
The patient samples and clinical information associated with the breast cancer cohort described in this study were obtained from University Hospital Zurich after approval by the Institutional Review Board (IRB) and state ethics office (Table S1). Pathologists designed and constructed three tissue microarrays (TMAs) (ZTMA21, ZTMA25 and ZTMA26) of core biopsies from different tissue types of breast cancer patients with a focus on primary tumors and LN metastases. The patients in this cohort were diagnosed between 1995 and 2010 and clinically categorized according to the guidelines at the time. The samples were taken at diagnosis and before treatment. The TMAs contain a single 0.6-mm diameter core per patient and available tissue type. Images were obtained of 771 primary tumors, 271 LN metastases, 49 tumor recurrences, 41 distant metastases, and 37 healthy breast tissue samples from 890 patients, resulting in 1245 images. For 263 patients more than one tissue type was available and for 212 patients both a primary tumor and an LN metastasis core were available. For analysis, a minimum threshold of 100 tumor cells was applied per image, in order to exclude cores that may have missed the tumor. As a result, the number of patients used for paired primary tumor and LN metastasis analysis was reduced to 205. This project was approved by the local Commission of Ethics (BASEC-PB-2019-111).
Method details
Antibody panel
The antibody panel was designed to focus on breast cancer-specific epitopes but also to distinguish epithelial, mesenchymal, endothelial, and the main immune cell types (Figures 1 and S1). More detailed information can be found in Table S2.
Preparation and staining
Using the antibody panel described in Table S2 and Figure S1, slides were stained as also previously described.50 Briefly, slides were incubated at 60°C for 1 h and then deparaffinized in xylene for 1 h before being rehydrated in a graded alcohol series (ethanol:deionized water, 100:0, 90:10, 80:20, 70:30, 50:50, 0:100; 5 min each). Antigen retrieval was conducted with Tris-EDTA (pH 9) buffer at 95°C in a NxGen decloaking chamber (Biocare Medical) for 20 min. Following cooling, slides were blocked with 3% BSA and 5% goat serum in 150 mM NaCl, 50 mM Tris-HCl, pH 7.6 (TBS) for 1 h. Samples were first stained and incubated with untagged rabbit anti-ERα antibodies and all metal-tagged rabbit clones overnight at 4°C. Samples were then washed in TBS before incubation with metal-tagged anti-rabbit secondary antibody (30 min at room temperature) to increase ERα signal, and then an additional incubation with anti-ERα antibodies (10 min at room temperature) to block excess secondary. After washing with TBS again, samples were incubated with all remaining (rabbit) metal-tagged antibodies overnight at 4°C (Table S2). Following incubation, slides were washed with TBS. Finally, samples were incubated with 0.5 μM Cell-ID Intercalator-Ir (Fluidigm, 201192B) for detection of DNA. After 5 min, slides were rinsed with TBS and then briefly in water and air dried.
Imaging mass cytometry
Multiplexed images of the TMA cores were acquired using a Hyperion Imaging System (Fluidigm). A square area around each core of a TMA was acquired at 400 Hz, and the raw data were preprocessed using commercial software (Fluidigm). The cores within a TMA were acquired in a randomized order, and the three TMAs were acquired in one continuous run in order of increasing ZTMA number. In cases of unexpectedly interrupted acquisitions, the remaining part of the core was acquired as a separate image, resulting in few patients with the same core split into two images. Signal spillover between channels was compensated on the single-cell level using the CATALYST51,52 R package (v.1.12.1).
Quantification and statistical analysis
Data processing
The commercial image format was converted to OME-TIFF files, based on which we segmented the single-cells using a combination of ilastik (v.1.3.3)53 and CellProfiler (v.3.1.8),54 following the workflow available at https://github.com/BodenmillerGroup/ImcSegmentationPipeline.55 High-dimensional mean marker expressions and spatial single-cell features were extracted using CellProfiler.55 Cells on the edges of the cores were marked as special cases for spatial analyses, and a circular area of each core was recorded for downstream density calculations. Even with high-quality segmentation, single cells extracted from images of tissue sections represent tissue slices with potentially overlapping cell fragments. Therefore, the extracted single-cell marker expression can include some information of neighboring cells, especially in densely packed tissues. Cellular neighborhoods were extracted by expanding the circumference of each cell by 4 pixels (4 μm) and recording the overlapping neighbor cell IDs using CellProfiler.55
Data transformation and normalization
The pixel values in the area corresponding to each cell were averaged into mean single-cell marker expressions. The single-cell expression data were normalized between 0 and 1 for each marker using the 99th percentile normalization to account for outliers.
Analysis workflow
All downstream analyses were conducted in R (v.3.6.3). A size threshold was applied to mark extremely small or extremely large cells as potential mis-segmentations and remove them from analyses. Only cores containing a minimum of 100 tumor cells were considered for analysis, in order to exclude potentially misleading cores, which may have missed the tumor bulk. Whenever binary presence or absence of a phenotype in a tissue is reported, the threshold for presence was 5 cells rather than 1, in order to increase robustness of the analysis. The “predominant” phenotype of a tissue was defined as the most abundant phenotype identified in the core.
Clustering
To identify single-cell phenotypes, we first separated the epithelial and non-epithelial cells in order to subsequently use unsupervised clustering on each subset separately, with a selected set of markers. We used the R package mclust (v.5.4.6)56 to apply a Gaussian mixture model to the mean single-cell expressions of panCK, in order to separate panCK-positive from panCK-negative cells. To assess the expression of the remaining markers, we used an unsupervised clustering step on all markers with a nearest neighbor parameter of 50 using the RPhenoGraph34 implementation from cytofkit (v.1.10.0).57 This granular clustering step split the single cells into 130 clusters. We then assigned each cluster to the epithelial or non-epithelial group based on the majority vote (more than 50%) of panCK-positive or -negative single cells. Inspection of the cluster marker means identified one “non-epithelial” cluster with positive expression of tumor markers HER2 and c-MYC that clustered with other tumor clusters in a hierarchical clustering. This cluster, which contained a small number of cells, was then reassigned to the tumor group. The epithelial and non-epithelial cells were then separately pooled for subclustering with the respective markers of interest. Subclustering was performed using the RPhenoGraph implementation of the cytofkit R package (v.1.10.0)57 with a nearest neighbor parameter of 100 for epithelial and 50 for non-epithelial cells. The higher nearest neighbor parameter for epithelial cell clustering was chosen to balance the high number of epithelial markers and large inter-patient variability among tumor cells; a lower nearest neighbor parameter would have resulted in exploding numbers of clusters. Epithelial cell clustering resulted in 59 clusters and non-epithelial cell clustering in 31 clusters. The Z-scored mean marker expressions of the clusters are displayed in Figures 1A and 1C. The heatmaps were constructed using the R package ComplexHeatmap (v.2.2.0).58
Differential abundance of phenotypes between matched primary tumors and LN metastases
Differential abundance testing was conducted using R functions of the edgeR package (v.3.28.0),59 inspired by the workflow described and implemented in the R package diffcyt (v.1.6.0).60 We used negative binomial generalized linear models to model overdispersed count data. We first estimated the negative binomial dispersion for each cluster without estimating a trend and then tested for differences in the abundance of clusters between the matched conditions by fitting a negative binomial generalized log-linear model for each cluster and then applying likelihood ratio tests to the coefficients. The differential abundance analysis in Figure 1B is based on a non-paired design but for the analyses in Figure 2E the model was provided with the patient IDs, thereby accounting for the paired design of the primary tumors and LN metastases. Normalization factors, calculated via the trimmed mean of M-values method, were included to adjust for composition effects. This analysis was conducted separately on epithelial and non-epithelial cells. A significance threshold of p < 0.01 was used to identify the differentially abundant phenotypes.
Spatial communities
Spatial tumor cell communities representing highly interconnected groups of neighboring tumor cells were identified using the Louvain algorithm61 on the topological neighborhood graph of tumor cells (non-tumor cells were excluded from graph). The Louvain R implementation from the igraph package (v.1.2.5) was used with default parameters. Tumor cell communities were clustered into community types based on their single-cell phenotype compositions (minimum-maximum normalized absolute numbers of cells of each phenotype cluster) as described previously26 using the RPhenoGraph implementation from the cytofkit R package with 30 nearest neighbors (v.1.10.0).57
Heterogeneity and tissue mixing quantification
Tumor cell heterogeneity within each core was quantified using Shannon entropy43 on the epithelial phenotype counts by means of the R package entropy (v.1.2.1).62 In addition to the tumor cell heterogeneity quantified across the whole tissue core, we calculated intra-community heterogeneities based on the epithelial phenotype counts within each spatially defined tumor cell community. All intra-community heterogeneities of a tissue core were then averaged into a tissue mixing (or local heterogeneity) score that quantifies the level of physical mixing among different tumor phenotypes. Tissue mixing scores are lowest when an entire tissue is dominated by a single tumor phenotype or when multiple phenotypes are completely spatially separated into distinct communities. High mixing scores occur in tissues with many different tumor phenotypes in every single community.
Survival analysis
The R package survival (v.3.1–12)63 was used to calculate Kaplan-Meier survival curves and Cox proportional-hazards models. The number of samples used for survival analysis slightly deviated from the described total number of samples because survival information was not available for every patient in the cohort. All survival models were adjusted for the clinical classifications (grade, molecular subtype, and nodal status). In a few cases information about one of the clinical classifications was not available for a patient. In order to avoid excluding such patients from survival analysis, we simply considered “not available” (NA) as a separate category for a given classification (e.g., Grade 1, Grade 2, Grade 3, Grade NA). The hazard ratios of the NA categories are displayed in the figures as we always display all covariates that were included in a standard Cox proportional-hazards model, but these categories were never used as the reference level and thus do not affect the hazard ratios of the other categories. The reference levels for the categorical covariates are indicated in the figure legends. When the number of covariates for a survival model was too high compared to the number of samples (e.g., 59 epithelial phenotypes and 213 LN metastasis samples) lasso-regularized Cox proportional-hazards models were applied to avoid overfitting (cv.glmnet function of the R package glmnet (v.3.0–2)64,65 with the family “cox” option). To increase robustness, the cross validation was run 500 times, and the mean error curves were averaged before choosing the optimal lambda value. Hazard ratios of all active covariates after regularization at the minimum cross validation error are displayed in the figures. Confidence intervals of hazard ratios after lasso selection are not displayed, as p values and confidence intervals of individual coefficients are invalid after feature selection.
Immunofluorescence quantification
For the identification of triple-negative subpopulations in 10 whole sections of luminal primary tumors we used standard immunofluorescence imaging. We combined unconjugated primary rabbit ER, PR, and HER2 antibodies into the same secondary channel (Cy7) and additionally imaged panCK (Cy3) and DAPI in order to reveal triple-negative tumor cells (Cy7-negative and Cy3-positive). Fluorescence images were acquired using a ZEISS slide scanner and segmented into approximately 6 million single cells, which were then classified into epithelial and non-epithelial cells based on a random forest object classifier using QuPATH (v.0.2.2).66 We applied a Gaussian mixture model to the log-transformed Cy7 single-cell expressions to distinguish cells positive for ER, PR, and HER2 from cells that did not express these markers (R package mclust v.5.4.6).56 We included a clinically triple-negative tumor as a true negative reference. Tumor cells with low expression of panCK were identified based on a Gaussian mixture model applied to the log-transformed Cy3 single-cell expressions (R package mclust v.5.4.6)56 and excluded from analysis to account for staining and tissue quality effects.
Immunohistochemical quantification
Immunohistochemical stains on sequential sections of the LN metastasis TMA (ZTMA25) were performed by the University Hospital Zurich for the established clinical markers ER, PR, and Ki-67 and additionally for p53 and GATA3. Using QuPATH (v.0.2.2),66 the single cells in each core of the scanned TMA were segmented and classified into tumor and non-tumor cells based on a random forest object classifier. Average tumor cell expression levels as well as the fraction of highly expressing tumor cells were extracted for each core and marker. Subsequently associations between p53 and GATA3 expression levels in the LNs and patient survival were assessed while controlling for the expression levels of the clinically established markers (Figure 5).
Predicting disseminated phenotypes based on the primary tumor
To identify phenotypes in the primary tumor that are predictive of the presence of a disseminated phenotype of interest, we applied logistic regression models to predict the binomial presence or absence of the phenotype of interest among the cells in the LN metastasis. Lasso regularization was used to select the stratifying covariates. We used the cv.glmnet function of the R package glmnet (v.3.0–2)64 with family “binomial” and 10-fold cross validation. To increase robustness, the cross validation was run 500 times, and the mean error curves were averaged before choosing the optimal lambda value. The phenotype predictors were based on binary presence or absence values. The active predictors after lasso regularization at the minimum CV-error are displayed in Figure S7.
Image visualization
Pixel or single-cell level example images shown in the figures were generated using the cytomapper R package.67
Acknowledgments
We thank patients who donated tumor samples, the B.B. laboratory for fruitful discussions, and S. Dettwiler, Tissue Biobank USZ, for technical support. B.B. was funded by an SNSF project grant, an NIH grant (UC4 DK108132), the CRUK IMAXT Grand Challenge, and the European Research Council (ERC) under the European Union's Horizon 2020 Program under the ERC grant agreement no. 866074 (“Precision Motifs”). H.W.J. was funded by the SystemsX Transitional Post-Doctoral Fellowship, the Canadian Institute of Health Research Post-Doctoral Fellowship, and the Cancer Research Society Scholarship for the Next Generation of Scientists.
Author contributions
J.R.F., H.W.J., and B.B. planned the project. J.R.F. conceived the data analysis approaches and performed data processing, image quantification, and data analysis. H.W.J. proposed the comparison of LN metastases to their primary tumors in this cohort and performed all experiments related to this analysis. J.R.F. and H.W.J. performed biological analysis and interpretation. Z.V., P.S., and H.M. provided patient samples and clinical information. J.R.F., H.W.J., N.d.S., and B.B. wrote and revised the manuscript with input from all authors. B.B. directed the project. All authors read and approved of the final manuscript.
Declaration of interests
J.R.F. and B.B. have founded and are shareholders of Navignostics, a spin-off from University of Zurich. B.B. is a member of the board of directors and J.R.F. is the CEO. J.R.F., H.W.J., and B.B. have made a patent application related to this work, licensed to Navignostics.
Published: March 14, 2023
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.xcrm.2023.100977.
Contributor Information
Hartland Warren Jackson, Email: hjackson@lunenfeld.ca.
Bernd Bodenmiller, Email: bernd.bodenmiller@uzh.ch.
Supplemental information
Clinical information corresponding to each TMA core, including both patient and core-level metadata.
Data and code availability
All data has been deposited on Zenodo (Zenodo: https://doi.org/10.5281/zenodo.7494413, Zenodo: https://doi.org/10.5281/zenodo.7494509, Zenodo: https://doi.org/10.5281/zenodo.7494727) and custom code has been deposited on Github and Zenodo (Github: https://github.com/BodenmillerGroup/BC_LN_metastses, Zenodo: https://doi.org/10.5281/zenodo.7650864) and is publicly available as of the date of publication. DOIs are also listed in the key resources table. Any additional information required to reanalyze the data reported in this work paper is available from the lead contact upon request.
References
- 1.Coates A.S., Winer E.P., Goldhirsch A., Gelber R.D., Gnant M., Piccart-Gebhart M., Thürlimann B., Senn H.J., Panel Members Tailoring therapies--improving the management of early breast cancer: st gallen international expert consensus on the primary therapy of early breast cancer 2015. Ann. Oncol. 2015;26:1533–1546. doi: 10.1093/annonc/mdv221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hammond M.E.H., Hayes D.F., Wolff A.C., Mangu P.B., Temin S. American society of clinical oncology/college of American pathologists guideline recommendations for immunohistochemical testing of estrogen and progesterone receptors in breast cancer. J. Oncol. Pract. 2010;6:195–197. doi: 10.1200/jop.777003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Wolff A.C., Hammond M.E.H., Allison K.H., Harvey B.E., Mangu P.B., Bartlett J.M.S., Bilous M., Ellis I.O., Fitzgibbons P., Hanna W., et al. Human epidermal growth factor receptor 2 testing in breast cancer: American society of clinical oncology/college of American pathologists clinical practice guideline focused update. J. Clin. Oncol. 2018;36:2105–2122. doi: 10.1200/jco.2018.77.8738. [DOI] [PubMed] [Google Scholar]
- 4.Sørlie T., Perou C.M., Tibshirani R., Aas T., Geisler S., Johnsen H., Hastie T., Eisen M.B., van de Rijn M., Jeffrey S.S., et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. USA. 2001;98:10869–10874. doi: 10.1073/pnas.191367098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Perou C.M., Sørlie T., Eisen M.B., van de Rijn M., Jeffrey S.S., Rees C.A., Pollack J.R., Ross D.T., Johnsen H., Akslen L.A., et al. Molecular portraits of human breast tumours. Nature. 2000;406:747–752. doi: 10.1038/35021093. [DOI] [PubMed] [Google Scholar]
- 6.Varga Z., Noske A., Ramach C., Padberg B., Moch H. Assessment of HER2 status in breast cancer: overall positivity rate and accuracy by fluorescence in situ hybridization and immunohistochemistry in a single institution over 12 years: a quality control study. BMC Cancer. 2013;13:615. doi: 10.1186/1471-2407-13-615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Stefanovic S., Wirtz R., Deutsch T.M., Hartkopf A., Sinn P., Varga Z., Sobottka B., Sotiris L., Taran F.A., Domschke C., et al. Tumor biomarker conversion between primary and metastatic breast cancer: mRNA assessment and its concordance with immunohistochemistry. Oncotarget. 2017;8:51416–51428. doi: 10.18632/oncotarget.18006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Varga Z., Lebeau A., Bu H., Hartmann A., Penault-Llorca F., Guerini-Rocco E., Schraml P., Symmans F., Stoehr R., Teng X., et al. An international reproducibility study validating quantitative determination of ERBB2, ESR1, PGR, and MKI67 mRNA in breast cancer using MammaTyper. Breast Cancer Res. 2017;19:55. doi: 10.1186/s13058-017-0848-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Stocker A., et al. Differential prognostic value of positive HER2 status determined by immunohistochemistry or fluorescence in situ hybridization in breast cancer. Breast Cancer Research and Treatment. 2020;183:311–319. doi: 10.1007/s10549-020-05772-6. [DOI] [PubMed] [Google Scholar]
- 10.Website. Howlader N., Noone A.M., Krapcho M., Miller D., Bishop K., Kosary C.L., Yu M., Ruhl J., Tatalovich Z., Mariotto A., et al. National Cancer Institute; 2017. SEER Cancer Statistics Review; pp. 1975–2014.https://seer.cancer.gov/csr/1975_2014/ [Google Scholar]
- 11.Carter C.L., Allen C., Henson D.E. Relation of tumor size, lymph node status, and survival in 24,740 breast cancer cases. Cancer. 1989;63:181–187. doi: 10.1002/1097-0142(19890101)63:1<181::aid-cncr2820630129>3.0.co;2-h. [DOI] [PubMed] [Google Scholar]
- 12.Muss H.B., Woolf S., Berry D., Cirrincione C., Weiss R.B., Budman D., Wood W.C., Henderson I.C., Hudis C., Winer E., et al. Adjuvant chemotherapy in older and younger women with lymph node-positive breast cancer. JAMA. 2005;293:1073–1081. doi: 10.1001/jama.293.9.1073. [DOI] [PubMed] [Google Scholar]
- 13.Cserni G., Chmielik E., Cserni B., Tot T. The new TNM-based staging of breast cancer. Virchows Arch. 2018;472:697–703. doi: 10.1007/s00428-018-2301-9. [DOI] [PubMed] [Google Scholar]
- 14.Tsuchiya A., Kanno M., Abe R. The impact of lymph node metastases on the survival of breast cancer patients with ten or more positive lymph nodes. Surg. Today. 1997;27:902–906. doi: 10.1007/bf02388136. [DOI] [PubMed] [Google Scholar]
- 15.Cardoso F., Kyriakides S., Ohno S., Penault-Llorca F., Poortmans P., Rubio I.T., Zackrisson S., Senkus E., ESMO Guidelines Committee Early breast cancer: ESMO Clinical Practice Guidelines for diagnosis, treatment and follow-up. Ann. Oncol. 2019;30 doi: 10.1093/annonc/mdz189. 1674-1220. [DOI] [PubMed] [Google Scholar]
- 16.Allison K.H., Hammond M.E.H., Dowsett M., McKernin S.E., Carey L.A., Fitzgibbons P.L., Hayes D.F., Lakhani S.R., Chavez-MacGregor M., Perlmutter J., et al. Estrogen and progesterone receptor testing in breast cancer: ASCO/CAP guideline update. J. Clin. Oncol. 2020;38:1346–1366. doi: 10.1200/JCO.19.02309. [DOI] [PubMed] [Google Scholar]
- 17.Chavez-MacGregor M., Valero V. Stability of estrogen receptor status in breast carcinoma: a comparison between primary and metastatic tumors with regard to disease course and intervening systemic therapy. Breast Dis. 2011;22:270–271. doi: 10.1016/j.breastdis.2011.06.042. [DOI] [PubMed] [Google Scholar]
- 18.Lindström L.S., Karlsson E., Wilking U.M., Johansson U., Hartman J., Lidbrink E.K., Hatschek T., Skoog L., Bergh J. Clinically used breast cancer markers such as estrogen receptor, progesterone receptor, and human epidermal growth factor receptor 2 are unstable throughout tumor progression. J. Clin. Oncol. 2012;30:2601–2608. doi: 10.1200/jco.2011.37.2482. [DOI] [PubMed] [Google Scholar]
- 19.Sun Y., Liu X., Cui S., Li L., Tian P., Liu S., Li Y., Yin M., Zhang C., Mao Q., Wang J. The inconsistency of molecular subtypes between primary foci and metastatic axillary lymph nodes in Luminal A breast cancer patients among Chinese women, an indication for chemotherapy? Tumour Biol. 2016;37:9555–9563. doi: 10.1007/s13277-016-4844-1. [DOI] [PubMed] [Google Scholar]
- 20.Cserni G. Tumour histological grade may progress between primary and recurrent invasive mammary carcinoma. J. Clin. Pathol. 2002;55:293–297. doi: 10.1136/jcp.55.4.293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Varga Z., Caduff R., Pestalozzi B. Stability of the HER2 gene after primary chemotherapy in advanced breast cancer. Virchows Arch. 2005;446:136–141. doi: 10.1007/s00428-004-1164-4. [DOI] [PubMed] [Google Scholar]
- 22.Xian Z., Quinones A.K., Tozbikian G., Zynger D.L. Breast cancer biomarkers before and after neoadjuvant chemotherapy: does repeat testing impact therapeutic management? Hum. Pathol. 2017;62:215–221. doi: 10.1016/j.humpath.2016.12.019. [DOI] [PubMed] [Google Scholar]
- 23.Aitken S.J., Thomas J.S., Langdon S.P., Harrison D.J., Faratian D. Quantitative analysis of changes in ER, PR and HER2 expression in primary breast cancer and paired nodal metastases. Ann. Oncol. 2010;21:1254–1261. doi: 10.1093/annonc/mdp427. [DOI] [PubMed] [Google Scholar]
- 24.Cockburn J.G., Hallett R.M., Gillgrass A.E., Dias K.N., Whelan T., Levine M.N., Hassell J.A., Bane A. The effects of lymph node status on predicting outcome in ER/HER2- tamoxifen treated breast cancer patients using gene signatures. BMC Cancer. 2016;16:555. doi: 10.1186/s12885-016-2501-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Sopik V., Narod S.A. The relationship between tumour size, nodal status and distant metastases: on the origins of breast cancer. Breast Cancer Res. Treat. 2018;170:647–656. doi: 10.1007/s10549-018-4796-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Jackson H.W., Fischer J.R., Zanotelli V.R.T., Ali H.R., Mechera R., Soysal S.D., Moch H., Muenst S., Varga Z., Weber W.P., Bodenmiller B. The single-cell pathology landscape of breast cancer. Nature. 2020;578:615–620. doi: 10.1038/s41586-019-1876-x. [DOI] [PubMed] [Google Scholar]
- 27.Ali H.R., Jackson H.W., Zanotelli V.R.T., Danenberg E., Fischer J.R., Bardwell H., Provenzano E., CRUK IMAXT Grand Challenge Team. Rueda O.M., Chin S.F., et al. Imaging mass cytometry and multiplatform genomics define the phenogenomic landscape of breast cancer. Nat. Cancer. 2020;1:163–175. doi: 10.1038/s43018-020-0026-6. [DOI] [PubMed] [Google Scholar]
- 28.Wagner J., Rapsomaniki M.A., Chevrier S., Anzeneder T., Langwieder C., Dykgers A., Rees M., Ramaswamy A., Muenst S., Soysal S.D., et al. A single-cell atlas of the tumor and immune ecosystem of human breast cancer. Cell. 2019;177:1330–1345.e18. doi: 10.1016/j.cell.2019.03.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Bedard P.L., Hansen A.R., Ratain M.J., Siu L.L. Tumour heterogeneity in the clinic. Nature. 2013;501:355–364. doi: 10.1038/nature12627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Rye I.H., Trinh A., Saetersdal A.B., Nebdal D., Lingjaerde O.C., Almendro V., Polyak K., Børresen-Dale A.L., Helland Å., Markowetz F., Russnes H.G. Intratumor heterogeneity defines treatment-resistant HER2+ breast tumors. Mol. Oncol. 2018;12:1838–1855. doi: 10.1002/1878-0261.12375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Keren L., Bosse M., Marquez D., Angoshtari R., Jain S., Varma S., Yang S.R., Kurian A., Van Valen D., West R., et al. A structured tumor-immune microenvironment in triple negative breast cancer revealed by multiplexed ion beam imaging. Cell. 2018;174:1373–1387.e19. doi: 10.1016/j.cell.2018.08.039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Giesen C., Wang H.A.O., Schapiro D., Zivanovic N., Jacobs A., Hattendorf B., Schüffler P.J., Grolimund D., Buhmann J.M., Brandt S., et al. Highly multiplexed imaging of tumor tissues with subcellular resolution by mass cytometry. Nat. Methods. 2014;11:417–422. doi: 10.1038/nmeth.2869. [DOI] [PubMed] [Google Scholar]
- 33.Bodenmiller B. Multiplexed epitope-based tissue imaging for discovery and healthcare applications. Cell Syst. 2016;2:225–238. doi: 10.1016/j.cels.2016.03.008. [DOI] [PubMed] [Google Scholar]
- 34.Levine J.H., Simonds E.F., Bendall S.C., Davis K.L., Amir E.a.D., Tadmor M.D., Litvin O., Fienberg H.G., Jager A., Zunder E.R., et al. Data-driven phenotypic dissection of AML reveals progenitor-like cells that correlate with prognosis. Cell. 2015;162:184–197. doi: 10.1016/j.cell.2015.05.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Brooks S.A., Leathem A.J.C. Expression of the CD15 antigen (Lewis x) in breast cancer. Histochem. J. 1995;27:689–693. doi: 10.1007/bf02388541. [DOI] [PubMed] [Google Scholar]
- 36.Ali H.R., Dawson S.J., Blows F.M., Provenzano E., Leung S., Nielsen T., Pharoah P.D., Caldas C. A Ki67/BCL2 index based on immunohistochemistry is highly prognostic in ER-positive breast cancer. J. Pathol. 2012;226:97–107. doi: 10.1002/path.2976. [DOI] [PubMed] [Google Scholar]
- 37.Callagy G.M., Webber M.J., Pharoah P.D.P., Caldas C. Meta-analysis confirms BCL2 is an independent prognostic marker in breast cancer. BMC Cancer. 2008;8:153. doi: 10.1186/1471-2407-8-153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Honma N., Horii R., Ito Y., Saji S., Younes M., Iwase T., Akiyama F. Differences in clinical importance of Bcl-2 in breast cancer according to hormone receptors status or adjuvant endocrine therapy. BMC Cancer. 2015;15:698. doi: 10.1186/s12885-015-1686-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.He L., Du Z., Xiong X., Ma H., Zhu Z., Gao H., Cao J., Li T., Li H., Yang K., et al. Targeting androgen receptor in treating HER2 positive breast cancer. Sci. Rep. 2017;7:14584–14610. doi: 10.1038/s41598-017-14607-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Simon R., Nocito A., Hübscher T., Bucher C., Torhorst J., Schraml P., Bubendorf L., Mihatsch M.M., Moch H., Wilber K., et al. Patterns of her-2/neu amplification and overexpression in primary and metastatic breast cancer. J. Natl. Cancer Inst. 2001;93:1141–1146. doi: 10.1093/jnci/93.15.1141. [DOI] [PubMed] [Google Scholar]
- 41.Edgerton S.M., Moore D., Merkel D., Thor A.D. erbB-2 (HER-2) and breast cancer progression. Appl. Immunohistochem. & Mol. Morphol. 2003:214–221. doi: 10.1097/00129039-200309000-00003. [DOI] [PubMed] [Google Scholar]
- 42.Zidan J., Dashkovsky I., Stayerman C., Basher W., Cozacov C., Hadary A. Comparison of HER-2 overexpression in primary breast cancer and metastatic sites and its effect on biological targeting therapy of metastatic disease. Br. J. Cancer. 2005;93:552–556. doi: 10.1038/sj.bjc.6602738. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Shannon C.E. A mathematical theory of communication. Bell System Technical Journal. 1948;27:623–656. doi: 10.1002/j.1538-7305.1948.tb00917.x. [DOI] [Google Scholar]
- 44.Montemurro F., Aglietta M. Hormone receptor-positive early breast cancer: controversies in the use of adjuvant chemotherapy. Endocr. Relat. Cancer. 2009;16:1091–1102. doi: 10.1677/ERC-09-0033. [DOI] [PubMed] [Google Scholar]
- 45.Joerger M., Thürlimann B. Chemotherapy regimens in early breast cancer: major controversies and future outlook. Expert Rev. Anticancer Ther. 2013;13:165–178. doi: 10.1586/era.12.172. [DOI] [PubMed] [Google Scholar]
- 46.Dagogo-Jack I., Shaw A.T. Tumour heterogeneity and resistance to cancer therapies. Nat. Rev. Clin. Oncol. 2018;15:81–94. doi: 10.1038/nrclinonc.2017.166. [DOI] [PubMed] [Google Scholar]
- 47.McGranahan N., Swanton C. Biological and therapeutic impact of intratumor heterogeneity in cancer evolution. Cancer Cell. 2015;27:15–26. doi: 10.1016/j.ccell.2014.12.001. [DOI] [PubMed] [Google Scholar]
- 48.Hart I.R., Easty D. Tumor cell progression and differentiation in metastasis. Semin. Cancer Biol. 1991;2:87–95. [PubMed] [Google Scholar]
- 49.Jögi A., Vaapil M., Johansson M., Påhlman S. Cancer cell differentiation heterogeneity and aggressive behavior in solid tumors. Ups. J. Med. Sci. 2012;117:217–224. doi: 10.3109/03009734.2012.659294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Schapiro D., Jackson H.W., Raghuraman S., Fischer J.R., Zanotelli V.R.T., Schulz D., Giesen C., Catena R., Varga Z., Bodenmiller B. histoCAT: analysis of cell phenotypes and interactions in multiplex image cytometry data. Nat. Methods. 2017;14:873–876. doi: 10.1038/nmeth.4391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Chevrier S., Crowell H.L., Zanotelli V.R.T., Engler S., Robinson M.D., Bodenmiller B. Compensation of signal spillover in suspension and imaging mass cytometry. Cell Syst. 2018;6:612–620.e5. doi: 10.1016/j.cels.2018.02.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Crowell H., Zanotelli V., Chevrier S., Robinson M. 2020. CATALYST: Cytometry dATa anALYSis Tools. R package version 1.12.2.https://github.com/HelenaLC/CATALYST [Google Scholar]
- 53.Berg S., Kutra D., Kroeger T., Straehle C.N., Kausler B.X., Haubold C., Schiegg M., Ales J., Beier T., Rudy M., et al. ilastik: interactive machine learning for (bio)image analysis. Nat. Methods. 2019;16:1226–1232. doi: 10.1038/s41592-019-0582-9. [DOI] [PubMed] [Google Scholar]
- 54.McQuin C., Goodman A., Chernyshev V., Kamentsky L., Cimini B.A., Karhohs K.W., Doan M., Ding L., Rafelski S.M., Thirstrup D., et al. CellProfiler 3.0: Next-generation image processing for biology. PLoS Biol. 2018;16:e2005970. doi: 10.1371/journal.pbio.2005970. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Zanotelli W.V., ndamond, Strotton M. 2020. BodenmillerGroup/ImcSegmentationPipeline: IMC Segmentation Pipeline (Version v0.9. [DOI] [Google Scholar]
- 56.Scrucca L., Fop M., Murphy T., Raftery A. Mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J. 2016;8:289–317. doi: 10.32614/RJ-2016-021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Chen H., Lau M.C., Wong M.T., Newell E.W., Poidinger M., Chen J. Cytofkit: a bioconductor package for an integrated mass cytometry data analysis pipeline. PLoS Comput. Biol. 2016;12:e1005112. doi: 10.1371/journal.pcbi.1005112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Gu Z., Eils R., Schlesner M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics. 2016;32:2847–2849. doi: 10.1093/bioinformatics/btw313. [DOI] [PubMed] [Google Scholar]
- 59.Robinson M.D., McCarthy D.J., Smyth G.K. A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Weber L. 2020. Diffcyt: Differential Discovery in High-Dimensional Cytometry via High-Resolution Clustering. R package version 1.8.8.https://github.com/lmweber/diffcyt [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Blondel V.D., Guillaume J.-L., Lambiotte R., Lefebvre E. Fast unfolding of communities in large networks. J. Stat. Mech. 2008;2008:P10008. doi: 10.1088/1742-5468/2008/10/p10008. [DOI] [Google Scholar]
- 62.Hausser J., Strimmer K. Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks. J. Mach. Learn. Res. 2009;10:1469–1484. http://jmlr.csail.mit.edu/papers/v10/hausser09a.html [Google Scholar]
- 63.Borgan Ø. Vol. 20. Springer-Verlag; 2001. Modeling Survival Data: Extending the Cox Model; pp. 2053–2054. [DOI] [Google Scholar]
- 64.Friedman J., Hastie T., Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 2010;33:1–22. http://www.jstatsoft.org/v33/i01/ [PMC free article] [PubMed] [Google Scholar]
- 65.Simon N., Friedman J., Hastie T., Tibshirani R. Regularization paths for cox’s proportional hazards model via coordinate descent. J. Stat. Softw. 2011;39:1–13. doi: 10.18637/jss.v039.i05. http://www.jstatsoft.org/v39/i05/ [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Bankhead P., Loughrey M.B., Fernández J.A., Dombrowski Y., McArt D.G., Dunne P.D., McQuaid S., Gray R.T., Murray L.J., Coleman H.G., et al. QuPath: open source software for digital pathology image analysis. Sci. Rep. 2017;7:16878–16887. doi: 10.1038/s41598-017-17204-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Eling N., Damond N., Hoch T., Bodenmiller B. Cytomapper: an R/bioconductor package for visualisation of highly multiplexed imaging data. Bioinformatics. 2020;36:5706–5708. doi: 10.1093/bioinformatics/btaa1061. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Clinical information corresponding to each TMA core, including both patient and core-level metadata.
Data Availability Statement
All data has been deposited on Zenodo (Zenodo: https://doi.org/10.5281/zenodo.7494413, Zenodo: https://doi.org/10.5281/zenodo.7494509, Zenodo: https://doi.org/10.5281/zenodo.7494727) and custom code has been deposited on Github and Zenodo (Github: https://github.com/BodenmillerGroup/BC_LN_metastses, Zenodo: https://doi.org/10.5281/zenodo.7650864) and is publicly available as of the date of publication. DOIs are also listed in the key resources table. Any additional information required to reanalyze the data reported in this work paper is available from the lead contact upon request.