Summary
Breast cancer immune response is important to patient outcome, but the prognostic interaction between tissue-infiltrating immune cell (TIIC) types is not well-characterized. We evaluated the associations between CD8+, FOXP3+, CD20+, and CD163+ TIICs and breast cancer-specific survival (BCSS). We developed an AI in Halo to score TIIC percentage by compartment (overall, stromal, or intra-tumoral) in 99,051 microarray images from 12,285 female breast cancers. The associations between log-transformed TIIC scores and BCSS were assessed using Cox regression. CD8+ and FOXP3+ TIICs were associated with better BCSS in ER-negative disease; CD8+ and CD20+ TIICs were associated with a better prognosis in ER-positive disease; and CD163+ TIICs were associated with a poorer prognosis in ER-positive disease in multi-marker models. These results may have implications for breast cancer immunotherapy.
Subject areas: Oncology, Immunology, Artificial intelligence
Graphical abstract

Highlights
-
•
Tissue-infiltrating CD8 T cells and B-cells associate with better outcomes in ER + breast cancer
-
•
Tissue-infiltrating CD8 and FOXP3 T cells associate with better outcomes in ER-breast cancer
-
•
Tissue-infiltrating macrophages associate with poorer outcomes in ER + breast cancer
Oncology; Immunology; Artificial intelligence
Introduction
Survival after a diagnosis of breast cancer is affected by many patient and tumor characteristics, including age at diagnosis, tumor size, tumor grade, local and regional lymph node status, and expression of tumor biomarkers including estrogen receptor (ER), progesterone receptor (PR), KI67 and HER2. Several multiparameter molecular tests have been developed to aid prognostication and treatment decision-making in patients with breast cancer.1 In addition to the molecular characteristics of the neoplastic parenchyma, there is accumulating evidence that nontumor cells might also influence disease prognosis.
A wide variety of tissue-infiltrating immune cells (TIICs), such as cytotoxic T-lymphocytes (CD8+), T-helper lymphocytes (CD4+), B-lymphocytes (CD20+), natural killer cells (NK cells), and macrophages,2,3 characterize the immune landscape of tumors. TIICs can occur in direct, cell-to-cell contact with tumor cells (intra-tumoral TIICs) or within the connective tissue stroma surrounding tumor cells (i.e., stromal TIICs).4,5 The prognostic associations of TIICs have been evaluated across several studies, but these were limited by a wide range of study designs with varying endpoints, sample sizes, metrics for immune cell infiltration, and statistical methodologies. As a result, the findings of these studies are inconsistent, but some broad patterns have emerged. Total tissue infiltrating lymphocytes are associated with a greater pathological complete response to neoadjuvant chemotherapy and improved disease-free survival in women with triple-negative breast cancer (TNBC) or HER2-positive breast cancer.5,6,7,8 Of the specific immune cell types, CD8+ T-lymphocytes have been studied the most. A 2023 meta-analysis based on 14 studies found that CD8+ TIICs were associated with improved overall survival and disease-free survival, with similar associations for both intra-tumoral and peritumoral CD8+ TIICs.9 They have also been found to be associated with better outcomes for women with ER-negative tumors (both triple-negative and HER2-positive),10 ER-positive/HER2-negative tumors,10 and ER-positive tumors overall.11 FOXP3+ T-lymphocytes have been associated with improved pathological complete response to neoadjuvant chemotherapy and overall survival in triple-negative and HER2-positive breast cancer, with some evidence that this effect is restricted to FOXP3+ TIICs in the stromal compartment.12 Tissue-infiltrating B-lymphocytes (CD20+) have been associated with better outcomes, with few data on either ER-status specific effects or on tissue compartment effects.13 One study reported that tissue-infiltrating B-lymphocytes are associated with better outcomes for both patients with HER2-positive and triple-negative breast cancer.14 In contrast, tissue-infiltrating macrophages (CD68+ or CD163+) have been found to be associated with worse overall survival and progression-free survival.15
A few studies have evaluated more than one immune cell type in breast cancer. A high ratio of CD8+: to FOXP3+ TIICs has been associated with better disease-free survival in triple-negative16,17 and HER2-positive16 breast cancers, and worse disease-free survival in luminal (ER-positive) breast cancers.18 One study used deconvoluted bulk gene expression data to evaluate the role of all the above markers (CD8, FOXP3, CD20, CD163, and CD68).19 Increased CD20+ cells were associated with improved outcomes in patients with ER-positive breast cancer, with similar associations for CD20+ and CD8+ lymphocytes in patients with ER-negative breast cancer. Poorer outcomes were observed for FOXP3+ lymphocytes and CD68 macrophages in patients with ER-positive disease, and for FOXP3+ lymphocytes and CD163+ macrophages in patients with ER-negative disease.
The aim of this study was to clarify the prognostic associations of subsets of TIICs, specifically CD8+, FOXP3+, CD20+, and CD163+, in ER-positive and ER-negative breast cancer by using machine learning algorithms for the high-throughput, quantitative assessment of TIICs, including their spatial localization within tissues. We addressed these aims in a large, multicenter study comprising more than 12,000 patients with clinically annotated breast cancer tissues on TMAs that were stained using IHC for four markers representing the major immune cell subtypes.
Results
There was a total of 99,051 core images from 12,285 patients after the application of initial case exclusion criteria. The patient and tumor characteristics of the cases are summarized in Table 1. The CD8+ TIIC scores were validated by comparing them to manual scoring by a pathologist (HRA) of 2,247 cores from the SEARCH study, as previously reported.10 A good correlation was observed for total score (Spearman’s Rho = 0.76), intra-tumoral score (Rho = 0.55), and stromal score (Kappa = 0.62) (Figure S1).
Table 1.
Characteristics of 12,285 cases by the ER status
| ER status |
||||||
|---|---|---|---|---|---|---|
| Negative | % | Positive | % | Missing | % | |
| Number of cases | 2,721 | – | 8,649 | – | 915 | – |
| Age group (years) | ||||||
| 20–39 | 496 | 18 | 1,014 | 12 | 66 | 7 |
| 40–59 | 1,405 | 52 | 4,152 | 48 | 528 | 58 |
| 60–79 | 797 | 29 | 3,350 | 39 | 316 | 35 |
| 80+ | 23 | 1 | 133 | 2 | 5 | 1 |
| Grade | ||||||
| 1 | 98 | 3.6 | 1,526 | 18 | 102 | 11 |
| 2 | 606 | 22 | 4,538 | 53 | 295 | 32 |
| 3 | 1,951 | 72 | 2,270 | 26 | 334 | 37 |
| Missing | 66 | 2.4 | 315 | 4 | 184 | 20 |
| Tumor size (mm) | ||||||
| 1–19 | 912 | 34 | 3,909 | 45 | 288 | 32 |
| 20–49 | 1,280 | 47 | 3,093 | 36 | 295 | 32 |
| 50+ | 174 | 6 | 413 | 5 | 39 | 4 |
| Missing | 355 | 13 | 1,234 | 14 | 293 | 32 |
| Positive nodes | ||||||
| 0 | 1,447 | 53 | 4,511 | 52 | 349 | 38 |
| 1–3 | 696 | 26 | 2,306 | 27 | 204 | 22 |
| 4–9 | 268 | 10 | 709 | 8 | 99 | 11 |
| 10+ | 119 | 4.4 | 400 | 5 | 28 | 3 |
| Missing | 191 | 7 | 723 | 8 | 235 | 26 |
| Breast death | ||||||
| No | 2,039 | 75 | 7,277 | 84 | 744 | 81 |
| Yes | 682 | 25 | 1372 | 16 | 171 | 19 |
| Followup (years) | ||||||
| Mean (SD) | 8.26 | 4 | 9.22 | 4 | 8.06 | 4 |
The choice of untransformed or log-transformed TIIC scores was based on the results of an initial set of single-marker, complete case analyses. We ran 576 single-marker Cox regression models, given all possible combinations of four markers, partially- and fully-adjusted models, three tissue compartments, ER-positive and ER-negative models, six thresholds for minimum tissue area, and untransformed and transformed TIIC scores. There were 288 models for each of the analyses using the untransformed and log-transformed score variable. The log-transformed score was the better model in 235, based on the model log likelihoods. There were 96 models at each tissue area threshold, with the 0.05 mm2 threshold being best in 35 models and the 0.30 mm2 threshold being best in 22 models. The results for all 576 complete case analysis models are provided in Table S1.
Single marker analyses
Subsequent analyses were based on the data for all cases after multiple imputation with the tissue area threshold on a minimum of 0.05 mm2 and the log-transformed percent tissue area TIIC score. In single marker analyses, CD8+ TIICs in both tumoral and stromal compartments were associated with improved BCSS in ER-positive and ER-negative disease (Figure 1). The effect sizes were similar in the partially and fully adjusted models. FOXP3+ TIICs were associated with a better prognosis in ER-negative and in ER-positive cases, but the effect in ER-positive disease was only observed in the fully adjusted models. CD20+ TIICs in both compartments were associated with better breast cancer-specific survival in both ER-positive and ER-negative with no attenuation of the association in the fully adjusted models. Finally, CD163+ TIICs in both compartments were associated with better prognosis in ER-negative cases. In ER-positive cases, CD163+ TIICs were associated with a poorer prognosis in the partially adjusted models, but the effect was completely attenuated after adjusting for grade, size and number of positive nodes. The results for the single-marker, fully adjusted models did not vary substantially by tissue area exclusion threshold (Figure S2 and Table S1).
Figure 1.
Hazard ratio and 95% confidence interval for association between TIIC score and breast cancer-specific survival by marker, ER status, and tissue compartment in single marker models using imputed data
Brown = partially model (adjusted for age and study), teal = fully adjusted model (adjusted for age, tumor size, tumor grade, number of positive nodes, and study).
Multi-marker analyses
The results of the multi-marker, partially and fully adjusted models, including all four immune cell markers using the imputed data, are shown Figure 2. CD8+ TIICs were associated with improved prognosis in ER-positive disease, with little difference between the tumor and stromal compartments. CD8+ TIICs were also associated with an improved prognosis, with a slightly stronger effect for CD8+ infiltration in the tumor compartment. CD20+ cells were associated with improved prognosis in ER-positive disease, with little difference between the tumor and stromal compartments. In ER-negative disease, CD20+ TIICs were associated with a better prognosis in partially adjusted models, but the effects were slightly attenuated in the fully adjusted models, with the association being nominally significant only for the stromal compartment. TIICS CD163+ TIICs were not associated with a prognosis in ER-negative cases and were associated with a poorer prognosis in ER-positive cases with similar effects in both stromal and tumoral compartments. FOXP3+ infiltration was not associated with outcome in ER-positive disease and associated with a better prognosis in ER-negative disease. Overall, there was limited evidence of interstudy heterogeneity in the effect estimates (Table S2), despite the underlying heterogeneity in the study designs, including the year of diagnosis, the time of storage of pathology material, and methods for TMA construction. The fraction of the total variance in breast cancer-specific survival explained by the model that was accounted for by the TIIC scores was 11 percent for the ER-negative model and 3.8 percent for the ER-positive model. In comparison, the fraction of the variance explained for other prognostic factors was 8.6 percent and 2.3 percent for age, 9.1 percent and 7.7 percent for tumor size, 30 percent and 29 percent for number of positive lymph nodes, and 12 percent and 14 percent for tumor grade.
Figure 2.
Hazard ratio and 95% confidence interval for the association between TIIC score and breast cancer-specific survival by ER status, marker, and tissue compartment for partially and fully adjusted multi-marker models using imputed datasets
Subgroup analyses
We investigated whether the associations of TIICs with prognosis varied by tumor HER2 status and by whether the patient had been treated with adjuvant chemotherapy or not. The effect sizes for patients treated with chemotherapy compared to those for patients who did not receive chemotherapy, and for HER2 for patients with HER2 positive tumors compared to those with HER2-negative tumors are shown in Figure 3 and Table S3. We found little evidence that the effects vary by these subgroups, with no significant differences after adjusting for multiple testing.
Figure 3.
Log hazard ratios based on fully adjusted single and multi-marker models for the association between TIIC score and breast cancer-specific survival by subgroup based on treatment with chemotherapy and tumor HER2 status (p > 0.05 for all comparisons)
Discussion
We evaluated the association with breast cancer-specific mortality of the presence of tissue-infiltrating cytotoxic T cells (CD8), regulatory T cells (FOXP3), B-cells (CD20), and M2 macrophages (CD163), along with well-established prognostic factors in female breast cancer.
Our findings for the single marker analyses are broadly consistent with those in the literature, with some notable differences. We found a better prognosis for CD8+ and CD20+TIICs in both ER-positive and ER-negative disease and for TIICs in both stromal and intra-tumoral compartments. Published studies have found that increased CD8+ TIICs predict better outcomes in ER-negative but not ER-positive breast cancer apart from the small subgroup of ER-positive that is also HER2-positive.9,10 We found no significant difference in the effect estimates for CD8 in ER-positive/HER2-negative and ER-positive/HER2-positive cases (Figure 3). While an association of CD20+ B-lymphocytes with survival for women with invasive breast cancer has been described previously, associations specific to ER-negative or ER-positive disease have not been shown. An association of FOXP3+ TIICs with better survival is well established in ER-positive and ER-negative disease12,20 and is supported by our single marker analyses. We found CD163+ macrophage infiltration to be associated with a poorer survival in ER-positive patients, although the association was completely attenuated on adjustment for other prognostic variables. This is similar to the few published studies that have reported on CD163+ macrophages in ER-positive disease.15 In contrast, we found infiltrating CD163+ macrophages to be associated with an improved survival in ER-negative breast cancer, whereas most published studies of ER-negative or triple-negative breast cancers have reported a poorer prognosis for CD163+ TIICs.15 The reasons for these differences are unclear. The difference between our results and those of previous studies for CD8+ and FOXP3+ in ER-positive cases is likely to be due to the substantially increased statistical power of our study to detect modest effects. The improved survival observed for CD163+ TIICs in ER-negative cases is harder to explain. Given the concordance of our findings for CD163+ TIICs with other studies in ER-positive cases, it seems unlikely that bias can be an explanation. Chance is a possible but unlikely explanation, given the opposite direction of effect in our study.
No published studies have reported on the effects of all four of these markers in multi-marker models. In fully adjusted models in ER-positive cases, CD8+ and CD20+ TIICs were associated with better survival, and CD163+ TIICs were associated with poorer survival. In fully adjusted models in ER-negative disease, CD8+ and FOXP3+ TIICs were associated with better survival, with no association for CD163+ TIICs or CD20+ TIICs.
Our study was based on samples of tumors of all sizes, grades, and subtypes from several countries, with the majority of cases being of white European ancestry. The study is representative of the European population. Given the similar associations of the well-established prognostic factors in diverse populations, it seems likely that the findings would also apply to other populations. We provided solid evidence to confirm known associations between CD8+, CD163+, and CD20+ TIICs and breast cancer survival in ER-negative disease and provided novel evidence for associations between those markers and ER-positive disease. There was little evidence for subgroup-specific effects for subgroups based on tumor HER2 status or subgroups based on treatment with adjuvant chemotherapy. This supports further consideration of the inclusion of ER-positive patients in clinical trials of immune modulators.
Limitations of the study
Manual validation of the automated scoring of the tissue-infiltrating immune cells was only performed for CD8. However, the appearance of CD8, FOXP3, CD20, and CD163-stained lymphocytes is similar, and it seems highly likely that the correlation observed for CD8 would be similar for the other markers. Given that manual scoring by a pathologist cannot be considered a “gold standard” – the correlation between the scoring by two pathologists of immunohistochemistry is often 0.5 to 0.8 – it is not clear how any given correlation between the pathologist and automated score would affect the interpretation of the observed positive associations. We have observed strong associations with the automated scores, which implies that the measure is biologically relevant.
We assumed that 971 patients with an unknown cause of death – out of 2,738 deaths within 15 years of follow-up – died of breast cancer. This will be associated with some misclassification, particularly in the period 10–15 years after diagnosis, when almost half of deaths are due to causes other than breast cancer (Table S4). Consequently, this may result in underestimation of the effect sizes. We therefore repeated all the analyses using breast cancer-specific mortality and censoring those with an unknown cause of death. As expected, we observed a small attenuation in the effect sizes when including the unknowns (Figure S3A), but the increased number of events improved power as evidenced by a smaller P-value (Figure S3B).
Breast tumors are spatially heterogeneous, and the cores sampled for tissue microarrays cannot capture all the heterogeneity. This is also likely to result in some underestimation of effect sizes. We explored the potential effect of this by comparing the complete case analysis results for two of the strongest single marker associations, that for CD8+ TIICs in both tissue compartments in ER-negative disease (HRfull = 0.81, p = 5.2 x 10−6) and CD20+ in ER-positive disease (HRfull = 0.82, p = 3.2 x 10−6), in subsets of the data based on cases represented by a single core and cases represented by two or more core. The effect size for the CD8+ analysis was somewhat larger for the multicore case sample (HRfull = 0.79, 95% CI 0.70–0.88) than for the single core case sample (HRfull = 0.84, 95% CI 0.72–0.98). The difference between the samples for the CD20+ tumor infiltration score was more substantial (HRfull = 0.76 95% CI 0.67–0.86, compared to HRfull = 0.87 95% CI 0.78–0.98). This is consistent with the observation that B-lymphocytes tend to aggregate, whereas T cells do not.21
A key question relating to any prognostic biomarker is whether or not that biomarker is predictive of response to therapy (where response is measured on a relative scale, such as relative risk reduction). However, our data are from observational studies, and it is not possible to answer the question of whether or not the benefit of chemotherapy varies by tissue immune-infiltrating immune cells, a question that can only be reliably answered by a randomized controlled trial of chemotherapy.
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Paul Pharoah (paul.pharoah@cshs.org).
Materials availability
This study did not generate new materials.
Data and code availability
-
•
Data: The tissue segmentation and TIIC scores generated by the Halo algorithm, together with the de-identified phenotype data, have been deposited in the European Genome Phenome Archive at https://ega_archive.org/datasets/EAD50000002125 and are publicly available as of the date of publication.
-
•
Code: The analysis code (R markdown) has been deposited on GitHub at https://github.com/paul-pharoah/breast-til and is publicly available as of the date of publication.
-
•
Additional information: Any additional information required to reanalyze the data reported in this article is available from the lead contact upon request.
Acknowledgments
The authors thank the Histology and Genomics Cores at the Cancer Research UK Cambridge Institute for technical support, and all patients, researchers, clinicians, and administrative personnel who contributed to the studies taking part in BCAST. The CPSII study investigators acknowledge the contribution to this study from central cancer registries supported through the Centers for Disease Control and Prevention’s National Program of Cancer Registries and cancer registries supported by the National Cancer Institute’s Surveillance Epidemiology and End Results Program.
BCAC was supported by the Cancer Research UK grant: PPRPGMNov20∖100002 and by core funding from the NIHR Cambridge Biomedical Research Centre (NIHR203312). The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care. The BCAST project was supported by the Horizon 2020 Research and Innovation Programs of the European Union B (grant number: 633784), the NIHR Cambridge Biomedical Research Centre and US Federal funds from the National Cancer Institute and National Institutes of Health under contract number 75N91019D00024. The content of this publication does not necessarily reflect the views or policies of the US Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the US Government. AJB was supported by the NIH Oxford-Cambridge Scholars and Gates Cambridge Scholars programs. The funding of the contributing studies is listed in Table S5.
Author contributions
Conceptualization: M.G.C., P.D.P.P., and M.K.S. Data curation: F.M.B., R.K., M.K.B., and R.M. Formal analysis: A.J.B. and P.D.P.P. Funding acquisition: D.F.E., M.G.C., P.D.P.P., and M.K.S. Investigation: H.R.A., M.A., M.A.D., A.J.B., and J.W.M.M. Methodology: A.J.B., A.M.A., and P.L. Keywords: Project administration: F.M.B., R.K., and R.L.M. Resources: A.A., H.R.A., I.A., T.U.A., A.B., C.B., H.B., K.B., S.B., A.C., C.C., F.Ca, F.Co, J.C., M.C., N.J.C., S.S.C., P.D., D.E., D.F.E., G.G., J.G., M.G.C., A.H.G., A.Ho, A.Hu, H.H., M.H., M.J.H., A.Jag, A.Jak, J.L.J., R.K., J.Li, J.Lis, J.Lu, S.M.L., A.M.M., J.W.M.M., J.L.M., K.M.,. T.M., W.E.M., N.O., A.P., T.C.P., P.D.P.P., M.U.R., B.S., M.K.S., L.R.T., W.T., A.Jvd.B, C.H.Mv.D, and S.Y. Software: M.A. and A.J.B. Supervision: M.A., M.G.C., R.L.M., P.D.P.P., and M.K.S. Visualization: A.J.B., A.M.A., M.G.C., P.D.P.P., and M.K.S. Writing of original draft: A.J.B., A.M.A., M.G.C., P.D.P.P., and M.K.S. Writing review and editing: all authors.
Declaration of interests
The authors declare no competing interests.
STAR★Methods
Key resources table
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Antibodies | ||
| CD163 mouse monoclonal | Novocastra | Cat# NCL-L-CD163; RRID: AB_2756375 |
| CD20 mouse monoclonal | Novocastra | Cat# NCl-L-CD20-L26; RRID: AB_563521 |
| CD8 rabbit monoclonal | Neomarkers | Cat# RM-9116-S; RRID: AB_149960 |
| FOXP3 mouse monoclonal | Abcam | Cat# ab20034; RRID: AB_445284 |
| Software and algorithms | ||
| Data analysis software | R Foundation for Statistical Computing | https://www.r-project.org/ |
| PathXL software | Philips Digital and Computational Pathology | – |
| Analysis code | R scripts | https://github.com/paul-pharoah/breast-til/ |
| Deposited data | ||
| De-identified case data | European Genome Phenome Archive | EGA: EGAD50000002125 |
Experimental model and study participant details
Twenty-two studies from the Breast Cancer Association Consortium (bcac.ccge.medschl.cam.ac.uk) provided 323 tissue microarrays for staining for CD8, FOXP3, CD20, and CD163 as part of a larger project (BCAST) to generate molecular pathology data on 15 markers in approximately 20,000 breast tumors. Arrayed tumors were from female breast cancer patients from Canada, Germany, the Netherlands, Poland, Singapore, the UK, and the USA who were diagnosed between 1961 and 2015. Clinicopathological data were available for ER status, age at diagnosis, tumor size, number of positive lymph nodes, vital status, cause of death, and follow-up time; methods of ascertaining vital status and cause of death are summarized in Table S6. Patients diagnosed before 2000, missing vital status or age at diagnosis, or without any follow-up time after entry into the study were excluded, resulting in a total of 12,285 cases from twenty studies in the analyses (Table 2). Of these, 1,551 were missing ER status data.
Table 2.
Summary of studies contributing eligible cases
| Study | Country | Cases (N)a | Year of diagnosisa | Age at diagnosis |
Tissue microarrays |
|
|---|---|---|---|---|---|---|
| Mean [Min, Max] | Core size (mm) | Arrays (N) | ||||
| ABCS | Netherlands | 461 | 2003–2011 | 42 [18, 49] | 0.6 | 14 |
| UKBGS | UK | 629 | 2004–2014 | 56 [24, 84] | 1.0 | 28 |
| CPS2 | USA | 330 | 2000–2009 | 73 [57, 87] | 1.0 | 7 |
| EPIC | Germany | 263 | 2000–2000 | 58 [40, 75] | 1.0 | 9 |
| ESTHER | Germany | 258 | 2001–2004 | 62 [50, 75] | 1.0 | 6 |
| MARIE | Germany | 1484 | 2001–2005 | 62 [49, 75] | 1.0 | 30 |
| MCBCS | USA | 196 | 2001–2008 | 56 [26, 83] | 0.6 | 4 |
| MMHS | USA | 151 | 2003–2013 | 66 [45, 89] | 0.6 | 4 |
| NEAT | UK | 1944 | 2000–2001 | 49 [26, 76] | 0.6 | 16 |
| ORIGO | Netherlands | 234 | 2000–2006 | 53 [22, 83] | 0.6 | 11 |
| PBCS | Poland | 1273 | 2000–2003 | 56 [27, 75] | 1.0 | 22 |
| PLCO | USA | 583 | 2000–2010 | 68 [55, 85] | 1.0 | 24 |
| POSH | UK | 835 | 2000–2007 | 36 [19, 41] | 1.0 | 39 |
| RBCS | Netherlands | 360 | 2000–2009 | 45 [25, 78] | 0.6 | 15 |
| SBCS | UK | 108 | 2012–2015 | 59 [24, 87] | 0.6 | 4 |
| SEARCH | UK | 2244 | 2000–2009 | 55 [23, 69] | 0.6 | 32 |
| SGBCC | Singapore | 178 | 2000–2013 | 54 [25, 81] | 1.0 | 2 |
| SKKDKFZS | Germany | 371 | 2000–2005 | 62 [27, 88] | 0.6 | 19 |
| SZBCS | Poland | 178 | 2002–2010 | 55 [31, 85] | 0.6 | 4 |
| UNC | USA | 205 | 2000–2009 | 61 [30, 91] | 2.0 | 25 |
| All | – | – | 2000–2015 | – | – | 315 |
After the exclusion of cases diagnosed before 2000, missing vital status or age at diagnosis, or without any follow-up time after entry into the study.
All participants provided written informed consent. The ethics committees or institutional; review boards responsible for oversight of the individual studies are listed in Table S7.
Method details
Immunohistochemistry
Tissue microarrays are paraffin blocks in cassettes containing multiple cylindrical cores of tumor with a diameter between 0.6 mm and 2 mm, extracted from formalin-fixed and paraffin-embedded surgical tumor tissue blocks. The specific locations from which the cores were taken were determined by pathologists for each study to be the most representative of the overall tumor structure/composition. Eleven studies used 0.6 mm cores, ten studies used 1 mm cores, and one study used 2 mm cores. Tissue microarrays were sent to the BCAST coordinating center at the University of Cambridge for sectioning and staining by the Histology Core at the Cancer Research UK Cambridge Institute. CD8, FOXP3, CD20, and CD163+ were selected as markers of key components of the immune response: CD8+ for cytotoxic response/immune upregulation, FOXP3+ for cytotoxic response/immune downregulation, CD20+ for humoral immunity, and CD163+ for innate immunity/immune downregulation. Details of the antibodies used for each marker are presented in Table S8. The stained sections were then scanned at 20× using a Leica Aperio AT2 scanner, and the images stored using PathXL software. The scanned images for each stained tissue microarray section were de-arrayed using study-specific tissue microarray maps, and an image of each stained core was exported as jpeg file for downstream analysis. Tissue microarrays included more than one tissue core per case for about 50% of cases (Table S9). The dataset comprised 99,051 stained cores (25,017 CD8; 24,720 FOXP3; 24,697 CD20; and 24,167 CD163) from 12,285 cases (Table S10).
Image analysis
Immune cell tissue infiltration scores were generated using Halo, a digital pathology platform produced by Indica Labs (Indica Labs, Albuquerque, NM). The image analysis model was composed of two parts: an immune cell detection component and a tissue segmentation component (see supplementary methods for details). Immune cell detection was performed by a proprietary immune cell detection script with minor hyperparameter optimization. Development of the tissue segmentation component was an iterative process using 553 pathologist-validated annotations (MAD, MA) of core images (150 CD8+, 104 FOXP3+, 179 CD20+, 120 CD163+) to identify tumor, stroma, artifact, and glass (see supplementary methods).
The automated algorithm was then applied to the complete set of images to generate the total area of tumor, stroma and artifact with the area of each occupied by TIICs. Individual core images were excluded from subsequent analyses if no tumor or stromal tissue was detected. The core-level immune cell score was then given as percentage of the compartmental tissue area occupied by TIICs. The number of cores per case varied between studies; case-level scores were taken as the mean value for cases with multiple valid tissue core scores. The automated scoring was validated against pathologist scores for CD8+ in the SEARCH study, as well as expert pathologist consensus on 80 validation images (20 per marker).
The maximum potential tissue area depends on the tumor core diameter and so the mean total tissue area detected by the tissue segmentation algorithm varied by tumor core diameter and by study (Figure S4). The proportion of cores with tissue area greater than 0.05 mm2, 0.1 mm2, 0.2 mm2 and 0.3 mm2 was 97 percent, 94 percent, 86 percent and 70 percent respectively.
Statistical analysis
Cox proportional hazards regression was used to assess the association between percent TIICs area and breast cancer specific survival (BCSS). Time at risk was from the date of diagnosis, with the time under observation beginning at the date at recruitment (left censoring). Follow-up was censored at death, the time of last observation, or 15 years after diagnosis, whichever came first. The event of interest was death from breast cancer. Where cause of death was not available (971 of 2,738 deaths), any death was treated as due to breast cancer. Separate analyses were performed for ER-positive and ER-negative tumors. Partially models included age at diagnosis and study as covariates. Fully-adjusted models were adjusted for age at diagnosis, tumor grade, number of positive nodes, tumor size and study. Grade was found to violate the proportional hazards assumption (see supplementary methods) and so it was treated as a time-varying covariate with the log hazard ratio varying as a function of log time. Participants from studies with fewer than 6 deaths were pooled into either a group of studies with 0.6 mm cores on the arrays or a group of studies with 1 mm cores on the arrays.
Age at diagnosis has been shown to have a nonmonotonic hazard function for women with ER-positive breast cancers.22 We, therefore, used the transformation of age at diagnosis described by Candido dos Reis and others for the ER-positive models which generates two variables for age.22 The distribution of the TIIC scores (percentage of the tissue area occupied by TIICs) was highly skewed (Figure S5). We therefore log transformed the TIIC score and ran all the models twice using both untransformed and transformed scores. Partially- and fully-adjusted models were run for each marker individually and then multi-marker models including all four markers were run. TIIC scores based on a small area of tissue are likely to be unreliable, so we also evaluated the effect of using different thresholds for exclusion of cores based on area of tissue detected: we used thresholds for minimum tissue area of 0.05 mm2, 0.1 mm2, 0.15 mm2, 0.1 mm2, 0.25 mm2, and 0.3 mm2.
Missing data for TIIC scores reduced the sample size of the complete case analysis (i.e., cases with missing data removed) for the partially-adjusted multi-marker model to 7,278 ER-positive cases and 2,295 ER-negative cases with the core exclusion threshold of 0.05 mm2. This was reduced further in the fully-adjusted models to 5,714 ER-positive cases and 1,868 ER-negative cases because of missing data for grade, tumor size and number of positive nodes. We therefore used multiple imputation by chained equations to impute missing data and maximise the sample size in all analyses. The scores for the four markers with study, age at diagnosis, ER status, grade, tumor size, number of positive nodes, follow-up time and breast cancer mortality were used in the imputation models, and each dataset was imputed 10 times. The results from the Cox regression analyses on each imputed dataset were combined using Rubin’s rules.23
We approximated the relative variation24 accounted for by TIIC scores as
where LR is the likelihood ratio chisquared statistic, the standard model includes study, age at diagnosis, grade, tumor size and number of positive nodes, and B denotes the inclusion of TIIC scores as predictors.
We evaluated between study heterogeneity by running the fully-adjusted, multi-marker models for the imputed data using the 0.05 mm2 tissue area exclusion threshold separately for each study. We tested for evidence of between study heterogeneity using Cochran’s Q25 and I2 as an estimate of the proportion of overall variance due to between study variance.26
All analyses and data visualisations were carried out using R,27 implemented in R Studio,28 with the R packages tidyverse,29 broom,30 ggforestplot,31 lemon,32 meta,33 mice,34 patchwork,35 and survival.36
Published: January 20, 2026
Footnotes
Supplemental information can be found online at https://doi.org/10.1016/j.isci.2026.114759.
Supplemental information
References
- 1.Chowdhury A., Pharoah P.D., Rueda O.M. Evaluation and comparison of different breast cancer prognosis scores based on gene expression data. Breast Cancer Res. 2023;25:17. doi: 10.1186/s13058-023-01612-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Crusz S.M., Balkwill F.R. Inflammation and cancer: advances and new agents. Nat. Rev. Clin. Oncol. 2015;12:584–596. doi: 10.1038/nrclinonc.2015.105. [DOI] [PubMed] [Google Scholar]
- 3.Hendry S., Salgado R., Gevaert T., Russell P.A., John T., Thapa B., Christie M., van de Vijver K., Estrada M.V., Gonzalez-Ericsson P.I., et al. Assessing Tumor-infiltrating Lymphocytes in Solid Tumors: A Practical Review for Pathologists and Proposal for a Standardized Method From the International Immunooncology Biomarkers Working Group: Part 1: Assessing the Host Immune Response, TILs in Invasive Breast Carcinoma and Ductal Carcinoma In Situ, Metastatic Tumor Deposits and Areas for Further Research. Adv. Anat. Pathol. 2017;24:235–251. doi: 10.1097/PAP.0000000000000162. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Savas P., Salgado R., Denkert C., Sotiriou C., Darcy P.K., Smyth M.J., Loi S. Clinical relevance of host immunity in breast cancer: from TILs to the clinic. Nat. Rev. Clin. Oncol. 2016;13:228–241. doi: 10.1038/nrclinonc.2015.215. [DOI] [PubMed] [Google Scholar]
- 5.Salgado R., Denkert C., Demaria S., Sirtaine N., Klauschen F., Pruneri G., Wienert S., Van den Eynden G., Baehner F.L., Penault-Llorca F., et al. The evaluation of tumor-infiltrating lymphocytes (TILs) in breast cancer: recommendations by an International TILs Working Group 2014. Ann. Oncol. 2015;26:259–271. doi: 10.1093/annonc/mdu450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Cortazar P., Zhang L., Untch M., Mehta K., Costantino J.P., Wolmark N., Bonnefoi H., Cameron D., Gianni L., Valagussa P., et al. Pathological complete response and long-term clinical benefit in breast cancer: the CTNeoBC pooled analysis. Lancet. 2014;384:164–172. doi: 10.1016/S0140-6736(13)62422-8. [DOI] [PubMed] [Google Scholar]
- 7.Denkert C., von Minckwitz G., Brase J.C., Sinn B.V., Gade S., Kronenwett R., Pfitzner B.M., Salat C., Loi S., Schmitt W.D., et al. Tumor-infiltrating lymphocytes and response to neoadjuvant chemotherapy with or without carboplatin in human epidermal growth factor receptor 2-positive and triple-negative primary breast cancers. J. Clin. Oncol. 2015;33:983–991. doi: 10.1200/JCO.2014.58.1967. [DOI] [PubMed] [Google Scholar]
- 8.Leon-Ferre R.A., Jonas S.F., Salgado R., Loi S., de Jong V., Carter J.M., Nielsen T.O., Leung S., Riaz N., Chia S., et al. Tumor-Infiltrating Lymphocytes in Triple-Negative Breast Cancer. JAMA. 2024;331:1135–1144. doi: 10.1001/jama.2024.3056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Sun Y.P., Ke Y.L., Li X. Prognostic value of CD8(+) tumor-infiltrating T cells in patients with breast cancer: A systematic review and meta-analysis. Oncol. Lett. 2023;25:39. doi: 10.3892/ol.2022.13625. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ali H.R., Provenzano E., Dawson S.J., Blows F.M., Liu B., Shah M., Earl H.M., Poole C.J., Hiller L., Dunn J.A., et al. Association between CD8+ T-cell infiltration and breast cancer survival in 12,439 patients. Ann. Oncol. 2014;25:1536–1543. doi: 10.1093/annonc/mdu191. [DOI] [PubMed] [Google Scholar]
- 11.Mao Y., Qu Q., Chen X., Huang O., Wu J., Shen K. The Prognostic Value of Tumor-Infiltrating Lymphocytes in Breast Cancer: A Systematic Review and Meta-Analysis. PLoS One. 2016;11 doi: 10.1371/journal.pone.0152500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Sun Y., Wang Y., Lu F., Zhao X., Nie Z., He B. The prognostic values of FOXP3(+) tumor-infiltrating T cells in breast cancer: a systematic review and meta-analysis. Clin. Transl. Oncol. 2023;25:1830–1843. doi: 10.1007/s12094-023-03080-1. [DOI] [PubMed] [Google Scholar]
- 13.Qin Y., Peng F., Ai L., Mu S., Li Y., Yang C., Hu Y. Tumor-infiltrating B cells as a favorable prognostic biomarker in breast cancer: a systematic review and meta-analysis. Cancer Cell Int. 2021;21:310. doi: 10.1186/s12935-021-02004-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Garaud S., Buisseret L., Solinas C., Gu-Trantien C., de Wind A., Van den Eynden G., Naveaux C., Lodewyckx J.N., Boisson A., Duvillier H., et al. Tumor infiltrating B-cells signal functional humoral immune responses in breast cancer. JCI Insight. 2019;5 doi: 10.1172/jci.insight.129641. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Allison E., Edirimanne S., Matthews J., Fuller S.J. Breast Cancer Survival Outcomes and Tumor-Associated Macrophage Markers: A Systematic Review and Meta-Analysis. Oncol. Ther. 2023;11:27–48. doi: 10.1007/s40487-022-00214-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Asano Y., Kashiwagi S., Goto W., Kurata K., Noda S., Takashima T., Onoda N., Tanaka S., Ohsawa M., Hirakawa K. Tumour-infiltrating CD8 to FOXP3 lymphocyte ratio in predicting treatment responses to neoadjuvant chemotherapy of aggressive breast cancer. Br. J. Surg. 2016;103:845–854. doi: 10.1002/bjs.10127. [DOI] [PubMed] [Google Scholar]
- 17.Miyashita M., Sasano H., Tamaki K., Hirakawa H., Takahashi Y., Nakagawa S., Watanabe G., Tada H., Suzuki A., Ohuchi N., Ishida T. Prognostic significance of tumor-infiltrating CD8+ and FOXP3+ lymphocytes in residual tumors and alterations in these parameters after neoadjuvant chemotherapy in triple-negative breast cancer: a retrospective multicenter study. Breast Cancer Res. 2015;17:124. doi: 10.1186/s13058-015-0632-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Fukui R., Fujimoto Y., Watanabe T., Inoue N., Bun A., Higuchi T., Imamura M., Morimoto K., Hirota S., Miyoshi Y. Association Between FOXP3/CD8 Lymphocyte Ratios and Tumor Infiltrating Lymphocyte Levels in Different Breast Cancer Subtypes. Anticancer Res. 2020;40:2141–2150. doi: 10.21873/anticanres.14173. [DOI] [PubMed] [Google Scholar]
- 19.Ali H.R., Chlon L., Pharoah P.D.P., Markowetz F., Caldas C. Patterns of Immune Infiltration in Breast Cancer and Their Clinical Implications: A Gene-Expression-Based Retrospective Study. PLoS Med. 2016;13 doi: 10.1371/journal.pmed.1002194. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Engels C.C., Charehbili A., van de Velde C.J.H., Bastiaannet E., Sajet A., Putter H., van Vliet E.A., van Vlierberghe R.L.P., Smit V.T.H.B.M., Bartlett J.M.S., et al. The prognostic and predictive value of Tregs and tumor immune subtypes in postmenopausal, hormone receptor-positive breast cancer patients treated with adjuvant endocrine therapy: a Dutch TEAM study analysis. Breast Cancer Res. Treat. 2015;149:587–596. doi: 10.1007/s10549-015-3269-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Danenberg E., Bardwell H., Zanotelli V.R.T., Provenzano E., Chin S.F., Rueda O.M., Green A., Rakha E., Aparicio S., Ellis I.O., et al. Breast tumor microenvironment structures are associated with genomic features and clinical outcome. Nat. Genet. 2022;54:660–669. doi: 10.1038/s41588-022-01041-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Candido Dos Reis F.J., Wishart G.C., Dicks E.M., Greenberg D., Rashbass J., Schmidt M.K., van den Broek A.J., Ellis I.O., Green A., Rakha E., et al. An updated PREDICT breast cancer prognostication and treatment benefit prediction model with independent validation. Breast Cancer Res. 2017;19:58. doi: 10.1186/s13058-017-0852-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Rubin D.B. John Wiley & Sons Inc.; 1987. Multiple Imputation for Nonresponse in Surveys. [DOI] [Google Scholar]
- 24.Harrell F.E. 2nd Edition. Springer; 2015. Regression Modeling Strategies, with Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. [DOI] [Google Scholar]
- 25.Cochran W.G. The combination of estimates from different experiments. Biometrics. 1954;10:101–129. [Google Scholar]
- 26.Higgins J.P.T., Thompson S.G. Quantifying heterogeneity in a meta-analysis. Stat. Med. 2002;21:1539–1558. doi: 10.1002/sim.1186. [DOI] [PubMed] [Google Scholar]
- 27.R Core Team . R Foundation for Statistical Computing; Vienna, Austria: 2021. R: A Language and Environment for Statistical Computing. [Google Scholar]
- 28.Posit team. Rstudio. Integrated Development Environment for R. Posit Software, PBC; Boston, MA: 2026. http://www.posit.co/ [Google Scholar]
- 29.Wickham H., Averick M., Bryan J., Chang W., McGowan L., François R., Grolemund G., Hayes A., Henry L., Hester J., et al. Welcome to the tidyverse. J. Open Source Softw. 2019;4:1686. doi: 10.21105/joss.01686. [DOI] [Google Scholar]
- 30.Robinson D., Hayes A., Couch S. broom: Convert Statistical Objects into Tidy Tibbles. 2023. https://CRAN.R-project.org/package-broom
- 31.Scheinin I., Kalimeri M., Jagerroos V., Parkkinen J., Tikkanen E., Würtz P., Kangas A. ggforestplot: Forestplots of Measures of Effects and Their Confidence Intervals. 2024. https://github.com/nightingalehealth/ggforestplot
- 32.Edwards S.L. lemon: Freshing Up your ‘ggplot2’ Plot. 2024. https://CRAN.R-project.org/package=lemon
- 33.Balduzzi S., Rücker G., Schwarzer G. How to perform a meta-analysis with R: a practical tutorial. Evid. Based. Ment. Health. 2019;22:153–160. doi: 10.1136/ebmental-2019-300117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.van Buuren S., Groothuis-Oudshoorn K. mice: Multivariate Imputation by Chained Equations in R. J. Stata. Softw. 2011;43:1–67. doi: 10.18637/jss.v045.i03. [DOI] [Google Scholar]
- 35.Pedersen T. patchwork: The Composer of Plots. 2022. httpa://CRAN.R-porjct.org/package=patchwork
- 36.Therneau T. A Package for Survival Analysis in. 2024. https://CRAN.R-project.org/package=survival [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
-
•
Data: The tissue segmentation and TIIC scores generated by the Halo algorithm, together with the de-identified phenotype data, have been deposited in the European Genome Phenome Archive at https://ega_archive.org/datasets/EAD50000002125 and are publicly available as of the date of publication.
-
•
Code: The analysis code (R markdown) has been deposited on GitHub at https://github.com/paul-pharoah/breast-til and is publicly available as of the date of publication.
-
•
Additional information: Any additional information required to reanalyze the data reported in this article is available from the lead contact upon request.



