Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2026 Feb 4:2026.02.02.703365. [Version 1] doi: 10.64898/2026.02.02.703365

The CD8 immgenT framework as a universal reference of mouse CD8αβ T cell differentiation states

Giovanni Galletti 1, Anna-Maria Globig 2,3, Olga Barreiro 4, Taylor A Heim 5,6, Shuozhi Liu 1,3, Samantha M Borys 7, Odhran Casey 4, Alexander Monell 1, Dhruv Patravali 1, Nicole E Scharping 1, Sara Quon 1, Kennidy K Takehara 1,2, Amir Ferry 1, Kitty P Cheung 1, Ellen Duong 5,6, Tomoyo Shinkawa 8, Stefani Spranger 5,6,9, Samuel M Behar 8, Susan M Kaech 2,3, Ananda W Goldrath 1,2,*, David Zemmour 10,*; the immgenT Project
PMCID: PMC12889609  PMID: 41676596

Abstract

Mouse CD8 T cell differentiation has been studied extensively in models of infections and cancer, yet no unified framework spans the full spectrum of immunological contexts. We present the CD8 immgenT framework, integrating >200,000 single-cell transcriptomes and 128-plex surface proteomes from 734 samples spanning multiple perturbations, tissues, and timepoints. Unbiased analysis identifies 21 states encompassing naive, effector, circulating memory, tissue-resident memory, progenitor-exhausted, and terminally-exhausted compartments, among others. These states re-emerge with striking molecular convergence across acute/chronic infections, cancer, autoimmunity, aging, and homeostasis, showing that near-identical transcriptional programs support protective or dysfunctional outcomes depending on developmental history and microenvironment. Classic archetypes map to discrete clusters but exhibit unappreciated heterogeneity and overlap, cautioning against rigid nomenclature. We provide validated combinatorial markers, flow cytometry gating strategies, and immgenT reference-based integration for reproducible annotation of new datasets. This universal coordinate system harmonizes fragmented CD8 T cell literature and clarifies relationships across diverse immune challenges.

INTRODUCTION

CD8 T cells are critical mediators of protective immunity against intracellular pathogens and tumors. Upon antigen encounter, naïve CD8 T cells undergo massive clonal expansion and acquire diverse functional properties shaped by the intensity, duration, and anatomical location of stimulation. In acute resolving infections, most activated cells transiently adopt potent cytotoxic and migratory capabilities to clear the pathogen, while a fraction survives long-term with enhanced recall potential. Among long-lived populations, some recirculate through blood and lymphoid tissues to mount systemic responses upon re-challenge, whereas others permanently settle in non-lymphoid organs, providing immediate, localized protection at barrier and parenchymal sites such as skin, mucosa, liver, lung, and kidney. These tissue-anchored cells exhibit rapid cytokine and cytotoxic responses upon local antigen re-encounter and are increasingly recognized as key correlates of durable vaccine- and immunotherapy-induced protection. Their establishment depends on local environmental cues, including TGF-β, retinoic acid, and chemokine gradients1,2, which drive expression of integrins (e.g., CD103/αEβ7, CD49a/α1β1) and CXCR63 for tissue retention and positioning. Transcriptional regulators including Hobit (Zfp683), Runx3, and others4 promote and reinforce long-term residence, although cells in different organs rely on partially distinct gene modules46, reflecting microenvironmental adaptation and complicating the definition of universal residency signatures.

In settings of persistent antigen exposure, such as chronic viral infections or tumors, CD8 T cells progressively lose effector functions, upregulate multiple inhibitory receptors (e.g., PD-1, TIM-3, LAG-3, TIGIT), and adopt altered metabolic and transcriptional states dominated by TOX711. Within these populations, a subset retains stem-like self-renewal capacity and continuously generates differentiated progeny with reduced cytokine-producing and cytotoxic potential7,1221. The dynamic balance between self-renewing and terminally differentiated dysfunctional compartments strongly influences the success of immune checkpoint blockade therapies2228. Distinguishing long-term tissue-anchored T cells arising after acute resolved stimulation from those persisting under chronic antigen exposure remains challenging, as both share core residency programs and surface markers (e.g., CD103, CD49a, CXCR6), fueling debate about their developmental relationships and functional equivalence29,30.

Over the past two decades, the immunology community has categorized these context-dependent behaviors into canonical subsets (e.g., circulating central memory (TCM), effector memory (TEM), tissue-resident memory (TRM), progenitor-exhausted (TPEX), and terminally-exhausted (TEX) CD8 T cells), using a nomenclature that has proved invaluable for communication. However, widespread adoption of discrete labels has sometimes encouraged over-classification and obscured the profound molecular convergence evident from single-cell studies: near-identical transcriptional states can arise in dramatically different contexts and support protective or dysfunctional outcomes depending on developmental history and ongoing microenvironmental cues. This convergence, together with subtle laboratory variations in gating and naming of highly similar populations, has generated persistent ambiguity in the literature31. A unified, single-cell-resolution reference capturing CD8 T cell states across diverse perturbations, tissues, and time points could harmonize annotation and clarify biological relationships.

Here, we present the CD8 immgenT framework, an integrated single-cell transcriptomic and surface-protein (128-marker panel) reference of >200,000 CD8αβ T cells from 734 samples spanning 80 experiments, 45 immunological challenges, and 45 tissues. By combining unbiased clustering, CITE-Seq-derived combinatorial surface markers, topic modeling, and tetramer/TCR-based antigen-specific tracking, we resolve 21 robust CD8 T cell states that capture naive, effector, circulating memory, tissue-resident memory, progenitor-exhausted, and terminally-exhausted populations. We define precise phenotypic signatures, flow cytometry gating strategies, and interpretable gene programs (GP) for each state, uncover previously underappreciated heterogeneity, and establish a projection framework (immgenT reference-based integration, T-RBI) that enables rapid, consistent annotation of new CD8 T cell datasets onto this unified map. The CD8 immgenT framework provides the first comprehensive single-cell coordinate system for mouse CD8 T cell differentiation. This publicly accessible, deeply curated resource resolves fragmentation in the literature, offers immediately actionable tools for cell-state identification and isolation, and establishes a universal standard for mapping CD8 T cell heterogeneity.

RESULTS

Building the immgenT CD8αβ reference

As part of the immgenT initiative to systematically and comprehensively chart every mouse T cell across tissues and perturbations, we sequenced 206,160 CD8 T cells across 734 samples spanning 80 experiments, 45 immunological challenges (e.g., infections, cancer, autoimmunity), and 45 tissues (Fig. 1a,b, Extended Data Table 1 and immgenT companion articles (immgenT-Cosmology ms) for additional details). Among these, we tracked 26,928 antigen-specific cells identified by tetramer staining, congenic markers, or paired alpha/beta TCR sequencing (e.g., P14, GP33 tetramer+ cells, OT-I) (Fig. 1b). As established in the immgenT cosmology paper (immgenT-Cosmology ms), CD8αβ+ T cells formed distinct transcriptional clusters separate from other T cell lineages (Fig. 1c), with CITE-Seq protein measurements confirming uniform co-expression of CD8α and CD8β on each CD8 T cell (Extended Data Fig. 1a).

Figure 1. A universal single-cell framework of mouse CD8αβ T cell differentiation states.

Figure 1.

a, schematic of the conditions and main tissues selected to build the immgenT framework; b, table summarizing main features of the CD8 T cell immgenT framework; c, UMAP projection of the CD3 T cell immgenT cosmology annotated by main T cell subsets; d, UMAP projection of the CD8 T cell immgenT framework annotated by main clusters; e, balloon plots showing mean proportion of immgenT conditions grouped by main categories across clusters (left) and presence/absence of antigen specific/endogenous CD8 T cells across the main tissues (right); f, cluster saturation graph showing number of clusters reaching more than 100 cells upon progressive inclusion of new immgenT experiments. Abbreviations: CNS, central nervous system; Ag, antigen; TREG, regulatory T cells; DP, double-positive; DN, double-negative; non-conv., non-conventional; cl, cluster; Prolif., proliferating.

To resolve CD8 T cell diversity, a CD8-focused Minimum Dimensional Embedding (MDE) was generated (immgenT-Cosmology ms), revealing a continuum of states across samples from 45 immunological challenges (Fig. 1d,e). Cluster saturation occurred after about half the immgenT experiments (IGT), defining 21 discrete T cell states with dynamic RNA and protein patterns (Fig. 1f, 2a). Non-consecutive cluster numbering reflects iterative reassignments of some initial clusters to other T cell lineages during analysis, while original labels were kept for consistency. Rare atypical states, including proliferating cells and “Miniverse”, were left unnumbered to prioritize the 19 core recurrent CD8 states. For external validation, we projected 8 published CD8 datasets3239 (104,024 cells) onto the reference using T-RBI, a scVI/SCANVI-based integration and label-transfer method (Extended Data Fig. 1b,c, Extended Data Table 2).

Figure 2. Recurrent CD8 T cell states across diverse immunological contexts reveal molecular convergence and heterogeneity.

Figure 2.

a, balloon plot showing mean expression of selected genes (left) and proteins (right) across immgenT clusters; b-c, UMAP projection of the CD8 T cell immgenT framework (gray) colored by selected experiments, tissues, and T cell types as indicated; d-e, bar graph showing the mean gene-program loading of GP9 (d) and GP11 (e) by CD8 T cells from cluster 1 and 2 (left) and heatmap showing expression of the top 30 genes from each gene-program across the same clusters (right); f, stacked bar plot showing the fraction of CD8 T cells from cluster 1 and 2 that belongs to secondary lymphoid organs (SLO) or non-lymphoid tissues (NLT). Abbreviations: cl, cluster; Prolif., proliferating; ADT, Antibody-Derived Tag; HDM, house dust mite; LN, lymph node; KO, knock-out; LP, lamina propria; w.p.i., weeks post-infection; mo, months; D, day; VV, vaccinia virus; GP, gene-program, n., number.

We next mapped widely studied CD8 responses using CD62L and CD44 expression40,41. CITE-Seq protein data showed CD62L+CD44 (naive/resting) cells in three clusters, CD62LCD44+ (antigen-experienced) in eight, CD62L+CD44+ (central memory-like) in four, and CD62LCD44 in two (Extended Data Fig. 1d). This partitioning revealed molecular convergence and context-dependent deployment across the atlas (Fig. 2ac). Indeed, all 21 clusters appeared in nearly every condition, including steady-state specific pathogen-free (SPF) mice (Fig. 2b,c, and Extended Data Fig. 1e), but proportions varied markedly: effector phases dominated by 14/15/28, memory by 4/5/10/12/13, persistent antigen by 7/10/11/12, and aging/bystander by 4/5 (detailed in Fig. 36). Genes diagnostic of archetypes (e.g., Tox, Pdcd1, Tnfrs18 encoding for GITR) were less selective, appearing across multiple clusters; Cxcr6 and Tcf7 also marked unexpected clusters (Fig. 2a). Samples from diverse perturbations (e.g., house dust mite allergy, allotransplant, PD-1 KO or Foxp3-mutant autoimmunity, infections) occupied overlapping but distinct regions, showing CD8 states are reusable yet utilized in condition-specific combinations (Fig. 2b,c). These findings indicate legacy subsets capture only part of the molecular repertoire, motivating unbiased atlas exploration.

Figure 3. Effector CD8 T cells comprise three molecularly distinct recurrent states.

Figure 3.

a, UMAP projection of the CD8 T cell immgenT framework (gray) colored by selected experiments and T cell types as indicated; b, scatter plots showing predicted CITE-Seq-based gating strategies for the isolation of the indicated clusters; c, scatter plots showing predicted CITE-Seq expression of CD11c and CD127 by clusters 14, 15, and 28; d, Gating enrichment projection over the CD8 T cell immgenT UMAP leveraging the strategy depicted in b and c; e, flow cytometry plots validating the gating strategies proposed in b and c; f, balloon plot showing the expression of a curated list of genes by cluster 14, 15, and 28; g, stacked bar graph showing the frequency of the indicated clusters across the top 50 samples and conditions; h, UMAP projection of the CD8 T cell immgenT framework (gray) colored by selected experiments and tissues as indicated. Abbreviations: d.p.i., days post-infection; Endo., endogenous; cl, cluster; D, day; w.p.i., weeks post-infection.

Figure 6. Exhaustion and residency-associated states form a continuum shaped by persistent antigen and tumor context.

Figure 6.

a, UMAP projection of the CD8 T cell immgenT framework (gray) colored by CD44+ clusters among the selected conditions as indicated; b, scatter plots showing predicted CITE-Seq-based gating strategies for the isolation of the indicated clusters; c, Gating enrichment projection over the CD8 T cell immgenT UMAP leveraging the strategies depicted in b; d, bar graph showing the frequency of the clusters 10, 7, 11, and 22 across samples (i.e., top10 for tumor, top15 for other perturbations for each cluster); e-g, Fold-change versus fold-change plots showing correlation of genes expressed by the clusters 7, 10, and 11, in each paired comparison; h, UMAP projection of the CD8 T cell immgenT framework (gray) overlayed with the TPEX cluster as defined in a selected public dataset36 after T-RBI and annotated by original authors’ annotation or immgenT’s; i, volcano plot representing cluster 22-specific signature obtained by comparison with all the other clusters. Abbreviations: cl, cluster; annot., annotation; vs., versus.

Then, we highlighted classic models of acute/chronic Lymphocytic choriomeningitis virus (LCMV) and three tumors (B16 melanoma, KRAS-driven (KP) lung adenocarcinoma, and pancreatic ductal adenocarcinoma (PDAC)) onto the reference (Fig. 2c and Extended Data Fig. 1f). Effector-phase (7–8 dpi) splenic CD8 T cells from LCMV-Armstrong and Clone 13 converged on clusters 14, 15, and 28. Late timepoints (27–30+ dpi) showed LCMVarm memory enriched in 4/5/12/13, while LCMVcl13 favored 7 and 12, with cluster 7 largely absent in LCMVarm (Extended Data Fig. 1f). In non-lymphoid tissues (e.g., small intestine), day 7 responses spanned 14/15/28 but were dominated by cluster 10 persisting as TRM-like. Patterns were consistent for endogenous and P14 cells. Tumor-infiltrating lymphocytes (TIL) mapped to 7/10/11, with cancer-specific biases: B16 melanoma across all three, KP lung to 7, PDAC to 11 (Fig. 2c). Notably, cluster 10 appeared in TRM-like and TIL settings, reflecting shared programs rather than fixed types. LCMV-derived T cell states are not universal as many conditions favor only one or two clusters, while others combine states that, in LCMV models, arise at different timepoints or tissues (Fig. 2b,c). Thus, some clusters resemble classic archetypes (e.g., TRM-like, TEX-like) but their broad presence cautions against fixed identities.

This extends to the naive compartment: CD62L+CD44 cells mapped overwhelmingly to clusters 1 and 2 with near-identical canonical naive gene expression (e.g., Sell, Ccr7, Tcf7). To better capture this heterogeneity, we generated transcriptional gene-programs (GP) using topic modeling-inspired empirical Bayes matrix factorization42,43 (see immgenT companion articles (immgenT-GP ms; immgenT-Cosmology ms)). Signature genes for each program were identified by differential gene-expression analysis that ranks genes by their specific enrichment in one program relative to all others, yielding compact and biologically interpretable gene sets. Cluster 1 showed ~1.8-fold higher GP9 activity and was >80% from secondary lymphoid organs, while cluster 2 ~2.9-fold higher GP11 and included ~50% contribution from non-lymphoid tissues (Fig. 2df). Naive CD5HI- and CD5LO-derived signatures44 showed modest, preferential enrichment in clusters 1 and 2, respectively (Extended Data Fig. 1g), indicating heterogeneity within the naive compartment45,46. Separately, an atypical cluster we called “Miniverse” resembled clusters 1/2 but showed elevated Pecam1, Runx3, and Tox (Fig. 2a), suggesting a distinct resting population with partial priming features that warrants further study.

Together, these analyses demonstrate that the immgenT framework captures CD8 diversity from time, tissue, and antigen exposure; reveals overlapping states across perturbations (e.g., chronic infection and cancer); and shows clusters as reusable transcriptional states implemented in diverse immunological contexts. We next analyze CD8 heterogeneity in detail, starting with effector, memory, and chronic states in canonical LCMV and cancer models before broader contexts.

Transcriptional, clonal, and phenotypic heterogeneity of effector CD8 T cells

The clusters dominating day 7–8 responses after LCMVarm and LCMVcl13 infections (14, 15, 28) (Fig. 3a) prompted reexamination of effector-cell diversity and conditions. T-RBI mapping of a published day 8 LCMVarm spleen dataset37 confirmed effector subsets localize to these clusters (Extended Data Fig. 2a). Classically, day 7 CD8 T cells divide into KLRG1+CD127 terminally differentiated effectors and rare KLRG1CD127+ memory precursors41,47,48. At peak expansion (day 7–8), antigen-specific CD8 T cells include “early-effector” cells49,50 (KLRG1CD127) that give rise to KLRG1+CD127 terminally differentiated cells and minor KLRG1CD127+ precursors41,47,48, plus KLRG1+CD127+ effectors with intermediate T-bet, reduced IL-2, and proliferative potential51. Classic KLRG1/CD127 profiling leaves many CD127, KLRG1-low/negative cells unaccounted for, underestimating the full short-lived effector compartment. The immgenT framework resolves the CD127 effector compartment into three distinct clusters, revealing underappreciated heterogeneity.

In the CD8 immgenT framework, surface-protein measurements by CITE-Seq showed clusters 14, 15, and 28 as CD44+CD62LCD127, with only cluster 14 expressing KLRG1 (Fig. 3b). Surprisingly, CD11c emerged as a more consistent marker across all three (specificity = 94%, Extended Data Table 3) (Fig. 3c). Thus, CD44+CD62LCD127CD11c+ captures the combined 14/15/28 effector family, with KLRG1 distinguishing cluster 14. Applying this gating strategy confirmed localization to the cluster 14/15/28 MDE region (Fig. 3d). Flow cytometry validated CD11c expression on splenic CD62LCD44+ CD8 T cells (LCMVarm, day 7; Fig. 3e) and lung T cells (Flu-OVA, Extended Data Fig. 2b), with only a subset KLRG1+. CD11c was high specifically on day 7 effectors, low on naive (CD62L+CD44) and day 30 memory (CD62LCD44+) cells (Extended Data Fig. 2c,d), confirming acute-phase specificity. Notably, endogenous and P14 CD44+CD11c+ cells included all described effector subsets (including memory precursors and terminally differentiated effectors), highlighting CD11c’s ability to distinguish this populations from other T cell states (Extended Data Fig. 2e). CD11c levels on T cells are lower than on DCs, requiring bright fluorophores and careful titration (Extended Data Fig. 2f). Time-course analysis showed CD11c expression peaking at day 7 among endogenous CD44+, consistent with published scRNA-Seq52 (Extended Data Fig. 2g). These data place clusters 14, 15, and 28 in the canonical short-lived/terminally differentiated effector compartment. While not new subsets, they show the CD127 pool, classically viewed as homogeneous or KLRG1-subdivided, comprises at least three transcriptionally distinct states. CD11c outperforms KLRG1 in uniformly capturing this family across challenges and tissues.

Transcriptional comparison showed top differentially expressed genes heterogeneous among 14/15/28 (Fig. 3f). Many of these transcripts belonged to GP10 and GP25 (Extended Data Fig. 2h,i), which are also active in cluster 12, the terminally-differentiated effector memory (tTEM)/long-lived effector cell (LLEC) cluster discussed in Figure 4. TCR analysis revealed clonal overlap among 14/15/28 (Extended Data Fig. 2j), suggesting branched differentiation, interconversion, or both from shared progenitors. Cluster distribution across 734 samples showed enrichment in LCMVarm (lymphoid and non-lymphoid tissues), with cluster 15 most common overall and 14 prominent in LCMVarm (Fig. 3g,h). However, these clusters appeared also in other infections (Mycobacterium tuberculosis (MTB) lung, Chlamydia uterus), autoimmunity (scurfy mice), and cancers (B16, PDAC) under checkpoint blockade or ACT.

Figure 4. Circulating memory CD8 T cell states are shared across acute infection, cancer, autoimmunity, aging and homeostasis.

Figure 4.

a, UMAP projection of the CD8 T cell immgenT framework (gray) colored by selected experiments and T cell types as indicated; b, UMAP projection of the CD8 T cell immgenT framework (gray) overlayed with blood and spleen P14 CD8 T cells isolated from a selected public dataset5 after T-RBI; c, scatter plots showing predicted CITE-Seq-based gating strategies for the isolation of the indicated clusters; d, balloon plot showing the expression of a curated list of genes by cluster 4, 5, 12, and 13; e, stacked bar graph showing the frequency of the indicated clusters across the top 50 samples and conditions; f, UMAP projection of the CD8 T cell immgenT framework (gray) colored by selected experiments and tissues as indicated. Abbreviations: d.p.i., days post-infection; tet, tetramer; cl, cluster; annot., annotation; mo, months; w.p.i., weeks post-infection; D, day.

Together, clusters 14, 15, and 28 represent interconnected, molecularly distinct short-lived effector states across infections, autoimmunity, and cancer. CD11c serves as a universal surface marker, offering a robust tool to identify and isolate the short-lived effector CD8 T cell compartment.

The circulating memory CD8 T cell compartment in secondary lymphoid organs is reused across immune states

Clusters 4, 12, and 13 (and smaller cluster 5) dominate late/memory phases (days 27–30+) of antigen-specific CD8 T cell responses (e.g., P14 or GP33 tetramer+) in spleen and lymph nodes after acute LCMVarm infection (Fig. 4a). A small fraction of tetramer+ cells from memory timepoints mapped to naive-like clusters 1–2, likely due to low-avidity or nonspecific binding from cross-reactivity/background staining. T-RBI projection of a public dataset from Crowl et al.5 (P14 cells, blood/spleen day 32 LCMVarm) confirmed these as “circulating memory” populations (Fig. 4b).

Surface-protein profiles aligned with classical delineations5356 (Fig. 4c and Extended Data Fig. 3a): clusters 4 and 5 were CD62L+CD44+CD127+ (TCM-like); cluster 13 was CD62LCD44+CD127+ (TEM-like); cluster 12 showed CD62LCD44+CD127 with elevated KLRG1, variably termed LLEC or tTEM. Previous studies54,56 defined LLEC as a persistent memory subset with effector-like features (high KLRG1, granzyme B, CX3CR1), homeostatic proliferation, reduced IL-15 dependence, and tissue-entry flexibility upon rechallenge. Interestingly, tTEM are described55 as a KLRG1+-derived terminally-differentiated population with potent cytotoxicity and limited multipotency/recall. Despite contextual differences, these populations share core phenotypes (KLRG1+CD127LOCD62LLO) and transcriptional signatures, suggesting they represent the same or highly similar state. Very similar features also underlie the KLRG1+ “exhausted KLR” state arising in chronic LCMV and tumors57,58, revealing convergence across functional and dysfunctional lineages. CITE-Seq protein-based gating mapped these phenotypes to their clusters (Extended Data Fig. 3b), with independent flow cytometry confirming matching populations in day 30 LCMVarm endogenous CD8 T cells (Extended Data Fig. 3a).

Gene expression followed expected patterns with clusters 4, 5, and 13 expressing variable Tcf7, Id3, and Il7r (memory-associated). On the other hand, cluster 12 expressed Zeb2, lacked Il7r, maintained high effector genes (Gzma, Gzmb, Gzmk, Ccl5, Klrg1) (Fig. 4d), and was enriched for the effector program GP10 (shared with acute effector states; Extended Data Fig. 2h). Cluster 5 was enriched for GP16 (interferon-responsive, e.g., Ifit1, Ifit3, Isg15), explaining its distinction from cluster 4 and suggesting unique type I/II IFN responsiveness in a TCM-like subset59,60 (Extended Data Fig. 3c).

These states were not LCMV-restricted (Fig. 4e,f). Clusters 4/5/12/13 appeared in secondary lymphoid organs during other infections (e.g., “dirty” mice), non-lymphoid tissues (uterus in Chlamydia, lung in MTB), TIL/tumor-draining LNs, autoimmune models (Foxp3-mutant), and baseline homeostasis (liver, lung, colon, peritoneal cavity). Clusters 4 and 5 increased with aging in SPF mice (IGT29), consistent with virtual memory-like cells61. Bystander-activated CD8⁺ T cells in vaccinia scarification62 (IGT46) also mapped to 4/5/13 (Extended Data Fig. 3d). Notably, few CD44+KLRG1CD127+CD11c+ cells, consistent with memory precursor phenotype (introduced in Extended Data Fig. 2), were included in cluster 13 (Extended Data Fig. 3e). Across this range of immunological conditions, cluster-defining gene signatures remained stable (Extended Data Fig. 3fi).

In conclusion, circulating memory states classically described as TCM, TEM, and tTEM/LLEC map to clusters 4, 5, 12, and 13 in the immgenT framework. These represent recurring states adopted by CD8 T cells across infections, cancer, autoimmunity, aging, and homeostasis, including persistent-antigen settings where “true” memory cannot form. We therefore recommend caution against labeling all cells in these clusters strictly as TCM, TEM, or tTEM/LLEC.

A high-resolution molecular framework of tissue-resident memory CD8 T cells in non-lymphoid organs

Cluster 10 dominates antigen-specific CD8 T cells in non-lymphoid tissues at memory timepoints, such as day 30+ after LCMVarm infection (P14 cells) (Fig. 5ac) and day 23 after Flu-OVA (OT-I cells) (Fig. 5d), establishing it as the principal state for canonical TRM CD8 T cells. T-RBI projection of a public dataset5 (P14 cells from small intestine and kidney, day 32 LCMVarm) confirmed canonical TRM across barrier and parenchymal tissues maps to cluster 10 (Fig. 5e). This prompted us to further characterize the molecular identity of TRM CD8 T cells across the extensive immgenT framework.

Figure 5. Tissue-resident memory CD8 T cells converge on a single dominant state despite extensive tissue-specific adaptation.

Figure 5.

a, UMAP projection of the CD8 T cell immgenT framework (gray) highlighting cells from cluster 10; b, UMAP projection of the CD8 T cell immgenT framework (gray) colored by selected experiments and tissues from day 30–60 post-LCMVarm infection (IGT38/40); c, stacked bar graph showing the number of cells from the conditions as in b across the CD8 T cell immgenT clusters; d, UMAP projection as in a but colored by selected conditions and T cell types as indicated; e, UMAP projection of the CD8 T cell immgenT framework (gray) overlayed with small intestine and kidney P14 CD8 T cells isolated from a selected public dataset5 after T-RBI; f, scatter plots showing predicted CITE-Seq-based gating strategies for the isolation of the indicated clusters (left) with gating enrichment projection over the CD8 T cell immgenT UMAP leveraging this strategy (right); g, balloon plot showing the expression of a curated list of genes by CD8 T cells isolated from tissues at day 30+ after LCMVarm infection (spleen from healthy mouse included as reference); h, scatter plots showing the CITE-Seq expression of CD49a and CD103 by all the CD8 T cells from cluster 10 (density) or other clusters (gray) across selected tissues; i, bar graph showing the mean gene-program loading of GP26 and GP35 by CD8 T cells isolated from the indicated tissues at day 30 post-LCMVarm infection (IGT38, left) and heatmap showing expression of the top 10 genes from each gene-program across the same tissues (right); Abbreviations: d.p.i., days post-infection; Endo., endogenous; annot., annotation; cl, cluster; CNS, central nervous system; LN, lymph node; GP, gene-program; Med., mediastinal; Mes., mesenteric; SG, salivary gland; sIEL, small intestine epithelium; sLPL, small intestine lamina propria.

Given the pronounced tissue-dependent phenotypic heterogeneity of TRM, we surveyed the 128-marker CITE-Seq panel within cluster 10. It expressed CD44, CD69, and CD73, and lacked CD62L, KLRG1, and CD39 (Fig. 5f). Cluster 10-defining genes showed relatively stable expression across non-lymphoid versus lymphoid compartments from LCMVarm (day 30+ LCMVarm, IGT38/40) (Fig. 5g) and overlapped with published TRM signatures5,63,64 (Extended Data Fig. 4a). Protein expression of classic integrins CD49a and CD103 was highly heterogeneous within the cluster (Fig. 5h), CD103 predominated in small intestine/colon, CD49a in salivary gland/lung, with many tissues showing mixtures (CD103+CD49a, CD103CD49a+, double-positive). Thus, neither integrin is required for the core TRM program, but rather they reflect tissue-specific tuning. Additionally, CD73+CD39 expression distinguished cluster 10 from adjacent clusters 7 and 11 (Fig. 5f and Extended Data Fig. 4b; Fig. 6). CITE-Seq surface-protein information guided flow cytometry-like gating strategies (Extended Data Table 4). No single combination was universally sensitive/specific. CD69+CD103±/CD49a± was most sensitive (median 44%) (Extended Data Fig. 4c) and adding CD73+CD39 further improved purity (median 73% to 78%) (Extended Data Fig. 4d). CD62LCD44+CD69+CD73+CD39CD103±/CD49a± gates mapped tightly to cluster 10 (Fig. 5f). Flow cytometry validation confirmed inter- and intra-tissue variation among post-LCMV P14 memory cells (Extended Data Fig. 4e). Specifically, cells were predominantly CD73+ in lung/small intestine, CD39 negative in lung but mostly positive in salivary gland/small intestine, while CD103/CD49a remained largely consistent with the CITE-Seq gating. Thus, CITE-Seq-guided strategies (especially CD73+CD39 anchored) improve traditional integrin gating, yet truly universal surface proteins for TRM identification remain elusive.

Key TRM transcription factors showed expected patterns: Hic15,65 and Zfp68363,65 (encoding for Hobit) were enriched in cluster 10; Runx364,65 was broadly elevated across antigen-experienced clusters (Extended Data Fig. 4f), consistent with its conserved role in sustaining cytotoxic/residency programs64,66,67. Clusters 26 (expressing Hic1) and 25 (expressing both Hic1 and Zfp683) showed partial sharing of cluster 10 TRM transcriptional network with IEL-enriched CD8 subsets in gut/mammary gland (Extended Data Fig. 4g,h).

Gene-program analysis of IGT38 (TRM across tissues, day 30 LCMVarm) revealed cluster 10 heterogeneity driven by GP26 (e.g., Cxcr6, Nkg7, S100a6; enriched in prostate/salivary gland) and GP35 (e.g., Nr4a1, Klf6, Fos, Jun; selective in small intestine epithelium/lamina propria) (Fig. 5i). These likely reflect microenvironmental cues: TGF-β-rich intestinal niches promote Itgae and AP-1 responses1,68 while CXCL16+ niches in salivary gland/prostate support CXCR6 retention69. Notably, Fos, Jun, and Nr4a1 can be amplified by enzymatic digestion5,70,71, suggesting GP35 in gut TRM may partially reflect methodological stress/activation enhancement.

In conclusion, TRM cells consistently converge into cluster 10 across the immgenT framework. This dataset provides a unified resource for dissecting surface-marker phenotypes, transcriptional programs, and tissue-specific specializations defining TRM biology across infections and tissues.

Mapping exhaustion states in chronic antigen-driven CD8 T cell responses

The immgenT framework captures classic chronic stimulation/exhaustion contexts, including chronic infections (e.g., LCMVcl13), tumors (B16 (IGT35), KP lung (IGT95/96), PDAC (IGT64/65)), and autoimmunity, showing enrichment in clusters 7 and/or 11 (Fig. 6a, 2c, and Extended Data Fig. 1f). Cluster 10, while dominated by TRM, also included TIL-like cells, consistent with links between TRM and anti-tumor immunity29,30,64,7276. T-RBI projection of independent TIL datasets from B16 melanoma and MC38 colorectal carcinoma confirmed localization to clusters 7, 10, and 11 (Extended Data Fig. 5a). Projection of TEX cells from chronic LCMVcl13 (ref. 36) mapped primarily to clusters 7 and 10 (Extended Data Fig. 5b). Cluster 10 also harbored cells from other persistent infections (e.g., MNV-CR6) (Extended Data Fig. 5c), indicating it encompasses a broader spectrum, potentially including progenitor and terminally exhausted states, beyond pure TRM.

CITE-Seq profiling showed clusters 10, 7, and 11 with mutually exclusive surface patterns relative to effector/memory states (Fig. 6b): cluster 7 lacked CD62L, CD11c, CD127, KLRG1, CD73, GITR, TIM3, but expressed CD44 and CD39; cluster 11 shared this but co-expressed GITR and TIM377; cluster 10 lacked CD39, GITR, TIM3, and expressed CD73 (Fig. 4f). These combinations enabled clean gating to clusters 7 and 11 in the MDE (Fig. 6c), providing practical flow cytometry strategies.

Across samples, cluster 10 predominated in non-tumor contexts (especially acute infection memory, e.g., LCMVarm, Flu), while present in tumors (Fig. 6d). Cluster 7 appeared in persistent-antigen infections (e.g., MCMV, MTB, LCMVcl13, MNV-CR6) and tumors. Cluster 11 was almost exclusively tumor-enriched. These clusters were represented at much lower frequency in autoimmune models (celiac-like, NOD, EAE; <0.01% of CD8 T cells) despite persistent antigen.

Transcriptionally, cluster 10 was the closest to cluster 7 (R2=0.604), while cluster 11 was the most distinct (R2=0.076) (Fig. 6eg). Canonical exhaustion genes (Lag3, Tigit, Pdcd1, Tox) were not specific to 7/10/11 but showed strong coordinated expression in these clusters (Extended Data Fig. 5d). Their co-expression was detected in multiple gene programs (GP12, GP81, GP110), with cluster 11 showing higher activity (Extended Data Fig. 5e,f), suggesting convergence of regulatory modules rather than a unique exhaustion program. Additionally, TRM signatures5,63,64 were comparably expressed in clusters 7 and 11 (Fig. 6gi, and Extended Data Fig. 5gi). This shared residency program highlights another challenge in current T cell nomenclature, as both TRM and a subset of TIL map to the same state (i.e., cluster 10), revealing convergent molecular circuitry operating in immunologically distinct contexts such as acute infection and cancer29,30. For example, transcriptomes of cluster 10 OT-I cells from Flu-OVA lung versus KP lung cancer showed minimal differences (Extended Data Fig. 5j,k), with similar surface proteins (Extended Data Fig. 5l). Past work has indicated that TILs in the KP lung model are refractory to ICB but can be reinvigorated by cytokines78,79, emphasizing that understanding these differences will be important for developing new therapeutics. Together, these contrasting proteomic, transcriptional, and immunological profiles demonstrate that clusters 7, 10, and 11 represent related but distinct CD8 TIL states shaped by persistent antigen exposure, tumor type, tissue microenvironment, and possibly sequential differentiation within the tumor microenvironment. TCR repertoire analysis revealed partial clonal overlap among 7/10/11 in B16 and KP tumors (Extended Data Fig. 5m), suggesting common progenitors or dynamic intratumoral transitions.

Unlike other subsets, TPEX cells lacked a dedicated cluster. T-RBI mapping of a published LCMVcl13 TPEX dataset36 placed them primarily in circulating memory-like cluster 4 and a rare state, cluster 22 (Fig. 6h). Cluster 22 showed a TPEX-like profile (e.g., Tcf7, Tox, Sell, Ccr7) (Fig. 6i) which appeared in B16 tumor-draining LNs and, unexpectedly, enriched in late VV memory LNs (80 dpi) (Fig. 6d), illustrating T-RBI’s utility for revealing unexpected biology.

In conclusion, clusters 7 and 11 represent distinct CD8 T cell states arising predominantly under persistent antigenic stimulation in infection and cancer. Through surface-marker definitions, gating strategies, gene signatures, and external dataset integration, the immgenT framework resolves the full spectrum of CD8 exhaustion and residency-related differentiation.

ImmgenT as a universal framework for mapping T cell data

To illustrate the practical utility of the CD8 immgenT framework as a universal reference, we present two case studies demonstrating how T-RBI projection enables rapid deconvolution of CD8 T cell heterogeneity in new datasets, guided by the immgenT general annotation system.

In the first case, we selected a dataset of mouse anti-CD19 CAR CD8 T cells39, representing an immune perturbation not included within the immgenT framework. The original study identified 11 clusters of CAR T cells generated in the context of B cell acute lymphoblastic leukemia (B-ALL) and CD19-expressing B16 melanoma (Fig. 7a). We projected the single-cell data onto the immgenT CD8 MDE, placing the query cells into a reusable, universal embedding instead of a study-specific UMAP (Fig. 7b). 95% of CAR T cells were assigned to immgenT CD8 clusters with high confidence (Extended Data Fig. 6a). A small fraction remained unassigned (5%), but these cells did not form coherent clusters and hence did not represent a novel CD8 T cell state absent from the immgenT reference (Extended Data Fig. 6b). Overlaying the study’s original clusters showed that T-RBI preserved the dataset’s internal diversity, rapidly resolving the cell states originally labelled as stem-like, effector-like, and exhausted-like (Fig. 7a,b). In addition to providing harmonized clustering and annotation of the query dataset, the integration provided greater granularity than originally described. A frequency dot plot revealed how each of the published dataset cluster distributed across immgenT clusters (Fig. 7c, Extended Data Table 5). Stem-like (TS-like) cells split into two dominant states: 52% mapping to immgenT cluster 4 (TCM-like) and 33% to cluster 13 (TEM-like). TEX-like cells were heterogeneous, spanning clusters 7, 10, and 11 (as seen in Fig. 6). Effector-like cells mapped primarily to clusters 15 (effector-like; 47%). TRM-like cells mapped predominantly to cluster 10 (66%) with a minority mapping to cluster 22 (11%) which the immgenT framework previously associated to a TPEX-like population (described in Fig. 6).

Figure 7. T-RBI projection enables rapid, reproducible annotation of new CD8 T cell datasets onto the universal immgenT reference.

Figure 7.

a, UMAP projection of the CD8 chimeric antigen receptor (CAR) T cells from a publicly available dataset39; b, UMAP projection of the CD8 T cell immgenT framework (gray) overlayed with CD8 CAR T cells as in a but after T-RBI and annotated by authors’ original annotation or immgenT’s; c, dot plot showing the composition of the CD8 immgenT clusters by authors’ original annotations where each dot represents the proportion of cells within an immgenT cluster that originate from a specific authors’ original annotation as in a (Extended Data Table 5); d, UMAP projection of the scRNA-Seq layer of infection/tumor P14 CD8 T cells from a publicly available dataset80; e, UMAP projection of the CD8 T cell immgenT framework (gray) overlayed with P14 T cells as in d but after T-RBI and annotated by authors’ original annotation or immgenT’s; c, dot plot showing the composition of the CD8 immgenT clusters by authors’ original annotations where each dot represents the proportion of cells within an immgenT cluster that originate from a specific authors’ original annotation as in d (Extended Data Table 6). Abbreviations: Prolif., proliferating; TS-like, stem-like T cells; TEFF-like, effector-like T cells; TEX-like, exhausted-like T cells; TNK-like, natural killer-like T cells; TRM-like, resident memory-like T cells; ISG, interferon-stimulated genes.

In the second case, we isolated the scRNA-Seq layer from a scRNA-Seq/scATAC-Seq paired dataset80, comprising 46,062 TCR-transgenic P14 CD8 T cells across LCMVarm, LCMVcl13, and four GP33-expressing tumor models (B16 melanoma, mWnt triple-negative breast cancer, KrasG12D Trp53R127H Pdx1Cre (KPC) PDAC, and B-ALL) (Fig. 7d). The breadth of this dataset, able to capture antigen-specific T cell states across a broad variety of immunological conditions, prompted us to project it onto the immgenT reference. About 99% of the cells mapped to existing immgenT clusters with high confidence (Extended Data Fig. 6c,d) and without generating novel states, further validating the comprehensiveness of the immgenT framework (Fig. 7e). Similarly to the previous case study, T-RBI improved granularity (Fig. 7f, Extended Data Table 6). For example, TCM as defined by the authors split into two states with 38% assigned to cluster 4 and 43% assigned to cluster 13. On the other hand, T-RBI also seemed to converge the granularity identified by the authors onto the same molecular states. For example, the authors’ TR-tem and TRM showed dominant convergence on the same cluster 10. These results further highlight the immgenT framework’s strength to improve and streamline cluster annotation but also raise awareness about the importance of integrating additional information (e.g., ATAC-Seq) to advance from a static molecular map toward a more dynamic model of CD8 T cell differentiation.

Overall, these case studies demonstrate how T-RBI projection onto the immgenT framework enables rapid, reproducible, and interpretable annotation of new single-cell CD8 datasets. Furthermore, to extend our proposed marker-based approach to all 21 clusters, we provide a predicted flow cytometry panel of 13 CITE-Seq markers enabling sampling across the entire CD8 MDE space (Extended Data Fig. 6e). We generated 21 cluster- or family-specific gating strategies, validated by flow cytometry (Fig. 3e and Extended Data Fig. 2bf, 3a, 4e, 6f,g) and evaluated their sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) (Extended Data Fig. 6h, Extended Data Table 3). We developed an interactive online tool, namely Rosetta (https://rosetta.immgen.org/?q=&group=), allowing users to perform in silico Gating on CITE-Seq protein data and visualize gated cells directly on the CD8 MDE. By anchoring external data into this universal CD8 reference, the immgenT general annotation system reveals previously undetected heterogeneity and provides a robust framework for harmonizing CD8 T cell states across studies.

DISCUSSION

The CD8 immgenT framework provides the first comprehensive, single-cell-resolution coordinate system for mouse CD8 T cell differentiation across an unprecedented breadth of immunological contexts. Integrating >200,000 CD8 T cells from 45 perturbations, 45 tissues, and multiple timepoints, we resolve 21 robust transcriptional and proteomic states that recur predictably according to the nature, duration, and anatomical location of antigenic stimulation. Classical models (acute LCMVarm for effector-to-memory transitions, chronic LCMVcl13 or tumor challenge for exhaustion-like states) organize the atlas, yet the same states generalize remarkably well across diverse viral, bacterial, parasitic, and fungal infections, cancer, autoimmunity, aging, and steady-state homeostasis in SPF mice. Although variable representations of nearly all 21 clusters exist at baseline, their proportions are dramatically reshaped by immunological challenge, indicating that CD8 T cell states are context-dependent rather than fixed cell types. This convergence establishes the immgenT annotation system for CD8 T cells as the immunological equivalent of artificial general intelligence: a single fixed reference (21-state atlas + T-RBI) can take any new CD8 T cell dataset, from any condition, tissue, or laboratory, and return standardized, reproducible state labels without retraining.

CD8 differentiation is largely dictated by two major axes: acute versus chronic antigen exposure and lymphoid versus non-lymphoid residence, with time and tissue microenvironment imposing additional fine-tuning. Acute resolving challenges generate transient effectors (clusters 14/15/28) that transition into circulating memory (clusters 4/5/12/13) or tissue-resident memory (predominantly cluster 10). Persistent antigen in chronic infection or malignancy drives progenitor and terminally exhausted compartments (clusters 22, 7, and 11). Tissue context exerts particularly strong influence on cluster 10 (canonical TRM), whose surface integrin usage (CD103 versus CD49a) and gene-program use vary across organs yet converge on a unified transcriptional identity. Circulating states exhibit tighter coherence, underscoring tissue-imposed reprogramming as a dominant source of CD8 heterogeneity.

A striking observation is the low frequency of canonical exhaustion-like states (clusters 7 and 11) in autoimmune models despite chronic inflammation and persistent self-antigen. Although exhaustion-like profiles have been described in some autoreactive T cells, potentially restraining aberrant responses8184, this suggests fundamental differences in antigen perception and response in autoimmunity versus chronic viral infection or cancer, possibly due to deletional tolerance, altered presentation, insufficient co-stimulation, or active suppression. T-RBI offers an immediate tool to interrogate additional autoimmune datasets for cryptic exhaustion-like populations.

The global MDE highlights both the power and sensitivity of the framework: it appears continuum-like when superimposing hundreds of conditions, yet discrete clusters remain robust in RNA, protein, and sample space. Cluster-defining signatures are stable across vastly different contexts, capturing convergent molecular endpoints rather than idiosyncrasies. Cluster 10 exemplifies this, reconciling extensive tissue-driven heterogeneity within a single coherent state. Immunologists have long sought unequivocal cell-surface markers; the framework explains their elusiveness as most CD8 states require combinatorial signatures and coordinated gene programs rather than single proteins.

The convergent nature of CD8 states, reaching similar molecular endpoints via diverse trajectories dictated by context, suggests shared regulatory pathways amenable to therapeutic targeting across diseases. Caution is warranted in assigning rigid functional labels. Bona fide TRM from acute infections map to cluster 10, yet subsets of TIL in melanoma and other cancers occupy the same state, blurring protective residency and dysfunctional persistence. Recent work shows tumor-resident exhausted T cells are ontologically, clonally, and functionally distinct from canonical TRM despite shared residency features, with TRM rewired to exhaustion under chronic antigen but TEX unable to form conventional TRM upon antigen withdrawal29,30. Similarly, clusters 4/5/12/13 recapitulate TCM, TEM, and tTEM/LLEC after acute infection but identical states appear in cancer-draining LNs, autoimmune lesions, aged mice, and unchallenged animals. These findings argue against overly prescriptive nomenclature and favor context-aware, framework-based descriptions that acknowledge molecular convergence.

Recent community guidelines for T cell nomenclature recognize the limitations of rigid subsets, advocating a modular paradigm that denotes individual properties (e.g., stemness, residency, exhaustion) with concise descriptors31. This aligns closely with immgenT: the 21 clusters provide an empirically derived coordinate system that anchors modular descriptors to molecular data, enabling precise, context-specific interpretations. Together, these efforts support viewing T cells as dynamic states rather than static subsets.

The public resources (T-RBI, Rosetta) extend utility: T-RBI harmonizes annotations and discovers rare subsets (e.g., placing TPEX in cluster 22 and revealing its enrichment at late vaccinia memory). Rosetta enables in silico gating on the 128-marker CITE-Seq panel. A streamlined 13-marker flow panel and validated strategies for all 21 states further democratize access (https://www.immgen.org/ImmGenT/). The immgenT companion articles together build a comprehensive view of T cell biology across lineages and contexts. The cosmology paper establishes the global landscape of mouse T cells, positioning CD8 states within this unified framework (immgenT-Cosmology ms). Additional companions connect CD8 states to the broader T cell universe: the gene-program paper defines compact, interpretable gene programs that capture major axes of variation across T cell states (immgenT-GP ms); the Treg reference reveals a conserved, low-dimensional state space for regulatory T cells with context-dependent abundance (immgenT-TREG ms); the TCR repertoire analysis documents recurrent public clonotypes, recombination biases, and tissue-specific clonal expansions (immgenT-TCR ms); and studies of normalized microbial exposure (“dirty” mice) demonstrate how environmental cues shift CD8 state proportions toward effector memory dominance without generating new identities (immgenT-Exposure ms). Despite its breadth, the framework primarily maps “where” cells reside in molecular space rather than “how” they arrive there. Key open questions include transcription-factor cascades, epigenetic landscapes, and lineage relationships. Integrating trajectory inference, CRISPR screens, ATAC-seq, and fate-mapping with T-RBI will transform this static map into a dynamic understanding of CD8 differentiation.

In summary, the CD8 immgenT framework resolves decades of fragmented literature into a cohesive reference. By providing robust states, validated markers, and publicly accessible tools, it equips the field to annotate heterogeneity reproducibly, uncover unexpected biology, and accelerate translational efforts in vaccination, cancer immunotherapy, and chronic infection.

ONLINE METHODS

Mice and cell lines

All experiments were performed using 6–8-week-old C57BL/6J mice obtained from The Jackson Laboratory (Bar Harbor, ME) and maintained under specific pathogen-free (SPF) conditions. Whenever possible, both male and female mice were included, with duplicate biological replicates per condition. Mice were euthanized by cervical dislocation (without CO2) between 8:30 and 9:30 AM to minimize circadian variation. All animal procedures were approved by the Institutional Animal Care and Use Committees (IACUC) of the participating institutions and complied with NIH guidelines. MC38-SIY and B16-SIY tumor experiments were performed using female C57BL/6 mice (6–8 weeks old) obtained from Taconic Biosciences and housed under SPF conditions at the MIT Koch Institute animal facility. MC38-SIY colon carcinoma and B16-SIY melanoma cell lines (Gajewski Laboratory, University of Chicago) were cultured in DMEM supplemented with 10% FBS, 1% penicillin/streptomycin, and 1× HEPES at 37 °C, 5% CO2. Cell lines were routinely tested for mycoplasma.

Tissue Collection and single-cell suspension preparation

Tissues were dissected and placed in ice-cold phenol-free DMEM supplemented with 2% FCS and 10 mM HEPES (staining buffer) within 15 minutes of euthanasia. For primary tissues, single-cell suspensions were prepared using optimized protocols to enrich for lymphocytes, including enzymatic digestion (limited to ≤20 minutes at 37°C in glucose-containing medium to preserve epitopes, if possible) and/or density gradient centrifugation (e.g., Percoll) as appropriate for each tissue type (refer to ImmGen Cell Preparation and Sorting Protocol for details; https://www.immgen.org/ImmGenT/immgenT.SOP.pdf). Digestion enzymes were selected to avoid epitope cleavage (e.g., CD3). Final cell concentrations were adjusted to ≥10–20×106 cells/mL (1–2×106 cells/100 μL). Negative selection was avoided to prevent loss of rare T cell populations with atypical surface markers. Sample-specific details, including tissue origin, mouse pooling (when cell numbers were low), and sorting strategy, are provided in Extended Data Table 7 of the accompanying immgenT article (immgenT-Cosmology ms).

Experiments are numbered IGT1–96 (described in Extended Data Table 7, column “IGT”). For the spleen standard (included in each experiment for batch correction), spleens from age-matched C57BL/6J mice were homogenized through a 40 μm filter using a 1 mL syringe plunger, centrifuged at 500 × g for 5 minutes at 4°C, and subjected to red blood cell lysis with 1 mL ice-cold ACK Lysing Buffer for 2 minutes on ice. Cells were washed, resuspended in staining buffer, and counted using a hemocytometer with trypan blue exclusion. See immgenT companion articles (immgenT-Cosmology ms) for additional details.

Hashtaging, antibody-derived tag staining and two-step cell sorting

For primary tissue samples, 1–2 × 106 cells were stained in 100 μL staining buffer with 1 μL TotalSeq-C Anti-Mouse Hashtag antibody (BioLegend, cat. nos. 155861–155879; unique hashtag per sample/replicate), viability dye and fluorescent antibodies (e.g., anti-CD3-PE, anti-CD45-APC). For the spleen standard, 1 × 106 splenocytes were stained in 50 μL staining buffer with 0.5 μL hashtag antibody and anti-CD45 conjugated to a distinct fluorochrome (e.g., FITC) from the primary samples. All staining was performed on ice in the dark for 20 minutes, followed by two washes with staining buffer and resuspension in 200 μL for sorting.

Cells were then filtered into FACS tubes. Samples (each with a unique hashtag) were sorted on a flow cytometer (e.g., BD FACSAria) to enrich for live CD45+CD3+ T cells (or more specific subsets, as detailed in Extended Data Table 7, column “gating_strategy”), pooling up to 450,000 cells across samples into a single 1.5 mL low-bind tube (Eppendorf, cat. no. 022431021) containing 300 μL staining buffer. Additionally, 50,000–100,000 live CD45+ splenocytes (distinguishable by fluorochrome) were sorted into the same tube. If fewer than 500,000 total cells were obtained, unstained splenocytes were added as fillers. Sorting plots and FCS files were saved for records.

ADT reagents were prepared by equilibrating TotalSeq-C ImmGen T panel (128 antibodies; BioLegend, custom part no. 900004815) at room temperature for 5 minutes, centrifuging at 10,000 × g for 30 seconds, adding 27.5 μL staining buffer (yielding a 2× mix), vortexing, incubating for 30 minutes, and centrifuging at 14,000 × g for 10 minutes at 4°C before transferring to a low-bind tube on ice. Post-first sort, cells were centrifuged at 500 × g for 5 minutes at 4°C, resuspended in 20 μL staining buffer + 5 μL Fc block (BioLegend), and mixed with 25 μL 2× ADT mix. The mixture was incubated on ice in the dark for 20 minutes with gentle resuspension every 5–10 minutes. Cells were washed four times with 1 mL staining buffer (500 × g, 5 minutes, 4°C) to remove unbound ADTs and resuspended in 300 μL staining buffer for the second sort.

Two populations were sorted into a single 1.5 mL low-bind tube containing 100 μL staining buffer: ~50,000 live T cells from primary samples (e.g., CD45APC+) and 5,000–10,000 spleen standard cells (e.g., CD45FITC+). Unlabeled filler cells were excluded. Sorting plots and FCS files were saved. As a general guideline, to recover ~10,000 high-quality cells post-sequencing, ~20,000 cells were targeted for encapsulation (~50% efficiency), requiring 40,000–60,000 cells in the first sort (~30–50% recovery). See immgenT companion articles (immgenT-Cosmology ms) for additional details and TotalSeq-C custom mouse panel antibody list.

Single-cell encapsulation, library Preparation, and sequencing

Cells were centrifuged at 500 × g for 5 minutes at 4°C, aspirated to 30 μL, and gently resuspended. Using the full 30 μL, samples were processed using the Chromium Next GEM Single Cell 5’ v2 Dual Index platform with Feature Barcoding for Cell Surface Protein and V(D)J Enrichment (10x Genomics, CG000330). GEM generation, barcoding, reverse transcription, and cDNA amplification were performed per manufacturer guidelines. Gene expression (GEX), TCR (VDJ), and TotalSeq-C (ADT) libraries were constructed separately, quantified (Qubit dsDNA HS assay), and quality-checked (Agilent Bioanalyzer High Sensitivity DNA assay). The three libraries were pooled based on molarity in the following proportions: 47.5% RNA, 47.5% Feature Barcode, and 5% TCR. The pooled libraries were sequenced on an Illumina NovaSeq S2 platform (100 cycles) using the 10x Genomics specifications: 26 cycles for Read 1, 10 cycles for Index 1, 10 cycles for Index 2, and 90 cycles for Read 2. See immgenT companion articles (immgenT-Cosmology ms) for additional details.

Data Processing, Quality Control, Integration, and Annotation

Gene, hashtag, and ADT counts were generated using CellRanger (v7.1.0) aligned to mm10 (GRCm38) with Gencode M25 annotation. Sample demultiplexing used HTODemux (Seurat v4.1). Quality control excluded cells with <500 RNA counts, >10% mitochondrial reads, <500 ADT counts, non-specific ADT binding (isotype controls), or non-T cell identity (based on lineage gene signatures via AddModuleScore_UCell). In some experiments, CITE-seq data did not pass QC and were excluded (Extended Data Table 7, column “cite_seq”).

Comprehensive analysis of the gene expression data, including cell identity assignment and hierarchical partitioning into level 1 lineages and level 2 clusters, is fully detailed in the immgenT companion papers (immgenT-Cosmology ms). Briefly, data were integrated using scVI.totalVI (v1.2.0) with lane as batch covariate (30 dimensions, all genes/proteins). Dimensionality reduction used pymde.preserve_neighbors(), followed by clustering (Seurat FindClusters in TOTALVI space) and hierarchical refinement (Louvain at multiple resolutions with silhouette-guided merging/splitting). Main lineage annotations (CD4, CD8, Treg, γδT, etc.) used RNA and protein markers (Cd3e, Trbc1/2, Cd4, Cd8a/b, Foxp3, etc.), with corrections based on CITE-seq. Resting/activated/proliferating states were assigned using Sell/CD62L, Cd44/CD44, Mki67, and cell cycle scoring. Differential gene expression used limma-trend and/or FlashierDGE (EBMF semi-NMF on log-normalized data). For mapping external datasets to the immgenT reference, T-RBI used Scanpy, scVI/SCANVI, and pyMDE. See immgenT companion papers (immgenT-Cosmology ms) for additional details.

Flow cytometry

Cells were incubated with Fc block for 10 minutes at 4°C and then with the indicated antibodies for 20 min at 4°C in PBS supplemented with 2% bovine growth serum and 0.01% sodium azide. All reagents were titrated prior to use to determine optimal concentrations. Stained cells were analyzed using the Cytoflex (Beckman Coulter), BD LSRFortessa X-20 or the Cytek Aurora and analyzed with BD FlowJo software version 10.

MC38-SIY and B16-SIY tumor experiments

Tumor cells were detached with 0.25% trypsin-EDTA, washed, and 2 × 106 cells in 100 μL PBS were injected subcutaneously into the flank. Tumors were harvested 14 days post-implantation. Tumors were minced and digested for 20 min at 37 °C in RPMI containing 250 μg/mL Liberase (Sigma-Aldrich) and 50 μg/mL DNase I (Sigma-Aldrich), then mashed through a 70 μm strainer. Tumors of the same type were pooled, washed three times in chilled FACS buffer (PBS + 1% FBS + 2 mM EDTA), blocked with anti-CD16/32 (clone 93, BioLegend; 1:100), and stained with Fixable Viability Dye eFluor 780 (eBioscience; 1:2000) and fluorophore-conjugated antibodies. Live CD45+ cells were sorted on a BD FACSAria III into RPMI + 10% FBS. Sorted cells were resuspended in PBS + 0.04% BSA at ~1,000 cells/μL. Single-cell 3′ gene expression libraries (no CITE-seq or TCR sequencing) were generated using Chromium Single Cell 3′ Reagent Kits v2 (10x Genomics) and sequenced on an Illumina HiSeq 2000. Raw data were processed with Cell Ranger v3.0.1 (mm10 reference) and further analyzed with Seurat v3.2.285. T-RBI was performed only on the CD8+ T cell fraction of this dataset.

Extended Data

Extended Data Figure 1. Comprehensive integration of the CD8αβ immgenT framework.

Extended Data Figure 1.

a, scatter plot showing CITE-Seq expression levels of CD8A and CD8B by the CD3 T cell immgenT cosmology annotated by main T cell subsets; b-c, UMAP projection of the CD8 T cell immgenT framework (gray) overlayed with all the public datasets3239 included in this study after T-RBI together (b) or individually (c); d, UMAP projection of the CD8 T cell immgenT framework annotated by main clusters with scatter plots partitioning the UMAP in four areas based on CITE-Seq expression levels of CD62L and CD44; e, UMAP projection of the CD8 T cell immgenT framework (gray) and highlighting cells from healthy 6–8 weeks old SPF C57BL/6J mice colored by immgenT clusters as indicated; f, UMAP projection of the CD8 T cell immgenT framework (gray) colored by selected experiments, tissues, and T cell types as indicated; g, UMAP plot showing the CD5HI and CD5LO signature44 enrichment in clusters 1 and 2. Abbreviations: cl, cluster; SPF, specific pathogen-free; Prolif., proliferating; d.p.i., days post-infection; sig., signature.

Extended Data Figure 2. Molecular heterogeneity and validation of acute effector states.

Extended Data Figure 2.

a, UMAP projection of the CD8 T cell immgenT framework (gray) overlayed with a selected public dataset37 after T-RBI; b-d, flow cytometry plots validating the expression of CITE-Seq-predicted markers from lung day 7 post-Flu-OVA infection (b) or the expression of CD11c from spleen at day 7 (c) and 30 (d) post-LCMVarm infection on CD62L+CD44 naive and CD62LCD44+ activated T cells with histogram plots showing the median fluorescence intensity (MFI); e, flow cytometry plots showing expression of KLRG1 and CD127 by CD44+CD11c+ endogenous or P14 cells at day 7 post-LCMVarm infection from the spleen; f, flow cytometry plots showing expression of CD11c on CD8b negative leukocytes from spleen (LCMVarm day 7) or lung (Flu-OVA day 7) as reference; g, scRNA-Seq public dataset52 showing the expression of Itgax (encoding CD11c) overtime from spleen P14 CD8 T cells (top-left), and flow cytometry data showing CD11c protein expression at the same timepoints from P14 and CD44+ endogenous CD8 T cells as representative histograms (CD44+ endogenous only; top-right) or cumulative box plot (n=3 per timepoint; bottom) where only mice with at least 100 target cells per timepoint were included; h-i, heatmap showing expression of the top 10 genes from GP10 (h) and GP25 (i) across the CD8 T cell immgenT clusters; j, circos plots showing the top 50 TCR clones and their sharing between the clusters 14, 15, and 28, from the indicated conditions. Abbreviations: annot., annotation; D, day; Endo., endogenous; n., number; w.p.i., weeks post-infection; D, day; cl, cluster.

Extended Data Figure 3. Phenotypic and transcriptional signatures of circulating memory states.

Extended Data Figure 3.

a, flow cytometry plots validating the gating strategy proposed in Fig. 4c; b, gating enrichment projection over the CD8 T cell immgenT UMAP leveraging the strategies depicted in Fig. 4c; c, heatmap showing expression of the top 10 genes from gene-program GP16 across the CD8 T cell immgenT clusters; d, UMAP projection of the CD8 T cell immgenT framework (gray) highlighting cells from skinback tissue at day 80 post-vaccinia virus (VV) infection (IGT46) as indicated; e, projection of CD44+KLRG1CD127+CD11c+ cells over the CD8 T cell immgenT UMAP; f-i, volcano plots representing cluster-defining signatures for clusters 4, 5, 12, and 13 when each is compared to all the other clusters (left) and balloon plots showing the top 10 genes upregulated or downregulated by each cluster signature with their expression across representative samples (right). Abbreviations: D, day; cl, cluster; VV, vaccinia virus; n., number; vs., versus.

Extended Data Figure 4. Surface-marker and transcriptional heterogeneity of tissue-resident memory CD8 T cells from cluster 10.

Extended Data Figure 4.

a, volcano plots representing cluster 10-specific signature with overlapping genes from the indicated publicly available signatures5,63,64 highlighted in red; b, scatter plots showing the CITE-Seq expression of CD39 and CD73 by all the CD8 T cells from cluster 10 (density) or other clusters (gray) across selected tissues; c-d, box and whisker plots showing sensitivity (c) and positive predictive value (d) for the indicated CITE-Seq-predicted gating strategies from Fig. 5f and Extended Data Table 4; e, representative flow cytometry plots from P14 CD8 CD44+IV T cells based on the gating strategy proposed in Fig. 5f; f, bar graph showing the average expression of selected TRM associated transcription factors across the CD8 immgenT clusters; g, stacked bar graph showing the frequency of the indicated clusters across the top 50 samples and conditions; h, balloon plot showing the expression of a curated list of genes by cluster 10, 25, and 26. Abbreviations: sig., signatures; CNS, central nervous system; LN, lymph node; cl, cluster; s, strategy; D, day; TF, transcription factor.

Extended Data Figure 5. Detailed mapping of the exhaustion/residency continuum and progenitor-exhausted cells.

Extended Data Figure 5.

a, UMAP projection of the CD8 T cell immgenT framework (gray) overlayed with an independent TIL dataset (GSE316401) after T-RBI; b, UMAP projection of the CD8 T cell immgenT framework (gray) overlayed with the TEX cluster as defined in a selected public dataset36 after T-RBI and annotated by original authors’ annotation or immgenT’s; c, UMAP projection of the CD8 T cell immgenT framework (gray) highlighting cells by selected conditions and T cell types as indicated; d, feature plots showing expression of selected exhaustion-associated genes within the immgenT CD8 dataset; e, heatmap showing gene weight of Lag3, Tigit, Pdcd1, and Tox, across all the immgenT-derived gene-programs; f, stacked bar plot showing combined mean factor loading for gene-programs GP110, GP12, and GP81, by cluster 7/10/11; g, volcano plot representing the cluster 10-specific signature obtained by comparison with all the other clusters; h-i, volcano plots representing cluster-defining signatures for clusters 7 (h) and 11 (i) when each is compared to all the other clusters (left) and gene-set enrichment analysis (GSEA) displaying enrichment score of publicly available TRM-associated signatures5,63,64 (right); j, UMAP projection of the CD8 T cell immgenT framework (gray) highlighting cells by selected conditions and T cell types as indicated; k, volcano plot representing the differentially expressed genes by cluster 10 OT-I cells from the lung with KP lung cancer or Flu-OVA infection; l, scatter plots showing the predicted cluster 10 CITE-Seq-based gating strategies (as in Fig. 5f) split by cluster 10 OT-I cells from KP lung cancer or lung Flu-OVA infection; m, circos plots showing the top 50 TCR clones and their sharing between the clusters 10, 7, and 11, from the indicated conditions. Abbreviations: D, day; cl, cluster; GP, gene-program; vs., versus; NES, normalized enrichment score; FDR, false discovery rate.

Extended Data Figure 6. Performance and practical implementation of the T-RBI reference framework.

Extended Data Figure 6.

a, histogram plot showing the scanvi confidence score related to Fig. 7ac; b, UMAP projection of a publicly available chimeric antigen receptor (CAR) T cell dataset39 highlighting cells with unsuccessful assignment to the immgenT reference (~4% of all cells) in relation to Fig. 7ac; c, histogram plot showing the scanvi confidence score related to Fig. 7df; d, UMAP projection of a publicly available infection/tumor P14 CD8 T cell dataset80 highlighting cells with unsuccessful assignment to the immgenT reference (~1% of all cells) in relation to Fig. 7df; e, scatter plots showing predicted CITE-Seq-based gating strategies for the isolation of the indicated clusters with gating enrichment projections over the CD8 T cell immgenT UMAP; f, bar graph showing the frequency of the clusters identified by flow cytometry with the gating strategy depicted as in Fig. 3be on splenocytes at day 7 post-LCMVarm infection; g, bar graph showing the frequency of the clusters identified by flow cytometry with the gating strategy depicted as in Fig. 4c,d and Extended Data Fig. 3a on splenocytes at day 30 post-LCMVarm infection; h, box and whisker plots showing positive predictive value for the CITE-Seq-predicted gating strategies from Extended Data Fig. 6e and Extended Data Table 3. Abbreviations: cl, cluster; d.p.i., days post-infection.

Supplementary Material

Supplement 1

Extended Data Table 1. Summary of samples and experiments in the CD8 immgenT framework.

media-1.xlsx (12.1KB, xlsx)
Supplement 2

Extended Data Table 2. List of external published CD8 T cell datasets projected onto the immgenT reference using T-RBI.

media-2.xlsx (11.2KB, xlsx)
Supplement 3

Extended Data Table 3. Performance metrics (sensitivity, specificity, PPV, and NPV) for CITE-seq-derived gating strategies across CD8 T cell states.

media-3.xlsx (11.6KB, xlsx)
Supplement 4

Extended Data Table 4. CITE-seq-guided flow cytometry-like gating strategies for identifying CD8_cl10 T cell state.

media-4.xlsx (10.3KB, xlsx)
Supplement 5

Extended Data Table 5. Distribution of original cluster annotations from a published CAR T cell dataset across immgenT CD8 T cell clusters.

media-5.xlsx (16.1KB, xlsx)
Supplement 6

Extended Data Table 6. Distribution of original cluster annotations from a published infection/tumor P14 CD8 T cell dataset across immgenT CD8 T cell clusters.

media-6.xlsx (20.6KB, xlsx)
Supplement 7

Extended Data Table 7. Sample-level metadata for the immgenT dataset.

media-7.xlsx (192.8KB, xlsx)

ACKNOWLEDGEMENTS

We acknowledge the Flow Cytometry Core Facility and the Sequencing Core Facility at the La Jolla Institute for Immunology, and the Brown University Flow Cytometry Core for assistance with cell sorting. National Institutes of Health grants: R24–072073 (ImmGen consortium), R01AI179952 (A.W.G.), R37AI067545 (A.W.G.), R01AI072117 (A.W.G.), R01AI150282 (A.W.G.), R01AI192333 (S.M.B.), R01AI172905 (S.M.B.), K00CA222711 (N.E.S.), F31AI176705 (K.K.T.), and F31DE032593 (S.M.B.). G.G. was a Cancer Research Institute Irvington Fellow supported by the Cancer Research Institute (CRI4145). A.M.G. was supported by a NOMIS Foundation Postdoctoral Fellowship. T.A.H. is supported by a postdoctoral fellowship from the Ludwig Center at MIT’s Koch Institute.

COLLABORATORS

Participants in the immgenT Project include:

Aaron Liu1, Alexander Chervonsky2, Alexandra Cassano2, Alia Welsh3, Amir Ferry11, Ananda Goldrath11, Andrea Lebron-Figueroa5, Ankit Malik2, Anna-Maria Globig4, Antoine Freuchet2, Bana Jabri2, Charlotte Imianowski6, Christophe Benoist5, Claire Thefaine7, Dan Kaplan6, Dania Mallah5, Dario Vignali6, David Sinclair5, David Zemmour2, Derek Bangs8, Domenic Abbondanza2, Enxhi Ferraj9, Eric Weiss6, Erin Lucas7, Evelyn Chang9, Gavyn Chern Wei Bee10, Giovanni Galletti11, Ian Magill5, Iliyan D Iliev12, Joonsoo Kang9, Jordan Voisine2, Josh Choi5, Julia Merkenschlager13, Jun R. Huh5, Katharine Block7, Ken Cadwell10, Kennidy K. Takehara11, Kevin Osum7, Laurent Brossay14, Laurent Gapin15, Liang Yang5, Lizzie Garcia-Rivera1, Marc K. Jenkins7, Maria Brbic16, Maria-Luisa Alegre2, Marion Pepper8, Mariya London17, Matthew Stephens2, Maurizio Fiusco16, Melanie Vacchio3, Michael Starnbach5, Michel Nussenzweig13, Mitch Kronenberg18, Myriam Croze19, Nalat Siwapornchai5, Nathan Morris12, Nicole E. Scharping11, Nika Abdollahi19, Nitya Mehrotra2, Odhran Casey5, Olga Barreiro del Rio5, Paul Thomas20, Peter Carbonetto2, Remy Bosselut3, Rocky Lai9, Sam Behar9, Sam Borys14, Sara E. Hamilton7, Sara Mostafavi8, Sara Quon11, Serge Candéias21, Shanelle Reilly14, Shanshan Zhang5, Siba Smarak Panigrahi16, Sofia Kossida19, Stefan Muljo3, Stefan Schattgen20, Stefani Spranger22, Steve Jameson7, Susan M. Kaech1, Takato Kusakabe12, Taylor Heim22, Tianze Wang8, Tomoyo Shinkawa9, Ulrich von Andrian5, Val Piekarsa5, Véronique Giudicelli19, Vijay Kuchroo5, Woan-Yu Lin12, Ziang Zhang2

1. NOMIS Center, Salk Institute for Biological Sciences, 2. The University of Chicago, 3. National Institutes of Health, 4. Allen Institute for Immunology, 5. Harvard Medical School, 6. Dept of Dermatology and Immunology, University of Pittsburgh, 7. University of Minnesota, 8. University of Washington, 9. UMass Chan Medical School, 10. University of Pennsylvania, 11. University of California San Diego, 12. Weill Cornell Medicine, 13. The Rockefeller University, 14. Brown University, 15. University of Colorado Anschutz Medical Campus, 16. Swiss Federal Institute of Technology, Lausanne, 17. New York University, 18. La Jolla Institute, 19. IMGT, Univ Montpellier, 20. St. Jude Children’s Research Hospital, 21. Alternative Energies and Atomic Energy Commission, Grenoble, 22. Massachusetts Institute of Technology

Footnotes

COMPETING INTERESTS

Authors declare no competing interests.

DATA AVAILABILITY

ImmgenT raw and processed data are available through GEO (accession GSE297097) and the immgenT portal (https://www.immgen.org/ImmGenT/) and visualized via Rosetta. MC38-SIY and B16-SIY tumor experiment data are available with GSE316401. Code is available at https://github.com/immgen/immgen_t_git/. See immgenT companion articles (immgenT-Cosmology ms) for additional details.

REFERENCES

  • 1.Obers A. et al. Retinoic acid and TGF-beta orchestrate organ-specific programs of tissue residency. Immunity 57, 2615–2633 e2610 (2024). [DOI] [PubMed] [Google Scholar]
  • 2.Rahimi R.A. & Luster A.D. Chemokines: Critical Regulators of Memory T Cell Development, Maintenance, and Function. Adv Immunol 138, 71–98 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Christo S.N., Park S.L., Mueller S.N. & Mackay L.K. The Multifaceted Role of Tissue-Resident Memory T Cells. Annu Rev Immunol 42, 317–345 (2024). [DOI] [PubMed] [Google Scholar]
  • 4.Heeg M. & Goldrath A.W. Insights into phenotypic and functional CD8(+) T(RM) heterogeneity. Immunol Rev 316, 8–22 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Crowl J.T. et al. Tissue-resident memory CD8(+) T cells possess unique transcriptional, epigenetic and functional adaptations to different tissue environments. Nat Immunol 23, 1121–1131 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Scott M.C. et al. Deep profiling deconstructs features associated with memory CD8(+) T cell tissue residence. Immunity 58, 162–181 e110 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Blank C.U. et al. Defining ‘T cell exhaustion’. Nat. Rev. Immunol. 19, 665–674 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Alfei F. et al. TOX reinforces the phenotype and longevity of exhausted T cells in chronic viral infection. Nature 571, 265–269 (2019). [DOI] [PubMed] [Google Scholar]
  • 9.Khan O. et al. TOX transcriptionally and epigenetically programs CD8(+) T cell exhaustion. Nature 571, 211–218 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Scott A.C. et al. TOX is a critical regulator of tumour-specific T cell differentiation. Nature 571, 270–274 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Seo H. et al. TOX and TOX2 transcription factors cooperate with NR4A transcription factors to impose CD8(+) T cell exhaustion. Proc. Natl. Acad. Sci. 116, 12410–12415 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Leong Y.A. et al. CXCR5(+) follicular cytotoxic T cells control viral infection in B cell follicles. Nat. Immunol. 17, 1187–1196 (2016). [DOI] [PubMed] [Google Scholar]
  • 13.Utzschneider D.T. et al. T Cell Factor 1-Expressing Memory-like CD8(+) T Cells Sustain the Immune Response to Chronic Viral Infections. Immunity 45, 415–427 (2016). [DOI] [PubMed] [Google Scholar]
  • 14.Wu T. et al. The TCF1-Bcl6 axis counteracts type I interferon to repress exhaustion and maintain T cell stemness. Sci Immunol 1 (2016). [Google Scholar]
  • 15.He R. et al. Follicular CXCR5− expressing CD8(+) T cells curtail chronic viral infection. Nature 537, 412–428 (2016). [DOI] [PubMed] [Google Scholar]
  • 16.Wieland D. et al. TCF1(+) hepatitis C virus-specific CD8(+) T cells are maintained after cessation of chronic antigen stimulation. Nat Commun 8, 15050 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Kallies A., Zehn D. & Utzschneider D.T. Precursor exhausted T cells: key to successful immunotherapy? Nature Reviews Immunology 20, 128–136 (2020). [Google Scholar]
  • 18.Zehn D., Thimme R., Lugli E., de Almeida G.P. & Oxenius A. ‘Stem-like’ precursors are the fount to sustain persistent CD8(+) T cell responses. Nat Immunol 23, 836–847 (2022). [DOI] [PubMed] [Google Scholar]
  • 19.Brummelman J. et al. High-dimensional single cell analysis identifies stem-like cytotoxic CD8(+) T cells infiltrating human tumors. J. Exp. Med. 215, 2520–2535 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Galletti G. et al. Two subsets of stem-like CD8(+)memory T cell progenitors with distinct fate commitments in humans. Nature Immunology (2020). [Google Scholar]
  • 21.Lugli E., Galletti G., Boi S.K. & Youngblood B.A. Stem, Effector, and Hybrid States of Memory CD8(+) T Cells. Trends Immunol. 41, 17–28 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Im S.J. et al. Defining CD8+ T cells that provide the proliferative burst after PD-1 therapy. Nature 537, 417–421 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Dammeijer F. et al. The PD-1/PD-L1-Checkpoint Restrains T cell Immunity in Tumor-Draining Lymph Nodes. Cancer Cell 38, 685–700 e688 (2020). [DOI] [PubMed] [Google Scholar]
  • 24.Fear V.S. et al. Tumour draining lymph node-generated CD8 T cells play a role in controlling lung metastases after a primary tumour is removed but not when adjuvant immunotherapy is used. Cancer Immunol Immunother 70, 3249–3258 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Miron M. et al. Human Lymph Nodes Maintain TCF-1(hi) Memory T Cells with High Functional Potential and Clonal Diversity throughout Life. J Immunol 201, 2132–2140 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Connolly K.A. et al. A reservoir of stem-like CD8(+) T cells in the tumor-draining lymph node preserves the ongoing antitumor immune response. Sci Immunol 6, eabg7836 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.von Renesse J., Lin M.C. & Ho P.C. Tumor-draining lymph nodes - friend or foe during immune checkpoint therapy? Trends Cancer (2025). [Google Scholar]
  • 28.Wijesinghe S.K.M. et al. Lymph-node-derived stem-like but not tumor-tissue-resident CD8(+) T cells fuel anticancer immunity. Nat Immunol 26, 1367–1383 (2025). [DOI] [PubMed] [Google Scholar]
  • 29.Burn T.N. et al. Antigen reactivity defines tissue-resident memory and exhausted T cells in tumors. Nat Immunol 27, 98–109 (2026). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Park S.L. et al. Tissue-resident exhausted and memory CD8(+) T cells have distinct ontogeny, function and role in disease. Nat Immunol 27, 110–125 (2026). [DOI] [PubMed] [Google Scholar]
  • 31.Masopust D. et al. Guidelines for T cell nomenclature. Nat Rev Immunol (2025). [Google Scholar]
  • 32.Conceicao-Neto N. et al. AAV-HBV mouse model replicates the intrahepatic immune landscape of chronic HBV patients at single-cell level. Front Immunol 16, 1421712 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Nie J. et al. The transcription factor LRF promotes integrin beta7 expression by and gut homing of CD8alphaalpha(+) intraepithelial lymphocyte precursors. Nat Immunol 23, 594–604 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Tabula Muris C. A single-cell transcriptomic atlas characterizes ageing tissues in the mouse. Nature 583, 590–595 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Yang L. et al. Transcriptome landscape of double negative T cells by single-cell RNA sequencing. J Autoimmun 121, 102653 (2021). [DOI] [PubMed] [Google Scholar]
  • 36.Miller B.C. et al. Subsets of exhausted CD8(+) T cells differentially mediate tumor control and respond to checkpoint blockade. Nat. Immunol. 20, 326–336 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Chen Z. et al. TCF-1-Centered Transcriptional Network Drives an Effector versus Exhausted CD8 T Cell-Fate Decision. Immunity 51, 840–855 e845 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Pritykin Y. et al. A unified atlas of CD8 T cell dysfunctional states in cancer and infection. Mol Cell 81, 2477–2493 e2410 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Zhu Z. et al. FOXP1 and KLF2 reciprocally regulate checkpoints of stem-like to effector transition in CAR T cells. Nat Immunol 25, 117–128 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Lam N., Lee Y. & Farber D.L. A guide to adaptive immune memory. Nat Rev Immunol 24, 810–829 (2024). [DOI] [PubMed] [Google Scholar]
  • 41.Kaech S.M. & Cui W. Transcriptional control of effector and memory CD8+ T cell differentiation. Nat Rev Immunol 12, 749–761 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Liu Y. et al. Dissecting tumor transcriptional heterogeneity from single-cell RNA-seq data by generalized binary covariance decomposition. Nat Genet 57, 263–273 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Willwerscheid J., Carbonetto P. & Stephens M. ebnm: An R Package for Solving the Empirical Bayes Normal Means Problem Using a Variety of Prior Families. arXiv (2021). [Google Scholar]
  • 44.Fulton R.B. et al. The TCR’s sensitivity to self peptide–MHC dictates the ability of naive CD8+ T cells to respond to foreign antigens. Nature Immunology 16, 107–117 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.De Simone G. et al. CXCR3 Identifies Human Naive CD8&lt;sup&gt;+&lt;/sup&gt; T Cells with Enhanced Effector Differentiation Potential. The Journal of Immunology, ji1901072 (2019). [Google Scholar]
  • 46.ElTanbouly M.A. et al. VISTA is a checkpoint regulator for naive T cell quiescence and peripheral tolerance. Science 367 (2020). [Google Scholar]
  • 47.Joshi N.S. et al. Inflammation directs memory precursor and short-lived effector CD8(+) T cell fates via the graded expression of T-bet transcription factor. Immunity 27, 281–295 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Kaech S.M. et al. Selective expression of the interleukin 7 receptor identifies effector CD8 T cells that give rise to long-lived memory cells. Nat Immunol 4, 1191–1198 (2003). [DOI] [PubMed] [Google Scholar]
  • 49.Obar J.J. et al. Pathogen-induced inflammatory environment controls effector and memory CD8+ T cell differentiation. J Immunol 187, 4967–4978 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Plumlee C.R., Sheridan B.S., Cicek B.B. & Lefrancois L. Environmental cues dictate the fate of individual CD8+ T cells responding to infection. Immunity 39, 347–356 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Joshi N.S. et al. Increased numbers of preexisting memory CD8 T cells and decreased T-bet expression can restrain terminal differentiation of secondary effector and memory CD8 T cells. J Immunol 187, 4068–4076 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Kurd N.S. et al. Early precursors and molecular determinants of tissue-resident memory CD8(+) T lymphocytes revealed by single-cell RNA sequencing. Sci Immunol 5 (2020). [Google Scholar]
  • 53.Omilusik K.D. & Goldrath A.W. Remembering to remember: T cell memory maintenance and plasticity. Current Opinion in Immunology 58, 89–97 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Renkema K.R. et al. KLRG1(+) Memory CD8 T Cells Combine Properties of Short-Lived Effectors and Long-Lived Memory. J Immunol 205, 1059–1069 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Milner J.J. et al. Delineation of a molecularly distinct terminally differentiated memory CD8 T cell population. Proc Natl Acad Sci U S A 117, 25667–25678 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Lucas E.D. et al. Circulating KLRG1(+) long-lived effector memory T cells retain the flexibility to become tissue resident. Sci Immunol 9, eadj8356 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.McClory S.E. et al. The pseudokinase Trib1 regulates the transition of exhausted T cells to a KLR(+) CD8(+) effector state, and its deletion improves checkpoint blockade. Cell Rep 42, 112905 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Morgan D.M. et al. Expansion of tumor-reactive CD8(+) T cell clonotypes occurs in the spleen in response to immune checkpoint blockade. Sci Immunol 9, eadi3487 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Crouse J., Kalinke U. & Oxenius A. Regulation of antiviral T cell responses by type I interferons. Nat Rev Immunol 15, 231–242 (2015). [DOI] [PubMed] [Google Scholar]
  • 60.Kolumam G.A., Thomas S., Thompson L.J., Sprent J. & Murali-Krishna K. Type I interferons act directly on CD8 T cells to allow clonal expansion and memory formation in response to viral infection. J Exp Med 202, 637–650 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Chiu B.C., Martin B.E., Stolberg V.R. & Chensue S.W. Cutting edge: Central memory CD8 T cells in aged mice are virtual memory cells. J Immunol 191, 5793–5796 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Weiss E.S. et al. Epidermal Resident Memory T Cell Fitness Requires Antigen Encounter in the Skin. eLife Sciences Publications, Ltd; 2025. [Google Scholar]
  • 63.Mackay L.K. et al. Hobit and Blimp1 instruct a universal transcriptional program of tissue residency in lymphocytes. Science 352, 459–463 (2016). [DOI] [PubMed] [Google Scholar]
  • 64.Milner J.J. et al. Runx3 programs CD8+ T cell residency in non-lymphoid tissues and tumours. Nature 552, 253–257 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Milner J.J. & Goldrath A.W. Transcriptional programming of tissue-resident memory CD8+ T cells. Current Opinion in Immunology 51, 162–169 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Cruz-Guilloty F. et al. Runx3 and T-box proteins cooperate to establish the transcriptional program of effector CTLs. J Exp Med 206, 51–59 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Shan Q. et al. The transcription factor Runx3 guards cytotoxic CD8(+) effector T cells against deviation towards follicular helper T cell lineage. Nat Immunol 18, 931–939 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Zhang Y., Feng X.H. & Derynck R. Smad3 and Smad4 cooperate with c-Jun/c-Fos to mediate TGF-beta-induced transcription. Nature 394, 909–913 (1998). [DOI] [PubMed] [Google Scholar]
  • 69.Heim T.A. et al. CXCR6 promotes dermal CD8+ T cell survival and transition to long-term tissue residence. J Immunol (2025). [Google Scholar]
  • 70.Adam M., Potter A.S. & Potter S.S. Psychrophilic proteases dramatically reduce single-cell RNA-seq artifacts: a molecular atlas of kidney development. Development 144, 3625–3632 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.O’Flanagan C.H. et al. Dissociation of solid tumor tissues with cold active protease for single-cell RNA-seq minimizes conserved collagenase-associated stress responses. Genome Biol 20, 210 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Byrne A. et al. Tissue-resident memory T cells in breast cancer control and immunotherapy responses. Nat Rev Clin Oncol 17, 341–348 (2020). [DOI] [PubMed] [Google Scholar]
  • 73.Nizard M. et al. Induction of resident memory T cells enhances the efficacy of cancer vaccine. Nature Communications 8 (2017). [Google Scholar]
  • 74.Malik B.T. et al. Resident memory T cells in the skin mediate durable immunity to melanoma. Science Immunology 2, eaam6346 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.Ganesan A.-P. et al. Tissue-resident memory features are linked to the magnitude of cytotoxic T cell responses in human lung cancer. Nature Immunology 18, 940–950 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Djenidi F. et al. CD8&lt;sup&gt;+&lt;/sup&gt;CD103&lt;sup&gt;+&lt;/sup&gt; Tumor–Infiltrating Lymphocytes Are Tumor-Specific Tissue-Resident Memory T Cells and a Prognostic Factor for Survival in Lung Cancer Patients. The Journal of Immunology 194, 3475 (2015). [DOI] [PubMed] [Google Scholar]
  • 77.Nair R. et al. Deciphering T-cell exhaustion in the tumor microenvironment: paving the way for innovative solid tumor therapies. Front Immunol 16, 1548234 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Horton B.L. et al. Overcoming lung cancer immunotherapy resistance by combining nontoxic variants of IL-12 and IL-2. JCI Insight 8 (2023). [Google Scholar]
  • 79.Horton B.L. et al. Lack of CD8(+) T cell effector differentiation during priming mediates checkpoint blockade resistance in non-small cell lung cancer. Sci Immunol 6, eabi8800 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Green W.D. et al. Enhancer-driven gene regulatory networks reveal transcription factors governing T cell adaptation and differentiation in the tumor microenvironment. Immunity 58, 1725–1741 e1729 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Baessler A. & Vignali D.A.A. T Cell Exhaustion. Annu Rev Immunol 42, 179–206 (2024). [DOI] [PubMed] [Google Scholar]
  • 82.Grebinoski S. et al. Autoreactive CD8(+) T cells are restrained by an exhaustion-like program that is maintained by LAG3. Nat Immunol 23, 868–877 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Smita S., Chikina M., Shlomchik M.J. & Tilstra J.S. Heterogeneity and clonality of kidney-infiltrating T cells in murine lupus nephritis. JCI Insight 7 (2022). [Google Scholar]
  • 84.Tilstra J.S. et al. Kidney-infiltrating T cells in murine lupus nephritis are metabolically and functionally exhausted. The Journal of clinical investigation 128, 4884–4897 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Butler A., Hoffman P., Smibert P., Papalexi E. & Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat Biotechnol 36, 411–420 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1

Extended Data Table 1. Summary of samples and experiments in the CD8 immgenT framework.

media-1.xlsx (12.1KB, xlsx)
Supplement 2

Extended Data Table 2. List of external published CD8 T cell datasets projected onto the immgenT reference using T-RBI.

media-2.xlsx (11.2KB, xlsx)
Supplement 3

Extended Data Table 3. Performance metrics (sensitivity, specificity, PPV, and NPV) for CITE-seq-derived gating strategies across CD8 T cell states.

media-3.xlsx (11.6KB, xlsx)
Supplement 4

Extended Data Table 4. CITE-seq-guided flow cytometry-like gating strategies for identifying CD8_cl10 T cell state.

media-4.xlsx (10.3KB, xlsx)
Supplement 5

Extended Data Table 5. Distribution of original cluster annotations from a published CAR T cell dataset across immgenT CD8 T cell clusters.

media-5.xlsx (16.1KB, xlsx)
Supplement 6

Extended Data Table 6. Distribution of original cluster annotations from a published infection/tumor P14 CD8 T cell dataset across immgenT CD8 T cell clusters.

media-6.xlsx (20.6KB, xlsx)
Supplement 7

Extended Data Table 7. Sample-level metadata for the immgenT dataset.

media-7.xlsx (192.8KB, xlsx)

Data Availability Statement

ImmgenT raw and processed data are available through GEO (accession GSE297097) and the immgenT portal (https://www.immgen.org/ImmGenT/) and visualized via Rosetta. MC38-SIY and B16-SIY tumor experiment data are available with GSE316401. Code is available at https://github.com/immgen/immgen_t_git/. See immgenT companion articles (immgenT-Cosmology ms) for additional details.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES