Skip to main content
Nature Portfolio logoLink to Nature Portfolio
. 2024 Jul 2;25(8):1445–1459. doi: 10.1038/s41590-024-01884-z

Pan-cancer profiling of tumor-infiltrating natural killer cells through transcriptional reference mapping

Herman Netskar 1,2,#, Aline Pfefferle 3,✉,#, Jodie P Goodridge 4, Ebba Sohlberg 3, Olli Dufva 5, Sarah A Teichmann 6,7, Demi Brownlie 8, Jakob Michaëlsson 3, Nicole Marquardt 8, Trevor Clancy 9,10, Amir Horowitz 11,12, Karl-Johan Malmberg 1,2,3,
PMCID: PMC11291284  PMID: 38956379

Abstract

The functional diversity of natural killer (NK) cell repertoires stems from differentiation, homeostatic, receptor–ligand interactions and adaptive-like responses to viral infections. In the present study, we generated a single-cell transcriptional reference map of healthy human blood- and tissue-derived NK cells, with temporal resolution and fate-specific expression of gene-regulatory networks defining NK cell differentiation. Transfer learning facilitated incorporation of tumor-infiltrating NK cell transcriptomes (39 datasets, 7 solid tumors, 427 patients) into the reference map to analyze tumor microenvironment (TME)-induced perturbations. Of the six functionally distinct NK cell states identified, a dysfunctional stressed CD56bright state susceptible to TME-induced immunosuppression and a cytotoxic TME-resistant effector CD56dim state were commonly enriched across tumor types, the ratio of which was predictive of patient outcome in malignant melanoma and osteosarcoma. This resource may inform the design of new NK cell therapies and can be extended through transfer learning to interrogate new datasets from experimental perturbations or disease conditions.

Subject terms: Innate lymphoid cells, Tumour immunology


Malmberg and colleagues generated a single-cell transcriptional reference map to investigate pan-cancer profiles of tumor-infiltrating natural killer cells.

Main

NK cells are innate lymphocytes that play a vital role in the immune response through their ability to directly kill transformed and virus-infected cells by orchestrating the early phase of the adaptive immune response1. NK cells are commonly divided into two functionally distinct subsets: CD56bright and CD56dim NK cells2,3. However, this is an oversimplified view of the repertoire. Mass cytometry profiling of NK cell repertoires at the single-cell level revealed an extensive phenotypic diversity comprising up to 100,000 unique subsets in healthy individuals4. Much of this diversity is based on combinatorial expression of stochastically expressed, germline-encoded activating and inhibitory receptors that bind to human leukocyte antigen (HLA) class I and tune NK cell function in a process termed NK cell education5,6. Another layer of diversity reflects the continuous differentiation through well-defined intermediate phenotypes from the naive CD56bright NK cells through CD62L+NKG2A+KIRCD57CD56dim NK cells to terminally differentiated, adaptive CD62LNKG2C+CD57+KIR+CD56dim NK cells, associated with past infection with cytomegalovirus (CMV)710. Given the increasing interest in harnessing the cytolytic potential of NK cells in cell therapy against cancer, it is of fundamental importance to understand the molecular programs and gene-regulatory circuits driving NK cell differentiation and the underlying functional diversification of the human NK cell repertoire.

Utilizing single-cell RNA sequencing (scRNA-seq), Crinier et al. discovered organ-specific signatures in human spleen NK cells and two major transcriptional clusters in blood-derived NK cells (PB-NK), corresponding to CD56dim (NK1) and CD56bright (NK2) NK cell subsets2. Bulk RNA and chromatin immunoprecipitation sequencing identified dominant transcription factor (TF) axes defining CD56bright (TCF1-LEF-MYC) and CD56dim (PRDM1) phenotypic subsets, respectively11. Later research reported additional diversity with unique transcriptional clusters, including interleukin (IL)-2- and type I interferon (IFN)-responding NK cell subsets12 and an intermediate CD56dimGzmK+ stage, potentially bridging CD56bright and CD56dim NK cells13. A comprehensive analysis unveiled a role for Bcl11b in driving NK cell differentiation toward the adaptive state, reciprocally suppressing early TFs such as RUNX2 and ZBTB16 (ref. 14). Combining gene expression analysis, chromatin accessibility and lineage tracing via mitochondrial DNA mutations, Rückert et al. revealed clonal expansions and a distinct inflammatory memory signature in adaptive NK cells15. Using a pan-cancer, single-cell atlas approach, Tang et al.16 identified a tumor-enriched dysfunctional CD56dimCD16hi NK cell population interacting with LAMP3+ dendritic cells in the TME. Hence, scRNA-seq and bulk RNA-seq usage have defined major transcriptional regulatory hubs during NK cell differentiation and identified a persistent memory state in human innate immunity. However, it remains unclear how the regulatory gene circuits that operate under homeostasis in healthy tissues are affected by cellular and/or soluble cues in the TME, resulting in perturbed functional states within tumor-infiltrating NK (TiNK) cells.

In the present study we established a single-cell transcriptional reference map that resolves gene expression trends and dominating TF–target interactions during NK cell differentiation in blood and normal tissues. Reference mapping enabled the analysis of cellular differences and gene programs in diseases and various conditions by contextualizing new datasets within a healthy transcriptional reference, facilitating the identification of new states not found in the literature17. We utilized our NK cell reference map, compiled from 44,640 PB-NK cells (12 donors) and 27,732 tissue-resident NK (TrNK) cells (136 donors), to query the regulons and functional states, as defined through gene expression signatures, of TiNK cells derived from 427 patients with 7 distinct solid tumors (38,982 TiNKs). We found that TrNK and TiNK cells have a clear tissue-residency signature but still share the dominant regulons of blood CD56bright and CD56dim NK cells. Of the six functional states identified in our pan-cancer atlas and confirmed in a spatial transcriptomics dataset, a dysfunctional stressed CD56bright state susceptible to TME-associated cellular communication and a cytotoxic effector CD56dim state were commonly enriched across tumor types. Stratification of patient survival data identified a high ratio of effector CD56dim to stressed CD56bright state to correlate with improved survival in patients with osteosarcoma and melanoma. This resource provides a granular view of cancer-specific alterations of solid TiNK cells, identifying how the TME can lead to NK cell dysfunction and may inspire new strategies to engineer cell therapy products with robust functional phenotypes resistant to TME-induced suppressive mechanisms.

Results

NK cell subset annotation using predictive gene signatures

To establish a pan-cancer atlas of TiNK cells, we first defined NK cell differentiation at the transcriptional level. We performed scRNA-seq of the total NK cell population from seven healthy donors and integrated our transcriptomes with five publicly available donor datasets2,18 using scVI (single-cell Variational Inference)19 (Supplementary Table 1). By retaining only cell-to-cell variation independent of sample-to-sample variation, the cells that initially clustered by donor and laboratory origin were successfully integrated into a homogeneous population and visualized using diffusion maps20 to preserve the continuous trajectories observed with biological differentiation (Fig. 1a). Although NK cell differentiation is best described as a continuum, CD56bright and CD56dim NK cells represent two distinct stages of differentiation. By performing gene signature scoring using AUCell21, we identified cells at the top of the diffusion map embedding scoring high for the CD56bright gene signature2, whereas the main body of the embedding exhibited increasing intensity of the CD56dim signature2 (Fig. 1b). Scoring of two independent gene signatures based on the CD56bright/dim regulon11 and proteome22 confirmed our results (Extended Data Fig. 1a,b).

Fig. 1. NK cell differentiation at the transcriptional level.

Fig. 1

a, Integration process of scRNA-seq data of NK cells from 12 donors and 4 different laboratories using scVI showing a UMAP based on the scVI latent representation, followed by a UMAP based on the diffusion map components. b, AUCell scores of gene signatures for CD56bright and CD56dim NK cell subsets. c, UMAP representation of five sorted subsets from a donor with an adaptive expansion (left) and a donor without an adaptive expansion (right). d, Heatmap depicting accuracy of our prediction model for subset annotation tested on the held-out 15% of cells from the subset-specific dataset (two donors). e, UMAP of the scANVI representation of both bulk and sorted NK cells, showing original annotation of NK cells (12 donors, left) and subset labels predicted (right) using the scANVI model trained with sorted subset data (2 donors). f, Dot plots showing the top three up- and downregulated genes between all pairs of subsets (x and y axes) as identified by the differential expression module in scANVI. These top genes were then visualized across all NK cell subsets within the differentiation spectrum (x axis), to highlight the continuous nature of NK cell differentiation. g, Diffusion map representation showing the predicted subset labels for the bulk data (top) and depicting Leiden clustering of the 12 donor NK cell dataset (bottom). h, Heatmap showing distribution of our annotated 12 donor NK cell subsets over the 5 Leiden clusters. i, Frequency (freq.) of annotated late CD56dim and adaptive NK cell subsets in donors with and without an adaptive NK cell expansion. Int., intermediate.

Source data

Extended Data Fig. 1. Peripheral blood NK cell subsets and sorting strategy.

Extended Data Fig. 1

(a, b) AUCell scores of gene signatures for CD56bright and CD56dim NK regulones (a) and proteomes (b). (c, d) Sorting strategy for phenotypically defined functional PB-NK cell subsets sequenced in one donor with (c) and one without (d) an adaptive expansion. (e) Heatmap depicting similarity between our five annotated transcriptional NK cell subsets (y-axis) and the Meta-NK defined NK subsets (x-axis). The scale represent gene set activity calculated by AUCell (a-b, e).

The relatively large and heterogeneous population of CD56dim NK cells is commonly phenotypically defined into functionally distinct subsets based on a selected number of inhibitory and activating receptors contributing to the functional tuning7. To identify predictive gene signatures associated with these functional stages encompassing NK cell differentiation, we sorted and sequenced equal numbers of CD56bright NK cells and four CD56dim NK cell subsets (NKG2A+KIRCD57, NKG2Aself-KIR+CD57, NKG2Anonself-KIR+CD57, NKG2Aself-KIR+CD57+ or NKG2A−/+self-KIR+CD57+NKG2C+) from two donors, one without and one with a large adaptive NK cell expansion (Fig. 1c and Extended Data Fig. 1c,d). Transcriptionally, the adaptive NK cell subset was the most distinct because the remaining CD56dim subsets exhibited a high degree of transcriptional overlap, while still ordering themselves along the previously defined maturation scheme (Fig. 1c). As previously observed in bulk RNA-seq data23, the transcriptomes of self and nonself KIR+ NK cells were highly similar even at the single-cell level and thus merged for subsequent analysis (Fig. 1c). The five transcriptionally distinct NK cell subsets were renamed to reflect their maturation stage: ‘CD56bright’, ‘early CD56dim’, ‘intermediate CD56dim’, ‘late CD56dim’ and ‘adaptive’ (Fig. 1c).

We next trained a semi-supervised model, scANVI (single-cell ANnotation using Variational Inference)24, to leverage our identified NK cell subset gene signatures to predict and infer subset annotation of compiled bulk NK cell scRNA-seq datasets. We first tested the accuracy of the prediction model (M1) on 15% of the subset-sorted NK cells (Fig. 1c) that were not included in the training of the model. Transcriptionally distinct subsets (CD56bright, adaptive) were annotated with high accuracy, whereas subsets exhibiting higher transcriptional overlap were annotated with slightly reduced accuracy (Fig. 1d). Using this model, we could annotate the total NK cell dataset comprising 23,253 single-cell transcriptomes across 12 donors at the subset level (Fig. 1e). The transcriptional profiles of the subsets are captured by the model and used to identify differentially expressed genes (DEGs). The overlapping sets of genes illustrate the transition between the subsets. (Fig. 1f). To validate our annotation model, we performed unbiased clustering (Leiden) of the total NK cell dataset (12 donors), identifying 5 clusters closely matching our annotated 5 NK cell subsets (Fig. 1g). A small portion of intermediate CD56dim-annotated NK cells clustered together with late CD56dim-annotated NK cells in cluster 4 (Fig. 1h), probably corresponding to more mature cells within the population. The subset stratification obtained through training of our model based on subset signatures, as well as the unbiased Leiden clustering, harmonizes well with the recently proposed NK1–3 nomenclature25 (Extended Data Fig. 1e). Having confirmed the validity of our five NK cell subsets, M1 was utilized to identify donors with an adaptive NK cell expansion, which were all confirmed to be CMV seropositive (Fig. 1i). Thus, this first scANVI model forms a basis to interrogate cellular states layered on top of the natural transcriptional changes with NK cell subsets at different stages of differentiation.

Fate-specific gene-regulatory networks

To decipher the regulatory gene pathways driving NK cell differentiation at the transcriptional level, we used Palantir26 and RNA velocity to calculate pseudotime27,28. Palantir identifies terminal cells based on a chosen starting cell, placing the remaining cells along a timeline (pseudotime). Defining the starting cell (blue) based on the lowest CD56dim score2 (Fig. 1b) identified two terminal cells (orange), predicted to be part of the late CD56dim and adaptive population, respectively (Fig. 2a). To validate this trajectory, we utilized the dynamic model implemented in scVelo27 to compute RNA velocity (spliced versus unspliced transcripts), inferring pseudotime without a predefined starting cell (Extended Data Fig. 2a,b). The resulting vector field and extrapolated pseudotime confirmed a trajectory starting within the CD56bright NK cell subset and terminating in the adaptive subset (Fig. 2b). Last, to infer developmental relationships at the resolution of the five subsets, representing functionally distinct subsets and proposed stages of NK cell differentiation7, we applied partition-based graph abstraction (PAGA)29 to quantify their connectivity and estimate transitions. In line with the two terminal fates (late CD56dim, adaptive) identified by Palantir, we analyzed donors with conventional and adaptive NK cells separately (Fig. 1i). In both types of donors, early CD56dim NK cells formed the connecting link between CD56bright and the remaining CD56dim populations (Fig. 2c,d). However, although adaptive donor NK cells continued their progression to intermediate CD56dim cells, terminating in the transcriptionally distinct adaptive population, conventional donors instead progressed toward intermediate/late CD56dim populations (Fig. 2c,d).

Fig. 2. GRNs defining conventional and adaptive NK cell fates.

Fig. 2

a, UMAP representation highlighting the starting cell (blue) with the lowest value CD56dim signature score and the two terminal cells (orange) as predicted by Palantir. b, UMAP representation of the data from the sorted subsets (two donors) showing the RNA velocity vector field as a stream plot and the inferred pseudotime. c,d, PAGA graph with directionality and transitions from RNA velocity analysis for the sorted subsets (2 donors) (c) and subset-inferred bulk donors (12 donors) stratified based on the presence or absence of adaptive expansion (d). e, Gene trends clustered into five overall trends of expression along pseudotime, showing expression of KLRC2, CD52 and IL32 in both terminal fates (pink, conventional fate; orange, adaptive fate). f, Inferred GRNs where dominant TFs for each trend are highlighted. g, Selection of regulons showing differential expression over pseudotime within the conventional and adaptive fate.

Extended Data Fig. 2. RNA velocity.

Extended Data Fig. 2

(a) Graphical depiction of inferring RNA velocity based on spliced vs unspliced transcripts. (b) RNA velocity plots for ZEB2 and BCL11B transcripts stratified by subset annotation in donors with an adaptive expansion.

Having established a temporal axis to NK cell differentiation, we utilized generalized additive models (GAMs) to compute gene expression trends as a function of time for each gene26, which clustered into five distinct trends (Fig. 2e). Genes varying in expression across the two terminal fates were depicted in their trends for each fate, exemplified by KLRC2, CD52 (refs. 15,18) and IL32 clustering into trend 1 in the conventional late CD56dim fate and trend 4 in the adaptive fate (Fig. 2e). Based on the two-fate model, we constructed gene-regulatory networks (GRNs)21 stratified by the five gene trends and identified the dominant TFs across pseudotime and their known downstream target genes (Fig. 2f). Trend 1 is dominated by genes that are downregulated with differentiation from CD56bright to CD56dim cells, including previously reported TFs (MYC, LEF1, RUNX2)11, RBPJ30 involved in Notch signaling, the retinoic acid receptor (RXRA) and TFs regulating ID2 expression (HOXA9, HOXA10)31 (Fig. 2e,f). Trend 2 genes, compared with trend 1, are upregulated during differentiation from early to intermediate CD56dim cells and include, among others, EGR1 (ref. 32) (cell survival, proliferation, apoptosis, regulation of TRAIL expression), BHLHE40 (refs. 33,34) (associated with NK cell activation and repression of RXRA) and IRF8 (refs. 35,36) (role in orchestrating adaptive response, essential NK cell gene) (Fig. 2e,f). TFs exhibiting less dynamic changes across pseudotime are clustered in trend 3, such as IKZF1 (ref. 37), XBP1 and KLF2, which play a role in regulating homeostatic proliferation, effector function and cytokine responsiveness38,39. TFs exhibiting higher expression at the start and end of pseudotime fall into trend 4, including STAT3 (cell survival, IFN-γ production) and DDIT3 (ref. 40) (stress response, metabolism). Last, expression of trend 5 genes steadily increases with differentiation, decreasing only during late differentiation, and includes previously reported TFs associated with CD56dim NK cells (MAF, PRDM1, TBX21)11, the AP-1 family member BATF, the ETS family member ETV7 and the Wnt target gene ASCL2 (Fig. 2e,f). The TF-based GRNs were further curated to only retain direct targets with significant motif enrichment, referred to as ‘regulons’ (denoted by ‘(+)’), expression of which was confirmed in an independent bulk RNA-seq dataset on sorted NK cell subsets. Regulon expression substantially differing between the conventional and adaptive fate includes conventional fate-associated BHLHE40 (ref. 34), IRF8 (refs. 35,36) and DDIT3 (ref. 40) and adaptive fate-associated MAF11, BATF and PRDM1 (ref. 41) regulons (Fig. 2g). Clustering of dominant TFs according to their temporal expression during NK cell differentiation revealed a set of highly connected regulatory circuits, expression of which diverged during terminal differentiation into one of the two cell fates: conventional or adaptive.

Transfer learning to generate pan-cancer atlas

Having transcriptionally defined NK cell differentiation in peripheral blood (PB), we proceeded to train a second model (M2) with publicly available scRNA-seq datasets encompassing 6 healthy tissues (prostate, lung, pancreas, skin, breast, brain) from a total of 136 donors using scVI19 to generate a healthy reference map (PB-NK + TrNK) (Fig. 3a and Supplementary Table 2). The tissue-specific datasets were integrated and annotated using scANVI and CellTypist42 was used to identify immune subsets of interest at the pan-tissue level (Fig. 3b and Extended Data Fig. 3a) and within individual tissues (Extended Data Fig. 3b–f). The annotation and integration steps were repeated for the scRNA-seq datasets from 7 solid tumors (prostate (PRAD), lung (NSCLC), melanoma (SKCM), pancreas (PAAD), breast cancer (BRAC), glioblastoma (GBM) and osteosarcoma (SARC)) from a total of 427 patients (Supplementary Tables 3 and 4), at the pan-cancer level (Fig. 3c,d) and within individual tumor types (Extended Data Fig. 4a–g). CellTypist-annotated innate lymphoid cells (ILCs) (Extended Data Fig. 5a,b) were further stratified into ILC1/2/3 based on previously described scRNA-seq signatures43. We could not identify ILC1s in both the tissue and the tumor datasets, but, importantly, ILC2- and ILC3-annotated cells scored highly for IL7R expression compared with CD56bright- and CD56dim-annotated NK cells, excluding contamination by ILC1s (Extended Data Fig. 5c,d).

Fig. 3. Pan-cancer atlas of healthy tissue-resident and solid TiNK cells.

Fig. 3

a, Graphic overview of healthy tissue datasets included in the analysis, with the number of donors denoted in brackets. b, UMAP representation showing integration of all healthy tissue datasets. c, Graphic overview of solid tumor datasets included in the analysis, with the number of donors denoted in brackets. d, UMAP representation showing integration of all solid tumor datasets. e, Scoring of tissue-residency signatures (sig.) in PB-NK cell subsets, as well as CD56bright- and CD56dim-annotated TrNK and TiNK subsets. f, UMAP representation showing integration of subset-annotated PB-NK, TrNK and TiNK cells. g,h, PAGA graphs (g) and connectivity heatmap (h) showing connectivity of PB-NK, TrNK and TiNK subsets across all tissues/tumor types, with individual tissues/tumor types highlighted (g). The scale represents gene set activity computed by AUCell (e). Panels a and c created with BioRender.com.

Extended Data Fig. 3. Healthy tissue dataset annotation using CellTypist.

Extended Data Fig. 3

(a) Heatmap depicting expression of signature genes of the main immune populations annotated by CellTypist across all tissue samples. (b-f) UMAP representation showing integration of all healthy tissue datasets, prostate (b), lung (c), pancreas (d), skin (e), breast (f), with individual cell subtypes annotated using CellTypist.

Extended Data Fig. 4. Solid tumor dataset annotation using CellTypist.

Extended Data Fig. 4

(a-g) UMAP representation showing integration of all solid tumor datasets, PRAD (a), NSCLC (b), SKCM (c), PAAD (d), BRAC (e), GBM (f), SARC (g), with individual cell subtypes annotated using CellTypist.

Extended Data Fig. 5. Tissue-residency scoring of NK cells.

Extended Data Fig. 5

(a, b) UMAP representation showing integration of all healthy tissue (a) and solid tumor (b) datasets, with lymphocytes populations visualized. (c, d) IL7R expression in annotated NK cells (CD56bright, CD56dim) and ILCs (ILC2, ILC3) in tissues (c) and tumors (d). (e, f) Dotplots depicting expression of genes defining the literature-TR and atlas-TR signatures in CD56bright and CD56dim subsets in healthy blood and across all tissue types (e) and stratified by individual tissues (f). (g) Tissue-residency scoring (atlas-TR) of CD56bright and CD56dim annotated NK cells in individual tissue and tumor types. The scale represents gene set activity calculated by AUCell (g).

To assess tissue-residency status in our annotated NK cells in the tissue- and tumor-derived datasets (Extended Data Fig. 5a,b), we utilized a literature-derived TR signature as well as our own atlas-derived TR (atlas-TR) signature (Fig. 3e). The atlas-TR signature is based on the top six genes differentially expressed by both CD56bright and CD56dim NK cells across tissue types when comparing with the corresponding subset in the blood-derived NK cells (Extended Data Fig. 5e,f). CD56bright NK cells scored generally higher for a TR signature compared with CD56dim NK cells in both normal tissue and tumors, with a more distinct TR signal (compared with PB-NK) achieved with the atlas-TR signature (Fig. 3e and Extended Data Fig. 5g). NK cells annotated in a healthy brain scored very low for tissue residency and thus we cannot exclude blood contamination in these samples (Extended Data Fig. 5g).

CD56bright- and CD56dim-annotated TiNK cells were mapped on to the reference map (PB-NK, TrNK) using transfer learning (scArches44) to generate the final model (M3), our pan-cancer NK atlas (Fig. 3f). CD56bright and CD56dim subsets from PB, tissues and tumors clustered together (Fig. 3g,h) and were more tightly connected than to their respective tissues/tumor origin, apart from skin-/SKCM-derived NK cells (Fig. 3g,h). Thus, differentiation stage had a greater influence on the NK cell transcriptome compared with tissue origin. Transfer learning facilitated incorporation of TiNK cells on to our healthy reference map of PB and TrNK cells, allowing for downstream systematic interrogation of cellular states within solid TiNK cells.

Altered NK cell subset frequencies across tissues and tumors

The TME is shaped by its cellular composition, in particular by the infiltrating immune cells, which in turn can be modulated by their surroundings. A pan-cancer comparison of the healthy tissue and tumor-annotated immune subtypes (Fig. 3b,d) identified an increased proportion of plasma cells and naive B cells, as well as a decreased proportion of CD56dim NK cells, classic monocytes, dendritic cells, NK T cells, and effector memory/effector T helper cells (helper TEM/EFF), effector memory/effector memory re-expressing CD45RA cytotoxic T cells (cytotoxic TEM/EMRA) and resident memory cytotoxic T cells (cytotoxic TRM) in the pan-cancer datasets (Fig. 4a). The fraction of CD56bright NK cells out of total immune cells was enriched in BRAC, whereas CD56dim NK cells were enriched in SKCM, but decreased in NSCLC and BRAC (Fig. 4a–c). We further annotated the CellTypist-identified NK cells at the subset level using our subset-trained model (M1) (Fig. 4d,e). Skewing of the CD56bright:CD56dim ratio between healthy blood or tissue and tumor was observed for most tumor types (Fig. 4d,e), including non-small cell lung cancer (NSCLC), which was independently validated by flow cytometry in an NSCLC cohort (Fig. 4f and Extended Data Fig. 6a). In line with this, we observed a general decrease in the intermediate CD56dim population within the TiNK cells (Fig. 4d,e). Protein-based annotation of the CD56dim population in the NSCLC cohort also identified a decrease of the early and intermediate CD56dim subset and a modest increase of the late CD56dim subsets in the NSCLC cohort compared with healthy blood controls (Fig. 4g and Extended Data Fig. 6b–e). Solid TiNK cells were enriched for a CD56bright transcriptional phenotype whereas intermediate CD56dim NK cells were reduced within the CD56dim compartment in solid tumors, findings that were verified at the protein level in an NSCLC cohort45.

Fig. 4. Cellular composition of pan-cancer cell atlas and subset distribution of TiNK cells.

Fig. 4

a, Heatmap depicting changes in immune subset proportion in tumor samples compared with healthy tissue samples at the pan-cancer level and within individual tumor types. b,c, Proportions of major immune subsets within healthy tissue (b) and tumor samples (c). d,e, Predicted subset annotations of CellTypist-identified NK cells in healthy tissue (d) and tumor samples (e) compared with annotated PB-NK cells. f,g, Frequency of CD56bright NK cells (f) and relative frequency of subsets (g) identified by flow cytometry in a cohort of healthy blood donors (n = 19) and central tumor samples from patients with NSCLC (n = 25), from 23 independent experiments. Data were analyzed using two-sample Student’s t-test with Bonferroni’s correction (a) and a two-tailed Mann–Whitney U-test (f): *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001. The bar graph in f represents the mean ± s.d, with the actual P value indicated (P < 0.0001).

Source data

Extended Data Fig. 6. Phenotyping of NK cells in NSCLC patient samples.

Extended Data Fig. 6

(a) Gating strategy for CD56bright and CD56dim NK cells in healthy donor (PBMC) and NSCLC samples (Tumor). (b-e) Representative plots of CD56dim NK cells (b) and quantification of NGK2A (c), KIR (d) and CD57 (e) expression in PBMC (n = 19) and NSCLC samples (n = 25), from 23 independent experiments. Data were analyzed using two-tailed Mann-Whitney test (c-e). All bar graphs represent the mean ± s.d. Actual p values are indicated.

Source data

Six functionally distinct cellular states of NK cells

TMEs of solid tumors are hostile and often immunosuppressive environments for immune cells to infiltrate46. Understanding how the TME can modulate NK cells at the transcriptional level can provide important insights into understanding the tumor-mediated immunosuppressive mechanisms and how to overcome them.

We implemented an unbiased approach (Milo47) to ascertain cellular states in our pan-cancer NK cell atlas by identifying 6,932 individual neighborhoods without pre-clustering based on cellular origin. Annotating individual neighborhoods as subset specific (>70% of cells in the neighborhood) identified TiCD56bright NK cells as having the most frequent, but also the most unique (differentially abundant), specific neighborhoods (Extended Data Fig. 7a). Notably, most neighborhoods were annotated as ‘mixed’, highlighting transcriptional similarities among NK cells found in PB, tissues and tumors (Extended Data Fig. 7a). The 6,932 neighborhoods were grouped into 6 distinctive neighborhood groups and tested for differential abundance of neighborhoods between TiNK cells and Ref-NK cells (Fig. 5a and Extended Data Fig. 7b). Neighborhood groups 1 and 2 consisted of neighborhoods significantly enriched for TiNK cells and group 6 included neighborhoods enriched for Ref-NK cells (Fig. 5b and Extended Data Fig. 7b).

Extended Data Fig. 7. Characterization of cellular states of NK cells identified in pan-cancer cell atlas.

Extended Data Fig. 7

(a) Beaswarm plot depicting differential abundance of TiNK or Ref-NK (PB-NK, TrNK) enriched neighborhoods, clustered based on subset annotation of individual neighborhoods. (b) TiNK fraction of cells in neighborhoods within each neighborhood group. The boxplot indicates the median with the interquartile range (IQR), whiskers extend to the farthest point within 1.5 times the IQR from the box. n is the number of neighborhoods in each group: group 1, n = 382; group 2, n = 1261; group 3, n = 1239; group 4, n = 871; group 5, n = 1427; group 6, n = 1752. (c-e) Volcano plots depicting differentially expressed genes (DEGs) between Group 4 vs. Group 3/5/6 (c), Group 5 vs. Group 3/4/6 (d), Group 6 vs. Group 3/4/5 (e). Differential expression analysis was performed using the findNhoodGroupMarkers method within the miloR package. Counts were aggregated per sample; groups were compared using edgeR and the adjusted p-values were used for the plots. (f-i) Gene set enrichment analysis (GSEA) for DEGs identified between Group 1 vs Group 2 (f), Group 3 vs Group 4/5/6 (g), Group 5 vs Group 3/4/6 (h) and Group 6 vs Group 3/4/5 (i). Volcano plots: log fold change cutoff at 0.5, p < 0.05. GSEA plots: p value cutoff 0.5 (red line).

Fig. 5. Distinct cellular states of NK cells identified in pan-cancer atlas.

Fig. 5

a, UMAP depicting neighborhood (Nhood) groups identified by Milo and computed using the scVI representation. b, Beeswarm plot depicting differential abundance of neighborhoods (TiNK versus Ref-NK enriched). Colored neighborhoods are differentially abundant at a false recovery rate (FDR) of 0.1. c, Pie charts showing distribution of NK subsets across neighborhood groups annotated using our annotation model (Fig. 1). d, Expression of dominant TF regulons of NK cell differentiation across NK cell states (neighborhood groups). e, Expression of TF regulons uniquely expressed across cellular states. f, Graphic representation of cellular states. g,h, Volcano plots depicting DEGs between group 1 versus group 2 (g) and group 3 versus group 4/5/6 (h) cellular states. Differential expression analysis was performed using the findNhoodGroupMarkers method within the miloR package. Counts were aggregated per sample; groups were compared using edgeR and the adjusted P values were used for the plots. i, Scoring of pathway gene signatures in NK cell states. Func., function; homeo., homeostasis. jn, Dot plots depicting selected genes belonging to stress response (j), immune suppression (k), metabolism (l), cytotoxicity (m) and chemokine/cytokine secretion (n). o, Pie charts depicting distribution of NK cell states in blood, tissues and tumors. Volcano plots: log(fold-change) cutoff at 0.5, P < 0.05. The scale represents regulon activity (d and e) or gene set activity (i) computed by AUCell.

Next, we visualized the distribution of NK cell subsets within each group using our annotation model (M1). Groups 1 and 2 were enriched for, but not exclusive to, CD56bright cells, whereas groups 3–6 were dominated by CD56dim NK cell subsets (Fig. 5c). The dominant TF regulons of PB-NK cell differentiation previously identified (Fig. 2f) confirmed groups 1 and 2 as two CD56bright states and groups 3–6 as four CD56dim NK cell states (Fig. 5d).

Cell-state-specific GRNs, DEGs, gene set enrichment analysis (GSEA) and signature scoring informed our annotation of the states as stressed CD56bright (group 1), typical CD56bright (group 2), effector CD56dim (group 3), adaptive CD56dim (group 4), activated CD56dim (group 5) and typical CD56dim (group 6) (Fig. 5e–n and Extended Data Fig. 7c–i). Comparing the stressed with the typical CD56bright state (group 1 versus group 2) identified increased expression of the cellular stress response ATF3 regulon, the hypoxia-induced MAFF regulon and numerous heat shock proteins (Fig. 5e,g and Extended Data Fig. 7f). The stressed CD56bright cell state scored highly for immunosuppressive pathways (transforming growth factor (TGF)-β signaling, hypoxia, reactive oxygen species (ROS)) and exhibited increased metabolic activation (glycolysis, cholesterol homeostasis, fatty acid metabolism and mTORC1 (mammalian target of rapamycin complex 1)) (Fig. 5g,j–l). Furthermore, a low NK cell cytotoxicity score, exemplified by reduced effector and activating signaling molecules, was suggestive of reduced functionality in this stressed CD56bright cellular state, which was uniquely enriched across all seven tumor types (Fig. 5i,m,o). In line with increased infiltration of CD56bright cells in the TME, the typical CD56bright cellular state was also enriched in five of seven tumor types compared with healthy tissue, with both CD56bright groups exhibiting higher expression of immunomodulatory molecules, including XCL1, XCL2 and IFNG (Fig. 5n–o).

Of the CD56dim states, the effector state was most frequently enriched across tumor types (SARC, PAAD), characterized by an enrichment for apical junction, actin and cytoskeleton-related genes as well as effector molecules (Fig. 5h and Extended Data Fig. 7g). This state, phenotypically enriched for intermediate and late CD56dim NK cell subsets, scored highly for NK cytotoxicity and oxidative phosphorylation and, importantly, low for immune suppression (Fig. 5i,k–m). The adaptive CD56dim state was uniquely enriched for adaptive NK cells, in line with adaptive-associated genes (CD52, IL32, GZMH, CD3E) being upregulated in this state (Fig. 5c and Extended Data Fig. 7c). The activated CD56dim state was distinguished by increased hypoxia, upregulated nutrient transporters and the mTORC1–Myc axis (Fig. 5i,k and Extended Data Fig. 7d,h). Last, the PB-enriched typical CD56dim state exhibited a low stress score and a high cytotoxicity score and was associated with IFN, tumor necrosis factor (TNF) and JAK/STAT signaling (Fig. 5i–j,m and Extended Data Fig. 7e,i). Notably, although we observed enrichment of individual cellular states in the TME, including the two CD56bright and the effector CD56dim states, all states were represented in healthy blood and tissue samples, albeit at different frequencies.

State-specific signaling in the TME links to functionality

To elucidate any TME-based influence on the six functional states identified, we employed CellChat48 to infer intercellular communication, focusing on commonly enriched signaling pathways across all seven tumor types. Group 1 and 2 NK cell states were enriched for incoming signaling across tumor type from four dominant communication pathways (Fig. 6a). Increased expression of CD44, CXCR4 and CD74 on group 1 and 2 NK cells, on which numerous signals from fibroblasts, endothelial cells, tumor cells and macrophages converged (COLLAGEN, MIF, LAMININ), facilitated the augmented incoming signaling in NSCLC (Fig. 6b,c). Notably, the fibroblasts, endothelial cells, tumor cells and cancer-associated fibroblasts (CAFs) also exhibited the strongest outgoing interaction strength across tumor types (Extended Data Fig. 8a–g). Furthermore, group 1 and 2 NK cells preferentially received inhibitory input via the major histocompatibility complex I (MHC-I) (HLA-E/KLRC1) pathway owing to high KLRC1 expression in these cellular states (Fig. 6a,d). Hence, group 1 and 2 cellular states were more receptive to TME-induced immunosuppressive signals via upregulated expression of CD44, CXCR4, CD74 and KLRC1.

Fig. 6. Intercellular communication of distinct cellular states in the TME.

Fig. 6

a, Selected predicted incoming signaling pathways involving TiNK cells common across tumor type, identified by CellChat. b, Violin plots showing expression of receptors for the MIF, COLLAGEN and LAMININ communication pathway in NSCLC. c, Circle plot depicting predicted incoming signaling via CD74, CXCR4 and CD44 expression (NSCLC). d, Violin plots showing expression of receptor and ligand for the MHC-I communication pathway in NSCLC. e, Selected predicted outgoing signaling pathways involving TiNK cells across tumor type. f, Heatmap depicting interaction role of individual cell populations in CCL, PARs and IFN-II signaling pathways in NSCLC based on network centrality analysis. g, Dot plot depicting GZMA expression (ligand for PARs) and IFNG expression (ligand for IFN-II) in NK cell states across tumor type. h, Frequency of granzyme A+CD56dim NK cells in healthy blood donors (n = 6) and patients with NSCLC (n = 11) from 9 independent experiments. i, Geometric mean fluorescence intensity (gMFI) of granzyme A+CD56dim NK cells in healthy blood donors and patients with NSCLC (n = 7) from six independent experiments. j,k, Degranulation (CD107a) (j) and granzyme B release (k) of CD56bright and CD56dim NK cells against A549 target cells pre-treated with (black box) and without (white box) IFN-γ (24 h) in the presence (black box) or absence (white box) of α-NKG2A antibody (E:T 1:1, 4 h, n = 5 biologically independent replicates from one experiment). Data were analyzed using two-way analysis of variance followed by Šidák’s multiple-comparison test (j and k), two-tailed Mann–Whitney U-test (h) or two-tailed Wilcoxon’s test (i). All bar graphs represent the mean ± s.d, with the actual P values indicated.

Source data

Extended Data Fig. 8. Intercellular communication in TME.

Extended Data Fig. 8

(a-g) Scatterplot depicting incoming and outgoing interaction strength of individual cell types in BRAC (a), PAAD (b), PRAD (c), NSCLC (d), SARC (e), SKCM (f), GBM (g) as identified by CellChat. (h) Violin plots showing expression of ligands for the CCL (CCL3, CCL5) communication pathway in NSCLC.

To understand how NK cells contribute to shaping the TME via an immunomodulatory role, we focused our analysis on outgoing signaling largely restricted to NK cells. We identified three signaling pathways (CC chemokine ligand (CCL), protease-activated inhibitors (PARs), IFN-II) through which NK cells predominantly communicated with dendritic cells, macrophages, fibroblasts and endothelial cells (Fig. 6e,f). CCL3 and CCL5, expressed across all states, can lead to the recruitment of cells expressing ACKR1, CCR1 and CCR4 (Extended Data Fig. 6h). Release of granzyme A, highly expressed at the transcriptional level by the effector NK cell state (group 3), can induce apoptosis of F2R-expressing cells in the TME, such as fibroblasts (Fig. 6g). Granzyme A expression was reduced in both frequency and intensity in CD56dim NK cells from central tumor samples from patients with NSCLC compared with healthy blood controls, hinting at a release of granzyme A by NK cells within the tumor (Fig. 6h,i). Release of IFN-γ, predominantly by the stressed CD56bright (group 1) state, can induce surrounding cells to upregulate MHC-I expression, including HLA-E (Fig. 6g and Extended Data Fig. 9a–d). Inhibitory signaling via the HLA-E axis significantly inhibits degranulation and granzyme B release of both CD56bright and CD56dim NK cells, as demonstrated by co-culturing NK cells with A549 (NSCLC) targets cells pre-stimulated with IFN-γ to upregulate HLA-E expression (Fig. 6j,k and Extended Data Fig. 9a–e). Blockade of the NKG2A–HLA-E axis, using an anti-NKG2A antibody, resulted in significant recovery of function, both degranulation and granzyme B release (Fig. 6j,k and Extended Data Fig. 9e). CD56bright cellular states exhibited increased inhibitory signaling (MHC-I) and augmented susceptibility to TME-induced suppression (MIF, COLLAGEN, LAMININ) whereas CD56dim states, particularly the effector state, exhibited high GZMA signaling, which was confirmed in samples of CD56dim from patients with NSCLC.

Extended Data Fig. 9. In vitro validation of IFNG-HLA-E-KLRC1 axis in NSCLC.

Extended Data Fig. 9

(a) Representative histogram of HLA-E expression of A549 cells pre-treated (24 h) with and without IFNγ. (b-d) Viability (b), frequency (c) and geometric MFI (d) of HLA-E + A549 cells (n = 4, biological replicates from two independent experiments). (e) Gating strategy and representative contour plots for functional readout of CD56bright and CD56dim NK cells against A549 target cells pre-treated with and without IFNγ (24 h) in presence and absence of α-NKG2A antibody (E:T 1:1, 4 h). Data were analyzed using two-tailed Mann-Whitney test (b-d). All bar graphs represent the mean ± s.d. Actual p values are indicated.

Source data

Ratio of cellular states is predictive of patient outcome

Having identified 6 functionally distinct cellular states of NK cells within our pan-cancer NK cell atlas comprising 89,850 scRNA-seq transcriptomes, we validated our findings in spatial RNA-seq datasets (Supplementary Table 5). Spatial RNA-seq tissue sections from SKCM, NSCLC and GBM were deconvoluted using Tangram49 combined with our established scRNA-seq references for the tumor types being analyzed to identify the cell types in these datasets (Fig. 7a). Compositional analysis of the main immune subtypes in SKCM, NSCLC and GBM varied greatly across tumor type, but was highly consistent across sequencing technique (scRNA-seq versus spatial-seq) (Fig. 7b). Focusing on SKCM, harboring the highest proportion of NK cells (Fig. 7b), we could further stratify the annotated NK cells into CD56bright and CD56dim subsets (Fig. 7c) and cellular states (Fig. 7d). Importantly, confirming previous results (Fig. 5i,m), the effector (group 3) and typical (group 6) CD56dim states scored highly for genes associated with NK cell cytotoxicity. Similarly, stress response-related genes, as well as immunosuppressive-related genes (ROS, hypoxia) scored highest in the stressed CD56bright (group 1) state (Fig. 7f,g), in line with results in the scRNA-seq data (Fig. 5i–k).

Fig. 7. Distinct cellular states in spatial RNA-seq and association with patient outcome.

Fig. 7

a, Deconvoluted spatial RNA-seq images from SKCM, NSCLC and GBM at the level of immune populations. b, Pie charts depicting compositional analysis of major immune populations from scRNA-seq datasets and spatial-seq datasets for SKCM, NSCLC and GBM samples. c,d, Annotation of CD56bright and CD56dim NK cell subsets (c) and the six cellular states of NK cells (d) in SKCM. eg, Dot plots depicting selected genes belonging to NK cytotoxicity (e), stress response (f) and immunosuppression (g) scored across NK cell states in spatial-seq data from SKCM. h, Kaplan–Meier survival curves showing association of high/low group 1/3 gene signatures with patient outcome across tumor types. Survival analysis was performed using Cox’s proportional hazards model; P values were computed using the log(rank) test.

The clinical benefit of NK cell infiltration in solid tumors has previously been assessed through a general NK cell signature score50,51. Having identified six functional states of NK cells in blood, tissue and solid tumors, in both scRNA-seq and spatial-seq datasets, we proceeded to test clinical relevance of these cellular states by using BayesPrism52 to deconvoluted TCGA (The Cancer Genome Atlas) RNA-seq data where we also had survival data53,54 (Extended Data Fig. 10). A higher ratio of effector CD56dim:stressed CD56bright NK state signatures was predictive of improved survival in SARC and SKCM (Fig. 7h). We hereby confirm that the six functional states identified in our pan-cancer NK cell atlas, and confirmed in spatial RNA-seq datasets, are also predictive of outcome in patients with osteosarcoma and melanoma.

Extended Data Fig. 10. Deconvolution of TCGA datasets.

Extended Data Fig. 10

Distribution of CD56bright and CD56dim NK cells in deconvoluted TCGA datasets. The boxplots indicates the median with the interquartile range (IQR), whiskers extend to the farthest point within 1.5 times the IQR from the box. For each plot and each subset n is the number of patients for each tumor type: SARC, n = 88; PAAD, n = 183; NSCLC, n = 600; BRAC, n = 1231; SKCM, n = 473; PRAD, n = 554; GBM, n = 175.

Discussion

In the present study, we report a compact description of the transcriptional diversification encompassing human NK cell differentiation at the single-cell level. By enriching for less frequent, but phenotypically well-defined, functionally distinct NK cell subsets, we could first train a model to correctly annotate five transcriptional subsets from bulk NK cell populations. By applying probabilistic models implemented in scvi-tools, we created a transcriptional reference map of human blood and TrNK cells from normal tissues, including blood, pancreas, lung, breast, skin, prostate and brain. Transfer learning using scArches facilitated integration of query datasets comprising a total of 2,176,214 transcriptomes from 427 patients spanning 7 solid tumor types. By extracting, annotating and mapping the TiNK cells on to our reference map of healthy donors, we could systematically interrogate TME-induced perturbations of GRNs and functional states of TiNK cells (Supplementary Fig. 1). Our pan-cancer atlas revealed six functionally distinct NK cell states with varying abundance across blood, tissues and tumor types, which we could confirm in spatial RNA-seq datasets (SKCM, NSCLC, GBM). Two states commonly enriched for across tumor types included a dysfunctional CD56bright cellular state susceptible to TME-induced immunosuppression and a cytotoxic TME-resistant CD56dim state, the ratio of which was predictive of patient outcome.

The view that NK cells, like T cells and other immune cells, undergo a continuous process of NK cell differentiation is relatively recent and was originally based on phenotypic and functional classification of discrete subsets7,55. There is abundant evidence to suggest that the CD56bright NK cell subset is the most naive, giving rise to the more differentiated CD56dim NK cells which can further differentiate toward terminal stages, a process accelerated by CMV infection8,56,57. Instead of forcing individual NK cells into arbitrary clusters representing a snapshot of a given time point of differentiation, we clustered TFs and their target genes into five distinct gene expression trends as a function of pseudotime, reflecting continuous differentiation. The dominant TF regulons within these five gene trends correlated with functional traits of NK cells along the differentiation axis, such as cytokine responsiveness, as well as proliferative and cytotoxic capacity. By retaining fate-specific expression profiles, conventional versus adaptive fate in donors with CMV-induced clonal NK cell expansions, we could observe clear divergence of regulon expression (for example, BATF, MAF) during terminal differentiation. BATF belongs to the AP-1 TF family which have been identified as potential drivers in shaping adaptive NK cell chromatin accessibility and thus dictating the unique functional features of this subset, including enhanced IFN-γ response to receptor stimulation15. Establishing dominant regulons defining NK cell differentiation in PB provided a vital reference for downstream interrogation of both TrNK and solid TiNK cells.

Utilizing CellTypist, we harmonized annotations of individual cell subtypes across multiple datasets from six different healthy tissues, extracting and integrating CD56bright and CD56dim NK cells using scVI19 to expand our transcriptional reference map. Importantly, tissue-, as well as tumor-annotated, NK cells, did not express human ILC signature genes (IL7R), instead expressing both EOMES and TBX21. Literature-derived tissue-residency genes (for example, CD69, ITGAE, ITGA1, CXCR6, ZNF683 and IKZF3), originally extrapolated from tissue-resident T cell signatures5861, were more highly expressed in tissue-derived NK cells, particularly in CD56bright NK cells62. Using our extensive pan-cancer NK cell atlas, we were able to generate a solely NK cell-derived, tissue-residency signature (atlas-TR: PSMA2, SLC5A3, CCL4L2, CLN3, SCGB1A1, AREG), which outperformed the conventional literature-derived TR signature across tissue and tumor type. CD56bright and CD56dim NK cells from healthy brain tissue exhibited a low TR- score, indicative of potential blood contamination in this specific dataset. Importantly, GBM-derived CD56bright and CD56dim NK cells scored highly for tissue residency, supporting their infiltration into the tumor. Expression of CCL4L2, encoding a chemokine that induces chemotaxis of CCR5- and CCR1-expressing cells, such as T cells, dendritic cells and macrophages, has previously been described in NK cells isolated from melanoma samples63. This represents an independent verification, because this dataset was not included in our study. These melanoma-infiltrating NK cells also exhibited high AREG expression, an epidermal growth factor (EGF) receptor ligand. Notably, upregulation of AREG has also been described in the setting of healthy and cirrhotic liver-resident NK cells64, a tissue type not included in our pan-cancer atlas. Intriguingly, SCGB1A1, a member of the secretoglobin family, functions as a potent inhibitor of phospholipase A2 (ref. 65), a well-described immunosuppressive molecule contributing to the development of the TME. Hence, it is tempting to speculate that secretion of the SCGB1A1-encoded protein could be another effector mechanism through which TiNK cells can positively contribute to remodeling of the TME.

The presence and abundance of NK cells that reside in the tumor bed vary across tumor types and treatments and between patients, and appears to be associated with the chemokine profiles in the different tissues/TMEs6669. In agreement with previous studies45,67,70, we observed a predominance of CD56bright NK cells in tumors compared with the corresponding normal tissue. TrNK cells are probably a mixed population including naturally residing TrNK cells and TiNK cells. Compositional differences between normal and tumor tissues suggests some degree of active recruitment, particularly in SKCM where NK cell frequencies starkly increased, albeit expansion from tissue-resident pools cannot be excluded. Migration into the TME is regulated by a broad family of integrins, selectins and chemokine receptors that are differentially expressed during NK cell differentiation. CXCR3, primarily expressed on CD56bright NK cells, has been implicated in homing to several solid tumors based on CXCL10 gradients71,72, and thus may contribute to the predominance of this subset in tumors. CCL2, CCL3, CCL5, CXCL8, CXCL9, CXCL10 and CXCL12 have similarly been implicated in mediating predominantly CD56bright NK cell trafficking into solid tumors based on chemokine receptor expression69. Release of CCL3 and CCL5 by NK cells can also recruit CCR1-expressing immune cells, such as macrophages. We observed increased CXCR4 expression in group 1 and 2 cellular states, corresponding to CD56bright TrNK and TiNK cells. Previous reports73,74 have demonstrated CD44-induced CXCR4 upregulation resulting in increased migration and invasiveness of malignant cells. Notably, CD44 was highly expressed on the tumor-enriched stressed CD56bright state, alongside CXCR4 and CD74, possibly sensitizing this population to TME-mediated immunosuppression from CAFs, fibroblasts, endothelial and tumor cells, as noted by high scores for TGF-β signaling, hypoxia and ROS. High immunosuppression of this state is in line with the increased stressed response noted, as exemplified by high expression of the cellular stress response-associated TF ATF3, the HSP70 co-chaperone BAG3, the stressful growth arrest gene GADD45B and DUSP1, which is associated with cellular response to environmental stress.

Transcriptional stress response programs, including heat shock proteins, have previously been reported as a potential artefact downstream of digestion of tissues75. We therefore took several measures to rule out digestion artefacts when compiling the present resource. In addition to implementing upstream data-processing steps, including removal of ambient RNA using decontX76, we found no evidence for systematic artefactual stress signal coming from a particular study or tumor type. Perhaps most importantly, the stress signature defining the group 1 NK cell state was also found in spatial transcriptomics data directly on tissue sample sections that have not undergone any upstream tissue dissociation/digestion.

We also found high KLRC1 expression on the group 1 and 2 states, which, alongside high IFNG expression, can induce an inhibitory feedback loop, whereby local IFN-γ secretion leads to HLA-E upregulation resulting in inhibitory input through CD94/NKG2A. Conversely, the effector CD56dim state, associated with improved patient outcome, lacked CD44 expression and highly expressed GZMA. Notably, this state exhibited high expression of the KLF2, PRDM1, BATF, TBX21 and IKZF1 regulons, indicative of high effector function, regulation of homeostatic proliferation and survival, but also cell migration and tissue residency. Unique TiNK cell-specific regulons in this state consisted of NFYC, CTBP1, POLE4 and CEBPA, which are involved in DNA repair, monitoring of proliferation, regulating MHC expression and maintaining structural homeostasis in the Golgi complex7780. Conversely, TiNK cell-specific regulons in the stressed CD56bright state included hypoxia-induced MAFF, cellular stress response regulon ATF3 and EGR3 (ref. 81) which induce negative regulators in response to activation. Metabolically, the effector CD56dim state scored highly for oxidative phosphorylation, compared with the stressed CD56bright state which favored glycolysis, mTORC1 activation and exhibited upregulated nutrient transporters and genes associated with cholesterol homeostasis.

Contrary to Tang et al.16, increased gene signature scoring of the tumor-enriched states stressed that the CD56bright state did not consistently associate with reduced survival across tumor types. Instead, we observed increased survival in patients exhibiting a high effector CD56dim state, which was further augmented with a low signature for the stressed CD56bright state. Of the four CD56dim states, the effector CD56dim state was enriched across two tumor types, painting a promising picture for the role of solid TiNK cells.

This resource provides a transcriptional reference map of human NK cells across healthy blood and tissues with harmonized annotations of transcriptional NK cell subsets. Uncovering the dominant gene-regulatory circuits during NK cell differentiation enabled identification of TME-induced perturbations in solid TiNK cells across tumor type. We identified functionally distinct NK cell states across healthy and malignant tissues, including tumor-enriched states predictive of patient outcome. Modeling of the intercellular communication pathways of outcome predicting NK cell states with the surrounding TME identified potential pathways of TME-induced NK cell suppression. Thus, our analysis has the potential to design more potent NK cell therapy products able to resist suppressive factors operating within the TME of solid tumors. Ultimately, this resource can be extended endlessly through transfer learning to interrogate new datasets from experimental perturbations or different tumor types.

Methods

Cell processing

Peripheral mononuclear cells (PBMCs) were isolated using density gradient centrifugation from anonymized healthy blood donors (Oslo University Hospital; Karolinska University Hospital) with informed consent. The study was approved by the regional ethics committee in Norway (Regional etisk komité (REK): protocol no. 2018/2482) and Sweden (Regionala etikprövningsnämnden i Stockholm: protocol no. 2016/1415-32; Etikprövningsmyndigheten: protocol no. 2020-05289). Donor-derived PBMCs were screened for KIR education and adaptive status using flow cytometry. NK cells were purified using an AutoMACS (DepleteS program, Miltenyi Biotec) and before overnight resting in complete Roswell Park Memorial Institute (RPMI) 1640 (Cytiva) (10% fetal bovine serum (FBS; GE Healthcare), 2 mM l-glutamine (GE Healthcare)) at 37 °C and 5% CO2.

Flow cytometry screening

PBMCs were stained for surface antigens and viability in a 96 V-bottomed plate, followed by fixation/permeabilization and intracellular staining at 4 °C. The following antibodies were used in the screening panel: CD3-V500 (clone UCHT1), CD14-V500 (clone MφP9), CD19-V500 (clone HIB19) and Granzyme B-AF700 (clone GB11) from Beckton Dickinson; CD57-FITC (clone HNK-1), CD38-BV650 (clone HB-7) and CD158e1-BV421 (clone DX9) from BioLegend; CD158a-APC-Vio770 (clone REA284) and CD158a/h-PE-Vio770 (clone 11PB6) from Miltenyi Biotec; and CD158b1/b2,j-PE-Cy5.5 (clone GL183), CD159a-APC (clone Z199) and CD56-ECD (clone N901) from Beckman Coulter. LIVE/DEAD Fixable Aqua Dead Stain kit for 405-nM excitation (Life Technologies) was used to determine viability. Samples were acquired on an LSR-Fortessa equipped with a blue, red and violet laser and analyzed in FlowJo v.9 (TreeStar, Inc.).

FACS sorting

Cells were harvested and surface stained with the following antibodies: CD57-FITC (HNK-1) from BioLegend; CD158e1/e2-APC (clone Z27.3.7), CD56-ECD (clone N901) and CD158b1/b2,j-PE-Cy5.5 (clone GL183) from Beckman Coulter; and CD158a-APC-Vio770 (clone REA284), CD159c-PE (clone REA205) and CD159a-PE Vio770 (clone REA110) from Miltenyi Biotec. Cells, 12,000, were directly sorted into Eppendorf tubes at 4 °C for each sample using a FACSAriaII (Beckton Dickinson). Sorting strategies for scRNA-seq for the donor with and without an adaptive NK cell expansion are depicted in Extended Data Fig. 1c,d.

ScRNA-seq

After sorting, cells were kept on ice during the washing (phosphate-buffered saline (PBS) + 0.05% bovine serum albumin (BSA)) and counting steps. Cells, 10,000, were resuspended in 35 μl of PBS + 0.05% BSA and immediately processed at the Genomics Core Facility (Oslo University Hospital) using the Chromium Single Cell 3′ Library & Gel Bead Kit v.2 (Chromium Controller System, 10x Genomics). The recommended 10x Genomics protocol was used to generate the sequencing libraries, which was performed on a NextSeq500 (Illumina) with ~5% PhiX as spike-in. Sequencing raw data were converted into fastq files by running Illumina’s bcl2fastq v.2.

ScRNA-seq data collection and processing

Previously published scRNA-seq data were collected mostly in the form of count matrices already aligned to GRCh38; the rest were collected as fastq files. For the datasets where we collected fastq files, the data were aligned to GRCh38 using Cell Ranger (10x Genomics Cell Ranger 7.0.0).

Quality control and normalization of scRNA-seq data

Data-cleaning steps were first carried out whereby cells not expressing a minimum of 1,000 molecules and genes expressed by <10 cells were filtered out. Doublets were removed using the SOLO algorithm82. The count matrices for all the tumor and tissue types were corrected for ambient RNA using decontX76. The data were normalized using log(transformation) for some of the downstream analysis as well as for visualization of gene expression-like dot plots. Quality control, transformation and most of the visualization of the gene expression data were performed using Scanpy83. For analysis using scVI and scANVI, the raw count data were used.

Integration of scRNA-seq data

The probabilistic models scVI and scANVI, as implemented in scvi-tools19, were used for integration of scRNA-seq data. These methods have been shown to perform well for integration of scRNA-seq data, especially when dealing with complex batch effects and integrating atlas-level data84. For cell-type and -subset annotations and prediction, scANVI was used to capture annotation of single-cell profiles. For the analysis of PB-NK subsets, the sorted subsets provided labels for training the scANVI model. The subset prediction provided by the model was tested on a held-out set of cells (15%) from the sorted subset data, giving us a confusion matrix summarizing the performance of the prediction.

Dimensionality reduction, clustering and visualization of scRNA-seq data

We computed the Uniform Manifold Approximation and Projection (UMAP) embeddings for visualization using the embedding learned from scVI and scANVI. Unsupervised clustering was also carried out using this learned embedding with Phenograph and the Leiden algorithm as implemented in Scanpy. PAGA29 was used to quantify the connectivity of different groups of cells, thereby providing a representation of the data as a simpler graph. The various plots were mostly generated using the plotting functions in Scanpy.

Cell-type annotations and harmonization

For many of the publicly available datasets, cell-type annotations were readily available and used as seed labels when training the scANVI model for that particular tissue/tumor type to annotate the nonimmune cells. The scANVI model allowed us to harmonize annotations that were needed for analysis across datasets. All immune cells for all tissue types were integrated using scVI and annotated using CellTypist42. The same was done for all immune cells across all tumor types. The CD16 and CD16+ NK cells identified by CellTypist were annotated as CD56bright and CD56dim, respectively. Where CITE-seq data were available, the surface expression of key markers also helped validate the cell-type annotations. For the identified NK cells, the cells were also scored using NK1/NK2 (CD56bright/CD56dim) signatures to validate the annotation of CD56bright and CD56dim NK cells. We also performed our own unsupervised Leiden clustering, which identified two dominating clusters corresponding to CD56bright and CD56dim NK cells.

Calculation of signature scores

Signature scores were computed using AUCell21, allowing for exploration of the relative expression of the signatures of interest in the datasets. Various gene sets were taken from the MSigDB Hallmark gene set collection85.

Pseudotime and RNA velocity analysis

Pseudotime was computed using Palantir26, which captures the continuous nature of differentiation, and cell fate, which allowed us to explore two terminal states and the gene expression changes seen along these trajectories. For this analysis, the starting cell was defined as the cell that was the least CD56dim (the lowest score for the NK1 signature). GAMs fitted on cells ordered by pseudotime were used to calculate gene trends, where the contribution of cells was weighted by their probability to end up in the given terminal state as calculated by Palantir. The gene trends indicate how gene expression levels develop over the differentiation timeline. These trends were clustered using the Leiden clustering algorithm to give us five clusters of gene trends. RNA velocity28 was also used to take advantage of splicing kinetics to identify directed dynamic information. We used velocyto28 and scVelo27 for this analysis, specifically the dynamic model implemented in the scVelo toolkit. The RNA velocity analysis was run on the 2 donors where sorted subsets were sequenced separately, as well as on the integrated data from 12 blood donors.

GRN analysis

SCENIC21 was used to infer TFs and GRNs from the scRNA-seq data. The SCENIC workflow86 was followed and the pySCENIC implementation was used. TF–gene associations were inferred by GRNBoost87 and motif–TF associations were downloaded from Aerts’s lab website and used for pruning the inferred associations. The inferred regulatory networks were also further pruned by removing lowly expressed TFs based on the bulk RNA-seq data. AUCell was used to compute the activity of the final regulons. The regulon activity was visualized using matrix plots, as implemented in Scanpy, to look at the activity across different groups of cells.

Bulk RNA-seq for TF and target validation

For validation of the TF and targets, we checked their expression in bulk RNA-seq data from four sorted NK cell populations (CD56bright, NKG2AKIR-CD56dim, NKG2AKIR+CD56dim and NKG2AKIR+NKG2C+CD56dim). Sequencing was performed using single-cell tagged reverse transcription88.

Reference mapping

The TiNK cells were added after the model for a healthy NK cell reference was trained. Then, scArches44 as implemented in scvi-tools19 was used to map these new data on to the established reference.

Cell–cell communication inference using CellChat

To infer the communication between the various cell types in the tumor datasets we used CellChat48. Based on gene expression of receptors and ligands in the data and a curated database of pathways, CellChat computes the communication probability between various receptor–ligand pairs. CellChat also provided ways to aggregate this information and for us to visualize the inferred cell–cell communication networks. CellChat was computed separately for each of the tumor types included in the analysis.

Differential gene expression analysis

To perform differential gene expression analysis we used pseudobulk because this has shown good results when analyzing scRNA-seq data in various studies89. This allowed us to aggregate up counts for each sample and consider the samples instead of the cells as replicates. We then used edgeR90 on the pseudobulk data. We could then identify DEGs between healthy reference NK cells and TiNK cells within and across subsets.

Differential abundance analysis using Milo

We used Milo47 to assign cells to neighborhoods on the k-nearest neighbors graph (k-NNG). The scVI representation of the cells was used for building the k-NNG. This allowed us to have a batch-corrected representation of the cells as input to this analysis. The differential abundance of the neighborhoods between the healthy reference and the TiNK cells was then computed. The neighborhoods were grouped into six groups using the groupNhoods function in Milo. These groups were considered as different NK cell states and further characterized using the functions in Milo for identification of DEGs. The differential expression analysis was done using pseudobulk by aggregating gene expression per sample. The single cells were then annotated using these groups for downstream analysis.

GSEA

GSEA was performed using the GSEA software91 and the MSigDB collection of gene sets. Genes were first ordered based on the differential expression analysis based on either the pseudobulk approach or the Milo analysis.

Spatial transcriptomics

Spatial transcriptomics datasets from lung tumor, glioblastoma and melanoma were collected from the 10x Genomics website (https://www.10xgenomics.com/datasets). Squidpy92 was used for preprocessing and segmentation and Tangram49 was used for deconvolution using our annotated scRNA-seq data for each of the tumor types as reference. The deconvolution was performed with the NK cells annotated as CD56bright and CD56dim, as well using the group annotations established in this paper.

Clinical and bulk RNA-seq data from TCGA and TARGET

Bulk RNA-seq data and clinical data were downloaded from TCGA and TARGET using TCGAbiolinks53 and curated survival data were downloaded from Xena54.

Deconvolution of bulk RNA-seq

Deconvolution of the bulk RNA-seq data was performed for each of the tumor types using BayesPrism52. BayesPrism has been shown to work well for deconvolution of data from tumors and especially well in dealing with high cell-type granularity93. The annotated reference datasets for each of the tumor types were used as prior information in the deconvolution. BayesPrism then computed both an expression matrix for each cell type and the cell-type fraction for each sample.

Survival analysis

The NK expression matrix inferred by BayesPrism for the various tumor types was used to score the signature genes for each of the identified NK cell states. The patients were then assigned as high and low for a group/state based on belonging to the highest or lowest half in terms of expression of these signature genes within the group of patients with a specific tumor type. The high and low designations could then be combined in an approach where a patient could be assigned as high or low in multiple groups. Survival analysis was conducted using Cox’s proportional hazards model from the R package survival94, adjusting for confounding clinical factors such as tumor stage, gender and age. Subsequently, survival curves were derived using the Kaplan–Meier method within the same package. For visualization, the ggsurvplot function of the survminer package in R was utilized.

Samples from patients with primary NSCLC

The patient cohort, processing of tissue specimens and flow cytometry staining were collected and performed as previously described45.

Functional assay using A549 cells

A549 cells were cultured in Dulbecco’s modified Eagle’s medium/high glucose with l-glutamine, sodium pyruvate (Cytiva) + 10% heat-inactivated FBS (Sigma-Aldrich) at 37 °C in 5% CO2. A549 cells, 20,000, were seeded per well in a 96-well F-bottom plate and pre-treated with and without 50 ng ml−1 of IFN-γ (PeptroTech) for 24 h before addition of NK cells. HLA-E expression after IFN-γ stimulation was evaluated using HLA-E–PE antibody (BioLegend, clone 3D12). NK cells were isolated using negative selection (NK cell isolation kit, Miltenyi Biotec) from previously cryopreserved PBMCs from healthy individuals. Cells were activated overnight with 5 ng ml−1 of IL-15 (R&D) in RPMI 1640 (Cytiva) + 10% heat-inactivated FBS at 37 °C in 5% CO2. NK cells were washed, resuspended in RPMI 1640 + 10% FBS and pre-incubated with and without α-NKG2A (a monalizumab biosimilar: immunoglobulin (Ig)G1 with PGLALA mutation, Merck) for 20 min prior. Target cells were washed in PBS before the addition of NK cells at a 1:1 effector:target (E:T) ratio in the presence of brefeldin A (GolgiPlug, 1:1,000, BD Biosciences), monensin (GolgiStop, 1:1,500, BD Biosciences) and anti-CD107a-BUV394 (BD Horizon, clone H4A3). After a 4-h incubation, the cells were stained with anti-IgG Fc–PE (Invitrogen), followed by surface, fixation and permeabilization (Cytofix/Cytoperm, BD) and finally intracellular staining using the following antibodies: CD159a-VioBright FITC (Miltenyi Biotec, clone REA110), Granzyme B-AF700 (BD, clone GB11), CD16-Pacific Blue (BD, clone 3G8), CD3-V500 (BD, clone UCHT1), TNF-α-BV650 (BioLegend, clone Mab11), IFN-γ-BV785 (BioLegend, clone 4S.B3), CD56-ECD (Beckman Coulter, clone N901) and perforin–PE-Cy7 (eBioscience, clone dG9), LIVE/DEAD Fixable Aqua Dead Cell Stain kit (Thermo Fisher Scientific).

Reagents and antibodies

A full list containing company information, catalog nos and antibody clones for all reagents can be found in Supplementary Data.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Online content

Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at 10.1038/s41590-024-01884-z.

Supplementary information

Supplementary Information (1MB, pdf)

Supplementary Fig. 1 and Tables 1–5 including references.

Reporting Summary (5.6MB, pdf)
Peer Review File (576KB, pdf)
Supplementary Data 1 (11.8KB, xlsx)

Reagent and antibody information, including catalog nos and titrations.

Source data

Source Data Figs. 1, 4 and 6 and Extended Data Figs. 6 and 9 (21.9KB, xlsx)

Statistical source data.

Acknowledgements

Large parts of the analyses were run using the Machine learning infrastructure (ML Nodes), University Centre for Information Technology, University of Oslo, Norway. This publication is part of the Human Cell Atlas (www.humancellatlas.org/publications), HCA-106. We thank the flow cytometry and genetics core facilities at Oslo University Hospital. We thank Merck KGaA for providing tool reagents. This work was supported by the Swedish Research Council (grant nos 223310 to K.-J.M. and 2021-03069 and 2021-01039 to N.M.), the Swedish Children’s Cancer Society (grant no. PR2020-1059 to K.-J.M.), the Swedish Cancer Society (grant nos 21-1793Pj to K.-J.M., 22-2319Pj to N.M. and 23-2946Pj to J.M.), Sweden’s Innovation Agency (K.-J.M.), the Center for Innovative Medicine (CIMED, grant no. 20200680 to N.M.), the Tornspiran Foundation (N.M.), the Karolinska Institutet (K.-J.M.), the Research Council of Norway (grant nos 275469 and 237579 to K.-J.M.), Center of Excellence: Precision Immunotherapy Alliance (grant no. 332727 to K.-J.M.), the Norwegian Cancer Society (grant nos 190386 and 223310 to K.-J.M.), the South-Eastern Norway Regional Health Authority (grant nos 2021-073 and 2024-053 to K.-J.M.), EU H2020-MSCA Research and Innovation program (grant no. 801133 to K.-J.M.), the Knut and Alice Wallenberg Foundation (grant no. 2018.0106 to K.-J.M.), the Swedish Foundation for Strategic Research (K.-J.M.) and the US National Cancer Institute (grant nos P01 CA111412 and P009500901 to K.-J.M. and R21AI130760 to A.H.).

Extended data

Author contributions

J.P.G., A.H. and A.P. performed the scRNA-seq experiments. A.P. performed the in vitro experiments and analysis. N.M., D.B. and J.M. provided the data on patients with NSCLC (flow cytometry). H.N. performed the computations. H.N. and A.P. performed the bioinformatic analysis. E.S., T.C., O.D., S.A.T., A.H. and K.-J.M. provided scientific input. A.P. wrote the manuscript with support from H.N., A.H., and K.-J.M. All authors edited the manuscript.

Peer review

Peer review information

Nature Immunology thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available. Primary Handling Editor: Jamie D. K. Wilson, in collaboration with the Nature Immunology team.

Funding

Open access funding provided by Karolinska Institute.

Data availability

The gene expression data generated for this paper are available at the National Center for Biotechnology Information’s Gene Expression Omnibus with accession no. GSE245690 and raw sequencing data are available at the European Genome–Phenome Archive with accession no. EGAS50000000014. The details about the publicly available data included in the analysis are available in Supplementary Tables 1, 2 and 3. For GSEA the Molecular Signature Database (v.2023.2.Hs), available at https://www.gsea-msigdb.org/gsea/msigdb, was used. Relevant gene sets for scoring were also retrieved from this database. Bulk RNA-seq data were downloaded from TCGA and TARGET. Curated survival data were downloaded from Xena. Processed data and models have also been made available via Zenodo at 10.5281/zenodo.8434223 (ref. 95) and as an online resource at http://nk-scrna.malmberglab.com. Source data are provided with this paper.

Code availability

The code generated for our analysis is available on GitHub at https://github.com/hernet/transcriptional-map-nk.

Competing interests

J.P.G. is an employee at Fate Therapeutics. T.C. is an employee at NEC OncoImmunity AS. K.-J.M. is a consultant at Fate Therapeutics and Vycellix and has research support from Fate Therapeutics, Oncopeptides for studies unrelated to this work. O.D. has received research funding from Gilead Sciences and Incyte, unrelated to this work, and personal fees from Sanofi, unrelated to this work. S.A.T. is a scientific advisory board member of ForeSite Labs, QIAGEN and Element Biosciences, and a co-founder and equity holder of TransitionBio and EnsoCell Therapeutics, and a part-time employee of GlaxoSmithKline. A.H. has received funding from Astra Zeneca/MedImmune, unrelated to this work. A.H. is a consultant at Purple BioTech. The other authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Herman Netskar, Aline Pfefferle.

These authors jointly supervised this work: Amir Horowitz, Karl-Johan Malmberg.

Contributor Information

Aline Pfefferle, Email: aline.pfefferle@ki.se.

Karl-Johan Malmberg, Email: k.j.malmberg@medisin.uio.no.

Extended data

is available for this paper at 10.1038/s41590-024-01884-z.

Supplementary information

The online version contains supplementary material available at 10.1038/s41590-024-01884-z.

References

  • 1.Moretta, A., Bottino, C., Mingari, M. C., Biassoni, R. & Moretta, L. What is a natural killer cell? Nat. Immunol.3, 6–8 (2002). [DOI] [PubMed] [Google Scholar]
  • 2.Crinier, A. et al. High-dimensional single-cell analysis identifies organ-specific signatures and conserved NK cell subsets in humans and mice. Immunity49, 971–986.e5 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Cooper, M. A., Fehniger, T. A. & Caligiuri, M. A. The biology of human natural killer-cell subsets. Trends Immunol.22, 633–640 (2001). [DOI] [PubMed] [Google Scholar]
  • 4.Horowitz, A. et al. Genetic and environmental determinants of human NK cell diversity revealed by mass cytometry. Sci. Transl. Med.5, 208ra145 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Horowitz, A. et al. Class I HLA haplotypes form two schools that educate NK cells in different ways. Sci. Immunol.1, eaag1672 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Goodridge, J. P., Önfelt, B. & Malmberg, K.-J. Newtonian cell interactions shape natural killer cell education. Immunol. Rev.267, 197–213 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Björkström, N. K. et al. Expression patterns of NKG2A, KIR, and CD57 define a process of CD56dim NK-cell differentiation uncoupled from NK-cell education. Blood116, 3853–3864 (2010). [DOI] [PubMed] [Google Scholar]
  • 8.Schlums, H. et al. Cytomegalovirus infection drives adaptive epigenetic diversification of NK cells with altered signaling and effector function. Immunity42, 443–456 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Lopez-Vergès, S. et al. CD57 defines a functionally distinct population of mature NK cells in the human CD56dimCD16+ NK-cell subset. Blood116, 3865–3874 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Juelke, K. et al. CD62L expression identifies a unique subset of polyfunctional CD56dim NK cells. Blood116, 1299–1307 (2010). [DOI] [PubMed] [Google Scholar]
  • 11.Collins, P. L. et al. Gene regulatory programs conferring phenotypic identities to human NK cells. Cell176, 348–360.e12 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Smith, S. L. et al. Diversity of peripheral blood human NK cells identified by single-cell RNA sequencing. Blood Adv.4, 1388–1406 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Melsen, J. E. et al. Single-cell transcriptomics in bone marrow delineates CD56dim Granzyme K+ subset as intermediate stage in NK cell differentiation. Front. Immunol.13, 1044398 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Holmes, T. D. et al. The transcription factor Bcl11b promotes both canonical and adaptive NK cell differentiation. Sci. Immunol.6, eabc9801 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Rückert, T., Lareau, C. A., Mashreghi, M.-F., Ludwig, L. S. & Romagnani, C. Clonal expansion and epigenetic inheritance of long-lasting NK cell memory. Nat. Immunol.23, 1551–1563 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Tang, F. et al. A pan-cancer single-cell panorama of human natural killer cells. Cell186, 4235–4251.e20 (2023). [DOI] [PubMed] [Google Scholar]
  • 17.Rood, J. E., Maartens, A., Hupalowska, A., Teichmann, S. A. & Regev, A. Impact of the Human Cell Atlas on medicine. Nat. Med.28, 2486–2496 (2022). [DOI] [PubMed] [Google Scholar]
  • 18.Yang, C. et al. Heterogeneity of human bone marrow and blood natural killer cells defined by single-cell transcriptome. Nat. Commun.10, 3931 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Gayoso, A. et al. A Python library for probabilistic analysis of single-cell omics data. Nat. Biotechnol.40, 163–166 (2022). [DOI] [PubMed] [Google Scholar]
  • 20.Haghverdi, L., Buettner, F. & Theis, F. J. Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics31, 2989–2998 (2015). [DOI] [PubMed] [Google Scholar]
  • 21.Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods14, 1083–1086 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Scheiter, M. et al. Proteome analysis of distinct developmental stages of human natural killer (NK) cells. Mol. Cell. Proteom.12, 1099–1114 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Goodridge, J. P. et al. Remodeling of secretory lysosomes during education tunes functional potential in NK cells. Nat. Commun.10, 514 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Xu, C. et al. Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative models. Mol. Syst. Biol.17, e9620 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Vivier, E. et al. High-dimensional single-cell analysis of natural killer cell heterogeneity in human blood. Preprint at Research Square10.21203/rs.3.rs-3870228/v1 (2024).
  • 26.Setty, M. et al. Characterization of cell fate probabilities in single-cell data with Palantir. Nat. Biotechnol.37, 451–460 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Bergen, V., Lange, M., Peidli, S., Wolf, F. A. & Theis, F. J. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol.38, 1408–1414 (2020). [DOI] [PubMed] [Google Scholar]
  • 28.Manno, G. L. et al. RNA velocity of single cells. Nature560, 494–498 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wolf, F. A. et al. PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biol.20, 59 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Chaves, P. et al. Loss of canonical notch signaling affects multiple steps in NK cell development in mice. J. Immunol.201, 3307–3319 (2018). [DOI] [PubMed] [Google Scholar]
  • 31.Nagel, S. et al. Polycomb repressor complex 2 regulates HOXA9 and HOXA10, activating ID2 in NK/T-cell lines. Mol. Cancer9, 151 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Balzarolo, M., Watzl, C., Medema, J. P. & Wolkers, M. C. NAB2 and EGR-1 exert opposite roles in regulating TRAIL expression in human natural killer cells. Immunol. Lett.151, 61–67 (2013). [DOI] [PubMed] [Google Scholar]
  • 33.Wiencke, J. K. et al. The DNA methylation profile of activated human natural killer cells. Epigenetics11, 363–380 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Cho, Y. et al. The basic helix-loop-helix proteins differentiated embryo chondrocyte (DEC) 1 and DEC2 function as corepressors of retinoid X receptors. Mol. Pharmacol.76, 1360–1369 (2009). [DOI] [PubMed] [Google Scholar]
  • 35.Adams, N. M. et al. Transcription factor IRF8 orchestrates the adaptive natural killer cell response. Immunity48, 1172–1182.e6 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Mace, E. M. et al. Biallelic mutations in IRF8 impair human NK cell maturation and function. J. Clin. Invest.127, 306–320 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Goh, W. et al. IKAROS and AIOLOS directly regulate AP-1 transcriptional complexes and are essential for NK cell development. Nat. Immunol.25, 240–255 (2024). [DOI] [PubMed] [Google Scholar]
  • 38.Wang, Y. et al. The IL-15-AKT-XBP1s signaling pathway contributes to effector functions and survival in human NK cells. Nat. Immunol.20, 10–17 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Rabacal, W. et al. Transcription factor KLF2 regulates homeostatic NK cell proliferation and survival. Proc. Natl Acad. Sci. USA113, 5370–5375 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Li, M. et al. DDIT3 directs a dual mechanism to balance glycolysis and oxidative phosphorylation during glutamine deprivation. Adv. Sci.8, e2003732 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Kallies, A. et al. A role for Blimp1 in the transcriptional network controlling natural killer cell maturation. Blood117, 1869–1879 (2011). [DOI] [PubMed] [Google Scholar]
  • 42.Domínguez Conde, C. et al. Cross-tissue immune cell analysis reveals tissue-specific features in humans. Science376, eabl5197 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Mazzurana, L. et al. Tissue-specific transcriptional imprinting and heterogeneity in human innate lymphoid cells revealed by full-length single-cell RNA-sequencing. Cell Res.31, 554–568 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Lotfollahi, M. et al. Mapping single-cell data to reference atlases by transfer learning. Nat. Biotechnol. 40, 121–130 (2021). [DOI] [PMC free article] [PubMed]
  • 45.Brownlie, D. et al. Accumulation of tissue-resident natural killer cells, innate lymphoid cells, and CD8+ T cells towards the center of human lung tumors. Oncoimmunology12, 2233402 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Combes, A. J., Samad, B. & Krummel, M. F. Defining and using immune archetypes to classify and treat cancer. Nat. Rev. Cancer23, 491–505 (2023). [DOI] [PubMed] [Google Scholar]
  • 47.Dann, E., Henderson, N. C., Teichmann, S. A., Morgan, M. D. & Marioni, J. C. Differential abundance testing on single-cell data using k-nearest neighbor graphs. Nat. Biotechnol.40, 245–253 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Jin, S. et al. Inference and analysis of cell-cell communication using CellChat. Nat. Commun.12, 1088 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Biancalani, T. et al. Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram. Nat. Methods18, 1352–1362 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Nersesian, S. et al. NK cell infiltration is associated with improved overall survival in solid cancers: a systematic review and meta-analysis. Transl. Oncol.14, 100930 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Cursons, J. et al. A gene signature predicting natural killer cell infiltration and improved survival in melanoma patients. Cancer Immunol. Res.7, 1162–1174 (2019). [DOI] [PubMed] [Google Scholar]
  • 52.Chu, T., Wang, Z., Pe’er, D. & Danko, C. G. Cell type and gene expression deconvolution with BayesPrism enables Bayesian integrative analysis across bulk and single-cell RNA sequencing in oncology. Nat. Cancer3, 505–517 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Colaprico, A. et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res.44, e71 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Goldman, M. J. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat. Biotechnol.38, 675–678 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Béziat, V., Descours, B., Parizot, C., Debré, P. & Vieillard, V. NK cell terminal differentiation: correlated stepwise decrease of NKG2A and acquisition of KIRs. PLoS ONE5, e11966 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Béziat, V. et al. NK cell responses to cytomegalovirus infection lead to stable imprints in the human KIR repertoire and involve activating KIRs. Blood121, 2678–2688 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Lee, J. et al. Epigenetic modification and antibody-dependent expansion of memory-like NK cells in human cytomegalovirus-infected individuals. Immunity42, 431–442 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Dogra, P. et al. Tissue determinants of human NK cell development, function, and residence. Cell180, 749–763.e13 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Poon, M. M. L. et al. Tissue adaptation and clonal segregation of human memory T cells in barrier sites. Nat. Immunol.24, 309–319 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Szabo, P. A., Miron, M. & Farber, D. L. Location, location, location: tissue resident memory T cells in mice and humans. Sci. Immunol.4, eaas9673 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Kumar, B. V. et al. Human tissue-resident memory T cells are defined by core transcriptional and functional signatures in lymphoid and mucosal sites. Cell Rep.20, 2921–2934 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Melsen, J. E. et al. Human bone marrow-resident natural killer cells have a unique transcriptional profile and resemble resident memory CD8+ T cells. Front. Immunol.9, 1829 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.de Andrade, L. F. et al. Discovery of specialized NK cell populations infiltrating human melanoma metastases. JCI Insight4, e133103 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Jameson, G. & Robinson, M. W. Insights Into human intrahepatic NK cell function from single cell RNA sequencing datasets. Front. Immunol.12, 649311 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Vecchi, L. et al. Phospholipase A2 drives tumorigenesis and cancer aggressiveness through its interaction with annexin A1. Cells10, 1472 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Cantoni, C. et al. NK cells, tumor cell transition, and tumor progression in solid malignancies: new hints for NK-based immunotherapy? J. Immunol. Res.2016, 4684268 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Platonova, S. et al. Profound coordinated alterations of intratumoral NK cell phenotype and function in lung carcinoma. Cancer Res.71, 5412–5422 (2011). [DOI] [PubMed] [Google Scholar]
  • 68.Carrega, P. et al. CD56brightperforinlow noncytotoxic human NK cells are abundant in both healthy and neoplastic solid tissues and recirculate to secondary lymphoid organs via afferent lymph. J. Immunol.192, 3805–3815 (2014). [DOI] [PubMed] [Google Scholar]
  • 69.Lachota, M. et al. Mapping the chemotactic landscape in NK cells reveals subset-specific synergistic migratory responses to dual chemokine receptor ligation. eBioMedicine96, 104811 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Carrega, P. et al. Natural killer cells infiltrating human nonsmall-cell lung cancer are enriched in CD56 bright CD16 cells and display an impaired capability to kill tumor cells. Cancer112, 863–875 (2008). [DOI] [PubMed] [Google Scholar]
  • 71.Rezaeifard, S., Talei, A., Shariat, M. & Erfani, N. Tumor infiltrating NK cell (TINK) subsets and functional molecules in patients with breast cancer. Mol. Immunol.136, 161–167 (2021). [DOI] [PubMed] [Google Scholar]
  • 72.Wendel, M., Galani, I. E., Suri-Payer, E. & Cerwenka, A. Natural killer cell accumulation in tumors is dependent on IFN-gamma and CXCR3 ligands. Cancer Res.68, 8437–8445 (2008). [DOI] [PubMed] [Google Scholar]
  • 73.Bao, W. et al. HER2 interacts with CD44 to up-regulate CXCR4 via epigenetic silencing of microRNA-139 in gastric cancer cells. Gastroenterology141, 2076–2087.e6 (2011). [DOI] [PubMed] [Google Scholar]
  • 74.Xie, P. et al. CD44 potentiates hepatocellular carcinoma migration and extrahepatic metastases via the AKT/ERK signaling CXCR4 axis. Ann. Transl. Med.10, 689 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 75.O’Flanagan, C. H. et al. Dissociation of solid tumor tissues with cold active protease for single-cell RNA-seq minimizes conserved collagenase-associated stress responses. Genome Biol.20, 210 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Yang, S. et al. Decontamination of ambient RNA in single-cell RNA-seq with DecontX. Genome Biol.21, 57 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 77.Zhu, X. S. et al. Transcriptional scaffold: CIITA interacts with NF-Y, RFX, and CREB to cause stereospecific regulation of the class II major histocompatibility complex promoter. Mol. Cell. Biol.20, 6051–6061 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Porse, B. T. et al. Loss of C/EBP alpha cell cycle control increases myeloid progenitor proliferation and transforms the neutrophil granulocyte lineage. J. Exp. Med.202, 85–96 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Colanzi, A. et al. Molecular mechanism and functional role of brefeldin A-mediated ADP-ribosylation of CtBP1/BARS. Proc. Natl Acad. Sci. USA110, 9794–9799 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Bellelli, R. et al. POLE3-POLE4 Is a Histone H3-H4 Chaperone that Maintains Chromatin Integrity during DNA Replication. Mol. Cell72, 112–126.e5 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Li, S. et al. The transcription factors Egr2 and Egr3 are essential for the control of inflammation and antigen-induced proliferation of B and T cells. Immunity37, 685–696 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 82.Bernstein, N. J. et al. Solo: doublet identification in single-cell RNA-seq via semi-supervised deep learning. Cell Systems11, 95–101.e5 (2020). [DOI] [PubMed] [Google Scholar]
  • 83.Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol.19, 15 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Luecken, M. D. et al. Benchmarking atlas-level data integration in single-cell genomics. Nat. Methods19, 41–50 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Systems1, 417–425 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 86.Van de Sande, B. et al. A scalable SCENIC workflow for single-cell gene regulatory network analysis. Nat. Protoc.15, 2247–2276 (2020). [DOI] [PubMed] [Google Scholar]
  • 87.Moerman, T. et al. GRNBoost2 and Arboreto: efficient and scalable inference of gene regulatory networks. Bioinformatics35, 2159–2161 (2019). [DOI] [PubMed] [Google Scholar]
  • 88.Islam, S. et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Res.21, 1160–1167 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Murphy, A. E. & Skene, N. G. A balanced measure shows superior performance of pseudobulk methods in single-cell RNA-sequencing analysis. Nat. Commun.13, 7851 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics26, 139–140 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA102, 15545–15550 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Palla, G. et al. Squidpy: a scalable framework for spatial omics analysis. Nat. Methods19, 171–178 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Tran, K. A. et al. Performance of tumour microenvironment deconvolution methods in breast cancer using single-cell simulated bulk mixtures. Nat. Commun.14, 5758 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Therneau, T. A Package for Survival Analysis in R. R package version 3.5-7. CRANhttps://CRAN.R-project.org/package=survival (2023).
  • 95.Netskar, H., Pfefferle, A. & Malmberg, K.-J. Pan-cancer profiling of tumor-infiltrating natural killer cells through transcriptional reference mapping. Zenodo10.5281/zenodo.8434223 (2024). [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Information (1MB, pdf)

Supplementary Fig. 1 and Tables 1–5 including references.

Reporting Summary (5.6MB, pdf)
Peer Review File (576KB, pdf)
Supplementary Data 1 (11.8KB, xlsx)

Reagent and antibody information, including catalog nos and titrations.

Source Data Figs. 1, 4 and 6 and Extended Data Figs. 6 and 9 (21.9KB, xlsx)

Statistical source data.

Data Availability Statement

The gene expression data generated for this paper are available at the National Center for Biotechnology Information’s Gene Expression Omnibus with accession no. GSE245690 and raw sequencing data are available at the European Genome–Phenome Archive with accession no. EGAS50000000014. The details about the publicly available data included in the analysis are available in Supplementary Tables 1, 2 and 3. For GSEA the Molecular Signature Database (v.2023.2.Hs), available at https://www.gsea-msigdb.org/gsea/msigdb, was used. Relevant gene sets for scoring were also retrieved from this database. Bulk RNA-seq data were downloaded from TCGA and TARGET. Curated survival data were downloaded from Xena. Processed data and models have also been made available via Zenodo at 10.5281/zenodo.8434223 (ref. 95) and as an online resource at http://nk-scrna.malmberglab.com. Source data are provided with this paper.

The code generated for our analysis is available on GitHub at https://github.com/hernet/transcriptional-map-nk.


Articles from Nature Immunology are provided here courtesy of Nature Publishing Group

RESOURCES