Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Mar 31.
Published in final edited form as: Nat Immunol. 2025 Oct 21;26(11):2100–2111. doi: 10.1038/s41590-025-02297-2

Single-cell RNA profiling of blood CD4+ T cells identifies distinct helper and dysfunctional regulatory clusters in children with SLE

Preetha Balasubramanian 1,2,9, Uthra Balaji 1,2,9, Marina Silva Santos 1,2, Jeanine Baisch 1,2, Cynthia Smitherman 1,2, Lynnette Walters 3, Paola Sparagana 3, Lorien Nassi 3,4,5, Katie Stewart 3,4,5, Julie Fuller 3,4,5, Terry Means 6, Virginia Savova 7, Jacques F Banchereau 8, Tracey Wright 3,4,5, Virginia Pascual 1,2,, Jinghua Gu 1,2,, Simone Caielli 1,2,
PMCID: PMC13034613  NIHMSID: NIHMS2151078  PMID: 41120754

Abstract

To characterize the complexity of the CD4+ T cell compartment in patients with systemic lupus erythematosus (SLE), we performed single-cell RNA sequencing of sorted blood CD4+ T cells from pediatric patients and healthy donors. We identified naive, memory, regulatory T (Treg) cell, proliferative and interferon-stimulated gene-high (ISG-high) clusters. Within the memory compartment, both follicular and peripheral helper cells were expanded in patients with lupus nephritis and/or high disease activity. Cytotoxic signatures were enriched in effector memory T cells re-expressing CD45RA (TEMRA), as well as in two memory subclusters, one of which overlapped with T helper 10-like cells (TH10). Notably, we observed an expansion of dysfunctional Treg cells in patients with lupus nephritis, along with upregulation of TLR5 and FCRL3 in SLE-naive Treg cells, suggesting a potential link with mucosal microbial dysbiosis. These findings highlight distinct CD4+ T cell subsets that may contribute to aberrant antibody responses and impaired immune regulation in SLE.


SLE is a systemic autoimmune disease characterized by clinical and molecular heterogeneity1. Pathogenic hallmarks include nucleic-acid-specific autoantibodies and increased interferon (IFN) activity2. Childhood-onset SLE accounts for up to 20% of SLE cases, and a higher number of pediatric patients (up to 85%) develop lupus nephritis (LN) compared to adults3.

Follicular helper T (TFH) cells contribute to autoimmunity in both mice and humans4, but extrafollicular helper T cells also drive the expansion of autoreactive B cells5. Thus, CXCR5PD-1hi peripheral helper T (TPH) cells were identified in both the synovium of patients with rheumatoid arthritis5 and blood of adults with SLE6. As TFH cells, these cells recruit and activate B cells through CXCL13 and IL-21 production. In addition, CXCR5CXCR3+PD-1hi cells expressing IFNγ and IL-10 (TH10) are expanded in the blood of children with SLE and the kidney of patients with LN7. Importantly, unlike T regulatory 1 (TR-1) cells8, TH10 cells lack regulatory properties but support differentiation of naive and memory B cells into plasmablasts in vitro7.

Dysfunctional effector CD4+ T cells could therefore represent a therapeutic opportunity in SLE. To dissect their heterogeneity, we performed single-cell RNA sequencing (scRNA-seq) of sorted blood CD4+ T cells from patients with SLE and healthy donors (HD). These analyses mapped TH10 and TPH cells within distinct subclusters (SCs), defined unique CD4+ cytotoxic programs, and uncovered phenotypic and functional alterations in SLE Treg cells, especially those from patients with LN.

Results

scRNA-seq of blood CD4+ T cells from HD and patients with SLE

CD4+ T cells were sorted from peripheral blood mononuclear cells (PBMCs) of 16 children (age 9–17 years, mean = 14 ± 2.5) with SLE, 11 of whom had biopsy-proven LN, along with 10 matched HD (Fig. 1a). Disease activity (DA) was measured using the SLE Disease Activity Index (SLEDAI), and patients were categorized into moderate/high (SLEDAI > 4; n = 6) and low (SLEDAI < = 4; n = 10) groups (Supplementary Table 1). Sorted CD4+ T cells were processed using the 10x Genomics Chromium Single Cell 3’ library, followed by whole-transcriptome sequencing (Supplementary Table 2 and Extended Data Fig. 1ac). After data preprocessing, including cell quality filtering9 and batch integration10, 239,473 CD4+ T cells were used for further clustering and phenotyping.

Fig. 1 |. scRNA-seq reveals altered cellular composition of CD4+ T cells in SLE.

Fig. 1 |

a, Workflow of scRNA-seq experiment starting from sorted CD4+ T cells. Guided clustering analysis was performed on total CD4+ T cells, followed by subclustering analysis. b, UMAP of total CD4+ T cells clustered into five major populations: naive and memory T cells, Treg cells, proliferating cells and ISG-high cells. c, Violin plot of expression of selected markers for the five major CD4+ T cell clusters. d, Box plots of cell proportions for the five major clusters within total CD4+ T cells by disease group (top), LN (middle panel) and DA (bottom panel). The boundaries and center line of the box plot represent upper and lower quartiles and the median of the data, respectively. The whiskers extend to the maximum and minimum data values within the 1.5× interquartile range (IQR) from the upper and lower quartiles. All data points are overlaid on the box plot. Two-sided t-tests were performed to test differences in cell proportions between groups. e, Heatmap showing the average expression of ISGs and IFN-related genes across the five clusters of CD4+ T cells in HD and patients with SLE. The colors of the heatmap indicate mean gene expression. Colors in the top row represent modular annotation of ISGs and IFN-related genes. f, Scatter plot showing correlations between per-sample average expression of ISGs and IFN-related genes in total CD4+ T cells and SLEDAI. Pearson correlation coefficient (r) and its corresponding P value were calculated. The error band represents the 95% confidence interval for the fitted regression line. Pearson’s correlation (0.648) and its corresponding two-sided P value (0.00034) were calculated. Single-cell analysis included SLE LN (n = 11), SLE NoLN (without nephritis; n = 5) and HD (n = 10) samples from the sorted CD4+ scRNA-seq dataset. *P < 0.05, **P < 0.01, ***P < 0.001.

A multistep subclustering framework was used to perform an initial clustering of total CD4+ T cells, followed by subclustering of each main cellular compartment (Fig. 1a)11. The initial total CD4+ T cell analysis yielded 15 clusters (Extended Data Fig. 1d), which were merged according to known CD4+ T cell markers into five main clusters (Fig. 1b,c and Extended Data Fig. 1e). Their cell identity assignment was independent of 10x batches, disease groups or samples (Extended Data Fig. 1fh). High CCR7, LEF1 and TCF7 and low S100A4 expression characterized the naive cluster, whereas the opposite pattern identified the memory cluster12,13 (Fig. 1c). FOXP3, IKZF2 and IL2RA (CD25) expression defined the Treg cell cluster14. An additional cluster was defined based on the highest levels of IFN-stimulated gene (ISG) expression (ISG-high)15. Finally, a small cluster expressing cell cycle genes, such as MKI67 and PCNA, was annotated as ‘proliferating’.

Analysis of cell proportions revealed differences between patients with SLE and HD across the five major clusters, including a reduction in naive cell clusters and an expansion of ISG-high and Treg cell clusters in SLE. Naive and Treg cell changes distinguished LN from both HD and non-LN patients (Fig. 1d). Memory and ISG-high clusters separated LN from HD. The frequency of proliferating CD4+ T cells was similar in patients with SLE and HD.

We assessed ISG expression using modules of coexpressed transcripts (M1.2, M3.4, M5.12)1 and IFN-related non-IFN-induced transcripts15. All SLE versus HD clusters showed increased ISG expression, ranging from mild (naive) to high (ISG-high, proliferating) expression levels (Fig. 1e and Extended Data Fig. 2a,b). The proliferating cluster uniquely showed upregulation of type II IFN genes (for example, genes encoding GBPs and IFNG) and downregulation of SOCS1 (Fig. 1e)16. ISG overexpression fluctuated across individual SLE samples (Extended Data Fig. 2a), and participant-specific ISG activity was correlated across the five main CD4+ clusters (Extended Data Fig. 2c) and with SLEDAI (Pearson’s correlation coefficient r = 0.648, P < 0.001; Fig. 1f and Extended Data Fig. 2d). A trend towards an increase in ISG expression was seen in LN, but this did not reach statistical significance (Extended Data Fig. 2e).

Thus, patients with SLE showed substantial shifts in major blood CD4+ T cell population frequencies according to DA and/or LN status. In addition, they showed overexpression of ISGs in correlation with DA rather than with LN.

Subclustering analysis of naive CD4+ T cells highlights five distinct cell states

Naive CD4+ T cells are more diverse and plastic than previously thought17. We uncovered six naive (CCR7hiS100A4low) SCs (SC0–SC5) (Extended Data Fig. 3a). SC0 to SC4 expressed bona fide naive CD4+ T cell transcripts, whereas cells within SC5 expressed higher levels of S100A4, KLRB1, ITGB1 and ANXA1 transcripts, characteristic of central memory T (TCM) cells18 (Fig. 2a,b).

Fig. 2 |. Subclustering analysis of naive CD4+ T cells reveals five distinct transcriptional states.

Fig. 2 |

a, Matrix plot showing top markers for each SC in naive CD4+ T cells. b, Heatmap of selected markers across the six naive and TCM SCs. c, Heatmap of the naive markers in b, plotted for naive clusters from PBMC validation dataset 1. The colors of the heatmap indicate the average expression of the markers. The row colors give the cluster annotation. NAI, naive. d, Box plots of relative cell proportions within total naive CD4+ T cells from the sorted dataset for each SC by disease group (top panel), LN (middle panel) and DA (bottom panel). The boundaries and center line of the box plot represent the upper and lower quartiles and the median of the data, respectively. The whiskers extend to the maximum and minimum data values within the 1.5× IQR from the upper and lower quartiles. All data points are overlaid on the box plot. Two-sided t-tests were performed to test differences in cell proportions between groups. Single-cell analysis included SLE LN (n = 11), SLE NoLN (n = 5) and HD (n = 10) samples from the sorted CD4+ scRNA-seq dataset and additional SLE (n = 10) samples from PBMC validation dataset 1. *P < 0.05, **P < 0.01, ***P < 0.001.

SC0 expressed cytoskeletal and motility genes (ACTG1, ACTB, CORO1A)19 and CD7, linked to adhesion20 (Fig. 2a,b). SC1 showed upregulation of transcriptional and cytokine signaling genes (MALAT1, ATM, IL6ST). SC2 was enriched for AP-1 complex transcripts, which are key to chromatin opening during T cell activation21. SC3 showed upregulation of cell cycle, apoptosis and JAK–STAT transcripts (BCL2, CDK6, PIM1, CISH, SOCS2). SC4 uniquely expressed SOX4, an early-life naive CD4+ T cell transcription factor22 also linked to CXCL13 production during inflammation23.

Subcellular analysis showed that naive CD4+ T cell markers were mainly in the nucleoplasm and linked to RNA splicing and transcription (Extended Data Fig. 3b). Naive SCs lacked CD transcripts, except for CD7 and CD69, which was upregulated in SC0 and SC2 (Extended Data Fig. 3c), respectively. Naive marker fold changes (FCs) were modest, but their variability was not due to noise or randomness (P < 2.2 × 10−16), as confirmed by in silico simulations (Extended Data Fig. 3d).

The naive T cell SC classification was confirmed in an independent PBMC dataset (validation dataset 1; Supplementary Table 3). Within this dataset, five naive states expressed the markers described above (Fig. 2c and Extended Data Fig. 3e). TCM CD4+ T cell markers did not define a unique SC but were highly enriched in the SC expressing the highest levels of FOS and JUN (Fig. 2c), which was mapped adjacent to the TCM SC in the original sorted CD4+ dataset (Extended Data Fig. 3a).

Within total CD4+ T cells, the frequencies of naive cells and of each SC were decreased in patients with SLE compared to HD. Within only naive CD4+ T cells, however, SC5 and SC3 were relatively expanded in SLE (Fig. 2d). SC3 was expanded in patients with LN and high DA compared to HD, whereas SC5 frequency classified patients with SLE with and without LN (Fig. 2d). These data confirm the presence of unique naive CD4+ T cell transcriptional states and their frequency changes in SLE.

The CD4+ memory compartment includes ten transcriptional distinct subsets

Phenotypical differences in homing and effector functions, antigen dependence, and capacity to proliferate or self-renew are characteristics of memory T cells. Our main cluster analysis yielded 90,178 memory CD4+ T cells based on high expression of S100A4 and low (although variable) expression of CCR7, which were classified into ten distinct SCs (Fig. 3a,b).

Fig. 3 |. Subclustering analysis reveals phenotypically diverse memory CD4+ T cells.

Fig. 3 |

a, UMAP plot representing the ten SCs of memory CD4+ T cells. b, Matrix plot showing the top five markers for each SC in the memory CD4+ T cells. c, Heatmap representing the selective CD4+ T helper cell markers across all memory SCs. d, Expression density plots of selected T helper cell markers across memory SCs. e, Frequency of PD-1+CD38+ cells within circulating CD4+CXCR5CD45RA memory T cells from HD (n = 6 independent donors) and patients with SLE (n = 20 independent donors). Data are presented as the mean ± s.d. (P = 0.0009; two-sided Student’s t-test). f, Cytokine expression profile as assessed by quantitative PCR in CD4+CXCR5CD45RA memory T cells sorted into PD-1CD38+, PD-1+CD38+ and PD-1+CD38 subsets following CD3/CD28 stimulation (n = 4 independent donors). Data are presented as the mean ± s.d. (IL10 P = 0.0006; IL21 P = 0.0123; CXCL13 P = 0.0067; two-sided Student’s t-tests). Single-cell analysis included SLE (n = 16) and HD (n = 10) samples from the sorted CD4+ scRNA-seq dataset. *P < 0.05, **P < 0.01, ***P < 0.001.

Blood CXCR5+CD4+ T cells are counterparts of lymphoid tissue TFH cells24. In our dataset, SC0, SC1 and SC5 expressed the highest levels of CXCR5, supporting a TFH-like phenotype (Fig. 3c,d). CCR7 and PD-1 (PDCD1) transcripts segregated CXCR5+ cells into a CCR7lowPD-1hi TFH SC (SC5) and two CCR7hiPD-1low SCs (SC0 and SC1) (Fig. 3c,d). Cells from SC0 transcribed higher levels of TCF7 and lower levels of S100A4. They also expressed SESN3, a marker of memory and effector CD4+ T cells found in rheumatoid arthritis synovium25. Consistent with CCR7lowPD-1hi TFH cells exhibiting an ‘effector’ phenotype26, SC5 cells transcribed IL21 and CXCL13 (Fig. 3c,d). Of the CXCR5+ SCs, only SC5 was significantly expanded in both patients with SLE (Extended Data Fig. 4a,b) and those with LN compared to HD (Extended Data Fig. 4c). Further in silico expression-based gating analysis confirmed the expansion of CXCR5+PD-1+CCR7 cells according to both DA and LN (Extended Data Fig. 4d). In summary, we identified three distinct SCs of blood CXCR5+ TFH-like cells, including a CCR7lowPD-1hi subset that transcribed IL21 and CXCL13 in the steady state and was expanded in patients with SLE with high DA and LN.

Within blood CXCR5 memory T cells, SC6 expressed the highest levels of key T helper 2 (TH2) cell transcripts, including transcription factor GATA3 and its antisense, long noncoding RNA GATA3-AS1, as well as PTGDR2 (CRTH2), CCR3, IL17RB, and transcripts encoding effector cytokines IL-4, IL-5 and IL-13 (Extended Data Fig. 4e). SC2 expressed GATA3 but lacked GATA3-AS1 and TH2 cytokine transcripts. Expression of CCR4, CCR6, CCR10 and IL22 and lack of IL17A and IFNG pointed to a T helper 22 (TH22) cell phenotype27 (Extended Data Fig. 4f). SC3 fitted the description of T helper 17 (TH17) cells (RORC, CTSH, IL17A and IL22)28 (Extended Data Fig. 4g). SC2 and SC3 were expanded in patients with SLE with high DA and those with LN compared to HD and non-LN patients (Extended Data Fig. 4ac).

MHC class II-restricted CD4+ T cells with cytotoxic potential, referred to as CD4-CTLs, have been widely reported29. In our dataset, SC4, SC7 and SC9 expressed cytotoxic genes and TFs (TBX21, EOMES and RUNX3) that promote cytotoxic programs30 (Fig. 3c,d). SC4 and SC7 expressed IFNG-AS1, GZMK and GZMA but had low levels or absence of GZMB and PRF1. SC4 uniquely transcribed XCL131. SC9 matched a TEMRA profile (high CX3CR1, GZMA and PRF1) and exclusively expressed GZMB, GZMH, GNLY and ZNF683 (Hobit)29. Only SC7 was enriched in patients with high DA or LN (Extended Data Fig. 4ac).

Importantly, cells in SC7 expressed IL10 and PDCD1 (Fig. 3d). They also upregulated chemokine receptor (CXCR3, CCR2 and CCR5) and TF (EOMES, RUNX3 and SLAMF7) transcripts, all of which are hallmarks of TH10 cells7 (Fig. 3c,d). In line with a TH10 proliferative nature and nonanergic state, SC7 cells showed upregulation of MKI67 and genes encoding HLA class II molecules (Fig. 3c,d). Moreover, they lacked CXCL13 expression5,6,32 (Fig. 3c,d) and thus fully matched the TH10 phenotype7.

CXCR5PD-1+ TPH cells facilitate B cell recruitment and activation through CXCL13 and IL-215. In our dataset, SC8 included cells that expressed PDCD1 and lacked CXCR5, similar to SC7 (Fig. 3c,d). Notably, they expressed IL21 and the highest levels of CXCL13 in the absence of cytotoxic transcripts (Fig. 3c,d). Akin to TPH5, they upregulated activation markers, including ICOS and HLA class II transcripts (Fig. 3c,d). Similar to TH10 cells, SC8 cells were significantly expanded in SLE and in patients with high DA and LN compared to HD (Extended Data Fig. 4ac). Further participant-level analysis showed that both TH10 and TPH cells coexisted in patients with SLE, especially those with high DA (Extended Data Fig. 4h).

As coexpression of PDCD1 and CXCR3 encompassed both SC7 (TH10) and SC8 (TPH) (Fig. 3c,d), we sought additional markers to enrich SC7 cells and identified CD38 as a candidate (Fig. 3d). PBMC flow cytometry analysis confirmed expression of CD38 in a fraction of memory CXCR5PD-1+CD4+ T cells (Extended Data Fig. 5a) and significant expansion of these cells in patients with SLE (Fig. 3e). Consistently, IL10 was highly transcribed, and IL-10 protein was secreted by sorted CD45RACXCR5PD-1+CD38+CD4+ T cells following short-term activation. By contrast, IL-21 and CXCL13 were predominantly found at the transcript and protein levels in activated TPH-like CD45RACXCR5PD-1+CD38CD4+ T cells (Fig. 3f and Extended Data Fig. 5b). Of note, a fraction of SC7 cells transcribed IL21 in their resting state (Fig. 3d and Extended Data Fig. 5c), but T cell receptor (TCR) stimulation of CD38+ cells barely induced IL21 transcription and IL-21 protein secretion (Fig. 3f and Extended Data Fig. 5b). Furthermore, CD38+CD4+ memory T cells expressed higher levels of SC7-associated protein markers, including CCR2, Ki-67, Tbet and PRF1 (Extended Data Fig. 5d). Across the entire CD4+ T cell pool, CD38 surface protein expression was lower in naive T cells and highest in TH10 cells, in which it followed a bimodal distribution (Extended Data Fig. 5e).

CD96hiCD4+ memory T cells expressing a TH22-related signature were recently found to be decreased in adult patients with SLE33. In our study, the average CD96 expression across memory CD4+ T cells was comparable in HD and pediatric patients with SLE (Extended Data Fig. 5f). Furthermore, a signature of CD96hi cells33 was enriched across four clusters, including SC2 (TH22), SC3 (TH17), SC6 (TH2) and SC8 (TPH) (Extended Data Fig. 5g,h). Cells expressing the highest (top 1%) levels of this signature were mapped to SC2 and SC3 and were significantly expanded in our pediatric SLE cohort. By contrast, the frequency of CD96hi cells within the SC8 (TPH-like) cluster was similar between HD and patients with SLE (Extended Data Fig. 5i). We therefore mapped all known subsets of CD4+ memory T cells, including three subsets expressing cytotoxic programs. We also identified two distinct populations of peripheral helper cells that were expanded in the blood of patients with high DA and LN.

CD4+ memory T cells are defined by canonical chemokine receptors. Accordingly, we performed expression-based in silico gating analysis for T helper 1 (TH1) (CCR6CXCR3+CCR4), TH2 (CCR6CXCR3CCR4+), TH17 (CCR6+CXCR3), TH22 (CXCR3CCR10+) and TFH (CXCR5+) cells34. As described above, CXCR5 was restricted to SC0, SC1 and SC5 (Extended Data Fig. 6a,b), whereas CXCR3+ cells were enriched in SC4 and SC7–SC9 (Extended Data Fig. 6b). CCR6+CXCR3 cells were mapped to SC3 and exclusively expressed RORC and IL17A (Extended Data Fig. 4g). The distribution of CCR6CXCR3CCR4+ cells spanned SC2 and SC6, but only SC6 expressed the TH2 marker PTGDR2 (ref. 35) (Extended Data Fig. 4e), along with TH2-specific cytokine transcripts. SC2 expressed the bulk of the TH22-associated CCR4, CCR6 and CCR10, with a smaller number of cells expressing the genes encoding these receptors in SC8 (Extended Data Fig. 4f). Finally, the gene encoding gut-homing chemokine receptor CCR9 was enriched in SLE cells from SC5 and SC8 (Extended Data Fig. 6c). Overall, canonical chemokine receptor expression largely matched the memory SC identities described above, while highlighting CXCR5CXCR3+ cells as the most heterogeneous among CD4+ memory T cells.

Validation of CD4+ memory T cell SCs using PBMC scRNA-seq

To validate the CD4+ T cell memory SC classification, we used an additional dataset (validation dataset 2) of PBMCs from nine patients with SLE and five HD (Supplementary Table 4). After subclustering, 12,296 CD4+ memory T cells were mapped to eight SCs, including two CXCR5+ TFH-like SCs (Extended Data Fig. 7a,b). Consistent with the sorted CD4+ dataset, two CXCR5+ SCs differed in PDCD1 expression. Thus, the CXCR5+PD-1hi SC expressed IL21 and relatively lower levels of CCR7, matching SC5 from the sorted dataset. Conversely, PD-1lo cells expressed higher levels of CCR7 in the absence of IL21, resembling SC0 and SC1 in the sorted dataset. Within the CXCR5 compartment, the TH17 (RORC, CTSH), TH2 (GATA3-AS1, PTGDR2), TEMRA (CX3CR1, NKG7, GNLY, GZMs), XCL1+ TH1-like, and CCR10+ TH22 SCs robustly reproduced our original SCs (Extended Data Fig. 7a,b). In the PBMC data, however, we identified one SC of CXCR5CXCR3+PD-1hi memory cells that included both TPH (CXCL13, IL21) and TH10 (GZMK, CD38, IFNG-AS1, MKI67, CCR5) cells, likely owing to the lower resolution and/or cell frequency within PBMCs (Extended Data Fig. 7a,b). In line with our sorted CD4+ T cell data, expression of CD96 was not decreased in SLE (Extended Data Fig. 7c), and the CD96hi signature33 was detected across the TH22, TH17 and TPH/TH10 SCs (Extended Data Fig. 7d). In addition, TH22 and TH17 cells expressing the top 1% of the CD96hi signature were increased in patients with SLE compared to HD (Extended Data Fig. 7e). In conclusion, the characterization of the CD4+ memory T cell compartment in SLE and HD blood was validated using an independent, nonsorted PBMC dataset containing about 1/7 as many cells as the original sorted dataset.

The ISG-high cluster encompasses nTreg, mTreg and Treg cells

Blood CD4+ T cells included cells overexpressing ISGs that were expanded in patients with SLE. In fact, their frequency correlated with the average ISG expression across participants (Fig. 4a). We identified four distinct SCs (Fig. 4b), of which SC0 and SC2 matched naive CD4+ T cells (Fig. 4c). SC0 and SC2 transcriptional programs overlapped with those of SOX4 and IL6ST naive SCs, respectively (Fig. 4c). SC1 exhibited a memory phenotype (S100A4 and KLRB1) with various SC-defining markers, including TFH, TH17, TH2 and cytotoxic markers (Fig. 4c,d). Last, SC3 expressed FOXP3 and IKZF2 along with S100A4, pointing to memory Treg (mTreg) cells (Fig. 4c,d).

Fig. 4 |. The ISG-high cluster includes naive, memory and Treg cells.

Fig. 4 |

a, Scatter plot showing the correlation between average ISG and IFN-related gene expression and frequency of ISG-high cells in total CD4+ T cells at the participant level. The error band represents the 95% confidence interval for the fitted regression line. b, UMAP of the four ISG-high SCs. c, Matrix plot showing the top 10 markers for each SC in ISG-high CD4+ T cells. d, Heatmap showing expression of selective CD4+ T helper cell markers across the ISG-high SCs. e, Box plots of cell proportions within total CD4+ T cells for each ISG-high SC by disease group (top), LN (middle) and DA (bottom). The boundaries and center line of the box plot represent the upper and lower quartiles and the median of the data, respectively. The whiskers extend to the maximum and minimum data values within the 1.5× IQR from the upper and lower quartiles. All data points are overlaid on the box plot. Two-sided t-tests were performed to test differences in cell proportions between groups. f, Dot plots reporting top differentially expressed genes between patients with SLE and HD. The size of the dot indicates the percentage expression of each gene in each cluster. Genes with significant P values (P < 0.05) and |FC | > 4 are highlighted by bolded squares. Single-cell analysis included SLE LN (n = 11), SLE NoLN (n = 5) and HD (n = 10) samples from the sorted CD4+ scRNA-seq dataset. *P < 0.05, **P < 0.01, ***P < 0.001.

Although present in HD, all ISG-high SCs were expanded in patients with SLE, especially those with LN (Fig. 4e). Pseudobulk analysis identified differentially expressed transcripts in each of the four SCs compared to HD (Fig. 4f). In addition to ISGs, transcripts associated with Treg differentiation, such as PTGER2 and RARA36, or with Treg function, including FAS, TNFRSF18 (GITR) and TNFRSF4 (OX40)37, were enriched in SLE SC3 (Fig. 4f). FCRL3, encoding a receptor for secretory IgA38, as well as transcripts associated with Toll-like receptors (TLRs) such as TLR5 and LY96, or their downstream signaling (RIPK1), were also overexpressed in SLE SC3 (Fig. 4f).

These findings highlight the heterogeneous composition of the blood ISG-high CD4+ T cell compartment in the steady state and its expansion in SLE.

Treg cells are expanded in SLE and LN and exhibit decreased suppression

The frequency and role of Treg cells in SLE remain controversial. Our analyses confirmed increased Treg frequency in patients with SLE39, especially those with LN. This was further supported by flow cytometry according to either CD25hiCD127low or CD25hiFoxP3hi protein expression within CD4+ T cells (Extended Data Fig. 8a).

To better elucidate this compartment, we reclustered it into two SCs. Treg SC0 transcribed CCR7 and lower levels of S100A4 (Fig. 5a), whereas Treg SC1 transcribed higher levels of S100A4, along with genes encoding activation markers such as HLA class II (Fig. 5a). Both SCs were expanded in patients with SLE, and in those with LN (Extended Data Fig. 8b). These SCs matched two functionally distinct human blood Treg subpopulations: a CD25lowCD45RA+ fraction (fraction I) consisting of ‘naive’ Treg (nTreg) cells; and a CD25hiCD45RA fraction (fraction II) including ‘memory’ Treg (mTreg) cells39 (Extended Data Fig. 8c). Using flow cytometry, we observed expansion of nTreg cells but not of mTreg cells in patients with SLE compared to HD (Extended Data Fig. 8c). This discrepancy with the scRNA-seq data (Extended Data Fig. 8b) may have arisen from the limited set of surface markers used in flow cytometry compared to the comprehensive scRNA-seq profiling. In addition, whereas nTreg cells could be reliably identified by flow cytometry using CD45RA and CD2539, the gating of mTreg cells (fraction II) and the adjacent non-Treg cells (fraction III) was less well defined (Extended Data Fig. 8c). Analysis of the transcriptomes of sorted fraction I and fraction II populations using published data40 confirmed the identities of Treg SC0 and SC1 (Extended Data Fig. 8d).

Fig. 5 |. TLR5+ nTreg cells are expanded in SLE, and TLR5 ligation increases nTreg dysfunction.

Fig. 5 |

a, UMAP of Treg SCs (left) and expression density plots of Treg-related markers (right). b, Heatmap showing expression of the four Treg-specific differentially expressed genes across CD4+ T cell clusters separated by disease group. c, TLR5 expression density plot in total CD4+ T cells. d, Box plots of relative expression of TLR5 across all five major CD4+ T cell groups between patients with SLE and HD. The boundaries and center line of the box plot represent the upper and lower quartiles and the median of the data, respectively. The whiskers extend to the maximum and minimum data values within the 1.5× IQR from the upper and lower quartiles. All data points are overlaid on the box plot. Differential gene expression analysis was performed using edgeR, and adjusted P values were calculated. The asterisk denotes an adjusted P value less than 0.05. Exact adjusted P values for all cell groups were as follows: 0.178 (ISG-high), 0.283 (memory), 0.205 (naive), 0.869 (proliferating), 0.024 (Treg_SC0) and 0.637 (Treg_SC1). e, Percentages of TLR5+ cells within naive T (CD4+CD25CD127+CD45 RA+), memory Teff (CD4+CD25 CD127+CD45RA), nTreg (CD4+CD25lowCD127CD 45RA+) and mTreg (CD4+CD25++CD127CD45RA+) cells in HD (n = 9 independent donors) and patients with SLE (n = 19 independent donors). Data are presented as the mean ± s.d. (naive T P = 0.0803 (nonsignificant (NS)); naive Teff P = 0.0635 (NS); nTreg P = 0.017; mTreg P = 0.0967 (NS); two-sided Student’s t-test). f, Percentage of suppression of responder cell proliferation by nTreg or mTreg cells, with or without flagellin stimulation, in samples from patients with SLE (n = 11 independent donors) and HD (n = 6 independent donors). Data are presented as the mean ± s.d. (HD fraction I (Fr. I) P = 0.8613 (NS); HD fraction II (Fr. II) P = 0.2993 (NS); SLE fraction I P = 0.0001; SLE fraction II P = 0.9339 (NS); two-sided paired Student’s t-test). Single-cell analysis included SLE (n = 16) and HD (n = 10) samples from the sorted CD4+ scRNA-seq dataset. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001.

To explore transcriptional differences in SLE and HD Treg cells, we performed pseudobulk analysis and identified 289 and 46 differentially expressed genes (P < 0.05 and FC > 1.5) for the naive (SC0) and memory (SC1) Treg SCs, respectively. Selecting transcripts with log2FC ≥ 0.25, and a five-fold increased expression in Treg cells compared to the remaining CD4+ T cells yielded four ‘Treg-specific differentially expressed genes’ for SC0, including TIGIT, HDAC9, FCRL3 and TLR5 (Fig. 5b). Notably, TLR5, whose activation has been reported to increase total Treg suppression41, was predominantly upregulated in SLE nTreg cells (Fig. 5c,d and Extended Data Fig. 8e). In our validation PBMC dataset 2 comprising 2,375 Treg cells, TLR5 and FCRL3 were also enriched in nTreg cells, and both nTreg and mTreg cells were expanded in SLE (Extended Data Fig. 9a,b). Assessing TLR5 protein expression across CD4+ T cell subsets revealed increased TLR5 in all SLE populations, with nTreg cells displaying the highest levels and the most significant difference compared to HD (Fig. 5e and Extended Data Fig. 9c). These findings identify TLR5 as the main TLR in human Treg cells and emphasize its upregulation in SLE.

We next tested the baseline Treg suppression function and the effect of flagellin, the main ligand for TLR542, on nTreg and mTreg cells from HD and patients with SLE. At baseline, nTreg and mTreg suppression was compromised in ~40% of patients with SLE39 (Extended Data Fig. 9d,e). Addition of flagellin did not have an effect on HD or SLE mTreg cells. It did, however, impair the suppressive function of SLE nTreg cells and, in some cases, even promoted effector T (Teff) cell proliferation (Fig. 5f and Extended Data Fig. 9f). These results point to a potential microbial trigger of nTreg dysfunction in SLE.

An integrated blood CD4+ T cell landscape in SLE

Using a recently described integration algorithm from the Ragas package11, we mapped 23 transcriptionally distinct CD4+ T cell SCs and reintegrated them to generate a high-resolution view of the CD4+ T cell compartment (Extended Data Fig. 10). We classified naive, memory, regulatory, proliferating and ISG-high T cells and correlated their frequencies with both SLE DA and LN. Distinct naive CD4+ T cell transcriptional states, as well as a heterogeneous ISG-high compartment, were identified in the steady state of HD and patients with SLE. Importantly, we mapped TFH, TPH and TH10 cells to adjacent CD4+ memory SCs and identified three SCs expressing cytotoxic programs. Finally, we uncovered SLE-specific Treg cell transcriptional and functional alterations predominantly affecting patients with LN.

Discussion

CD4+ T cells contribute to the pathogenesis of SLE by helping autoreactive B cells, secreting proinflammatory cytokines, differentiating into cytotoxic effectors and maintaining pathogenic memory responses Here, we profiled the transcriptome of CD4+ T cells sorted from the blood of pediatric patients with SLE and HD at the single-cell level, mapped them to 23 transcriptionally distinct CD4+ T cell SCs and studied their association with SLE DA and LN.

Recent studies have highlighted the contribution of extrafollicular reactions to autoimmune diseases and the expansion of blood CD4+CXCR5PD-1hi memory T cells, including TH10 cells and TPH cells, in SLE57. Our analyses mapped these two effector helper populations to neighboring SCs and detected additional surface markers, including CD38, to enable their identification. Notably, combining CD38 with additional TH10 markers permitted us to identify cells that transcribed either IL10 or IL21 in the steady state but primarily secreted IL-10 upon TCR stimulation, a finding that warrants further study.

CD4-CTLs29 mapped in our study to three different memory TH1-like SCs, including TEMRA (SC9), TH10 (SC7), and TH1-like XCL1+ (SC4) cells. All three SCs expressed GZMA, but GZMK was restricted to SC7 and SC4. Within the CD8+ T cell compartment, GZMK mediates complement activation and contributes to tissue inflammation43, but its role in CD4+ T cells remains to be addressed. GZMK also marks a population of clonally expanded cytotoxic CD4+ T cells found in inflamed tissues of patients with IgG4-related disease44, where they associate with activated (including extrafollicular) B cells. In these patients, GZMK+ cytotoxic T cells express amphiregulin and TGFβ, which contribute to tissue fibrosis44, as well as SLAMF7, a marker of cytotoxic cells45. In our dataset, SLAMF7 was expressed in SC9 and (to a lesser extent) in SC7 but was absent from SC4.

TH1-related transcripts, such as CXCR3 and IFNG, were detected in all three cytotoxic SCs, as well as in TPH-like SC8. The cytokine and cytotoxic transcriptional profiles of cells in SC7 overlaps with both TH10 and TR-1 cells7,8. Functionally, however, we reported that SLE blood TH10 cells are not anergic but proliferate upon activation and provide B cell help in an IL-10-dependent manner7. We now show that, in addition to CXCR3 and PD-1, cells within SC7 express activation (HLA-DR, CD38), proliferation (MKI67) and costimulation (ICOS) markers in the steady state, and we confirm the expression of unique chemokine receptors (CCR2 and CCR5)7. Importantly, we show that PD-1 and CD38 coexpression identifies blood CD4+ memory cells enriched in IL-10-secreting but not IL-21-secreting precursors, fitting the description of TH10 cells7. CD38 expression has been linked to CD4+ and CD8+ recent thymic emigrant T cells that express SOX4 and decline with age46. A recent study also reported CD38 as a marker of IL-2 immunotherapy-induced proliferative HLA-DR+CD38+CXCR3+ Treg cells that could be recapitulated in vitro through TCR plus IL-2 signals and might be primed for migration to inflammatory sites in SLE47. Finally, an induced T follicular regulatory subset was also recently identified in human tonsils as a distinct CXCR5+CD38+, TFH-descended subset that gains suppressive function while retaining the capacity to help B cells. Notably, these cells express IL-21 and IL-10 together with cytotoxic markers such as KLRB148). The potential relationship between these cells and cells within blood SC7 deserves further study.

Key findings of our study are the expansion of Treg cells in LN, the overexpression of TLR5 in SLE nTreg cells and the potentiation of their dysfunction by flagellin. Treg cells express a variety of TLRs49, and TLR1/2 ligation promotes nTreg and mTreg differentiation into pathogenic TH17-like cells in multiple sclerosis50. TLR5 expression in HD Treg cells was previously reported to enhance suppression upon flagellin binding in vitro. However, CD127low Teff cells were not excluded from the Treg pool in that study41. In addition to TLR5, we found that SLE nTreg cells overexpressed FCRL3, which has been shown to inhibit Treg function and to promote the skewing of these cells toward a proinflammatory phenotype38. Microbial orthologs of SLE autoantigens have recently been described to trigger autoimmunity in mice51, and antibodies against a restricted pool of microbiome strain-specific cell wall lipoglycans correlated with DA, particularly in patients with active LN52. Our studies further support the need to elucidate the complex interactions between Treg cells and the microbiome, as well as their connection with specific organ involvement in patients with SLE.

Online content

Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41590-025-02297-2.

Methods

Human samples

Children and adolescents with SLE were enrolled through the pediatric rheumatology clinics at Texas Scottish Rite Hospital for Children and Children’s Medical Center in Dallas, Texas. All study procedures were conducted in accordance with protocols approved by the Institutional Review Boards (IRBs) at Weill Cornell Medicine (IRB no. 22–03024648, 17–11018757 and 0604008488–07), the University of Texas Southwestern Medical Center (IRB no. STU 092010–167) and the Hospital for Special Surgery (IRB no. 2019–2338). Consent or assent was obtained from patients between 0 and 25 years of age. All enrollees consented to have coded information from their samples shared with external entities via publications and/or other data sharing mechanisms. Blood was collected in ACD tubes (BD Biosciences), and laboratory measurements were recorded. Clinical DA was assessed using the SLEDAI 2000.

Flow cytometry and cell sorting

Cryopreserved PBMCs were thawed at 37 °C, washed once with phosphate-buffered saline (PBS), and resuspended in Flow buffer (PBS containing 2% fetal bovine serum (FBS) and 0.5 mM EDTA). Cells were treated with Fc receptor block (BD Pharmingen) for 5 min at 37 °C, followed by surface staining for 30 min on ice. For CD4 T cell staining, cells were incubated with anti-CD3 BUV737 (UCHT1; BD Pharmingen) and anti-CD4 BV510 (OKT4; BioLegend), washed and stained with 7-AAD (BioLegend) to exclude dead cells. For isolation of Treg and Teff cells, PBMCs were stained with the following antibody panel: anti-CD3 BUV737 (UCHT1; BD Pharmingen), anti-CD4 BV510 (OKT4; BioLegend), anti-CD25 allophycocyanin (BC96; BioLegend), anti-CD127 BV421 (A01905; BioLegend) and anti-CD45RA BV785 (H100; BioLegend). For identification and sorting of CD3+CD4+CXCR5CD45RAPD-1+CD38+ cells, the following antibodies were used: anti-CD3 BUV737 (UCHT1; BD Pharmingen), anti-CD4 BV510 (OKT4; BioLegend), anti-CD45RA BV785 (H100; BioLegend), anti-CD38 APC-Fire 810 (HIT2; BioLegend), anti-CXCR5 AF647 (RF8B2; BD Pharmingen) and anti-PD-1 BV421 (EH12.2H7; BioLegend). Additional surface markers included anti-CCR2 FITC (K036C2; BioLegend). Samples were acquired on an Aurora flow cytometer (Cytek) or sorted using Melody cell sorter (BD Pharmingen; 100-μm nozzle) or a Cytek Aurora Cell Sorter (70-μm nozzle). Data were analyzed using FlowJo v.10.8.1 (BD Biosciences). For intracellular staining, surface-stained cells were labeled with a fixable viability dye (for example, 7-AAD or Fixable Viability Dye eFluor 780; eBioscience), followed by fixation and permeabilization using a FoxP3/Transcription Factor Staining Buffer Set (Thermo Fisher). Cells were then stained with antibodies against intracellular markers, including anti-Ki-67 PE (Ki-67; BioLegend), anti-PRF1 PE (B-D48; BioLegend), anti-Tbet PE (eBio4B10; eBioscience) and anti-FoxP3 PE (259D; BioLegend). For quantification of nTreg and mTreg cells and TLR5 expression, PBMCs were stained with anti-CD3 BUV737, anti-CD4 BV510, anti-CD45RA BV785, anti-CD127 BV421, anti-CD25 allophycocyanin and anti-TLR5 Alexa Fluor 488 (S16021; BioLegend). Viability was assessed using 7-AAD.

Quantitative real-time PCR and cytokine assay

Sorted T cells (1 × 105 cells) were stimulated overnight with Immuno-Cult (StemCell) following the manufacturer’s protocol. Supernatants were collected, and cytokines were measured using a CBA Flex Set (BD; for IL-10 and IL-21) or Human BLC/CXCL13 ELISA Kit (Thermo Fisher; for CXCL13). For quantitative PCR, a total of 1 × 105 cells were lysed in 50 μl Cell-to-Ct lysis buffer (Thermo Fisher). Complementary DNA (cDNA) was synthesized directly from cell lysates using a Cell-to-Ct Kit (Thermo Fisher). Quantitative real-time PCR was performed with TaqMan Gene Expression Assays (Applied Biosystems) according to the manufacturer’s instructions.

Suppression assay

nTreg cells (CD4+CD25hiCD127lowCD45RA+), mTreg cells (CD4+CD25h iCD127lowCD45RA) and naive T cells (CD4+CD25CD127+CD45RA+) were sorted from PBMCs of pediatric patients with SLE or HD by fluorescence-assisted cell sorting. Antigen-presenting cells (APCs) from an allogeneic HD were isolated by depleting CD3+ cells using CD3 microbeads (Miltenyi) and then treated with mitomycin C (50 μg ml−1). One million APCs were incubated with mitomycin C at 37 °C for 15 min and washed three times. Suppression assays were performed by coculturing nTreg cells (20,000 cells) or mTreg cells (20,000 cells) with CFSE-labeled naive T cells (50,000 cells per well in 200 μl). Naive T cells were stimulated with soluble anti-CD3 monoclonal antibody (OKT3, 1 μg ml−1) in the presence of mitomycin-treated APCs (50,000 cells). Control wells contained naive T cells with anti-CD3 and APCs but without Treg cells. Assays were conducted with or without flagellin (Salmonella typhimurium, 100 ng ml−1; Invivogen). After 5 days, cells were harvested, stained with 7-AAD and anti-human CD4 antibody to exclude dead cells and APCs, and analyzed by flow cytometry. Proliferation was assessed by CFSE dilution. Suppression was calculated as follows: percentage of suppression = difference in the percentage of proliferation in the presence and absence of Treg cells/percentage of proliferation of naive T cells alone × 100.

scRNA-seq data processing and analysis

Sample preparation.

For the sorted CD4+ T cell dataset, sorted live CD4+ T cells were resuspended in PBS supplemented with 0.2% nonacetylated bovine serum albumin (BSA; Miltenyi), and viability was determined using trypan blue staining and measured on a Countess FLII (Thermo Fisher). Then, 50,000 cells were loaded for capture on a 10x Chromium system using a v.3 Single Cell 3′ Reagent Kit for RNA sequencing (10x Genomics). After capture and lysis, cDNAs from the cells were amplified as per the recommendations for the 10x Chromium. The amplified cDNAs were loaded on a single lane of NovaSeq 6000 for a depth of 50,000 reads per cell.

For SLE PBMC validation dataset 1, PBMCs in 10% DMSO + 90% FBS were thawed quickly at 37 °C and placed in DMEM supplemented with 10% FBS. Cells exhibiting a viability rate less than 70% were excluded. Cells were quickly spun down at 400g for 10 min, washed once with 1× PBS supplemented with 0.04% BSA and finally resuspended in 1× PBS with 0.04% BSA. Viability was determined using trypan blue staining and measured on a Countess FLII. Then, 12,000 single cells were loaded into one lane of a 10x Genomics Chromium [Controller, X], targeting 6,000 barcoded single cells. Single-cell capture, barcoding and library preparation were performed using the 10× 3′ Chromium platform (v.3.1) chemistry. cDNA and libraries were checked for quality on an Agilent 4200 TapeStation, quantified by KAPA quantitative PCR, and sequenced on an Illumina NovaSeq 6000 targeting 100,000 raw read pairs per cell.

For the 5′ PBMC validation dataset 2, cryopreserved PBMCs were thawed and resuspended in RPMI containing 10% FBS for cell number and viability assessment. Live PBMCs were enriched using a Dead Cell Removal Kit (Miltenyi) and resuspended in PBS with 1% BSA for scRNA-seq processing using Chromium Single Cell 5’ V2 gene expression (10x Genomics). The samples were divided into four batches and loaded into different lanes of a 10x chip, aiming for a cell recovery of 10,000 per sample, following the manufacturer’s instructions. Paired-end sequencing was performed on an Illumina NovaSeq 6000.

Cell quality filtering, normalization and main cluster analysis for the sorted CD4+ T cell dataset.

Count data of 16 patients with SLE and 10 healthy controls processed by CellRanger53 were merged using Seurat18 from R. Single-cell sequencing generated libraries of 9,398 ± 2,143 and 8911 ± 2,561 cells (mean ± s.d.) per sample from patients with SLE and HD, respectively. On average, >1,600 genes and/or features per cell were detected for both groups. Following the Seurat guided clustering workflow (https://satijalab.org/seurat/articles/pbmc3k_tutorial.html), cell quality filtering was performed to remove cells with abnormal number of features (<500 or >4,000) and/or high mitochondrial gene expression (>20%); this yielded 257,229 filtered cells. Subsequently, we performed normalization, variable feature selection, scaling (regressing out percentage mitochondrial expression and total RNA count per cell) and dimension reduction by principal component analysis (PCA). To remove batch effects from single-cell data, Harmony10 was applied to the principal components using sample as the grouping variable. The Harmony-corrected cell embeddings were further used for UMAP54 and clustering analysis (resolution = 0.5), which generated an initial version of 16 clusters. Further investigation using immune cell markers in combination with automated single-cell annotation by Azimuth18 revealed several additional doublet clusters, including monocyte doublets and plasmacytoid dendritic cell doublets, and an extra cluster with abnormal mitochondrial activities; these were removed from the data. We also removed a small T cell cluster with high CD79A expression but low CD79B expression. Finally, we reperformed UMAP and clustering analysis on the remaining 239,473 cells and obtained 15 clusters for the total CD4 data.

Single-cell subclustering analysis for the sorted CD4+ T cell dataset.

Single-cell subclustering analysis was performed to improve the identification of subpopulations of CD4+ T cells. The 15 initial clusters from total CD4+ T cells were further grouped into naive T cells (clusters 0, 1, 5, 11, 12 and 13), memory T cells (clusters 2, 4, 6, 7 and 9), Treg cells (clusters 3 and 10) and ISG-high CD4+ T cells (cluster 8), and subclustering analysis was performed for each subset separately. The subclustering analysis also followed the Seurat guided clustering workflow with some modifications for the Treg cells and ISG-high T cells. Briefly, we reperformed variable feature selection, scaling, dimension reduction, Harmony batch integration, UMAP and clustering and removed any additional low-quality clusters with higher mitochondrial expression and low RNA content. For the ISG-high cell subset, to prevent formation of additional high-ISG SCs, the following steps were performed: (1) we calculated the average ISG expression using seven representative ISGs (ISG15, ISG20, IFI6, IFI44L, IFITM3, LY6E, MT2A), which were regressed out during the scaling step; and (2) we removed all ISGs and IFN-related genes15 from the variable feature list when rerunning PCA. For Treg cells, we directly performed reclustering and UMAP analysis on the original Harmony embeddings from the total CD4 analysis without rerunning variable feature selection, PCA or batch integration to avoid overfitting to noise in the data. Seurat and Ragas11 were used for downstream analysis, including marker analysis, differential state analysis and cell proportion analysis. The Nebulosa package55 was used to construct expression density plots.

Analysis of naive CD4+ T cells from PBMC validation dataset 1.

Analysis of scRNA-seq data from SLE PBMCs was performed following a similar guided clustering procedure to that previously described. First, cells with nFeature_RNA smaller than 500 or greater than 5,000 or with a percentage mitochondrial expression greater than 15% were filtered. In addition, we applied Scrublet56 to systematically identify doublets from the PBMC data, and cells with a Scrublet score >0.28 were deemed to be technical doublets and removed from further analysis. Normalization, variable feature selection, scaling by regressing out ‘percent.mt’ and ‘nCount_RNA’, PCA, Harmony batch integration by sample, UMAP and clustering were subsequently performed, yielding an initial version of 18 clusters. After removal of clusters with low nCount_RNA and/or high mitochondrial gene expression and a small low-diversity cluster with cells predominantly from a single patient with SLE, we obtained the final PBMC data with 39,133 cells that formed 15 clusters. To obtain accurate cell identities for naive CD4+ T cells, we performed a series of subclustering analyses starting from the total T cells. We extracted CD3+ T cells from PBMC clusters 0, 3, 4, 5 and 9 and performed subclustering analysis. After removing an additional doublet cluster and a mito-high/RNA-low cluster, we acquired nine final clusters for 18,940 total T cells. Among them, clusters 0, 3, 4, 7, and 8 were positive for CD4 expression and were extracted for a second round of subclustering analysis. Initial clustering of the CD4+ T cells yielded ten clusters, including one small cluster with residual CD8A and/or CD8B expression, which was removed from downstream analysis. Among the final nine clusters from 11,024 CD4+ T cells, clusters 0, 1, and 3 were naive CD4+ T cells (high CCR7 and low S100A4 expression). A third round of subclustering analysis was then performed to delineate the heterogeneity within the naive CD4+ T cell compartment, which yielded five final clusters.

Analysis of memory and Treg SCs from PBMC validation dataset 2.

As validation dataset 1 only contained SLE samples, to validate memory and Treg SCs and their cell proportion changes between SLE and HD from the sorted CD4+ dataset, we generated a second validation dataset using 10× 5′ chemistry on PBMCs from nine additional patients with SLE and five HD. Cell quality filtering (nFeature_RNA ≤500 or ≥3000, percentage of mitochondrial gene expression ≥10%) and doublet removal by Scrublet were first performed; this generated 139,482 total PBMCs, which were analyzed using the aforementioned Seurat guided clustering workflow. After cleaning up additional doublet clusters and clusters with high mitochondrial expression, we grouped and annotated the remaining clusters. From the total PBMCs, we further subsetted T cells, proliferating T cells and NK cells (CD56bright and CD56dim) and performed reclustering analysis. This led to 11 refined T/NK clusters spanning CD4+ T, CD8+ T, mucosal-associated invariant T/γδT and NK cell types. A total CD4+ T cell object with 32,850 cells containing naive, memory, regulatory and ISG-high subsets was constructed, replicating the main cell populations from the sorted CD4+ dataset. Of note, this total CD4+ T cell object did not contain a proliferating cluster, because proliferative CD4+ T cells formed a separate ‘proliferating’ cluster of its own with other proliferating NK and CD8+ T cells. To validate the CD4+ memory SCs, we combined 12,138 CD4+ memory T cells with 385 Azimuth-predicted CTL/TCM/TEM cells from the ‘CD8_TEM/CD4_TEMRA’ cluster. After reclustering the 12,523 combined CD4+ memory and cytotoxic T cells, we further identified and removed a smaller cluster enriched for innated lymphoid cell markers (SOX4 and TRDC) and obtained 12,296 final CD4+ memory T cells that were grouped into eight subsets. Moreover, we also subsetted and reclustered 2,375 Treg cells from the total CD4+ T cell object and divided them into nTreg and mTreg cells based on CCR7, S100A4, CD74 and HLA-DR expression.

Miscellaneous bioinformatics analysis.

To evaluate whether the naive SCs identified here represented real transcriptional diversity in the naive CD4+ T cell compartment or were simply driven by noise, we downsampled 50,000 cells from the total naive CD4+ T cells and permuted their raw expression counts to build ‘null’ data of single-cell expression. After data normalization, variable feature selection, PCA and clustering, we obtained five final SCs from the null data. Seurat function FindAllMarkers was used to identify markers for the null data and calculate their FCs. Subcellular location information was downloaded using R package HPAanalyze57. We used the newly developed Ragas package to analyze scRNA-seq data at both main and SC levels. To identify SC-specific markers, the RunFindAllMarkers function was used, and the top markers were visualized with the RunMatrixPlot function. To identify differentially expressed genes between disease groups, differential state analysis58 was performed using the RunPseudobulkAnalysis function with the edgeR method59, and we corrected for batch effects between subsets of data. Differential analysis of cell frequencies was performed using the RunProportionPlot function, and the parent object was assigned based on the chosen cell compartment of interest. R package ggplot260 was used to produce basic R plots for scRNA-seq data analysis. To further characterize Treg SC0 and SC1, we reanalyzed the bulk RNA-seq data from ref. 40, which profiled the expression of conventional CD4+ T cells and regulatory T cells, including CD25lowCD45RA+ nTreg cells and CD25hiCD45RA mTreg cells (also referred to as eTreg cells by Cuadrado et al.). We performed hierarchical clustering followed by filtering of non-Treg markers and identified 193 and 161 genes as markers for nTreg and mTreg cells, respectively. After calculating their expression FCs between SC0 and SC1 Treg cells in our CD4+ T cell dataset and removing genes with a log2FC smaller than 0.1, we obtained a final set of 39 Treg marker genes. Pearson’s correlation was calculated between the log2FC of the 39 genes for Treg SC0 versus SC1 from our CD4+ scRNA-seq data and for nTreg versus eTreg from the bulk RNA-seq data.

Statistical analysis

All statistical analyses were performed based on the distribution of data and study design. All results from spectral flow cytometry data and cell-based assays are presented as mean ± s.e.m. Comparisons between two groups were performed using two-tailed Welch’s t-test. For cell-based suppression assays, paired t-tests were used to assess differences between control and treatment groups. Statistical analyses for flow cytometry data and cell-based assays were conducted using GraphPad Prism (v.9; GraphPad Software). Differential cell proportion analysis for scRNA-seq data was performed using a two-sample t-test implemented in the Ragas package11, and two-sided raw P values were reported. For pseudobulk-based differential state analysis, two-sided adjusted P values were calculated for each gene.

Extended Data

Extended Data Fig. 1 |. Quality summary for 10X sequencing, clustering and batch integration.

Extended Data Fig. 1 |

(a-b) Bar plot and violin plot showing the distribution of number of cells per individual sample in HD (n = 10, in coral) and SLE (n = 16, in blue). (c) Histogram showing the distributions of number of detected genes per sample, grouped by disease. (d) UMAP of total CD4+ T cells grouped into 15 initial clusters. (e) Dot plot illustrating further re-grouping of the 15 initial clusters into five major CD4+ T cells clusters based on the selected markers. (f-h) Bar plots showing the distributions of cells in the five major CD4+ T cell populations after harmony batch correction based on (f) 10X batch, (g) disease group and (h) individual sample.

Extended Data Fig. 2 |. ISG expression across CD4+ T cell clusters and its association with LN and DA.

Extended Data Fig. 2 |

(a) Heatmap showing the average expression of ISGs and IFN-related genes from Nehar-Belaid et al., 2020 across the five major CD4+ T cell populations at the sample level. Row colors indicate modular annotation of ISGs and IFN-related genes. (b) Dot plot reporting the logFC and expression frequency from differential gene expression analysis between SLE patients and HD for the five major CD4+ T cell clusters across all ISGs and IFN-related genes. The size of the dot indicates percentage expression for each gene in each cluster. Genes with expression frequency > 0.1 are highlighted by bolded square. (c) Pairwise scatter plots and correlations comparing per sample average expression of ISGs and IFN-related genes among the five major CD4+ T cell clusters. (d-e) Box plots showing average gene expression of ISGs and IFN-related genes in total CD4+ T cells grouped by DA or LN status. The boundaries and center line of the box plot represent upper/lower quartiles and the median of the data, respectively. The whiskers extend to the maximum and minimum data values within the 1.5 IQR from the upper/lower quartiles. All data points were overlaid on the box plot. Two-sided t-tests were performed to test differences in cell proportions between groups. Exact p-values for all pairwise comparisons are: 0.00077 (SLEDAI high vs. none), 0.008 (SLEDAI low vs. none), 0.0267 (SLEDAI high vs. low), 0.00026 (LN vs. HD), 0.08865 (NoLN vs. HD), 0.245 (LN vs. NoLN). Single-cell analysis included SLE LN (n = 11), SLE NoLN (n = 5) and HD (n = 10) samples from the sorted CD4+ scRNA-Seq dataset. LN/NoLN: w/wo nephritis; High: SLEDAI > 4; Low: SLEDAI < = 4. Significance levels: * (p < 0.05), ** (p < 0.01), *** (p < 0.001).

Extended Data Fig. 3 |. Analysis of naïve CD4+ T cell SCs from the sorted CD4 dataset and PBMC validation dataset 1.

Extended Data Fig. 3 |

(a) UMAP showing six naïve/CM SCs obtained by subclustering the naïve cluster from the sorted CD4+ T cell dataset. (b) Bar plot showing the subcellular locations of the naïve/CM SC markers with FC > 1.5 from the sorted CD4+ T cell dataset. (c) Heatmap showing the average expression of CD69 across all naïve/CM SCs from the sorted CD4+ T cell dataset. (d) Bar plot showing the distribution of log2FC of subcluster markers from the sorted CD4+ T cell dataset (pink) and the permuted dataset (cyan). (e) Subcluster analysis of SLE PBMCs from the validation dataset 1. PBMC panel: 15 clusters were identified from total PBMCs, including two B cell clusters (1, 7), CD14+ (2) and CD16+ monocytes (8), five T cell clusters (0, 3, 4, 5, 9), one NK cluster (6), one DC (13) and one pDC (14) cluster, one plasma cell cluster (10), platelets (12), and one cluster of proliferating cells (11); CD3+ T cell panel: the 1st round of subcluster analysis was performed on T cells from PBMC clusters 0, 3, 4, 5 and 9, which yielded nine T cell SCs, among them the CD4+ T cells (0, 3, 4, 7, 8); CD4+ T cell panel: CD4+ T cells were further analyzed and re-clustered, which led to nine final SCs. SCs with high CCR7 and low S100A4 (0, 1, 3) were annotated as naïve CD4+ T cells. SC7 was also naïve-like, but was high for ISG expression; Finally, five SCs were identified from the naïve CD4+ T cells. Single-cell analysis included SLE (n = 16) and HD (n = 10) samples from the sorted CD4+ scRNA-Seq dataset, and additional SLE (n = 10) samples from the PBMC validation dataset 1.

Extended Data Fig. 4 |. Cell proportion analysis for memory CD4+ T cells from the sorted dataset.

Extended Data Fig. 4 |

(a-c) Box plots of relative cell proportions within total CD4+ T cells for each memory SC by disease group (a), severity (b) and LN (c). The boundaries and center line of the box plot represent upper/lower quartiles and the median of the data, respectively. The whiskers extend to the maximum and minimum data values within the 1.5 IQR from the upper/lower quartiles. All data points were overlaid on the box plot. Two-sided t-tests were performed to test differences in cell proportions between groups. (d) Expression-based in silico gating analysis for TFH subsets. Two CXCR5+ sub-populations, PD-1+ CCR7 (yellow) and PD-1 CCR7+ (blue), were defined based on single cell gene expression as shown in the UMAP (upper left panel). Box plots (upper right and lower panel) show the relative cell proportions of the two CXCR5+ subsets within total CD4+ T cells by disease, LN, or DA. Box plots represent the same summary statistics as previously described. Two-sided t-tests were performed to test differences in cell proportions between groups. (e-g) Expression density plots of selective markers specific for TH2 (e), TH22 (f), and TH17 (g). (h) Proportion of TH10 (SC7) and TPH (SC8) cells in total CD4+ T cells per subject. Single-cell analysis included SLE LN (n = 11), SLE NoLN (n = 5) and HD (n = 10) samples from the sorted CD4+ scRNA-Seq dataset. LN/NoLN: w/wo nephritis; High: SLEDAI > 4; Low: SLEDAI < = 4. Significance levels: * (p < 0.05), ** (p < 0.01), *** (p < 0.001).

Extended Data Fig. 5 |. Flow cytometry characterization of the PD-1+ CD38+ memory subset.

Extended Data Fig. 5 |

(a) Gating strategy and representative flow cytometry dot plots showing the percentages of PD-1+ CD38+ in SLE patients. (b) Levels of IL10, IL21 and CXCL13 in the supernatants of PD-1+ CD38+ and PD-1+ CD38 cells upon activation (n = 3 independent donors). Data are represented as the means ± s.d. (IL-10 *p = 0.0235; IL-21 **p = 0.0089; CXCL13 **p = 0.004. Two-sided Student’s t-tests). (c) Bar plot showing number of cells within SC7 that express either IL10, IL21 or both. (d) Representative flow cytometry means fluorescence index of TH10 specific markers on CD4+ CXCR5 CD45RA PD-1 CD38+, CD4+ CXCR5 CD45RA PD-1+ CD38+ and CD4+ CXCR5 CD45RA PD-1+ CD38 cells. (e) Bar graphs showing relative MFI (median fluorescence index) of CD38 expression on naïve CD4 (CD4+ CD25 CD127+ CD45RA+ CCR7+ cells, TH10 (CD4+ CD25 CD127+ CD45RA CXCR5 CCR6 CXCR3+ PD1+) cells, TFH (CD4+ CD25 CD127+ CD45RA CXCR5+), nTreg cells (CD4+ CD25+ CD127 CD45RA+) and mTreg cells (CD4+ CD25+ CD127 CD45RA+) (n = 39 independent donors). Data are represented as the means ± s.d. (TH10-Naïve CD4 **p = 0.0024; TH10-TFH **p = 0.0039; TH10-nTreg **p = 0.0024; TH10-mTreg **p = 0.0026. Two-sided Student’s t-tests). (f) Average expression of CD96 in memory CD4+ T cells. The boundaries and center line of the box plot represent upper/lower quartiles and the median of the data, respectively. The whiskers extend to the maximum and minimum data values within the 1.5 IQR from the upper/lower quartiles. All data points were overlaid on the box plot. (g-h) Module score of CD96hi signatures in memory CD4+ T cells and across SCs. (i) Cell proportion changes for the CD96 signaturehi cells between SLE and HD. Box plots represent the same summary statistics as previously described. Two-sided t-tests were performed to test differences in cell proportions between groups. Single-cell analysis included SLE (n = 16) and HD (n = 10) samples from the sorted CD4+ scRNA-Seq dataset. Significance levels: * (p < 0.05), ** (p < 0.01), *** (p < 0.001).

Extended Data Fig. 6 |. In silico gating analysis of CD4+ T helper subsets based on chemokine receptor expression.

Extended Data Fig. 6 |

(a) UMAPs showing the presence (yes) or absence (no) of different CD4+ Th subsets based on gating of chemokine receptor gene expression from single cell data. (b) Bar plot showing the proportion of cells of CD4+ Th subsets inferred by expression-based gating across the ten memory CD4+ T cell SCs. (c) Expression density plot of CCR9 on memory SCs (left) and heatmap showing the average expression of CCR9 across memory SCs and disease groups (right). Single-cell analysis included SLE (n = 16) and HD (n = 10) samples from the sorted CD4+ scRNA-Seq dataset.

Extended Data Fig. 7 |. Analysis of the PBMC validation dataset 2.

Extended Data Fig. 7 |

(a) Workflow of subclustering analysis of the PBMC validation dataset 2. After T cells and NK cells were subsetted and reclustered, CD4+ memory T cells and Treg cells were further reclustered, yielding eight CD4+ memory T cell clusters and two Treg clusters. (b) Summarized heatmap showing expression of selected T helper cell markers across memory subsets. (c) Average expression of CD96 in CD4+ memory T cells. The boundaries and center line of the box plot represent upper/lower quartiles and the median of the data, respectively. The whiskers extend to the maximum and minimum data values within the 1.5 IQR from the upper/lower quartiles. All data points were overlaid on the box plot. (d) Module score of CD96hi signatures across CD4+ memory T cell subsets. (e) Cell proportion changes for the CD96 signaturehi cells between SLE and HD (* indicates p < 0.05). Box plots represent the same summary statistics as previously described. Two-sided t-tests were performed to test differences in cell proportions between groups. Single-cell analysis included SLE (n = 9) and HD (n = 5) samples from the PBMC validation dataset 2.

Extended Data Fig. 8 |. Additional data for Treg subsets definition, validation, and their associated cell frequency changes.

Extended Data Fig. 8 |

(a) Representative flow cytometry gating of Treg cells (left), with corresponding percentages in the blood of SLE patients and HD (middle). Data are represented as the means ± s.d. (For Foxp3+ SLE n = 19; HD n = 10 ****p = 0.0001; for CD127 SLE n = 16; HD n = 7 **p = 0.0029. Two-sided Student’s t-tests). Right: Frequencies of Foxp3+ cells in HD (n = 10), SLE patients without LN (n = 7), and those with LN (n = 12). Data are represented as the means ± s.d. (HD vs SLE NoLN ****p = 0.0001; HD vs SLE LN ***p = 0.0003. Ordinary one-way ANOVA). (b) Box plots showing relative proportions of each Treg SC within total CD4+ T cells by disease group (top), LN (middle) and DA (bottom). The boundaries and center line of the box plot represent upper/lower quartiles and the median of the data, respectively. The whiskers extend to the maximum and minimum data values within the 1.5 IQR from the upper/lower quartiles. All data points were overlaid on the box plot. Two-sided t-tests were performed to test differences in cell proportions between groups. (c) Gating strategy for nTreg and mTreg cells (left) and their percentages in total CD4+ T cells in the blood of HD (n = 10 independent donors) and SLE (n = 21 independent donors) (right). Data are represented as the means ± s.d. (Fr.I ***p = 0.0004; Fr.II n.s. p = 0.2044. Two-sided Student’s t-tests). (d) Scatter plot of log2FC of 39 nTreg/mTreg markers between Treg SCs (SC0 vs. SC1) and sorted nTreg and mTreg fractions. Pearson’s correlation (0.67) and its corresponding two-sided p-value was calculated. (e) Box plot showing the mean gene expression of all TLRs across the Treg SCs. Single-cell analysis in (b), (d) and (e) included SLE LN (n = 11), SLE NoLN (n = 5) and HD (n = 10) samples from the sorted CD4+. LN/NoLN: w/wo nephritis; High: SLEDAI > 4; Low: SLEDAI < = 4. Significance levels: * (p < 0.05), ** (p < 0.01), *** (p < 0.001), **** (p < 0.0001), n.s. (not significant).

Extended Data Fig. 9 |. Additional data for subcluster analysis of Treg cells from the PBMC validation dataset 2 and their frequency changes due to the addition of flagellin.

Extended Data Fig. 9 |

(a) UMAP showing total Treg cells are clustered into two subsets with the expression density plots of Treg-related genes. (b) Box plots showing relative proportions of Treg subsets within total CD4+ T cells by disease group. The boundaries and center line of the box plot represent upper/lower quartiles and the median of the data, respectively. The whiskers extend to the maximum and minimum data values within the 1.5 IQR from the upper/lower quartiles. All data points were overlaid on the box plot. Two-sided t-tests were performed to test differences in cell proportions between groups. (c) Representative flow cytometry density plots showing the expression of TLR5 on nTreg and mTreg cells gated on CD4+ CD25+ CD127low cells from the blood of HD and SLE patients. (d) Percentage suppression of responder cells by nTreg or mTreg cells in the absence of flagellin in SLE patients (n = 10 independent donors) and HD (n = 10 independent donors). Data are represented as the means ± s.d. (Fr.I **p = 0.0027; Fr.II *p = 0.0199. Two-sided Student’s t-tests). (e) Percentage suppression of responder cells by nTreg or mTreg cells at varying responder cell:Treg ratios. (f) Representative histogram plots showing CellTrace dilution of responder cells cultured with HD or SLE nTreg cells, with or without flagellin. Single-cell analysis included SLE (n = 9) and HD (n = 5) samples from the PBMC validation dataset 2. Significance levels: * (p < 0.05), ** (p < 0.01), *** (p < 0.001).

Extended Data Fig. 10 |. Integrated CD4+ peripheral T cell landscape in SLE.

Extended Data Fig. 10 |

Key conclusions from the study are highlighted and illustrated. All subclusters from the naïve/CM (NCM), memory (MEM), regulatory (TREG) and ISG-high (ISGH) CD4+ T cell groups were integrated and re-projected to a new UMAP using the R package Ragas (sorted CD4+ T cell dataset, n = 26). Each subcluster was numerically labeled based on their original subcluster identity before integration. Graphics created using BioRender.com.

Supplementary Material

Supplementary Table 1
Supplementary Table 3
Supplementary Table 4
Supplementary Table 2

Supplementary information The online version contains supplementary material available at https://doi.org/10.1038/s41590-025-02297-2.

Acknowledgements

We thank our patients and their families and the healthy individuals who participated in this study, and the members of the Pediatric Rheumatology Clinics at Texas Scottish Rite Hospital for Children and the Children’s Medical Center in Dallas. We thank T. Miller (WCM, New York) for help with fluorescence-assisted cell sorting, as well as the Epigenomics Core at Weill Cornell Medicine (New York, NY). This work was supported by U19AI144301-06 (NIH/NIAID Autoimmunity Centers of Excellence), P50AR070594-06 (NIH/NIAMS Center for Lupus Research), an iAward from Sanofi to V.P. and funds from the Drukier Institute for Children’s Health at Weill Cornell Medicine.

Footnotes

Competing interests

V.P. has received consulting honoraria from Sanofi, AstraZeneca, Merk, Regeneron and Moderna. She is a Scientific Advisor for Moderna and Metis Therapeutics and has received a research grant from Sanofi and a contract from AstraZeneca. T.M. is an employee at Sanofi Pasteur. V.S. is an employee at AstraZeneca. In the early stage of the study, J.B. was a SAB member of Cue Biopharma. Currently, he is the founder of Immunoledge LLC, an entity designed to advise biotech start-ups. In this role, J.B. serves as CIO of Georgiamune, SAB member of Metis Therapeutics and advisor to the Jackson Laboratory. The other authors declare no competing interests.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Extended data is available for this paper at https://doi.org/10.1038/s41590-025-02297-2.

Data availability

The sorted CD4+ T cell dataset (GSE285773) and two PBMC validation cohorts (GSE285774 and GSE298578) are available via GEO. The processed Seurat objects are available via Zenodo at https://doi.org/10.5281/zenodo.16385794 (ref. 61).

Code availability

The code used to generate all scRNA-seq analysis results presented in Figs. 15 is available via GitHub at https://github.com/jig4003/CD4-SLE.

References

  • 1.Banchereau R et al. Personalized immunomonitoring uncovers molecular networks that stratify lupus patients. Cell 165, 1548–1550 (2016). [DOI] [PubMed] [Google Scholar]
  • 2.Caielli S, Wan Z & Pascual V Systemic lupus erythematosus pathogenesis: interferon and beyond. Annu. Rev. Immunol. 41, 533–560 (2023). [DOI] [PubMed] [Google Scholar]
  • 3.Brunner HI, Silverman ED, To T, Bombardier C & Feldman BM Risk factors for damage in childhood-onset systemic lupus erythematosus: cumulative disease activity and medication use predict disease damage. Arthritis Rheum. 46, 436–444 (2002). [DOI] [PubMed] [Google Scholar]
  • 4.Morita R et al. Human blood CXCR5+CD4+ T cells are counterparts of T follicular cells and contain specific subsets that differentially support antibody secretion. Immunity 34, 108–121 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Rao DA et al. Pathologically expanded peripheral T helper cell subset drives B cells in rheumatoid arthritis. Nature 542, 110–114 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Bocharnikov AV et al. PD-1hiCXCR5 T peripheral helper cells promote B cell responses in lupus via MAF and IL-21. JCI Insight 4, e130062 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Caielli S et al. A CD4+ T cell population expanded in lupus blood provides B cell help through interleukin-10 and succinate. Nat. Med. 25, 75–81 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Roncarolo MG, Gregori S, Bacchetta R & Battaglia M Tr1 cells and the counter-regulation of immunity: natural mechanisms and therapeutic applications. Curr. Top. Microbiol. Immunol. 380, 39–68 (2014). [DOI] [PubMed] [Google Scholar]
  • 9.Stuart T et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Korsunsky I et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Balaji U et al. Ragas: integration and enhanced visualization for single cell subcluster analysis. Bioinformatics 40, btae366 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Escobar G, Mangani D & Anderson AC T cell factor 1: a master regulator of the T cell response in disease. Sci. Immunol. 5, eabb9726 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Weng NP, Araki Y & Subedi K The molecular basis of the memory T cell response: differential gene expression and its epigenetic regulation. Nat. Rev. Immunol. 12, 306–315 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Miragaia RJ et al. Single-cell transcriptomics of regulatory T cells reveals trajectories of tissue adaptation. Immunity 50, 493–504 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Nehar-Belaid D et al. Mapping systemic lupus erythematosus heterogeneity at the single-cell level. Nat. Immunol. 21, 1094–1106 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Liau NPD et al. The molecular basis of JAK/STAT inhibition by SOCS1. Nat. Commun. 9, 1558 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.van den Broek T, Borghans JAM & van Wijk F The full spectrum of human naive T cells. Nat. Rev. Immunol. 18, 363–373 (2018). [DOI] [PubMed] [Google Scholar]
  • 18.Hao Y et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Riviere JB et al. De novo mutations in the actin genes ACTB and ACTG1 cause Baraitser-Winter syndrome. Nat. Genet. 44, 440–444 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Shimizu Y et al. Crosslinking of the T cell-specific accessory molecules CD7 and CD28 modulates T cell adhesion. J. Exp. Med. 175, 577–582 (1992). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Yukawa M et al. AP-1 activity induced by co-stimulation is required for chromatin opening during T cell activation. J. Exp. Med. 217, e20182009 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Thomson Z et al. Trimodal single-cell profiling reveals a novel pediatric CD8αα+ T cell subset and broad age-related molecular reprogramming across the T cell compartment. Nat. Immunol. 24, 1947–1959 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Yoshitomi H et al. Human Sox4 facilitates the development of CXCL13-producing helper T cells in inflammatory environments. Nat. Commun. 9, 3762 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ueno H, Banchereau J & Vinuesa CG Pathophysiology of T follicular helper cells in humans and mice. Nat. Immunol. 16, 142–152 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Argyriou A et al. Single cell sequencing identifies clonally expanded synovial CD4+ TPH cells expressing GPR56 in rheumatoid arthritis. Nat. Commun. 13, 4046 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.He J et al. Circulating precursor CCR7loPD-1hi CXCR5+ CD4+ T cells indicate Tfh cell activity and promote antibody responses upon antigen reexposure. Immunity 39, 770–781 (2013). [DOI] [PubMed] [Google Scholar]
  • 27.Jiang Q et al. Role of Th22 cells in the pathogenesis of autoimmune diseases. Front. Immunol. 12, 688066 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Liang SC et al. Interleukin (IL)-22 and IL-17 are coexpressed by Th17 cells and cooperatively enhance expression of antimicrobial peptides. J. Exp. Med. 203, 2271–2279 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Patil VS et al. Precursors of human CD4+ cytotoxic T lymphocytes identified by single-cell transcriptome analysis. Sci. Immunol. 3 eaan8664 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Cruz-Guilloty F et al. Runx3 and T-box proteins cooperate to establish the transcriptional program of effector CTLs. J. Exp. Med. 206, 51–59 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Dorner BG et al. MIP-1α, MIP-1β, RANTES, and ATAC/ lymphotactin function together with IFN-γ as type 1 cytokines. Proc. Natl Acad. Sci. USA 99, 6181–6186 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Yoshitomi H & Ueno H Shared and distinct roles of T peripheral helper and T follicular helper cells in human diseases. Cell Mol. Immunol. 18, 523–527 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Law C et al. Interferon subverts an AHR-JUN axis to promote CXCL13+ T cells in lupus. Nature 631, 857–866 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Sallusto F Heterogeneity of human CD4+ T cells against microbes. Annu. Rev. Immunol. 34, 317–334 (2016). [DOI] [PubMed] [Google Scholar]
  • 35.Cosmi L et al. CRTH2 is the most reliable marker for the detection of circulating human type 2 Th and type 2 T cytotoxic cells in health and disease. Eur. J. Immunol. 30, 2972–2979 (2000). [DOI] [PubMed] [Google Scholar]
  • 36.Mucida D et al. Reciprocal TH17 and regulatory T cell differentiation mediated by retinoic acid. Science 317, 256–260 (2007). [DOI] [PubMed] [Google Scholar]
  • 37.Bour-Jordan H & Bluestone JA Regulating the regulators: costimulatory signals control the homeostasis and function of regulatory T cells. Immunol. Rev. 229, 41–66 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Agarwal S et al. Human Fc receptor-like 3 inhibits regulatory T cell function and binds secretory IgA. Cell Rep. 30, 1292–1299 (2020). [DOI] [PubMed] [Google Scholar]
  • 39.Miyara M et al. Functional delineation and differentiation dynamics of human CD4+ T cells expressing the FoxP3 transcription factor. Immunity 30, 899–911 (2009). [DOI] [PubMed] [Google Scholar]
  • 40.Cuadrado E et al. Proteomic analyses of human regulatory T cells reveal adaptations in signaling pathways that protect cellular identity. Immunity 48, 1046–1059 (2018). [DOI] [PubMed] [Google Scholar]
  • 41.Crellin NK et al. Human CD4+ T cells express TLR5 and its ligand flagellin enhances the suppressive capacity and expression of FOXP3 in CD4+CD25+ T regulatory cells. J. Immunol. 175, 8051–8059 (2005). [DOI] [PubMed] [Google Scholar]
  • 42.Yoon SI et al. Structural basis of TLR5-flagellin recognition and signaling. Science 335, 859–864 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Donado CA et al. Granzyme K activates the entire complement cascade. Nature 641, 211–221 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Koga R et al. Granzyme K- and amphiregulin-expressing cytotoxic T cells and activated extrafollicular B cells are potential drivers of IgG4-related disease. J. Allergy Clin. Immunol. 153, 1095–1112 (2024). [DOI] [PubMed] [Google Scholar]
  • 45.Fox DA et al. Lymphocyte subset abnormalities in early diffuse cutaneous systemic sclerosis. Arthritis Res. Ther. 23, 10 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Bohacova P et al. Multidimensional profiling of human T cells reveals high CD38 expression, marking recent thymic emigrants and age-related naive T cell remodeling. Immunity 57, 2362–2379. e10 (2024). [DOI] [PubMed] [Google Scholar]
  • 47.Raeber ME et al. Interleukin-2 immunotherapy reveals human regulatory T cell subsets with distinct functional and tissue-homing characteristics. Immunity 57, 2232–2250.e10 (2024). [DOI] [PubMed] [Google Scholar]
  • 48.Le Coz C et al. Human T follicular helper clones seed the germinal center-resident regulatory pool. Sci. Immunol. 8, eade8162 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Dai J, Liu B & Li Z Regulatory T cells and Toll-like receptors: what is the missing link? Int. Immunopharmacol. 9, 528–533 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Nyirenda MH et al. TLR2 stimulation regulates the balance between regulatory T cell and Th17 function: a novel mechanism of reduced regulatory T cell function in multiple sclerosis. J. Immunol. 194, 5761–5774 (2015). [DOI] [PubMed] [Google Scholar]
  • 51.Greiling TM et al. Commensal orthologs of the human autoantigen Ro60 as triggers of autoimmunity in lupus. Sci. Transl. Med. 10, eaan2306 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Azzouz D et al. Lupus nephritis is linked to disease-activity associated expansions and immunity to a gut commensal. Ann. Rheum. Dis. 78, 947–956 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Zheng GX et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.McInnes L, Healy J, Saul N & Grossberger L UMAP: uniform manifold approximation and projection. J. Open Source Softw. 3, 861 (2018). [Google Scholar]
  • 55.Alquicira-Hernandez J & Powell JE Nebulosa recovers single-cell gene expression signals by kernel density estimation. Bioinformatics 37, 2485–2487 (2021). [DOI] [PubMed] [Google Scholar]
  • 56.Wolock SL, Lopez R & Klein AM Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 8, 281–291 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Tran AN, Dussaq AM, Kennell T Jr, Willey CD & Hjelmeland AB. HPAanalyze: an R package that facilitates the retrieval and analysis of the Human Protein Atlas data. BMC Bioinformatics 20, 463 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Crowell HL et al. muscat detects subpopulation-specific state transitions from multi-sample multi-condition single-cell transcriptomics data. Nat. Commun. 11, 6077 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Robinson MD, McCarthy DJ & Smyth GK edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Wickham H ggplot2: Elegant Graphics for Data Analysis (Springer, 2016). [Google Scholar]
  • 61.Balasubramanian P et al. Single cell RNA profiling of blood CD4+ T cells identifies distinct helper and dysfunctional regulatory clusters in children with SLE. Zenodo 10.5281/zenodo.16385794 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Table 1
Supplementary Table 3
Supplementary Table 4
Supplementary Table 2

Data Availability Statement

The sorted CD4+ T cell dataset (GSE285773) and two PBMC validation cohorts (GSE285774 and GSE298578) are available via GEO. The processed Seurat objects are available via Zenodo at https://doi.org/10.5281/zenodo.16385794 (ref. 61).

RESOURCES