Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Mar 1.
Published in final edited form as: Nat Immunol. 2024 Aug 20;25(9):1731–1741. doi: 10.1038/s41590-024-01888-9

CD4+ T cells exhibit distinct transcriptional phenotypes in the lymph node and blood following mRNA vaccination in humans

Nicholas Borcherding 1, Wooseob Kim 1,2, Michael Quinn 3, Fangjie Han 4, Julian Q Zhou 1, Alexandria J Sturtz 1, Aaron J Schmitz 1, Tingting Lei 1, Stefan A Schattgen 5, Michael K Klebert 6, Teresa Suessen 7, William D Middleton 7, Charles W Goss 8, Chang Liu 1, Jeremy Chase Crawford 5, Paul G Thomas 5, Sharlene A Teefey 7, Rachel M Presti 6,9,10,11, Jane A O’Halloran 9, Jackson S Turner 1, Ali H Ellebedy 1,10,11,*, Philip A Mudd 4,10,*
PMCID: PMC11627549  NIHMSID: NIHMS2037041  PMID: 39164479

Abstract

SARS-CoV-2 infection and mRNA vaccination induce robust CD4+ T cell responses. Using single-cell transcriptomics, here we evaluated CD4+ T cells specific for the SARS-CoV-2 spike protein in the blood and draining lymph node (dLN) of human subjects 3 months and 6 months post-vaccination with the BNT162b2 mRNA vaccine. We analyzed 1,277 spike-specific CD4+ T cells, including 238 defined using Trex, a deep learning-based reverse epitope mapping method to predict antigen-specificity. Human dLN spike-specific CD4+ T follicular helper cells (TFH cells) exhibited heterogeneous phenotypes, including germinal center CD4+ TFH cells and CD4+ IL-10+ TFH cells. Analysis of an independent cohort of SARS-CoV-2-infected individuals 3 months and 6 months post-infection found spike-specific CD4+ T cell profiles in blood that were distinct from those detected in blood 3 months and 6 months post-BNT162b2 vaccination. Our findings provide an atlas of human spike-specific CD4+ T cell transcriptional phenotypes in the dLN and blood following SARS-CoV-2 vaccination or infection.


The SARS-CoV-2 pandemic provided a unique opportunity to study primary human immune responses to a new pathogen and the immunodominant SARS-CoV-2 spike antigen incorporated into various vaccine platforms. Messenger RNA (mRNA) vaccines engender strong immune responses to the SARS-CoV-2 spike antigen, including high-frequency circulating spike-specific CD4+ T cells1,2 and spike-specific CD4+ T follicular helper cells (CD4+ TFH cells) in the draining lymph node (dLN)3. CD4+ TFH cells support the development and maintenance of germinal center (GC) B cells in secondary lymphoid organs by providing appropriate co-stimulation and cytokine survival signals throughout antibody class switch, affinity maturation, long-lived plasma cell development and memory B cell development4-6. In mouse models, functional CD4+ TFH cells are absolutely required for productive GC and the development of memory B cells and long-lived plasma cells7-11.

Due to limitations in sampling human secondary lymphoid organs, GC and CD4+ TFH cell responses have been studied in easily accessible tissue compartments, including blood and discarded clinical tonsillectomy tissue12-14. Biopsies or autopsy have yielded insights into the phenotype of human CD4+ TFH cells in the lymph nodes15,16, but have been limited to the exploration of phenotypes at steady state. The evaluation of human antigen-specific CD4+ TFH cells in the secondary lymphoid organs after acute infection or vaccination is even more limited3,13,17 and rarely includes analysis of antigen-specific responses at the single-cell level.

Serial fine needle aspiration (FNA) of ultrasound-localized dLN has been used to probe human GC responses in the axillary dLN after deltoid intramuscular vaccination in a cohort of infection naïve human subjects vaccinated with the SARS-CoV-2 mRNA vaccines3,18 and detected strong induction of spike-specific CD4+ TFH cell responses in the dLN including CD4+ T cells that recognize the HLA-DPB1*04-restricted immunodominant epitope S167-1803 that persist for 6 months post-vaccination3. Here, we performed single-cell RNA sequencing (RNA-seq) to obtain matched transcriptome and T cell receptor (TCR) sequencing from >200,000 T cells from the blood and dLN of six SARS-CoV-2-naïve, HLA-DPB1*04+ human subjects on day 28- day 201 post-vaccination with the first dose of a primary two-dose BNT162b2 mRNA vaccine series, with the second dose delivered exactly 21 days after dose one. Using a reverse epitope discovery technique developed to integrate biochemical properties of TCR complementarity determining region 3 (CDR3) amino acids and transcriptional profiles in single cells to predict antigen-specificity, we expanded the number of known spike-specific TCR in our dataset, confirmed these paired TCRs were spike-specific, and analyzed the transcriptional dynamics of multiple lineages of spike-specific CD4+ T cells restricted by multiple HLA class II alleles in the blood and dLN at day 110 and day 201 post-vaccination. We also incorporated the analysis of spike-specific CD4+ T cells from the blood of a cohort of six HLA-DPB1*04+ human subjects post-primary infection with SARS-CoV-2 and compared these responses to the spike-specific memory CD4+ T cells found after vaccination. Our data provides an atlas of total and spike-specific transcriptional phenotypes in CD4+ T cells found in blood and the dLN following initial exposure to the SARS-CoV-2 spike antigen.

Results

mRNA vaccination induces diverse T cell phenotypes

We performed single-cell RNA-seq and paired TCR sequencing on total dLN cells from FNA samples obtained on day 28, day 60, day 110 and day 201 after the first BNT162b2 vaccine dose from six 34- to 48-year-old, 2 female and 4 male, HLA-DPB1*04+ subjects and magnetically enriched total CD4+ cells from temporally-matched blood samples from four of the six subjects obtained at day 110 and day 201 post-dose one of the two dose BNT162b2 vaccination3,19 (Fig. 1a, Extended Data Table 1, Extended Data Table 2). At all timepoints, all subjects had detectable spike-specific GC B cells in the evaluated dLN19. A total of 219,283 individual T cells passed all transcriptional quality metrics and contained a paired TCR sequence (Fig. 1b, https://cellpilot.emed.wustl.edu). Because we did not select for CD4+ T cells during dLN sample preparation and because of a low level contamination with CD8+ T cells during the magnetic separation step in the blood samples (Extended Data Fig. 1), CD8+ T cells were included in the data set. Based on uniform manifold approximation and projection (UMAP) analysis, we identified 19 transcriptional T cell clusters (Fig. 1b). Following annotation using granular cell types with canonical markers and reference atlases (Extended Data Fig. 2), we identified two CD4+ TFH cells clusters (C10 and C15) and one CD4+ T follicular memory (TFM) cells cluster (C1) that co-localized in the same region of the UMAP (Fig. 1b). Common T cell markers such as CD4, CD8A, CXCR5, ICOS and PDCD1 separated clearly in the UMAP projection (Fig. 1c). All 19 T cell clusters were present at all time points (Fig. 1d) and in both the blood and dLN (Fig. 1e). Throughout the UMAP projection, we found individual T cells with published SARS-CoV-2-specific TCR CDR3 sequences3,20,21, including the immunodominant HLA-DPB1*04-restricted CD4+ T cell epitope S167-1803 (Fig. 1f). S167-180-specific TCRs were primarily localized in the two CD4+ TFH cells clusters (C10 and C15, Fig. 1b,f), consistent with an enrichment of dLN tissue from HLA-DPB1*04+ individuals during an ongoing GC response in our dataset. Alignment of previously published spike-specific TCR alpha chain (TRA) and TCR beta chain (TRB) CDR3 sequences3,20,21 found in our dataset identified dominant polar amino acid signatures at positions 11, 12 and 13 of the TRA CDR3 and positions 10 and 13 of the TRB CDR3 and similarities in amino acids found at other key CDR3 residues (Fig. 1h). As such, our analysis of more than 200,000 T cells found in the blood and dLN in the first 6 months following primary BNT162b2 mRNA vaccination revealed diverse T cell transcriptional profiles enriched for spike-specific CD4+ TFH cells.

Figure 1. BNT162b2 mRNA vaccination induces spike-specific CD4+ T cells with diverse transcriptional phenotypes in the blood and dLN.

Figure 1.

a. Schematic showing sample collection time points for dLN FNA and peripheral blood collection from six donors 368-01a, 368-04, 368-13, 368-16, 368-20 and 368-22 at day 28, day 60, day 110 and day 201 post-first dose of the BNT162b2 vaccine and graph showing the number of cells isolated from each donor from dLN or blood. For each sample collection, a technical replicate was performed and sequenced. b. UMAP of 219,283 dLN and blood CD3+ T cells with paired TRA-TRB sequences that passed quality control filtering. Cluster annotation based on canonical subtype markers and automated annotation using SingleR and ProjecTIL. c. Gene-weighted density estimates overlaid on the UMAP coordinates for the T cell markers CD4, CD8A, CCR7, SELL, CXCR5, ICOS, FOXP3, IL2RA, CTLA4 and PDCD1. d,e. Relative cellular density at day 28, 60 110 and 201 (d) and in blood and dLN (e) in CD3+ T cells as in b. f. Localization of spike-specific TCRs identified in ref3,21 along the UMAP. S167-180-specific TCRs are highlighted in blue; other spike-specific TCRs are in red. g. Alignment of TRA and TRB CDR3 motifs for S167-180-specific3 and other spike-specific21 TCRs.

The dLN contains transcriptionally diverse CD4+ TFH

To analyze the phenotypic dynamics of human CD4+ TFH cells in the dLN post-vaccination, we generated a new UMAP of all dLN CD4+ TFH cells and CD4+ TFM cells found in clusters C1, C10 and C15, which identified 12 distinct phenotypic subclusters (denoted c0-c11) (Fig. 2a-c, https://cellpilot.emed.wustl.edu). These were principally classified as the well-described4,22 CXCL13+CXCR5+BCL6+ CD4+ GC TFH cells (c3), the previously described15,23 CD4+ IL-10+ TFH cells (c8), cytotoxic GZMA+GZMK+ CD4+ TFH C cells (c9), effector IRF4+ CD4+ TFH EFF cells (c6), proliferating MKI67+ CD4+ TFH Pro cells (c11), regulatory FOXP3+ CD4+ TFR cells (c4) and memory KLF2+ CD4+ TFM cells (c0, c1, c2, c7, and c10). A single distinct cluster (c5) represented CD8+ T cells and was not considered further. CD4+ GC TFH and CD4+ IL-10+ TFH cells clustered together in hierarchical clustering analysis of gene sets (Fig. 2d) and shared many characteristics, including the highest expression of canonical CD4+ GC TFH markers like CXCR5, PDCD1 and BCL6 (Fig. 2b,c); the highest expression of genes related to TCR signaling, T helper pathways, activation pathways, cell adhesion signaling and antigen presentation in gene set enrichment analysis (Fig. 2d); and high expression of genes related to increased metabolic activity, with elevated expression levels of genes involved in oxidative phosphorylation, glycolysis and PI3K/AKT signaling (Fig. 2d). CD4+ GC TFH and CD4+ IL-10+ TFH cells maintained a relatively consistent expression of the distinguishing gene sets throughout the duration of the GC day 28 to day 201 (Fig. 2e) and shared the largest number of identical paired TCR clonotypes among all 12 dLN TFH clusters (Fig. 2f), suggesting significant overlap in the clonal populations recruited to these effector CD4+ TFH cell subsets. Despite the close relationship between CD4+ GC TFH and CD4+ IL-10+ TFH cells, they exhibited differences in cytokine gene expression, with exclusive expression of IL10 and much higher expression of IL21 in CD4+ IL-10+ TFH cells (c8) and much higher expression of IL4 in CD4+ GC TFH cells (c3)(Fig. 2b).

Figure 2. Diverse CD4+ TFH cell and CD4+ TFM cell transcriptional phenotypes are detected in the dLN after BNT162b2 mRNA vaccination.

Figure 2.

a. UMAP of the subset of CD4+ TFH cells and CD4+ TFM cells from Fig. 1b found in the dLN of 368-01a, 368-04, 368-13, 368-16, 368-20 and 368-22 on day 28, 60, 110 and 201 post-first dose of the BNT162b2 vaccine. b. Gene-weighted density estimates of the indicated transcripts overlaid on the UMAP as in a. c. Top 8 or fewer differentially expressed genes in clusters c0-c11 based on UMAP as in a. Dot size represents the percentage of cells expressing the gene, and color is assigned based on scaled expression value. d. Heatmap of median gene set enrichment for the significantly altered gene sets in clusters c0-c4 and c6-c11 as in a., K1-K4 represent k-means clustering of gene sets with general summaries of groupings listed to the right of each group. Significance defined as adjusted p-value < 0.05 via two-way ANOVA. e. Normalized gene set enrichment values in clusters c0-c4 and c6-c11 at day 28, 60, 110 and 201 post-first dose of the BNT162b2 vaccine. Colors indicate individual gene sets. f. Circos plot showing overlap of unique individual TCR clonotypes at day 28, 60, 110 and 201 post-first dose of the BNT162b2 vaccine, with ribbons between clusters representing overlapping clonotypes. g. Relative CD4+ TFH and CD4+ TFM cells density at day 28, 60, 110 and 201 post-first dose of the BNT162b2 vaccine. S167-180-specific TCR represented by white dots. Shown is the percentage of S167-180-specific CD4+ TFH and CD4+ TFM cells among total CD4+ TFH and CD4+ TFM cells at each time point.

The CD4+ TFH EFF cells and CD4+ TFH C cells expressed much less CXCR5, PDCD1 and BCL6 compared to CD4+ GC TFH cells and CD4+ IL-10+ TFH cells, but segregated with CD4+ GC TFH and CD4+ IL-10+ TFH in hierarchical clustering analysis (Fig. 2d), although the functional significance of these two TFH populations was not clear. CD4+ TFH EFF cells had high expression of the microRNA MIR155HG (Fig. 2c), a transcript associated with increased inflammation through inhibition of SOCS1 and many other genes24, which was also shown to encode a short functional peptide, miPEP155, which modulates class II antigen presentation25; had the highest expression of the T cell transcriptional regulators IRF4 and NFKBID (Fig. 2c), suggesting a transitional phenotype, and that they may ultimately develop into other CD4+ T cell subsets; and had the highest frequency of clonotypic overlap with CD4+ GC TFH (Fig. 2f), suggesting a relationship between these two subsets. CD4+ TFH C cells exhibited clonal overlap with CD4+ TFM cells (c1) at day 28, day 60, day 110 and day 201 and had minimal clonal overlap with the other clusters (Fig. 2f), suggesting a unique lineage distinct from CD4+ GC TFH cells that may be related to previously identified “cytotoxic” TFH13.

CD4+ TFH Pro (c11) expressed the proliferation marker MKI67 (Fig. 2c) and expressed PDCD1 and CXCR5 (Fig. 2c). While a spatiotemporal relationship between CD4+ TFH Pro and the other effector TFH could not be established from these data alone, unique paired TCR clonotypes were shared specifically between CD4+ TFH Pro and CD4+ GC TFH and CD4+ IL-10+ TFH cells at day 28, day 60, day 110 and day 201 (Fig. 2f), suggesting that CD4+ GC TFH and CD4+ IL-10+ TFH clones entered proliferation cycles throughout the 6 month timeline, rather than a burst of proliferation early and maintenance of the cluster size over time.

dLN CD4+ TFM cells (c0, c1, c2, c7 and c10) had relatively low expression of genes involved in oxidative and glycolytic metabolism pathways, TCR signaling and cell adhesion signaling compared to effector TFH populations (Fig. 2d) and expressed transcription factors involved in maintaining long-term T cell responsiveness and homeostasis, such as KLF226,27, JUN28, JUNB29 and KLF627 (Fig. 2c). Cluster c7 and cluster c1 CD4+ TFM cells exhibited clonal overlap with CD4+ GC TFH cells (Fig. 2f), suggesting a close relationship between CD4+ GC TFH and the CD4+ TFM populations found in c7 and c1. TCR sequences specific for the S167-180-epitope3 were found primarily in the CD4+ GC TFH cell cluster at every evaluated time point (Fig. 2g) and were found in the CD4+ IL-10+ TFH, CD4+ TFH Pro and CD4+ TFM cell clusters in small numbers in at least one time point (Fig. 2g). Thus, CD4+ TFH cell populations found in the dLN following BNT162b2 mRNA vaccination were transcriptionally diverse and included effector, memory, proliferating and regulatory populations and spike-specific CD4+ GC TFH cells.

Trex can identify antigen-specific CD4+ TFH

To expand the number of spike-specific TCR sequences in the present dataset, we developed a method (referred to as Trex, T cell receptor and expression) that used co-embedding of the RNA transcriptome and the latent dimensional embeddings of both the TRA and TRB CDR3 sequences for each clonotype to identify antigen-specific CD4+ T cells that integrated the biochemical properties of the TCR amino acids and the transcriptional signatures of specific cells (Fig. 3a). Model hyperparameters were empirically based using a bootstrap approach (Extended Data Fig. 3a,b). Each model in Trex showed high fidelity in the return of unique latent dimensional embeddings across sequences (Extended Data Fig. 3c) and runtimes less than 20 seconds for 50,000 unique TCR sequences (Extended Data Fig. 3d). The latent dimensional embeddings were based on the output of neural network-based models called variational autoencoders, which transform the amino acid sequence of each clonotype into a matrix based on Kidera factors before encoding (Fig. 3a). For a given clonotype, a centroid-like approach was used to select a best representative cell to use for RNA expression, based on the minimal Euclidean distance across the calculated principal components (Fig. 3a), similar to clonotype neighbor graph analysis (CoNGA)30. For a given clonotype, the TRA, TRB and RNA vectors were then co-embedded, and a nonlinear dimensional reduction was calculated to represent an immune response at both the transcriptional and repertoire levels simultaneously (Fig. 3a).

Figure 3. Coembedding of single-cell TCR and RNA values from dLN CD4+ TFH and CD4+ TFM using Trex identifies spike-specific responses.

Figure 3.

a. Graphical representation showing the computational embedding of TCR CDR3 amino acid sequences and single-cell RNA from CD4+ TFH cells to generate a heat-diffusion-based manifold of CD4+ TFH cells. Matrices are rescaled based on nearest neighbors and the corrected values are then used for dimensional reduction. b. PHATE projection of the tri-modal (RNA, TRA and TRB) embedding of dLN CD4+ TFH cells by clonotype. Total number of each unique clonotype represented by dot size. c. Representative RNA expression of CD4+ TFH cell marker genes (BCL6, CXCL13, CXCR5, ICOS, MAF, PDCD1, EMP3 and KLF2) overlaid onto the PHATE projection as in b. d. The location of spike-specific3,21 TRA (upper) and TRB (lower) CDR3 in the PHATE projection. e. Alignment of TRA and TRB motifs in spike-specific TCRs3,21 from PHATE-defined clusters Trex-C0, Trex-C1 and Trex-C3. f. The location within the PHATE projection (left) and sequence (right) of five candidate spike-specific TCR clonotypes derived from PHATE-defined clusters Trex-C0 and Trex-C1 that have one TCR chain appearing in > 1 donor and have not been previously described as specific for SARS-CoV-2 spike.

We used Trex to examine all dLN CD4+ TFH cells (clusters c0-c11) and generated a PHATE-based manifold of the resulting data that contained six independent clusters denoted Trex-C0-C5 (Fig. 3b, https://cellpilot.emed.wustl.edu). Transcripts of various CD4+ TFH cell genes partitioned throughout the manifold (Fig. 3c), consistent with the inclusion of both transcriptional and TRA-TRB properties in the model. We found that previously known spike-specific TCR clonotypes co-localized into unique and very focal areas within clusters Trex-C0, Trex-C1 and Trex-C3 of the PHATE-based manifold (Fig. 3d). TRA and TRB CDR3 in these three clusters shared related amino acid biochemical properties (Fig. 3e) that were similar to those observed in the published spike-specific clonotypes. Comparison of the overlap of the nearest neighbors between the Trex- and CoNGA-derived TCR vectors (Extended Data Fig. 4a,b) indicated distinct vectors but an overlap in the nearest neighbor clusters that contained spike-specific CD4+ TFH cell clones (Extended Data Fig. 4c). CoNGA TCR-based clustering centralized spike-specific CD4+ TFH cell clones into a single cluster, whereas Trex-based clusters exhibited multiple small spike-specific-predominant clusters (Extended Data Fig. 4d,e).

To test whether clonally-expanded dLN CD4+ TFH cells with at least one public TRA or TRB shared in two or more donors located within clusters Trex-C0, Trex-C1 and Trex-C3 and found in proximity to other known spike-specific CD4+ T cell clonotypes in the Trex PHATE-based manifold had a high probability of being spike-specific, we chose five TCR candidates that fitted these criteria (Fig. 3f) and were distributed uniquely into multiple CoNGA TCR-based clusters and Trex-based clusters (Extended Data Fig. 4f). We synthesized the five TCRs, cloned them into a retroviral transduction system31 and transduced into primary human CD4+ T cells or Jurkat cells expressing a NFAT-GFP reporter. To determine epitope specificity, we mapped the responsiveness of each transduced TCR (TCR1-TCR5) to overlapping spike peptides in vitro. All five candidate TCRs were spike-specific (Extended Data Fig. 5). TCR2 mapped to S167-180 and bound the HLA-DPB1*04:01-S167-180 tetramer (Extended Data Fig. 5), but did not share the TRA CDR3 motif previously characterized as S167-180-specific3. The spike-specific epitopes for TCR3, TCR4 and TCR5 were restricted by HLA-DRB5*02:02, DRB1*07:01 and DPB1*02:01, respectively (Extended Data Fig. 6). Next, we selected all members of the TRA-TRB families with highly-related TCR to the five index TCR candidates that were experimentally determined to be spike-specific. This expanded the total number of analyzed spike-specific dLN CD4+ TFH cells from 164 to 238 (Supplementary Table 3). Thus, Trex along with in vitro validation of potential antigen-specific TCR, expanded the total number of analyzed spike-specific dLN CD4+ TFH cells in our single-cell dataset by nearly 50%.

Spike-specific dLN TFH gene expression varies over time

We next explored the phenotypic dynamics of the expanded dataset of 238 spike-specific dLN CD4+ TFH cells on day 28, day 60 and day 201 following mRNA vaccination (Fig. 4). dLN CD4+ TFH cells from day 110 post-vaccination included significantly fewer spike-specific cells (10) compared to day 28 (94), day 60 (64) and day 201 (70), and were therefore excluded from the analysis. Gene set enrichment analysis revealed elevated T cell activation, interleukin signaling, cytokine signaling, infection response, IL-12 signaling, GATA3 signaling, NKT pathway genes, P38-MAPK signaling and TGFβ signaling pathways at day 60, the peak of the GC response (Fig. 4a). Gene sets representing CXCR4 signaling and cell cycle progression were significantly enriched in spike-specific CD4+ TFH cells at day 201, at the end of the GC response (Fig. 4a).

Figure 4. Spike-specific CD4+ TFH cell transcriptional phenotypes detected in the dLN change over time.

Figure 4.

a. Median gene set enrichment heatmap showing immune-related gene sets in dLN CD4+ TFH cells at day 28, 60 and 201 post-first dose of the BNT162b2 vaccine with 5 distinct clusters defined by k-means (K1-K5). b. Volcano plot of differential gene expression in dLN spike-specific CD4+ TFH cells at day 28 (n=94) and day 201 (n=70) post-first dose of the BNT162b2 vaccine. Size is based on the change in the percentage of cells expressing the gene at day 28 compared to day 201. Statistical testing performed using two-sided MAST testing without adjustment for multiple comparisons. c. Clonal proportion of the spike-specific dLN CD4+ TFH cell repertoire at day 28, 60 and 201 post-first dose of the BNT162b2 vaccine. Top number on the bar plot indicates the number of clones at the specific time point shared across more than one time point, whereas the number on the bottom indicates the number of spike-specific clones unique to the time point.

We detected several genes that were differentially expressed between day 28 and day 201, the beginning and end of the GC reaction (Fig. 4b, Supplementary Table 4). Early spike-specific dLN CD4+ TFH cells (day 28) had higher expression of ICAM1 (Fig. 4b), suggesting enhanced activation and clustering of CD4+ TFH cells during the early GC; and higher expression of ZBTB14 (Fig. 4b), a poorly characterized member of the zinc finger and BTB domain family of transcription factors, which also includes Bcl-6 (ZBTB27)32. Genes involved in cholesterol metabolism (RELCH), ubiquitination (GID4), and intracellular signaling (MAP4K4, ANXA1) were upregulated in spike-specific CD4+ TFH cells at day 201 (Fig. 4b). To evaluate the paired TCR clonotypes found in the dLN spike-specific CD4+ TFH cells at various time points, we tracked 21 identical paired TCR clonotypes observed at more than one time point during the ongoing GC (Fig. 4c). These TCR clonotypes accounted for between 5% and 28% of the total number of sequenced cells (Fig. 4c), indicating persistence or proliferation of clonally-identical spike-specific CD4+ TFH within the GC over time. These observations indicated that spike-specific CD4+ TFH in the dLN had distinct transcriptional signatures early (day 28), at peak (day 60) and late (day 201) in the GC response and that clonally identical spike-specific CD4+ TFH cell populations persisted throughout the course of the human GC.

Circulating and dLN T cells show minimal overlap

To determine if clonally-identical populations of spike-specific CD4+ T cells could be found in the blood during the ongoing GC reaction, we assessed our dataset for identical paired TRA-TRB sequences in all the sequenced dLN and blood T cells from three of the six individuals vaccinated with BNT162b2 that had matched blood and dLN samples at day 110 (368-01a, 368-13 and 368-22) and a paired blood and dLN sample from donor 368-01a at day 201 post-vaccination, for which the analysis included all CD4+ T cells and CD8+ T cells. Despite 415 spike-specific CD4+ TCR clonotypes identified in the three donors in both dLN (91 spike-specific clonotypes) and blood (324 spike-specific clonotypes) at day 110 and day 201, we found no clonally-identical paired TCRs in the blood and dLN in these four matched samples (data not shown). Expansion of the analysis found minimal overlap between the paired TCR repertoire in the blood and dLN when all blood and dLN samples from these three donors (blood samples from all three donors on days 110 and 201, dLN samples from 368-01a on days 28, 60, 110 and 201, dLN samples from 368-13 on days 60 and 110, and dLN samples from 368-22 on days 60 and 110) were analyzed together (Fig. 5a). We found 6 overlapping TCR (out of 47,560 sequenced) in donor 368-01a, no overlapping TCR (out of 39,280 sequenced) in donor 368-13, and 58 overlapping TCR (out of 44,817 sequenced) in donor 368-22 (Fig. 5a). Rarefaction analysis suggested adequate sampling depth to fully represent the diversity of the TCR repertoire in both the dLN and blood compartments in these three donors (Fig. 5b), suggesting that these two compartments represented distinct populations of clonally diverse T cells 3 to 6 months after vaccination.

Figure 5. TCR sequencing reveal limited overlap in clonal TCR repertoire between dLN and blood 3-6 months after mRNA vaccination.

Figure 5.

a. Representation of total T cell clonal overlap between blood and dLN in donors 368-01a, 368-13 and 368-22 (blood samples from all three donors on days 110 and 201, dLN samples from 368-01a on days 28, 60, 110 and 201, dLN samples from 368-13 on days 60 and 110, and dLN samples from 368-22 on days 60 and 110 post-first dose of the BNT162b2 vaccine). Numbers indicate unique TCR clones and J is the calculated Jaccard stability index. b. Rarefication and extrapolation of all TCR clones included in a, for donors 368-01a, 368-13 and 368-22. The dotted line indicates the point of extrapolation, and the ribbon is the 95% confidence interval. c. Scatter plot showing the dLN or blood location and proportion of total TCR repertoire for each TCR clonotype found in donors 368-01a and 368-22. Overlapping clonotypes are indicated in yellow.

The majority of shared clonotypes between blood and dLN identified in donors 368-01a and 368-22 represented relatively infrequent paired TCR clonotype populations found in only one or two T cells in either the blood or the dLN (Fig. 5c, Supplementary Tables 5 and 6), rather than clonally expanded populations, with four notable exceptions - all of which were CD8+ T cell populations found in 13-19 CD8+ T cells. (Fig. 5c, Supplementary Tables 5 and 6). Based on transcriptional profile, 48% of the populations with overlapping TCR clonotypes were CD8+ T cells (Supplementary Tables 5 and 6), despite the magnetic enrichment of the blood samples for CD4+ T cells (> 97% purity prior to sequencing), suggesting that the frequency of overlapping blood and dLN clonal CD4+ T cell populations was substantially less than that observed for CD8+ T cells. None of the 58 overlapping TCR clonotypes in donor 368-22 contained CDR3 sequences of known SARS-CoV-2 spike-specific CD4+ T cells, including those determined to be spike-specific here (Supplementary Table 6). Three of the six overlapping clonotypes in donor 368-01a were SARS-CoV-2 spike-specific CD4+ T cells (Supplementary Table 5), two were S167-180-specific and a third was S120-136-specific (TCR5) (Supplementary Table 5). The three overlapping spike-specific CD4+ T cell clonotypes were found at day 28 and day 60 in the dLN and at day 110 and day 201 in the blood (Supplementary Table 5), perhaps indicating the emergence of memory TFH from the dLN to the blood late in the course of the GC. In summary, we found that three to six months following mRNA vaccination, the blood and dLN contained distinct clonotypic populations of spike-specific CD4+ T cells.

Spike-specific CD4+ T cells are detected in the blood and dLN

We next explored the transcriptional signatures of SARS-CoV-2 spike-specific CD4+ T cells found in the blood and dLN samples collected at day 110 and day 201 and included both CD4+ TFH cells and non-CD4+ TFH cells from all sequenced dLN and blood samples at these time points, including TCR that were S167-180-specific, previously published spike-specific TCR and the five new clonotype families identified using Trex. A broad evaluation of the transcriptional differences between the blood and dLN compartments identified the upregulation of PDCD1 and CXCL13 in dLN spike-specific CD4+ T cells compared to blood spike-specific CD4+ T cells at both day 110 and day 201 (Fig. 6a and Supplementary Table 7). REL and RELB, which are involved in canonical and non-canonical NF-κB signaling respectively, and CST7, which encodes a cysteine protease inhibitor, were significantly upregulated in blood spike-specific CD4+ T cells compared with dLN spike-specific CD4+ T cells (Fig. 6a and Supplementary Table 7).

Figure 6. Total spike-specific blood CD4+ T cells are transcriptionally distinct from the total spike-specific CD4+ T cell population in the dLN.

Figure 6.

a. Volcano plot of differential gene expression between all spike-specific CD4+ T cells in dLN (n=533) and blood (n=938). Size of points is based on the difference in the percentage of cells expressing each gene in dLN compared to blood. Statistical testing was performed using two-sided MAST testing with adjustment for multiple comparisons. b. Z-scaled median gene set enrichment heatmap for immune-related gene sets found in dLN and blood spike-specific CD4+ T cells at day 110 and day 201 post-first dose of the BNT162b2 vaccine.

Gene set enrichment analysis of these data indicated substantial similarity between the spike-specific CD4+ T cells in the peripheral blood, and some differences between the dLN spike-specific CD4+ transcriptional profiles at day 110 and day 201 (Fig. 6b). Spike-specific CD4+ T cells in the blood had less DNA repair and glycolipid metabolism signaling than the dLN samples at day 110 and day 201 (Fig. 6b). We observed enrichment of TCR signaling, T cell activation and cytokine signaling pathways in the dLN samples at day 201, while the dLN samples at day 110 had significantly elevated amino acid metabolism and Notch signaling when compared with the dLN samples at day 201 (Fig. 6b). Thus, evaluation of all spike-specific CD4+ T cells in both the blood and the dLN agnostic to TFH markers found signatures of the ongoing CD4+ GC TFH cell response in the dLN and evidence of upregulated NF-κB signaling in spike-specific CD4+ T cells in the blood.

Infection induces distinct spike-specific CD4+ T cell phenotype

We next leveraged our ability to detect large numbers of spike-specific CD4+ T cells in HLA-DPB1*04:01+ individuals to compare the transcriptional phenotype of peripheral blood spike-specific CD4+ T cells in BNT162b2-vaccinated individuals to the phenotype of peripheral blood spike-specific CD4+ T cells from individuals post-acute symptomatic primary infection with SARS-CoV-2. We generated a new dataset that included single-cell RNA-seq and TCR sequencing of PBMC collected at day 110 (3 months) and day 201 (6 months) post-first dose from four BNT162b2-vaccinated individuals (3 male, 1 female, age range 34-38) and magnetically enriched (>97% purity) CD3+ T cells from PBMC collected 1 month, 3 months and 6 months post-infection from six HLA-DPB1*04:01+ individuals hospitalized with moderate (n=3, all male, age range 53-75, WHO severity scale 3-4), or severe (n=3, all male, age range 53-73, WHO severity scale range 6-8, one of which died) COVID-19 during the first wave of the pandemic (April to August of 2020), prior to the introduction of vaccines33 (Fig. 7a and Extended Data Tables 1 and 2). All 10 donors were exposed to the SARS-CoV-2 spike antigen for the first time, either through the two-dose BNT162b2 vaccine or through natural infection. UMAP projection of all spike-specific CD4+ T cells identified nine clusters: CD4+ central memory (TCM) cells (0), central memory CD4+ T cells expressing SLC2A3 that encodes the glucose transporter 3 protein GLUT3 (CD4+ GLUT3+ TCM cells; 1), CD4+ effector memory (TEM) cells (2), CD4+ GLUT3+ TEM cells (3), a population of CD4+ TCM cells expressing the anti-apoptotic transcripts BCL2 and GIMAP5 (CD4+ Bcl-2+ TCM cells) (4), two clusters of cytotoxic effector memory CD4+ T cells (TEM C) (5 and 8), FOXP3+ CD4+ Treg cells (6) and a population of regulatory CD4+ CD52+ T cells34 (7) (Fig. 7b, https://cellpilot.emed.wustl.edu). All clusters were found in both infected and vaccinated individuals at the matched 3-6 month time points (Fig. 7c). CD4+ Bcl-2+ TCM cells, CD4+ GLUT3+ TEM cells and CD4+ CD52+ T cells represented a significantly higher proportion of the spike-specific CD4+ T cells at month 3 and month 6 after antigen exposure in vaccinated compared to infected individuals (Fig. 7c). CD4+ TCM cells and CD4+ TEM cells were a significantly higher proportion of spike-specific CD4+ T cells at month 3-6 after antigen exposure in infected donors (Fig. 7c). Relatively small increases in the total proportion of CD4+ TCM cells and CD4+ TEM cells and a slight decrease in the total proportion of CD4+ TEM C cells were detected between infected individuals with moderate or severe COVID-19 at the acute (day 18-36) time point (Extended Data Fig. 7). The two populations of CD4+ TEM C cells (5 and 8) were characterized by high expression of cytotoxic cytokines (CCL4 and CCL5) and granzymes (GZMA, GZMK, and GZMH) (Fig. 7d). We also observed distinct populations of circulating regulatory FOXP3+ CD4+ Treg cells (6) and FOXP3loCD52hiCD4+ T cells (7) (Fig. 7d), which were reported to suppress antigen-specific T cell responses through soluble CD52 ligation of Siglec-10 on target cells34.

Figure 7. Circulating blood spike-specific CD4+ T cells induced by infection are transcriptionally distinct from those induced by mRNA vaccination.

Figure 7.

a. Schematic showing the timeline for blood sample collection 3-months and 6-months post-first dose of the BNT162b2 vaccine in donors (368-01a, 368-04, 368-13 and 368-22, n=4) or following SARS-CoV-2 infection in individuals who developed moderate (350-041, 350-117 and 350-400, n=3) or severe (350-065, 350-084 and 350-397, n=3) disease. b. UMAP projection of all circulating blood spike-specific CD4+ T cells in all donors as in a., including 693 from SARS-CoV-2-infected and 804 cells from BNT162b2-vaccinated donors (1,497 total cells). c. UMAP projection (left) and proportion breakdown (right) of cluster composition for circulating blood spike-specific CD4+ T cells in donors 368-01a, 368-04, 368-13, 368-22, 350-041, 350-117 and 350-400 at 3 months and 6 months in SARS-CoV-2-infected (infected, n=289) and BNT162b2-vaccinated (vaccinated, n= 804) donors. Statistical significance was based on bootstrapping 1,000 times to form a null distribution. * corrected p-value < 0.05; ** corrected p-value < 0.01. d. Top 8 or fewer cluster-defining differentially expressed genes in clusters 0-8 as in b. e. TCR cluster assignments based on normalized Levenshtein distance of the CDR3 sequence across donors. Only cluster assignments with more than two clonotypes were retained. f. UMAP and proportion breakdown of circulating blood spike-specific CD4+ T cells in infected donors 350-041, 350-117 and 350-400 at day 18 to 36 (Early) and at 3-6 months (Late).

Expanded clonotypes of spike-specific TCR with highly related TCR suggestive of clonal groups were found in all clusters except the small CD4+ TEM C cluster 8 and in every individual donor (Fig. 7e). Most of the spike-specific TCR clonal groups were found in the CD4+ TCM, CD4+ GLUT3+ TCM and CD4+ Bcl-2+ TCM cell clusters (Fig. 7e). CD4+ TCM cells from infected individuals composed a larger proportion of total spike-specific CD4+ T cells at day 18-36 post-infection compared to month 3-6 post-infection (Fig. 7f). CD4+ TEM C cells composed a higher proportion of circulating spike-specific CD4+ T cells in infected individuals at month 3-6 post-infection compared to day 18-36 post-infection (Fig. 7f). Thus, circulating spike-specific CD4+ T cells therefore exhibited distinct phenotypes following primary induction by either infection or vaccination.

Discussion

Using previously published spike-specific TCR sequences and Trex as a tool to assist with additional reverse epitope discovery30,35, here we longitudinally tracked the evolution of large numbers of SARS-CoV-2 spike-specific CD4+ T cells in the human blood and dLN in the first 6 months after SARS-CoV-2 mRNA vaccination or infection. An interactive online data portal, CellPilot (https://cellpilot.emed.wustl.edu), allows rapid and detailed interrogation of our dataset.

Recent supervised and unsupervised informatic tools for analyzing TCRs and antigen specificity have been developed36-41. Trex is a TCR analysis platform built to combine deep variational autoencoders with gene expression data at the single-cell level. Although several methods on the combination of TCR data and gene expression have been published30,40, Trex offers up to 8 variational autoencoding models (four per TCR chain) and a generative artificial intelligence approach to encode TCR amino acid sequences into latent dimensional space. In addition, the latent dimensional space of the TCRs can be used adaptively to filter, cluster, or as a layer input for multimodal dimensionality reduction. In the future, the use of this technique to combine single-cell RNA, protein and chromatin accessibility quantification with the vectorized TCRs could allow for an even more comprehensive analysis of antigen-specific immune responses.

Despite identifying eleven CD4+ TFH and CD4+ TFM cell transcriptional phenotypes in the dLN following vaccination, the majority of spike-specific CD4+ TFH exhibited the classical CD4+ GC TFH4,22 and CD4+ IL-10+ TFH15,23 phenotypes. We identified a large number of overlapping paired TCR clonotypes between CD4+ GC TFH and CD4+ IL-10+ TFH between day 28 and day 201 post-vaccination, suggesting a common origin of these two effector TFH, despite significant transcriptional differences between the two subsets, which implied very different functional roles. Differential IL21 and IL4 expression in CD4+ GC TFH and CD4+ IL-10+ TFH was reminiscent of the segregation of these important functional cytokines in time and space within the mouse GC after infection42,43.

We did not observe substantial overlap between paired human spike-specific CD4+ TCR sequences found in the dLN and those found in the matched blood samples at day 110 and day 201, during the late GC response, in line with a reported minimal clonal overlap between paired TCRs from CD4+ T cells, but detectable overlap between CD8+ T cells in the blood and lymph node compartments sampled from deceased organ donors44. Indeed, many of the overlapping clonotypes we identified in the dLN and blood were contaminating CD8+ T cells. The three overlapping spike-specific CD4+ TCRs we discovered in one donor were found at day 28 and day 60 in the dLN and at day 110 and day 201 in the blood. These cells may represent the first emergence of circulating CD4+ TFM from the dLN.

Our tracking and transcriptional phenotyping of large numbers of spike-specific CD4+ T cells allowed us to gain significant insights into the execution of the spike-specific CD4+ TFH cell response in the dLN after mRNA vaccination. Peak T cell activation, interleukin signaling and cytokine signaling transcriptional activity were observed at day 60. We found upregulation of genes associated with CXCR4 signaling at day 201, near the end of the GC response, raising the possibility that this pathway may play a role in the termination of the GC in humans. Notably, CXCR4 signaling is critical in localizing CD4+ T cells found within the GC to the dark zone45. We speculate that the localization of antigen-specific CD4+ GC TFH in the dark zone towards the end of the GC response may facilitate the termination and collapse of the ongoing GC.

Finally, our comparison of the mRNA vaccine-induced spike-specific CD4+ T cells in the blood with the spike-specific CD4+ T cells from the blood of SARS-CoV-2-infected individuals identified a significantly higher proportion of spike-specific Bcl-2+CD4+ TCM cells with higher expression of a pro-survival/anti-apoptotic transcriptional program in the vaccinated individuals. Therefore, unique long-term transcriptional profiles were induced in spike-specific memory CD4+ T cells depending upon the context of initial antigen exposure – infection or mRNA vaccination.

Our work has limitations. The present study evaluated CD4+ T cell responses from six mRNA-vaccinated and six SARS-CoV-2-infected donors. While our results are reproducible across this cohort, their broad applicability across larger populations of individuals cannot be adjudicated at this time. Our focus on HLA-DPB1*04+ individuals - while necessary to obtain sufficient numbers of antigen-specific cells for the unique analyses we performed - may have introduced unrecognized bias into our results and further validation of our findings would be required to ensure these findings apply to individuals without this HLA allele. In summary, we developed a broad single-cell transcriptional atlas of human spike-specific CD4+ T cells in the blood and dLN in the first 6 months following primary exposure to SARS-CoV-2 spike through BNT162b2 mRNA vaccination or infection.

Methods

Human subjects

We included samples from two prospective observational human cohorts. Demographics and HLA-typing of all included subjects are reported in Extended Data Tables 1 and 2. In cohort 1, human subjects who received the primary two-dose BNT162b2 mRNA vaccine series were prospectively enrolled into observational study WU-368, approved by the Washington University in St. Louis Institutional Review Board (approval # 2020-12-081). Complete details of the study cohort have been previously published3,18,19. Briefly, the cohort included 43 vaccinated human subjects who provided blood samples with an age range from 28-73, 21 were female; 15 of the same cohort participants (age range 28-52, 7 were female) also provided one or more dLN FNA samples. Written informed consent was obtained from each subject.

Human subjects in cohort 2 were infected with SARS-CoV-2 during the first wave of the COVID-19 pandemic (April to August of 2020). Subjects with acute symptomatic viral respiratory illness evaluated at Barnes Jewish Hospital, Saint Louis Children’s Hospital, Christian Hospital or affiliated Barnes Jewish Hospital testing sites, all located in Saint Louis, Missouri, USA were enrolled into a prospective observational cohort study, WU-350. The WU-350 study was approved by the Washington University in St. Louis Institutional Review Board (approval # 2020-03-085). Full details of the cohort and inclusion criteria have been previously published33. The 6 donors (age range 53-75, all male, WHO severity scores 3-8) (Extended Data Table 1) included in the present manuscript tested positive for SARS-CoV-2 with a clinical PCR test. Informed consent was obtained from each subject or their legally authorized representative.

Sample preparation

Vaccinated subjects underwent dLN FNA sampling as previously described46. Briefly, draining dominant lateral axillary lymph nodes ipsilateral to the deltoid muscle mRNA vaccination site were localized with ultrasound and sampled 28, 60, 110 and/or 201 days after the first vaccine dose with multiple passes of 6 separate 25-gauge needles using real-time ultrasound guidance. Each needle was flushed with 3 mL of R10 (RPMI 1640 media containing L-glutamine supplemented with 10% FBS, 100 U/mL penicillin-streptomycin) followed by three 1-mL rinses with R10. Any contaminating RBCs were lysed with ACK hypotonic lysis buffer, dLN FNA cells were washed twice with P2 (1x PBS supplemented with 2% FBS and 2 mM EDTA), and cells were then counted and cyropreserved in 90% FBS with 10% DMSO before storage in liquid nitrogen until analysis. Matched blood samples obtained from vaccinated individuals 110 or 201 days after the first vaccine dose were obtained into EDTA-anticoagulated tubes and prepared to PBMC using Ficoll density gradient centrifugation. Contaminating RBCs were removed from PBMC via hypotonic lysis, PBMC were washed, counted and cryopreserved in 90% FBS / 10% DMSO and kept in liquid nitrogen until analysis. Blood samples from infected participants were collected 18-36 days after the onset of viral respiratory illness symptoms and 3 months or 6 months after the onset of viral respiratory illness symptoms into EDTA-anticoagulated tubes and prepared to PBMC using Ficoll density gradient centrifugation. Contaminating RBCs were removed from PBMC via hypotonic lysis, PBMC were washed, counted and cryopreserved in 90% FBS / 10% DMSO and kept in liquid nitrogen until analysis.

HLA-typing

Vaccinated individuals were HLA-typed by nanopore sequencing47. Genomic DNA was purified using the AllPrep DNA/RNA kit (Qiagen). Target HLA genes were amplified by long-range PCR (NGS LR kit, One Lambda) and sequenced following the SQK-LSK109 protocol on the R10.3 MinION flow cells (Oxford Nanopore Technologies). High-resolution HLA typing was assigned using the Athlon2 program.

For HLA-typing of infected individuals, we extracted DNA from PBMCs using Zymo Quick-DNA Plus kits for use in the AllTYpe NGS 11-Loci Amplification Kit (One Lambda, Lot 014). HLA libraries were sequenced at 150x150 bp (MiSeq, Illumina), and the data were analyzed with TypeStream Visual (v3.0; One Lambda).

dLN single-cell RNA-seq library preparation and sequencing

dLN FNA samples were thawed, washed with P2, and resuspended in P2. Chromium Single Cell 5’ Gene Expression Dual Index libraries and Chromium Single Cell V(D)J Dual Index libraries (10x Genomics) were prepared according to the manufacturer’s instructions without modifications. Both gene expression and V(D)J libraries were sequenced on a Novaseq S4 (Illumina) instrument, targeting a sequencing depth of 50,000 and 5,000 read pairs per cell, respectively.

T cell enrichment of PBMC populations for single-cell RNA-seq

Frozen PBMC samples were thawed, washed once with R10, and then washed with P2. PBMC were counted on a Cellometer Auto 2000 (Nexcelom) and resuspended to a final concentration of approximately 108 cells/mL in P2. Total untouched CD3+ or positively selected CD4+ T cells were enriched using either the EasySep Human T Cell Isolation Kit or the EasySep Human CD4 positive selection kit II, respectively, with the EasyEights magnet (STEMCELL Technologies) all per the manufacturer’s instructions. Following enrichment, T cell populations were washed with P2, re-counted and resuspended in PBS supplemented with 0.05% BSA. Chromium single cell 5’ v2 gene expression and Chromium single cell V(D)J libraries (10x Genomics) were prepared according to the manufacturer’s instructions without modifications. Gene expression and V(D)J libraries were sequenced on a Novaseq S4 (Illumina) instrument.

The remaining T cells were stained for flow cytometry to verify the T cell enrichment. Enriched T cells were added to a round-bottom 96-well plate and washed twice in P2. A master mix was added to the cells with the following reagents for 20 minutes at 4°C: CD3 APC Fire 810 (HIT3a, Biolegend); CD4 PerCP (OKT4, Biolegend, to avoid blocking from positive selection); CD8 BV421 (RPA-T8, Biolegend); CD16 BV570 (3G8, Biolegend; CD14 APC (M5E2, Biolegend); CD19 BV750 (HIB19, Biolegend); Zombie NIR (Biolegend) diluted in Brilliant Staining buffer (50μL per test, BD Horizon) and P2. Following staining, cells were washed three times in P2 and then fixed with 1% paraformaldehyde (Electron Microscopy Sciences) for 20 minutes at 4°C. Cells were washed once in P2, then resuspended in P2, and stored at 4°C until analysis within 24-48 hours. Flow cytometry samples were run on an Aurora spectral flow cytometer using SpectroFlo v.2.2 software (Cytek). Flow cytometry data were analyzed using FlowJo v.10 (Treestar).

Single-cell RNAseq processing and analysis

Filtered outputs of 10x Cell Ranger count and V(D)J pipelines were imported into R (v4.1) using the Seurat (v4.1.0 ) R package48. Filtering was applied on a sequencing run basis to remove cells with less than 100 features, more than 2.5-fold the standard deviation of feature numbers, and greater than 15% mitochondrial gene percentage. Doublets were estimated using the scDblFinder (v1.6.0) R package49. Individual cells were annotated using ProjecTILs (v2.0.3) R package50,51 and SingleR (v1.6.1) R package52 using the DICE annotation data set53. Clonotypes were added to the integrated Seurat object using the scRepertoire (v1.7.0) R package54. T cells were isolated based on the assignment of CD4/CD8 T cell annotation from ProjecTIL and the presence of a productive clonotype. Overall T cell dimensional reduction utilized 2,000 variable genes with the TCR genes removed to prevent bias in the manifold by clonality. The harmony (v0.1.0) R package55 was used in integrating multiple sequencing runs and generating the UMAP (dimensions = 1:15, epochs = 500) and clusters (resolution = 0.8, dimensions = 1:15, algorithm = 3). T follicular UMAP embedding and clustering utilized dimensions = 1:20 and a resolution of 0.5. CD8+ T cell designations were based on the examination of the distribution of CD8 expression, and a cut-off was set for CD8A ≥ 0.4. Spike-specific cells from vaccinated and infected donor peripheral blood were integrated using the Harmony R package using the individual sequencing run as the variable and dimensions = 1:30 with calculating UMAP (dimensions = 1:25, epochs = 500) and cluster (resolution = 0.5 and algorithm = 3). Gene expression UMAP overlays utilized the Nebulosa (v1.6.0) R package56. Gene set enrichment analysis was performed using the escape (v1.4.2) R package57 with the UCell approach58 and the Hallmark, Kegg, and BioCarta gene set libraries from GSEA59. TCR rarefication and extrapolation was performed using the iNEXT (v3.0.0) R package60 using the abundance of combined TRA and TRB clonotypes by patient and tissue and default settings in terms of bootstraps, knots and Hill numbers. TCR clustering was performed using the scRepertoire package and the clusterTCR function with the normalized edit distance threshold set to 0.85.

TCR sequencing analysis and visualization

Spike-specific clonotype annotations were assigned for both TRA and TRB and derived from previously published data3,20 and the VDJdb database21. TCR sequencing motifs were created with the msa (v1.28.0) R package set to protein alignment with the ClustalW algorithm and max iteration = 30. The resulting aligned sequences were converted into seginer format and plotted with ggseqlogo (v0.1) R package. Single clonotype representation for single-cell analysis was performed similarly to the previously described CoNGA30. For a given combined TRA and TRB, a single transcriptome was selected based on the minimal Euclidean distance across all cells in the individual clonotype. Vectors for the TRA and TRB were calculated using the TCR autoencoder Trex (v0.99.7) R package translating the CDR3 amino acid sequence into a matrix based on the Kidera factors61. For the resulting RNA principal components and embedded TCR values, the first 15 dimensions were selected and rescaled using the mutual nearest neighbor approach with k=100 with the mumosa (v1.4.0) R package. The resulting values were then subjected to the phate algorithm with default settings with the PhateR (v1.0.7) R package62. Clustering was performed by generating a k-nearest neighbor igraph with the bluster (v1.6.0) R package and clusters were calculated using the Leiden algorithm from the leidenAlg (v1.0.3) R package with a resolution = 0.7 and the number of iterations = 5. Putative spike-specific TCRs were derived from clusters where previously identified spike-specific TCRs were present. In addition, the putative TCRs were selected based upon the fact that they were clonally expanded and they expressed either an alpha or beta chain that appeared in 2 or more donors and had not been previously shown to bind the spike epitope. Related putative-spike specific clones were called by identifying TRA or TRB CDR3 sequences within Levenshtein distance of two and shared V genes.

Development of Trex autoencoding models

TCR embedding utilized training variational autoencoders on TRA and TRB CDR3 amino acid sequences, taking the AF, KF, or both converted numeric matrices with 0 padding to set CDR3 length of 60. The matrices were transformed into a 1-dimensional array, and values normalized across all sequences. Values with no variation were transformed into 0s. Alternatively, a one-hot autoencoding approach was also trained by converting the amino acid sequence to a matrix based on the individual amino acid along the sequence. A stacked autoencoder approach was utilized, similar to the previously described method40 with a 128-64-30-64-128 neuron structure. The bottleneck layer consists of a 30-neuron/vector embedding. Each autoencoder model was trained using the keras (v2.4.0) R package across 288,043 unique CDR3 AA TRA and 453,111 unique CDR3 AA TRB sequences across 15 single-cell data sets and 4 curated TCR databases – McPAS-TCR64, VDJdb65, IEDB66, and PIRD67 resulting in 8 models: TRA-AF, TRA-KF, TRA-both, TRA-OHE, TRB-AF, TRB-KF, TRB-both, TRB-OHE. The models were trained using 80:20 data split and hyperparameters were selected based on minimal Kullback-Leibler divergence value with a batch size of 128, learning rate of 0.001, and optimization using root mean square propagation. The TCR models and corresponding R package to run the embeddings with a Seurat or Single-cell Experiment object are available at https://github.com/ncborcherding/Trex.

Putative spike-specific TCR transductants

Putative spike-specific TRA and TRB variable regions were combined in silico with murine constant regions (murine TRAC and murine TRBC2) modified to include additional cysteine residues in place of serine at position 57 in murine TRBC2 and threonine at position 47 in murine TRAC. Using murine constant regions prevents pairing with endogenous human TCR following retroviral transduction of primary human T cells. The additional cysteine residues enhance alpha/beta constant region binding affinity increasing chimeric human variable/mouse constant TCR surface expression. Constructs containing the modified TRA and TRB were separated by a T2A sequence and synthesized to include a NotI and EcoRI restriction site at the 5’ and 3’ ends of the region of interest, respectively (GenScript). Synthesized constructs from GenScript were double-digested with NotI and EcoRI and cloned into the pMP71 retroviral vector31 and ligation was confirmed via sequencing of the recombinant plasmid. Recombinant pMP71 was used to transfect the 293Vec-RD114 retroviral packaging cell line (provided by BioVec Pharma) with the TransIT-LT1 (Mirus Bio) transfection reagent using the manufacturer’s protocol and recommended conditions. Transfection media was removed after 24 hours, replaced with fresh media, and retrovirus containing supernatents were harvested 24 hours later. Retroviral supernatants were stored at −80°C until used.

Human CD4+ T cells were enriched from cryopreserved PBMC using the EasySep Human CD4 Positive Selection Kit II (STEMCELL Technologies). Isolated T cells were cultured in R10-500 (R10 supplemented with 500 U/mL recombinant human IL-2 [BioLegend]) at 37°C with 5% CO2 and activated with the Miltenyi Biotec human T Cell Activation/Expansion kit according to the manufacturer's instructions. 2 days after activation/expansion, activated T cells were purified from dead cell debris and activation beads with a Ficoll gradient. Cells were washed in R10, resuspended at 2x106 per mL in R10-500, and plated on 24-well flat-bottom tissue culture plates.

TCR RD114 retroviral supernatants were thawed, layered on top of a 20% sucrose (w/v) gradient, and centrifuged in a microcentrifuge at 20,000 x g at 4°C for 1 hour. The supernatant was discarded and residual volume, including the retroviral pellet, was incubated with ViroMag beads (OZ Biosciences) for 15 minutes at room temperature. Retrovirus/beads were then added to the activated T cells in the 24-well plate and the plate was briefly centrifuged at 1600 x g for 1 minute before being placed on a pre-warmed magnet (OZ Biosciences) and incubated at 37°C with 5% CO2 for 15 minutes. Transduced T cells were cultured for at least 1 week prior to analysis with changes of R10-500 media, as needed.

Intracellular cytokine staining mapping of human TCR transductants

250,000 to 500,000 transduced CD4+ T cells, a portion of which were confirmed to express the recombinant chimeric TCR using a murine TCR beta chain-specific monoclonal antibody (BV510, clone H57-597, BioLegend), were co-cultured with 100,000 EBV-transformed B cells from the experimental subject who expressed the index paired putative spike-specific TCR in the presence of various mapping pools of SARS-CoV-2 spike overlapping 17-mer peptides (NR-52402, BEI Resources). Each peptide was incubated at a final concentration of 1 μg/mL. Separate unstimulated control wells with equivalent concentrations of DMSO to the final concentration of DMSO found in the peptide-stimulated condition were included. Positive control phorbol 12-myristate 13-acetate (PMA, InvivoGen) and Ionomycin (InvivoGen) were added to separate wells. Cells in all conditions were co-cultured in R10 media supplemented with co-stimulatory antibodies against CD28 and CD49d (BD Biosciences). Samples with the appropriate stimulus were incubated for 1.5 hours before the addition of Brefeldin A and monensin (both from BD Biosciences) and then incubated for an additional 12-16 hours. Surface staining was performed followed by fixation in 1% paraformaldehyde, permeabilization with washing buffer supplemented with 0.1% w/v saponin (Sigma) and intracellular staining using fluorescently labeled antibodies directed against cytokine antigens. We used the following antibodies: CD3 PE-Cy7 (clone UCHT1, BioLegend), CD4 APC-Cy7 (clone SK3, BioLegend), murine TCR beta chain BV510 (clone H57-597, BioLegend), CD69 BV711 (clone FN50, BioLegend), IFN-gamma PE (clone B27, BioLegend), TNF-alpha PerCP-Cy5.5 (clone MAb11, BioLegend) and IL-2 APC (clone 5344.111, BD Biosciences). The panel included Zombie NIR viability stain (BioLegend). All antibodies were used at pre-titrated optimal staining concentrations. In a separate experiment performed on unstimulated TCR2 transductants, we performed the surface stain portion of the panel following incubation with an S167-180 HLA-DPB1*04:01 PE-labeled tetramer reagent (Washington University in Saint Louis tetramer core facility) for 15 minutes to confirm ICS results. All samples were acquired on a Cytek Aurora spectral flow cytometer and unmixed files were analyzed using FlowJo software (version 10, BD Biosciences). Final analysis was gated on live CD4+ T cells positive for murine TCR beta chain.

HLA-restriction determination

Human Jurkat clone E6-1 T cells were obtained from ATCC and transduced with an NFAT eGFP reporter lentivirus (BPS Bioscience) according to the manufacturer’s instructions and then puromycin selected at a final concentration of 1 μg/mL for 7 days. NFAT-GFP reporter Jurkat T cells were then transduced with each candidate TCR retrovirus following sucrose purification, as described above. TCR-transduced NFAT-GFP reporter Jurkat T cell lines were FACS sort purified after staining with CD3 PE-Cy7 (clone UCHT1, BioLegend), CD4 APC-Cy7 (clone SK3, BioLegend) and murine TCR beta chain BV510 (clone H57-597, BioLegend) on a Bigfoot Spectral Cell Sorter (ThermoFisher).

Single HLA-class II allele expressing antigen-presenting cells were developed by gene synthesis of select HLA class II alleles found in the subjects from which the candidate TCR were selected. We synthesized HLA-class II alpha and beta chains separated by a T2A sequence and included a NotI and EcoRI restriction site at the 5’ and 3’ ends (GenScript). Synthesized constructs were cloned into the pMP71 retroviral transduction system as described above, transfected into the 293Vec-RD114 retroviral packaging cell line, and resulting retroviruses were used to transduce a K562-based artificial APC (aAPC) cell line expressing exogenous human CD64, CD80, CD83, CD74, and HLA-DM.

NFAT-GFP reporter Jurkat TCR cell lines were co-cultured in R10 with individual K562 aAPC cell lines, either expressing unique HLA-class II alleles or HLA-class II devoid (not HLA-virus transduced), in the presence or absence of 10 μg/mL of the unique SARS-CoV-2 spike peptide previously mapped to each responding TCR for a total of 16 hours at 37°C, 5% CO2. Following co-culture, cells were surface stained with CD3 PE-Cy7 (clone UCHT1, BioLegend), murine TCR beta chain BV510 (clone H57-597, BioLegend), and Zombie NIR viability stain (BioLegend) before analysis on a Cytek Aurora spectral flow cytometer. Unmixed files were analyzed using FlowJo software (version 10, BD Biosciences).

Statistics & Reproducibility

Heatmaps of gene sets were derived from the intersection of significant enrichment comparison (Bonferroni-adjusted p-value < 0.05) by ANOVA and Kruskal-Wallis H test for multiple comparisons and T-test and Wilcoxon Rank Sum test for binarized comparisons. For none-rank-based significance testing, distributions were evaluated before applying testing. Differential gene expression utilized MAST6876 using the donor as a latent variable and a pseudocount of 0.1. Cluster proportion comparisons between antigen-specific T cells used the scProportionTest (v0.0.0.9) R package with 1,000 permutations. Code for the entire analysis is available at https:://github.com/ncborcherding/COVID_TCR. Sample size of the cohort was based on the voluntary enrollment of participants who consented to the respective procedures into the trial and randomization or blinding is not applicable to this study.

Extended Data

Extended Data Figure 1. Sample preparation diagram and representative flow cytometry plots of cell purity following magnetic cell enrichment.

Extended Data Figure 1.

a. Blood and dLN samples from BNT162b2 mRNA vaccinated cohort. b. Peripheral blood mononuclear cell enrichment strategy for BNT162b2 mRNA vaccinated or SARS-CoV-2 infected donors.

Extended Data Figure 2. Reference-based T cell annotations for UMAP in Figure 1b.

Extended Data Figure 2.

a. Density plots showing the relative distribution of ProjecTIL-based CD4+ T cell labels. b. Density plots showing the relative distribution of ProjecTIL-based CD8+ T cell labels. Individual gray dots indicate individual cells matching the label.

Extended Data Figure 3. Performance metrics for Trex autoencoder models by approach and chain.

Extended Data Figure 3.

For the given hyperparameter, models were trained on 2e5 random sequences with 10 epochs for minimal Kullback-Leibler divergence value. a. Mean square error of models after training varying the latent dimensions (left panel) and batch size (right panel) with different learning rates. b. Kullback-Leibler divergence values of models after training varying the latent dimensions (left panel) and batch size (right panel) with different learning rates. c. Evaluations of fidelity of models to return unique values using novel sequences for TRA and TRB chains across all models in Trex. Novel sequences were randomly sampled and bootstrapped a total of 10 times. d. Distribution of computational time for model application across the models, chains, and bootstraps.

Extended Data Figure 4. Comparison of Trex co-embedding approach with clonotype neighbor graph analysis (CoNGA).

Extended Data Figure 4.

a. Schematic representation of the CoNGA pipeline that generates nearest neighbors of clones using both edit-distance-based TCR networks and gene expression (GEX) networks. b. Resulting UMAPs for CoNGA-based dimensional reduction using gene expression or edit-distance-based TCR with denoted locations of previously identified spike-specific clones. c. Nearest-neighbor overlap using the Dice (left) and Jaccard (right) index of the 10 nearest neighbors defined by CoNGA and by the co-embedding with Trex. d. Breakdown and distribution of TCR-based clusters using CoNGA TCR output or Trex latent dimensions. Blue colored data indicate the relative proportion of clusters with spike-specific clones with a summary of the graphed values to the right of each bar chart. e. Trex-based latent dimensional clusters with proportion filled by the respective CoNGA TCR-based clusters. f. Distribution and relative size of the candidate TCRs and related sequences (edit distance ≤ 2) selected in Figure 3 for both the CoNGA-based TCR clusters (upper panel) and Trex-based clusters (lower panel).

Extended Data Figure 5. Confirmation of TCR candidates’ specificity for SARS-CoV-2 spike.

Extended Data Figure 5.

Each TCR candidate’s variable gene regions were cloned with murine T cell receptor (mTCR) constant regions into a retroviral transduction vector and resultant retroviruses were used to transduce primary human CD4+ T cells. Positive results from intracellular cytokine stain mapping of the spike protein with overlapping peptides are shown. Gating was first performed on total live single cells, then on CD3+CD4+ T cells, and finally on mTCR beta chain (mTCRb) positive candidate TCR-transduced cells. Unstimulated background cytokine expression, positive control phorbol 12-myristate 13-acetate (PMA) and Ionomycin cytokine expression, and top cytokine expression to individual 17-mer peptides used for total spike proteome mapping are shown for each TCR candidate (a-e). Representative surface stain of unstimulated TCR2-transduced CD4+ T cells with the S167-180 DPB1*04:01 HLA-class II tetramer is shown (right panel in b). Each experiment shown is representative of two independent TCR transduction and mapping experiments.

Extended Data Figure 6. Confirmation of TCR candidate HLA restriction.

Extended Data Figure 6.

NFAT-GFP reporter Jurkat T cells transduced with candidate TCR expressing retrovirus were sort purified and maintained as clonal cell lines. a. Reporter Jurkat lines or b. Transduced primary human CD4+ T cells were co-cultured with spike peptides identified in Extended Data Figure 5 presented in the context of various K562-based aAPC cell lines expressing single HLA class II alleles. Cells were gated on total live single cells, then on CD3+ cells. In b. the top panels show the frequency of retrovirally transduced (murine TCR beta constant region expressing, mTCR+) primary human CD4+ T cells that were gated on prior to evaluation of intracellular cytokine staining in the bottom panels. Red asterisks denote positive responses for each TCR line.

Extended Data Figure 7. Circulating blood spike-specific CD4+ T cells induced early after primary SARS-CoV-2 infection were similar regardless of illness severity.

Extended Data Figure 7.

Comparison of circulating blood spike-specific CD4+ T cells during acute (day 18 to 36 post-onset of disease symptoms) infection between donors with moderate (350-041, 350-117 and 350-400, n=3) versus severe (350-065, 350-084 and 350-397, n=3) infection. Statistical significance was based on bootstrapping 1,000 times to form a null distribution. * adjusted two-tailed permutation test p-value < 0.05.

Extended Data Table 1.

Demographics of clinical cohorts

Subject Age Sex Exposure Severity WHO severity
score (0-8)
Tissue(s)
368-01a 34 male BNT162b2 blood and lymph node
368-04 38 female BNT162b2 blood and lymph node
368-13 34 male BNT162b2 blood and lymph node
368-16 37 male BNT162b2 lymph node only
368-20 48 female BNT162b2 lymph node only
368-22 36 male BNT162b2 blood and lymph node
350-065 73 male SARS-CoV-2 ventilated/alive 6 blood only
350-084 67 male SARS-CoV-2 ventilated/alive 7 blood only
350-397 53 male SARS-CoV-2 ventilated/died 8 blood only
350-041 75 male SARS-CoV-2 hospitalized 4 blood only
350-117 53 male SARS-CoV-2 hospitalized 3 blood only
350-400 65 male SARS-CoV-2 hospitalized 4 blood only

Extended Data Table 2.

HLA-typing results of human subjects

Subject HLA-
A
HLA-
B
HLA-
C
HLA-
DPA1
HLA-
DPB1
HLA-
DQA1
HLA-
DQB1
HLA-
DRB1
HLA-
DRB3,4,5
368-01a 02:01/68:01 39:01/44:03 04:01/07:02 01:03/01:03 02:01/04:01 01:02/02:01 02:02/06:02 07:01/15:01 B4*01:01/B5*01:01
368-04 01:01/03:01 08:01/15:01 03:03/07:01 01:03/01:03 04:01/04:01 03:03/05:01 02:01/03:01 03:01/04:08 B3*01:01/B4*01:03
368-13 01:01/29:02 13:02/14:02 06:02/08:02 01:03/01:03 04:01/15:01 02:01/02:01 02:02/02:02 07:01/07:01 B4*01:01/B4*01:03
368-16 02:01/03:01 35:03/51:01 04:01/15:02 01:03/01:03 02:01/04:01 01:02/01:04 05:02/05:03 14:54/16:01 B3*02:02/B5*02:02
368-20 02:01/11:01 44:02/51:01 07:04/15:02 01:03/02:01 04:01/17:01 01:01/04:01 04:02/05:01 01:01/08:01
368-22 02:01/11:248 40:01/40:01 03:04/03:04 01:03/01:03 04:01/04:02 01:01/03:01 03:02/05:01 01:01/04:04 B4*01:03
350-065 68:01/68:01 51:01/58:02 03:04/06:02 01:03/01:03 04:01/18:01 01:05/05:05 03:01/05:01 12:01/16:02 B3*02:02/B5*02:02
350-084 02:01/02:01 27:05/45:01 02:02/16:01 01:03/03:01 04:01/105:01 03:01/05:01 02:01/03:02 03:01/04:04 B3*02:02/B4*01:03
350-397 01:01/11:01 08:01/38:01 07:01/12:03 01:03/01:03 03:01/04:01 01:02/05:01 02:01/05:02 03:01/16:01 B3*01:01/B5*02:02
350-041 02:01/03:01 07:02/56:01 01:02/07:02 01:03/01:03 03:01/04:01 03:01/05:05 03:01/03:02 04:01/12:01 B3*02:02/B4*01:03
350-117 02:01/03:01 07:02/56:01 01:02/07:02 01:03/01:03 03:01/04:01 03:01/05:05 03:01/03:02 04:01/12:01 B3*02:02/B4*01:03
350-400 02:01/03:01 15:01/44:02 03:03/05:01 01:03/01:03 04:01/04:02 03:01/03:01 03:02/03:02 04:01/04:04 B4*01:03

Supplementary Material

Supplementary Tables 3-7

Supplementary Table 1. Demographics of clinical cohorts.

Supplementary Table 2. HLA-typing results of human donors

Supplementary Table 3. Spike-specific dLN CD4+ TFH clonotype sequences included in analysis

Supplementary Table 4. Differential gene expression results for analysis performed in Figure 4b

Supplementary Table 5. Overlapping TCR clonotypes between blood and dLN in donor 368-01a

Supplementary Table 6. Overlapping TCR clonotypes between blood and dLN in donor 368-22

Supplementary Table 7. Differential gene expression results for analysis performed in Figure 6a

Acknowledgments

The authors thank the research participants for their dedication to the study protocol. The authors thank A. Haile and other staff members of the Washington University Infectious Disease Clinical Research Unit for their assistance with participant enrollment and follow-up. The authors also thank members of the Washington University Emergency Care and Research Core for their enrollment of COVID-19-infected participants. In addition, the authors thank A. P. Voigt of Northwestern University for providing support in the development of the interactive single-cell data portal, CellPilot. This study utilized samples obtained from the Washington University School of Medicine’s COVID-19 biorepository, which is supported by: the Barnes-Jewish Hospital Foundation; the Siteman Cancer Center grant P30 CA091842 from the National Cancer Institute of the National Institutes of Health; and the Washington University Institute of Clinical and Translational Sciences grant UL1TR002345 from the National Center for Advancing Translational Sciences of the National Institutes of Health. The Mudd laboratory was supported by US National Institutes of Health (NIH) grant R01AI173203 and by a grant from the Barnes-Jewish Hospital Foundation. The Ellebedy laboratory was supported by NIH grants U01AI141990 and U01AI150747, NIAID Centers of Excellence for Influenza Research and Surveillance contracts HHSN272201400006C and HHSN272201400008C, and by NIAID Collaborative Influenza Vaccine Innovation Centers contract 75N93019C00051. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the NIAID or NIH.

Footnotes

Data and code availability

Single-cell data presented in the manuscript can be accessed at the publicly available GEO database under accession numbers GSE195673 and GSE249313. Datasets are also available on the interactive online tool CellPilot (https://cellpilot.emed.wustl.edu). CellPilot was adapted from the cellcuratoR (v0.1.0) R package63. Single-cell data and analysis code is also available at the public Zenodo repository under the DOI:10.5281/zenodo.11395445.

Competing interests

The Ellebedy laboratory received funding under sponsored research agreements unrelated to the data presented in the current study from Emergent BioSolutions, Moderna and AbbVie. A.H.E. has received consulting and speaking fees from InBios International, Inc, Fimbrion Therapeutics, RGAX, Mubadala Investment Company, Moderna, Pfizer, GSK, Danaher, Third Rock Ventures, Goldman Sachs, and Morgan Stanley. AHE is the founder of ImmuneBio Consulting LLC. N.B. is the head of computational biology at Omniscope, Inc. and has consulted for Santa Ana Bio, LLC. The remaining authors have no competing interests related to this work.

References

  • 1.Anderson EJ et al. Safety and Immunogenicity of SARS-CoV-2 mRNA-1273 Vaccine in Older Adults. N Engl J Med 383, 2427–2438 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Painter MM et al. Rapid induction of antigen-specific CD4+ T cells is associated with coordinated humoral and cellular immunity to SARS-CoV-2 mRNA vaccination. Immunity S1074761321003083 (2021) doi: 10.1016/j.immuni.2021.08.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Mudd PA et al. SARS-CoV-2 mRNA vaccination elicits a robust and persistent T follicular helper cell response in humans. Cell 185, 603–613.e15 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Crotty S. Follicular Helper CD4 T Cells (T FH ). Annu. Rev. Immunol 29, 621–663 (2011). [DOI] [PubMed] [Google Scholar]
  • 5.Vinuesa CG, Linterman MA, Yu D & MacLennan ICM Follicular Helper T Cells. Annu Rev Immunol 34, 335–368 (2016). [DOI] [PubMed] [Google Scholar]
  • 6.Crotty S. T Follicular Helper Cell Biology: A Decade of Discovery and Diseases. Immunity 50, 1132–1148 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Nurieva RI et al. Bcl6 Mediates the Development of T Follicular Helper Cells. Science 325, 1001–1005 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Johnston RJ et al. Bcl6 and Blimp-1 are reciprocal and antagonistic regulators of T follicular helper cell differentiation. Science 325, 1006–1010 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Yu D. et al. The transcriptional repressor Bcl-6 directs T follicular helper cell lineage commitment. Immunity 31, 457–468 (2009). [DOI] [PubMed] [Google Scholar]
  • 10.Akiba H. et al. The Role of ICOS in the CXCR5 + Follicular B Helper T Cell Maintenance In Vivo. J Immunol 175, 2340–2348 (2005). [DOI] [PubMed] [Google Scholar]
  • 11.Crotty S, Kersh EN, Cannons J, Schwartzberg PL & Ahmed R SAP is required for generating long-term humoral immunity. Nature 421, 282–287 (2003). [DOI] [PubMed] [Google Scholar]
  • 12.Brenna E. et al. CD4(+) T Follicular Helper Cells in Human Tonsils and Blood Are Clonally Convergent but Divergent from Non-Tfh CD4(+) Cells. Cell Rep 30, 137–152 e5 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Dan JM et al. Recurrent group A Streptococcus tonsillitis is an immunosusceptibility disease involving antibody deficiency and aberrant TFH cells. Sci Transl Med 11, eaau3776 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Heit A. et al. Vaccination establishes clonal relatives of germinal center T cells in the blood of humans. Journal of Experimental Medicine 214, 2139–2152 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Cañete PF et al. Regulatory roles of IL-10–producing human follicular T cells. Journal of Experimental Medicine 216, 1843–1856 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Del Alcazar D. et al. Mapping the Lineage Relationship between CXCR5+ and CXCR5− CD4+ T Cells in HIV-Infected Human Lymph Nodes. Cell Reports 28, 3047–3060.e7 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Poon MML et al. SARS-CoV-2 infection generates tissue-localized immunological memory in humans. Sci. Immunol eabl9105 (2021) doi: 10.1126/sciimmunol.abl9105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Turner JS et al. SARS-CoV-2 mRNA vaccines induce persistent human germinal centre responses. Nature 596, 109–113 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Kim W. et al. Germinal centre-driven maturation of B cell response to mRNA vaccination. Nature 604, 141–145 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Dykema AG et al. Functional characterization of CD4+ T cell receptors crossreactive for SARS-CoV-2 and endemic coronaviruses. Journal of Clinical Investigation 131, e146922 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Goncharov M. et al. VDJdb in the pandemic era: a compendium of T cell receptors specific for SARS-CoV-2. Nat Methods 19, 1017–1019 (2022). [DOI] [PubMed] [Google Scholar]
  • 22.Ramiscal RR & Vinuesa CG T-cell subsets in the germinal center. Immunol Rev 252, 146–155 (2013). [DOI] [PubMed] [Google Scholar]
  • 23.Kumar S. et al. Developmental bifurcation of human T follicular regulatory cells. Sci. Immunol 6, eabd8411 (2021). [DOI] [PubMed] [Google Scholar]
  • 24.Chen M, Wang F, Xia H & Yao S MicroRNA-155: Regulation of Immune Cells in Sepsis. Mediators of Inflammation 2021, 1–10 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Niu L. et al. A micropeptide encoded by lncRNA MIR155HG suppresses autoimmune inflammation via modulating antigen presentation. Sci. Adv 6, eaaz2059 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Carlson CM et al. Kruppel-like factor 2 regulates thymocyte and T-cell migration. Nature 442, 299–302 (2006). [DOI] [PubMed] [Google Scholar]
  • 27.Cao Z, Sun X, Icli B, Wara AK & Feinberg MW Role of Krüppel-like factors in leukocyte development, function, and disease. Blood 116, 4404–4414 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Lynn RC et al. c-Jun overexpression in CAR T cells induces exhaustion resistance. Nature 576, 293–300 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Koizumi S et al. JunB regulates homeostasis and suppressive functions of effector regulatory T cells. Nat Commun 9, 5344 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Schattgen SA et al. Integrating T cell receptor sequences and transcriptional profiles by clonotype neighbor graph analysis (CoNGA). Nat Biotechnol 40, 54–63 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Abdulhaqq S. et al. Identification and Characterization of Antigen-Specific CD8+ T Cells Using Surface-Trapped TNF-α and Single-Cell Sequencing. J Immunol 207, 2913–2921 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Cheng Z-Y, He T-T, Gao X-M, Zhao Y & Wang J ZBTB Transcription Factors: Key Regulators of the Development, Differentiation and Effector Function of T Cells. Front. Immunol 12, 713294 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Mudd PA et al. Distinct inflammatory profiles distinguish COVID-19 from influenza with limited contributions from cytokine storm. Sci Adv 6, (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Bandala-Sanchez E. et al. T cell regulation mediated by interaction of soluble CD52 with the inhibitory receptor Siglec-10. Nat Immunol 14, 741–748 (2013). [DOI] [PubMed] [Google Scholar]
  • 35.Huang H, Wang C, Rubelt F, Scriba TJ & Davis MM Analyzing the Mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening. Nat Biotechnol 38, 1194–1202 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Glanville J et al. Identifying specificity groups in the T cell receptor repertoire. Nature 547, 94–98 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Mayer-Blackwell K. et al. TCR meta-clonotypes for biomarker discovery with tcrdist3 enabled identification of public, HLA-restricted clusters of SARS-CoV-2 TCRs. eLife 10, e68605 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Zhang H, Zhan X & Li B GIANA allows computationally-efficient TCR clustering and multi-disease repertoire classification by isometric transformation. Nat Commun 12, 4699 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Zhao Y. et al. DeepAIR: A deep learning framework for effective integration of sequence and 3D structure to enable adaptive immune receptor analysis. Sci. Adv 9, eabo5128 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Zhang Z, Xiong D, Wang X, Liu H & Wang T Mapping the functional landscape of T cell receptor repertoires by single-T cell transcriptomics. Nat Methods 18, 92–99 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Sidhom J-W, Larman HB, Pardoll DM & Baras AS DeepTCR is a deep learning framework for revealing sequence concepts within T-cell repertoires. Nat Commun 12, 1605 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Weinstein JS et al. TFH cells progressively differentiate to regulate the germinal center response. Nat Immunol 17, 1197–1205 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Shulman Z. et al. Dynamic signaling by T follicular helper cells during germinal center B cell selection. Science 345, 1058–1062 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Poon MML et al. Tissue adaptation and clonal segregation of human memory T cells in barrier sites. Nat Immunol 24, 309–319 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Silva-Cayetano A. et al. Spatial dysregulation of T follicular helper cells impairs vaccine responses in aging. Nat Immunol 24, 1124–1137 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

Methods-only references

  • 46.Turner JS et al. Human germinal centres engage memory and naive B cells after influenza vaccination. Nature 586, 127–132 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Liu C. et al. High-resolution HLA typing by long reads from the R10.3 Oxford nanopore flow cells. Human Immunology 82, 288–295 (2021). [DOI] [PubMed] [Google Scholar]
  • 48.Hao Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587.e29 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Germain P-L, Lun A, Garcia Meixide C, Macnair W & Robinson MD Doublet identification in single-cell sequencing data using scDblFinder. F1000Res 10, 979 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Andreatta M. et al. Interpretation of T cell states from single-cell transcriptomics data using reference atlases. Nat Commun 12, 2965 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Andreatta M. et al. A CD4+ T cell reference map delineates subtype-specific adaptation during acute and chronic viral infections. eLife 11, e76339 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Aran D. et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat Immunol 20, 163–172 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Schmiedel BJ et al. Impact of Genetic Polymorphisms on Human Immune Cell Gene Expression. Cell 175, 1701–1715.e16 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Borcherding N, Bormann NL & Kraus G scRepertoire: An R-based toolkit for single-cell immune receptor analysis. F1000Res 9, 47 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Korsunsky I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods 16, 1289–1296 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Alquicira-Hernandez J & Powell JE Nebulosa recovers single-cell gene expression signals by kernel density estimation. Bioinformatics 37, 2485–2487 (2021). [DOI] [PubMed] [Google Scholar]
  • 57.Borcherding N. et al. Mapping the immune environment in clear cell renal carcinoma by single-cell genomics. Commun Biol 4, 122 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Andreatta M & Carmona SJ UCell: Robust and scalable single-cell gene signature scoring. Computational and Structural Biotechnology Journal 19, 3796–3798 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Subramanian A. et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. U.S.A 102, 15545–15550 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Hsieh TC, Ma KH & Chao A iNEXT: an R package for rarefaction and extrapolation of species diversity ( H ill numbers). Methods Ecol Evol 7, 1451–1456 (2016). [Google Scholar]
  • 61.Kidera A, Konishi Y, Oka M, Ooi T & Scheraga HA Statistical analysis of the physical properties of the 20 naturally occurring amino acids. J Protein Chem 4, 23–55 (1985). [Google Scholar]
  • 62.Moon KR et al. Visualizing structure and transitions in high-dimensional biological data. Nat Biotechnol 37, 1482–1492 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Voigt AP et al. Spectacle: An interactive resource for ocular single-cell RNA sequencing data analysis. Exp Eye Res 200, 108204 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Tickotsky N, Sagiv T, Prilusky J, Shifrut E & Friedman N McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. Bioinformatics 33, 2924–2929 (2017). [DOI] [PubMed] [Google Scholar]
  • 65.Shugay M. et al. VDJdb: a curated database of T-cell receptor sequences with known antigen specificity. Nucleic Acids Research 46, D419–D427 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Vita R. et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Research 47, D339–D343 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Zhang W. et al. PIRD: Pan Immune Repertoire Database. Bioinformatics 36, 897–903 (2020). [DOI] [PubMed] [Google Scholar]
  • 68.Finak G. et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol 16, 278 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Tables 3-7

Supplementary Table 1. Demographics of clinical cohorts.

Supplementary Table 2. HLA-typing results of human donors

Supplementary Table 3. Spike-specific dLN CD4+ TFH clonotype sequences included in analysis

Supplementary Table 4. Differential gene expression results for analysis performed in Figure 4b

Supplementary Table 5. Overlapping TCR clonotypes between blood and dLN in donor 368-01a

Supplementary Table 6. Overlapping TCR clonotypes between blood and dLN in donor 368-22

Supplementary Table 7. Differential gene expression results for analysis performed in Figure 6a

RESOURCES