Abstract
The network of thymic stromal cells provides essential niches with unique molecular cues controlling T cell development and selection. Recent single-cell RNA sequencing studies have uncovered previously unappreciated transcriptional heterogeneity among thymic epithelial cells (TEC). However, there are only very few cell markers that allow a comparable phenotypic identification of TEC. Here, using massively parallel flow cytometry and machine learning, we deconvoluted known TEC phenotypes into novel subpopulations. Using CITEseq, these phenotypes were related to corresponding TEC subtypes defined by the cells’ RNA profiles. This approach allowed the phenotypic identification of perinatal cTEC and their physical localisation within the cortical stromal scaffold. In addition, we demonstrate the dynamic change in the frequency of perinatal cTEC in response to developing thymocytes and reveal their exceptional efficiency in positive selection. Collectively, our study identifies markers that allow for an unprecedented dissection of the thymus stromal complexity, as well as physical isolation of TEC populations and assignment of specific functions to individual TEC subtypes.
Subject terms: T cells, Thymus
T cell development requires functionally diverse thymic epithelial cell (TEC) populations performing specific functions. Here, using massively parallel flow cytometry and machine learning, the authors examine in mice the TEC compartment from the perinatal period to adulthood, identify novel phenotypic markers and characterize the function of perinatal cortical TEC.
Introduction
The thymus is essential for the formation and maintenance of the adaptive immune system. Its stroma provides a unique microenvironment promoting the generation and selection of T lymphocytes tolerant to an individual’s own tissue antigens yet responsive to an unlimited range of pathogens or malignantly transformed cells. Thymic epithelial cells (TEC) constitute the major cellular element of the stromal scaffold1–3. Other cellular components of the stroma are different mesenchymal cell types and endothelial cells2,4. TEC attract blood-borne lymphoid progenitors, commit them to a T cell fate, provide the molecular cues essential for expansion and differentiation, and shape the T cell antigen receptor (TCR) repertoire via stringent processes of positive and negative selection based on the cells’ antigen specificity5–7.
The TEC compartment is composed of separate cortical (c) and medullary (m) lineages which have typically been defined by the cells’ anatomical location, a limited number of phenotypic markers and several functional characteristics8–10. The surface marker Ly51 and reactivity to UEA1 are used to distinguish between cTEC (Ly51+UEA1−) and mTEC (Ly51−UEA1+). Markers such as CD80 and MHCII identify subsets of mTEC such as immature (CD80loMHCIIlo; mTEClo) and mature epithelia (CD80hiMHCIIhi; mTEChi). The latter cells are further differentiated based on the cells’ capacity to express the Autoimmune Regulator (Aire)11,12. Recent single-cell RNA-sequencing (scRNAseq) uncovered a remarkable TEC heterogeneity which could previously not be appreciated using the few cell surface markers available for flow cytometry13–15. For instance, a scRNAseq analysis of the TEC compartment of 4-week-old mice demonstrated a single cortical TEC type but 4 separate mTEC subtypes, namely immature and a mature mTEC, post-Aire mTEC and tuft-like mTEC15. Investigations of TEC heterogeneity across the life trajectory (1–52-weeks of age) identified 9 different TEC subtypes whose relative frequencies vary with age13. For example, the cTEC compartment is composed of at least two main subtypes, designated perinatal and mature cTEC13. Perinatal cTEC represent a major subpopulation early in life (~40% of all TEC in the first week after birth) but their relative frequency rapidly decreases thereafter with only a small fraction of these cells being detected in adult animals. Conversely, mature cTEC increase in frequency and represent the majority of cortical epithelia from 4 weeks of life onwards13. Intertypical TEC are characterised by a gene expression profile that includes signatures typical for both cortical and medullary thymic epithelial lineages. They also express genes including Pdpn, Ccl21a, Ly6a, and Plet1 that have previously been associated with mTEC, thought to have a progenitor potential, and localised at the cortico-medullary junction (CMJ)16–20. However, only few of these transcriptionally defined TEC subsets can currently be assigned to any cytometrically characterised TEC subpopulation, (for clarity, we refer to transcriptionally defined TEC clusters as subtypes and cytometrically specified TEC as subpopulations). Moreover, the hitherto absence of suitable and informative cell surface markers to physically isolate most of the TEC subtypes for in vitro analyses and in vivo transfer studies has disallowed to date further functional characterizations of these cells and the physical establishment of direct precursor-progeny relationships at single-cell resolution.
To address this limitation, we sought to screen mouse TEC for the expression of 260 cell surface markers employing massively parallel flow cytometry and the Infinity Flow computational pipeline to infer a co-expression pattern for any of the tested epitopes21. This approach identified several novel TEC surface markers that when suitably combined identified perinatal cTEC, intertypical TEC and tuft-like mTEC which had previously only been classified either by their distinct RNA expression profiles or a combination of cell surface and intracellular markers22,23. The identity of these phenotypically defined TEC subpopulations was verified by simultaneous measurements of mRNA and surface protein expression using CITEseq and, in the case of perinatal cTEC, further characterised functionally, spatially, and developmentally.
Results
Establishment of a cell surface expression atlas across thymic stromal cell subsets
To resolve thymic stroma heterogeneity at a phenotypic level, we sought to identify new cell surface markers that reliably and accurately identify TEC subsets hitherto only defined by the cells’ individual gene expression profiles. For this purpose, we used massively parallel flow cytometry for 260 individual cell surface markers followed by an analysis employing Infinity Flow, a computational machine learning algorithm21. Thymic stomal cells were isolated as single cells from 1-, 4-, and 16-week-old mice, physically enriched and subsequently stained for 12 backbone markers that either alone or in combination reliably identified haematopoietic (CD45), different epithelial (EpCAM1, Ly51, UEA1, MHCII, CD40, CD80, CD86, Sca1, AIRE, Podoplanin), endothelial (CD31) and some mesenchymal cells (Sca1, Ly51, Podoplanin; Fig. 1a). As a next step, cells were split into aliquots and stained individually for 260 exploratory markers (Supplementary Table 1). Following sample acquisition, Infinity Flow was used to impute the expression level of each of the exploratory markers tested at single-cell resolution21. The resultant predictions of expression are based on non-linear functions of the recorded backbone markers. The observed heterogeneity and co-expression patterns were further analysed and visualised by the single-cell analysis pipeline Seurat24. Hierarchical clustering of the data resulted in 7 clusters for data drawn from 1-week-old mice and in 10 clusters for that of older animals, as illustrated in two dimensions by a Uniform Manifold Approximation and Projection (UMAP) (Supplementary Fig. 1a–c).
At each of the three separate timepoints, the major thymic stromal cell types, epithelia, fibroblasts, pericytes, and endothelial cells, could reliably be identified based on the expression of key markers including EpCAM1 (CD326) identifying TEC, CD140a marking fibroblasts, Ly51 and CD146 singling out pericytes, and CD31 staining endothelial cells. Additional markers identified subpopulations within these cell clusters (Supplementary Fig. 1a–c). Several of the antibody specificities to detect backbone epitopes were also included among the selected 260 exploratory markers (e.g., EpCAM1, CD31, Ly51, and Sca1) which allowed direct comparisons between exploratory and identical backbone markers, thus verifying the utility of the Infinity Flow algorithm. For these markers, we noted highly similar expression profiles, therefore demonstrating the reliability of the computational approach taken (Supplementary Fig. 1d).
The initial expression analysis not only confirmed by flow cytometry the heterogeneity among thymic stromal cell types, but also revealed a dynamic change over time in the relative representation of individual TEC subpopulations (Supplementary Fig. 1a–c). In a second analysis, we focused exclusively on EpCAM1+ cells and disclosed in 1-week-old but not older mice three separate cTEC subclusters as defined by the cells’ differential expression of Ly51 and UEA1 thus illustrating a greater heterogeneity of the cTEC population early in life (Fig. 1b–d and Supplementary Fig. 2a–d). In mice 4 weeks of age and older, mTEC with a low surface expression of MHCII (designated mTEClo) segregated into 4 separate subclusters based on the differential expression of the surface markers analysed (Fig. 1b–d).
cTEC heterogeneity identified by differential cell surface marker expression
We next queried whether the expression of CD83, CD40, HVEM (CD270), and Ly51 unequivocally classified individual cTEC subpopulations, since their intensity profile differed across cTEC clusters identified in 1-week-old mice (Fig. 2a). The expression of CD40 and HVEM were exclusively restricted to a subcluster designated cTEC I (see below) whereas the two other markers were detected across all cTEC subclusters, but with a stronger signal on the cTEC I (Fig. 2a).
Analysing a previously published scRNAseq dataset of TEC13, transcripts for Cd83, Cd40, and Enpep (encoding Ly51) were detected in perinatal cTEC, albeit at various levels (Fig. 2b, c). In contrast, transcripts for Tnfrsf14, the gene encoding HVEM, were detected in only a few TEC but across several clusters, thus failing to unequivocally identify perinatal cTEC. This finding highlighted the limitations of gene expression studies to identify surface markers that matched the cells’ RNA profile. To further assess the relationship of the cTEC I subcluster to TEC subtypes identified by scRNAseq, we generated a score of similarity using SingleR which related the RNA expression profile of individual cells to the computed cell surface expression pattern of cTEC I. This analysis demonstrated the highest similarity score to the pairing of cTEC I with perinatal cTEC (Fig. 2d).
We then aimed to define surface markers that allow the isolation of perinatal cTEC by flow cytometry. We used the presence of markers highly expressed in cluster cTEC I of 1-week-old mice while excluding markers detected on the majority of mature cTEC isolated from 4- to 16-week-old animals (Figs. 1b–d; 2a), as the population of cortical epithelia in older animals only includes perinatal cTEC at a very low frequency. Within the population of cTEC, we identified a subpopulation of cells that concomitantly expressed CD83 and CD40 but were Sca1 negative early in postnatal life (Fig. 2e; Supplementary Fig. 3a). As early as 4 weeks postnatally, TEC with a Sca1+ phenotype appeared among CD83+CD40+ cTEC. These cells were electronically excluded from further analysis as they represent mature cTEC that accumulate with age. The frequency of CD83+CD40+Sca1− epithelia changed substantially during the life course as the cells’ relative representation progressively increased throughout organogenesis, plateaued in 1-week-old mice (4.6 × 104 ± 1.4 × 104 cells) and subsequently decreased, displaying the lowest representation in 8-week-old animals (2.4 × 103 ± 2.3 × 103 cells; Fig. 2e, f; Supplementary Fig. 3b).
Perinatal cTEC displayed in contrast to other cTEC subpopulations, comprising cTEC clusters II and III, higher cell surface levels for HVEM, Ly51 and MHCII which allowed the identification of these cells in combination with a high cell surface expression of CD83 and CD40, in the absence of Sca1. Perinatal cTEC also showed a higher Foxn1 promoter activity as demonstrated in reporter mice where the expression of GFP is under the transcriptional control of the Foxn1 locus (Fig. 2g).
Identification of intertypical and tuft-like TEC
The Infinity Flow analysis of adult mice revealed 4 distinct mTEClo clusters. Clusters I and II were observed in 4- and 16-week-old mice and displayed a similar expression profile for most of the 260 exploratory markers (Fig. 1b–d) including a shared expression of Sca1 and CD146 (Fig. 3a, b). To match these two phenotypically defined subpopulations to their corresponding transcriptome-determined TEC subtypes, we probed the RNA profiles of single TEC for the expression of Ly6a/Ly6e (encoding Sca1) and Mcam (encoding CD146). While transcripts for Ly6a/Ly6e were detected especially among intertypical TEC13, Mcam-specific RNA was only detected at low levels and in different TEC, but mostly within the intertypical TEC subtype (Fig. 3c). This subtype is characterised by transcriptional features characteristic of both cTEC and mTEC —as phenotypically defined by the conventional surface marker Ly51 and UEA-reactivity— contribute to this unique TEC subtype13. Using again the SingleR package, the similarity scores for both 4- and 16-week-old mTEClo I and II were calculated to be the highest when matched to the intertypical TEC subtype (Fig. 3d; Supplementary Fig. 4a). The mTEClo subpopulation contained cells that co-expressed Sca1 and CD146 and cells with this phenotype increased with postnatal age (Fig. 3e, f; Supplementary Fig. 4b). Although initially only detected among mTEClo this subpopulation was increasingly also observed within the cTEC compartment of mice older than 4 weeks of age (Fig. 3e, f).
A unique set of surface markers that specifically recognised the mTEClo cluster III could not be found. However, the simultaneous expression of CD66a and CD117 in the absence of Sca1 and CD63 positivity identified mTEClo cluster IV. This cluster was only detected in adult mice (Fig. 4a; Supplementary Fig. 5a), although the defining 4 cell surface markers could also be detected in the mTEClo cluster II of 1-week-old animals (Supplementary Fig. 5a). It is therefore possible that cluster mTEClo II of 1-week-old mice represents epithelia that form the separate cluster mTEClo IV in older animals. The single-cell transcriptomic analysis only partially matched the phenotypic analysis of mTEClo clusters as Ly6a/Ly6e- and Cd63-specific RNA could indeed be detected in the vast majority of TEC (Figs. 3c and 4b) whereas transcripts for Ceacam1 (encoding CD66a) and Kit (encoding CD117) were largely absent in these cells (Fig. 4b). The similarity score revealed the best match between mTEClo cluster IV and post-AIRE and tuft-like mTEC (Fig. 4c; Supplementary Fig. 5b).
We next sought to define a phenotypic profile of cells belonging to cluster mTEClo IV that would allow their physical isolation by flow cytometry. For this purpose, we identified within the Sca1−CD63− mTEClo a subpopulation of cells that stained positively for both CD66a and CD117, thus mirroring the features identified for mTEClo cluster IV (Fig. 4d). Most of the Sca1−CD63−CD66a+CD117+ mTEClo cells (~70%) also expressed the serine/threonine-protein kinase Dclk1 (Fig. 4e) which was previously identified as a typical intracellular marker for tuft-like mTEC15, 23. While staining for the tuft-like mTEC marker L1CAM was not successful in our hands, both the Dclk1 negative and positive fractions shared the CD104lo tuft-like mTEC phenotype as previously described (Supplementary Fig. 5c)15. Furthermore, the presence of Dclk1 negative cells within tuft-like mTEC is supported by previous scRNAseq results13,22. We also observed that an absence of Dclk1 in Sca1−CD63−CD66a+CD117+ mTEClo correlated with a lower surface expression of CD66a and CD117, indicating that these cells are not yet fully differentiated into tuft-like mTEC (Supplementary Fig. 5d). Conversely, the vast majority of Dclk1-positive TEC were detected among Sca1−CD63−CD66a+CD117+ mTEClo (Supplementary Fig. 5e). In the absence of the transcription factor Pou2f3 Dclk1 expression was absent and Sca1−CD63−CD66a+CD117+ mTEClo cells were not detected (Fig. 4f, g) thus confirming their identity as tuft-like thymic epithelia22,23. Using these phenotypic features, we noted the presence of tuft-like mTEC to change over time with the highest frequency and cellularity in 4-week-old mice (Fig. 4h, i). As CD66a−CD117− cells (“non-tuft-like”) made up a substantial portion (~40% at 4 weeks) of the relatively few Sca1−CD63− mTEClo, we sought to further determine their identity via bulk RNAseq. Specifically, we investigated whether these cells shared any transcriptomic features characteristic of tuft-like mTEC. This analysis revealed major transcriptional differences between the non-tuft-like and tuft-like cells. Genes associated with tuft-like cells were enriched in the CD66a+CD117+ fraction of cells and included Dclk1, Il25, Ceacam1, and Kit (Supplementary Fig. 5f). Similarly, the top genes identified in a previous scRNAseq experiment13 defining tuft-like mTEC were expressed at a very high level in these tuft-like cells whilst their transcripts were not present in non-tuft-like cells (Supplementary Fig. 5g). Correlation of the entire non-tuft-like transcriptome to the annotated TEC subsets confirmed this delineation from cells identified as tuft-like mTEC (Supplementary Fig. 5h). Because the CD66a−CD117− non-tuft-like mTEC displayed a gene expression profile that did not match any of the known mTEC subpopulations it is likely that these rare cells (~4% of total TEC) represent a mixture of mTEC subpopulations.
Tuft-like mTEC originate from mTEC that have once expressed the tissue restricted antigen Csnb15. We utilised CsnbCre::Rosa26LSL-YFP reporter mice15 that allow in vivo fate mapping within the mTEC linage to test whether Sca1−CD63−CD66a+CD117+ mTEClo originate from a Csnb-expressing precursor. In keeping with the previous study, we identified 70–80% of the Sca1−CD63−CD66a+CD117+ mTEClo mTEC to be YFP labelled (Supplementary Fig. 5i), suggesting that they represent bona fide tuft-like mTEC. The mTEClo compartment is composed of medullary epithelia that either have not yet expressed the transcriptional facilitator AIRE or, alternatively, belong to a group of cells that have differentiated from AIRE-positive, mature mTEC. Tuft-like mTEC have previously been shown to derive, at least in part, from AIRE-positive precursors22 and be enriched for expression of the surface glycoprotein Tspan8, an AIRE-enhanced tissue-restricted antigen14,25. We therefore tested whether Sca1−CD63−CD66a+CD117+ mTEClo (i.e. tuft-like) mTEC are positive for the expression of Tspan8. As many as 60% of tuft-like mTEC expressed Tspan8, further validating their identity and identifying these cells to contain post-AIRE mTEC (Supplementary Fig. 5j). The combined expression of YFP and Tspan8 was only detected in a small fraction of Sca1+CD146+ mTEClo intertypical TEC (Supplementary Fig. 5i,j), suggesting they belong to a TEC developmental stage that does not yet promiscuously express tissue-specific antigens.
CITEseq validates novel TEC markers
We next concurrently determined the abundance of surface proteins and mRNA expression in individual thymic stromal cells using Cellular Indexing of Transcriptomes and Epitopes by sequencing (CITEseq)26. This approach was also taken to confirm unequivocally that the combined use of the newly described cell surface markers identifies specific and previously defined TEC subtypes. For this purpose, CD45−Ter119− cells were isolated from thymi of 1- and 16-week-old mice and stained with oligonucleotide-coupled antibodies each directed against EpCAM1, MHCII, UEA1, CD80, CD86, CD40, CD83, HVEM, CD73, Sca1, CD63, CD117, CD200, CD54, CD49a, CD274, CD9, Ly6G/Ly6C, Podoplanin, or CD31, respectively. Labelled cells were further subjected to single-cell sequencing along with estimating the abundance of antibody-derived tags (ADTs)26.
Cell clustering resulted in the identification of 12 clusters with data computed either from single-cell gene expression profiling or, alternatively, from ADT, and displayed by means of t-distributed stochastic neighbour embedding (tSNE) (Supplementary Fig. 6a–c). The cell-type annotation performed was based on the gene expression profiles derived from the Immunological Genome Project (ImmGen)27 and confirmed the identity of individual clusters as endothelial cells, fibroblasts, stromal cells, and epithelial cells, respectively. Hence, the chosen combination of selected surface markers was sufficient to identify individual stromal cell types (Supplementary Fig. 6d, e).
Analysing the captured gene expression profiles of only TEC identified 8 separate clusters (A-H), whereas examining the ADT data recognised 9 (1–9) clusters (Fig. 5a, b). Comparing the two approaches demonstrated a nearly pairwise relationship (Fig. 5c). The few exceptions observed concerned on one hand the clusters D and E which represented a mixture of clusters 4 and 5 and on the other hand cluster G which split into clusters 7 and 8 as a result of the clusters’ differential expression of Ly6C/Ly6G (Fig. 5d). The limited number of antibodies used in the CITEseq analysis identified three cTEC cluster (defined as UEA1 non-reactive cells:1 corresponding to cluster A [1/A], 2/B, and 3/C), three mTEClo (MHCIIloCD80lo: 4/D, 5/E, and 9/H) and three mTEChi subpopulations (MHCIIhiCD80hi: 6/F, 7/G, and 8) (Fig. 5d–f and Supplementary Fig. 6f).
We next assigned CITEseq-defined TEC cluster identities to those subtypes we have previously classified13. Clusters 1/A and, to a lesser extent, 2/B corresponded to perinatal cTEC. Cluster 2/B was further related to mature cTEC and cluster 3/C to mature cTEC and intertypical TEC (Fig. 5d, f). Clusters 4/D and 5/E related to intertypical TEC while cluster 6/F was most similar to both mature mTEC and proliferating TEC. Clusters 7/G and 8/G displayed a high similarity to mature mTEC (Supplementary Fig. 6f) whereas cluster 9/H linked to tuft-like mTEC. Notably, the transcriptional signatures characteristic of post-Aire mTEC and neural (n) TEC could not be detected. Furthermore, we analysed the data for the expression of gene transcripts characterising the recently described mimetic mTEC subsets (designated tuft, microfold, enterocyte/hepatocyte, neuroendocrine, ciliated, ionocyte, keratinocyte, and muscle cells)28. Transcripts identifying “tuft cells” were detected in cluster 9/H (Supplementary Fig. 7) whereas mRNA signatures characterizing microfold-, enterocyte/hepatocyte-, and keratinocyte-like TEC were detected in clusters G/7-8 within the mature mTEC population. Genes associated with neuroendocrine, ciliated, ionocyte, and muscle cells among specified TEC subsets could not be robustly identified.
Replicating the changes of specific TEC subtypes with age, clusters 1/A and 2/B were more abundant in 1-week-old mice whereas the frequencies of clusters 3/C, 4/D, 5/E, 7/G, and 9/H were increased in 16-week-old animals (Fig. 5d; Supplementary Fig. 6g).
TEC in cluster 1/A displayed the highest CD83, CD40, and HVEM protein expression among CITEseq defined clusters, thus confirming the cells’ identity as perinatal cTEC. The expression of these markers was reduced in cluster 2/B and completely absent in cluster 3/C, suggesting the former to represent a developmentally intermediate cell state between perinatal and mature/intertypical-like cTEC (Fig. 5d, g). Furthermore, cTEC maturation was paralleled by a decrease in Foxn1 transcription and an increase in CD73, CD49a, and Sca1 protein expression (Supplementary Fig. 6h).
The differential CD117, CD63, and Sca1 protein expression (as measured by ADT) identified cluster 9/H as tuft-like mTEC (CD117+CD63−Sca1−; see above and Fig. 5d, h) and thus confirmed the flow cytometric definition and gating strategy used to identify these cells as both accurate and practical (Fig. 4g–i). This conclusion was further corroborated by the detection of Dclk1 and Ceacam1 transcripts in cluster 9/H (Fig. 5h) and a high similarity score with the tuft-like mTEC subtype (Fig. 5d, f)13.
ADT-based detection of Sca1 protein expression matched to cells with a transcriptional signature of intertypical TEC within the cTEC (cluster 3/C) and mTEClo subpopulations (clusters 4 + 5/D +E; Fig. 5d, f, h). Hence, intertypical TEC could unequivocally be identified by Sca1 expression alone. As the transcriptional signature identifying intertypical TEC was spread across three CITEseq-defined clusters (3/C, 4/D and 5/E; Fig. 5d, f), the detection of CD146 expression appeared to deconvolute TEC heterogeneity further since fractions of Sca1+ cTEC and Sca1+ mTEClo stained positively for CD146+ (Fig. 3e).
The ADT-based documentation of surface markers identified individual TEC subtypes. However, the corresponding gene expression profiles were on their own insufficient to recognize these cells, not least because of the occasional discrepancy between surface protein and RNA expression (Supplementary Fig. 6h, i). CITEseq could therefore validate the utility of the selected, novel surface markers and the gating strategies chosen. Together they identified four TEC subpopulations that correspond to a specific transcriptionally defined cluster and two subpopulations that represent a mixture of two related clusters, namely UEA1−CD83+CD40+Sca1− perinatal cTEC (cluster 1/A), UEA1−CD83−CD40−Sca1− mature cTEC (2/B), UEA1−CD83−CD40−Sca1+ intertypical-like cTEC (3/C), UEA1+MHCIIloCD80loSca1+ intertypical mTEC (4 + 5/D + E), UEA1+MHCIIhiCD80hi mature mTEC (6 + 7 + 8/F + G), and UEA1+MHCIIloCD80loSca1−CD63−CD66a+CD117+ tuft-like mTEC (9/H) (Supplementary Fig. 8).
Perinatal cTEC present an enhanced potential for positive selection
We next sought to localize perinatal cTEC within the thymus stromal architecture. Because Ly51 abundance was higher on perinatal cTEC in comparison to other cortical epithelial populations (Fig. 2g), we used this differential to localize perinatal cTEC on thymus tissue sections (Fig. 6a). Quantification of the Ly51 signal intensity in immunohistology detected these cells in close proximity to the medulla with gradual increase of the Ly51 signal but invariable cytokeratin 8 (K8) staining across the cortex from subcapsular region to the inner and eventually deep cortex (Fig. 6a, b).
The newly assigned surface markers enabled the isolation and functional testing of individual cTEC populations ex vivo. We therefore investigated the capacity of perinatal (CD83+CD40+Sca1−) and non-perinatal (CD83−CD40−) cTEC to effect positive thymocyte selection. We co-cultured these cells in thymic epithelial cell cultures (TECx) together with CD69−CD4+CD8+ (i.e., pre-selection double-positive) thymocytes for two days (for gating see Supplementary Fig. 9). Thymocytes were then analysed for phenotypic features associated with positive thymic selection, i.e., the upregulation of TCR and CD5 (Fig. 6c). The number of total thymocytes and those with a CD5hiTCRβhi phenotype were significantly increased in TECx composed of perinatal cTEC when compared to aggregates composed of other cortical epithelia (Fig. 6d). Taken together, these results identified perinatal cTEC to be juxtaposed to the medulla and particularly efficient in positively selecting thymocytes.
The number of perinatal cTEC significantly decreases with age13. We therefore explored whether this variation was paralleled by a change in the efficiency to impose positive thymocyte selection. We thus monitored and compared thymocyte maturation in 4- and 16-week-old thymi and classified their sequential maturational stages according to TCRβ and CD69 expression (i.e., stage 0: TCRβ−CD69− → stage 1: TCRβ+CD69+/− → stage 2: TCRβ+CD69+ → stage 3: TCRβ+CD69−). The frequency of pre-selection thymocytes (i.e., stage 0) was increased, whereas the relative abundance of cells with a post-selection phenotype (stage 2 and 3) was significantly reduced in older animals (Fig. 6e, f). These in vivo results indicated a compromised capacity of older mice to positively select thymocytes, which correlated with a decrease in the availability of perinatal cTEC (Fig. 2e, f).
Crosstalk with thymocytes induces maturation of perinatal cTEC
We finally investigated whether thymic crosstalk8,29 could explain the inverse correlation between the decreased frequency of perinatal cTEC with age and the expansion of thymocytes after birth. We therefore first determined the frequency of perinatal cTEC in Rag2-deficient (Rag2−/−) mice, which have a hypoplastic thymus secondary to a thymocyte developmental arrest at the DN3a stage. We found a high fraction of perinatal cTEC in these mice that was not influenced by age (Fig. 7a). Hence, thymocytes at developmental stages up to the beta-checkpoint did not influence the age-related changes in perinatal cTEC frequencies.
To probe whether thymocytes at later developmental stages, especially unselected CD4+CD8+ (double positive, DP) thymocytes, controlled the frequency of perinatal TEC, Rag2−/− mice were injected with antibodies directed against CD3ε. This treatment results in a substantial increase in pre-selection DP thymocytes30–32. Four weeks after antibody or control injections, the thymus of actively treated Rag2−/− mice contained an abundance of DP thymocytes which correlated with numerical and phenotypic changes in the cTEC compartment (Fig. 7b–e). The latter were marked by a reduction in perinatal cTEC, parallel to an increase in non-perinatal cTEC, specifically Sca1+ cells (Fig. 7d, e) which corresponded to intertypical TEC according to our CITEseq data (Fig. 5d, f, h). Taken together, these results identified the abundance of and/or signalling by pre-selection DP thymocytes as the mechanism by which the frequency of perinatal cTEC was controlled.
Discussion
Single-cell transcriptomic analyses have uncovered an unexpected heterogeneity within many cell populations of a seemingly identical phenotype. Cells of the thymic stromal compartment constitute no exception to this observation4, 13–15,22. The apparent lack of suitable cell surface markers identifying unequivocally TEC subpopulations identical to individual TEC subtypes precludes the isolation of live TEC and their ex vivo functional analysis. Here we report that this limitation has been substantially overcome. We describe novel cell surface markers that identify the comparable subtype of scRNAseq-defined perinatal, mature and intertypical cTEC, and mature, intertypical and tuft-like mTEC.
Predicting the cell surface phenotypes from corresponding scRNAseq profiles is challenging as technical limitations detecting low transcript copy numbers and the acknowledged disparity between transcript detection and protein expression render this attempt difficult. For example, a comparison of 7 scRNAseq methods uncovered that high-throughput methods, including the widely used 10x Chromium, have lower sensitivities in comparison to the low-throughput methods Smart-seq2 and CEL-Seq2 when capturing rare transcripts33,34. We therefore opted for an alternative method and stained TEC for the expression of hundreds of cell surface markers. This screening approach of massive parallel flow cytometry combined with Infinity Flow analysis discovered surface markers previously not inferred to be expressed by TEC. CITEseq which combines the detection of promising candidate markers and single-cell transcriptomic profiles finally established the accuracy of the cell surface markers chosen to identify TEC subtypes.
By applying the newly identified surface markers we show that perinatal cTEC are enriched within the TEC scaffold towards the cortico-medullary junction (CMJ) and that these cells are particularly efficient in positively selecting maturing thymocytes. As expected from their role in shaping the TCR repertoire, we find perinatal cTEC typically juxtaposed to thymocytes with an activated phenotype (i.e., CD69+). The age-dependent decline in the frequency of perinatal cTEC is noted both when using flow cytometry and scRNAseq to classify these cells. The actual pace by which this regression is observed differs, however, between these two methods, demonstrating a seemingly faster kinetic for perinatal cTEC identified by their RNA expression profile13. This may be explained by differences in the half-lives of specific transcripts and their corresponding proteins. Nonetheless, a post-natal decrease of these cells to an almost complete absence early in adulthood is expected to compromise the robustness of thymopoiesis and possibly the efficiency by which thymocytes are positively selected. The observed decrease in perinatal cTEC correlates with other compositional changes within the epithelial scaffold and may constitute an intrinsic driver for thymus senescence. This understanding is consistent with scRNAseq data that shows the quiescence of a population of medullary precursor cells and correlates this alteration with an impaired maintenance of the medullary TEC compartment13. In parallel, these age-related changes link to less efficient T cell selection, a decreased self-antigen representation, a decreased TCR repertoire diversity, and a reduced frequency of thymus-resident naïve T cells (this report and ref. 13).
Intertypical TEC express the glycosyl phosphatidylinositol-anchored cell surface protein Sca1 independently whether they are positive for the cortical marker Ly51+ or reactive with the lectin UEA-1, a general feature of medullary TEC. Here, we now show that intertypical TEC can be further split into two sizable subpopulations based on their expression of the cell adhesion molecule CD146. Because oligonucleotide labelled anti-CD146 antibodies were not available for the CITEseq analysis and transcripts for this marker are typically lowly expressed among intertypical TEC, a distinction between CD146 negative and positive cells was not possible when analysing scRNAseq data. Intertypical TEC may contain progenitors with a developmental bias towards the mTEC lineage13. Indeed, a recent report provides further support of this contention since the gene expression profile of intertypical TEC largely overlaps with a heterogeneous progenitor population which has been claimed to act as mTEC biased postnatal TEC progenitor20. Our profiling of Tspan8 expression and lineage tracing furthermore suggest that the majority of Sca1+CD146+ intertypical TEC relate to immature mTEC that have not yet fully differentiated to express collectively a broad range of tissue restricted antigens.
We further specify the cell surface phenotype for tuft-like mTEC (L1CAM+CD104+15). These cells share transcriptional (e.g. expression of IL25, Trmp5, Dclk1, and IL17RB) and morphological characteristics with gut epithelial tuft cells13,15,22, play a function in central T cell tolerance induction22 and control both the homeostasis of type 2 innate lymphoid cells and the generation of type 2 natural killer T cells15,22,23. The lineage tracing of these cells further shows that the majority of tuft-like mTEC derive from AIRE expressing TEC, a finding in keeping with previously reported observations15,22. However, whether all tuft-like mTEC differentiate from mature mTEC remains an open question because the labelling method needed to draw this conclusion (i.e., inducible, Aire-dependent tracing) may not be completely effective22. Interestingly, the Csnb lineage tracing identified an increased frequency of labelled tuft-like cells in comparison to mature mTEC where labelling is initiated. While we have no unequivocal explanation for this increase, we nevertheless conclude that the majority of tuft-like mTEC (at least 60%) are the progeny of mature medullary epithelia. There is, however, room to speculate that all tuft-like mTEC may be derived in this way, since any contributions from another Csnb-negative (i.e., non-labelled) precursor would dilute the frequency of labelled tuft-like cells, a result that we did not observe.
To probe the utility of the new set of cell surface markers to phenotype altered TEC scaffolds, we next analysed the composition of the thymic epithelia in FOXN1Δ505/WT mice which express a dominant negative mutation of FOXN1 and consequently show substantial defects in TEC differentiation35. A previous scRNAseq-based analysis of these animals revealed a relative enrichment of perinatal and mature cTEC against a reduction of tuft-like mTEC whereas the frequency of intertypical TEC remained unchanged. The flow cytometric analysis of the TEC scaffold in FOXN1Δ505/WT mice identifies the same variations and thus maps accurately to the transcriptional analysis of these cells, thus demonstrating that the phenotypic and gene expression-based analyses draw comparable conclusions (Supplementary Fig. 10).
With the approach taken, five of the previously defined nine TEC subtypes13 can now be unequivocally identified using cell surface markers. The still small number of discriminatory cell surface markers so far identified likely accounts for this minor limitation. Implementing more markers in the screening process may identify additional cell surface markers that will identify the remaining TEC subtypes for which we have not yet identified an unambiguous cell surface marker profile. Alternatively, the use of intracellular markers may be informative in identifying the remaining TEC subtypes, namely proliferating TEC, post-AIRE mTEC, and nTEC. However, an obvious drawback for this approach will be that TEC identified in this fashion will be non-viable and can therefore not be used for functional studies in vitro or after transfer in vivo.
It is important to establish precursor-progeny relationships for specific TEC subtypes now that we can identify and purify specific subpopulations. For example, the potential can be tested whether CD146+ intertypical TEC give rise to mature mTEC, competent to effect negative selection. Another effort could be directed in dissecting the molecular requirements of tuft-like mTEC controlling the development of type 2 lymphoid cells employing ex vivo functional assays. Finally, the screening workflow described here will also be valuable in identifying novel biomarkers apt to monitor changes in cell subpopulations and their functions resulting from spontaneous or engineered changes in gene function.
Taken together, we have identified novel surface markers that enable the isolation and functional assessment of different TEC subpopulations that correspond to previously identified subtypes so far only defined by their transcriptome. This is accomplished by combining a high throughput screening workflow with a computational expression projection followed by unsupervised clustering.
Methods
Mice
Animals were maintained under specific pathogen–free conditions at the University of Oxford Biomedical Science facilities under local and United Kingdom Home Office regulations and permissions. They were housed in cages grouped to a maximum of five animals per cage on wood litter. Environmental enrichment such as tunnel, gnawing and nesting material was provided. The animal house was maintained under artificial lighting (12 h) between 7:00 am and 7:00 pm, in a controlled ambient temperature of 22 °C ± 2 °C, and relative humidity between 30% and 70%. Experiments performed were approved by the United Kingdom Home Office regulations and co-housed age- and gender-matched wild-type C57BL/6 mice were used in all experiments as a reference for genetically modified animals.
CsnbCre mice15 were crossed to the Rosa26YFP mouse line36 to induce lineage tracing in the mature mTEC compartment.
Mice heterozygous for a Foxn1 allele with a single nucleotide loss at position 1470 (designated FOXN1Δ505/WT) were generated at the Genome Engineering Facility of the MRC Weatherall Institute of Molecular Medicine, University of Oxford as previously described35.
Rag2−/− mice were bred and maintained in the mouse facility of the Department of Biomedicine at the University of Basel in accordance with permissions and regulations of the Cantonal Veterinary Office of Basel-Stadt.
For timed pregnancies 7- to 14-week-old mice were mated over-night and separated early next morning. For pregnant females the mating was considered E0.5 that morning.
TEC isolation
Isolated thymi were cleaned from adipose tissue, separated into the two lobes, and subsequently subjected to three rounds of enzymatic digestion with Liberase (2.5 mg/ml, Roche, Cat no: 5401127001) and DNaseI (10 mg/ml, Roche, Cat no: 10104159001) diluted in PBS (Gibco, Cat no: 70011044) at 37 °C. After filtration through a 100-μm cell strainer and resuspension in FACS buffer (PBS supplemented with 2% FBS), cell number was determined using a CASY cell counter (Innovatis). For most analyses CD45+ hematopoietic cells were depleted by incubation with anti-CD45 beads (Miltenyi) as per manufacturer’s recommendations and subsequently subjected to the AutoMACS separator (Miltenyi) “depleteS” program.
Flow cytometry
Cells were counted and stained in FACS buffer containing antibodies of interest for 30 min at 4 °C in the dark. For the identification of dead cells an additional staining with propidium iodide (PI, Sigma, Cat no: P4864) or Zombie red (Biolegend, Cat no: 423110) was used. For intra-cellular staining, cells were fixed and permeabilised after cell-surface staining using the Cytofix/Cytoperm (BD Biosciences, Cat no: 554714) or the Transcription Factor Staining Buffer Set (Invitrogen, Cat no: 00-5523-00) according to the manufacturer’s protocol. Cells were analysed and sorted on a BD FACSAria III instrument (BD Biosciences). Cells were sorted into FACS buffer. Cell purities of at least 95% were confirmed by post-sort analysis. The following antibodies were used: CD4-APCCy7 (1:400, GK1.5, Biolegend), CD4-FITC (1:1000, GK1.5, Biolegend), CD5-PerCPCy5.5 (1:400, 53-7.3, Biolegend), CD8α-AF700 (1:800, 53-6.7, Biolegend), CD40-FITC (1:200, 3/23, Biolegend), CD45-AF700 (1:400, 30-F11, Biolegend), CD63-PE (1:400, NVG-2, Biolegend), CD66a-APC (1:400, Mab-CC1, Biolegend), CD66a-FITC (1:400, Mab-CC1, Biolegend), CD69-PECy5 (1:800, H1.2F3, Biolegend), CD69-PECy7 (1:200, H1.2F3, Biolegend), CD73-BV421 (1:400, TY/11.8, Biolegend), CD80-PECy5 (1:2000, 16-10A1, Biolegend), CD80-BV605 (1:400, 16-10A1, Biolegend), CD83-Bio (1:200, Michel-19, Biolegend), CD83-PE (1:400, Michel-19, Biolegend), CD104-FITC (1:400, 346-11A, Biolegend), CD117-BV421 (1:200, 2B8, BD Biosciences), CD146-APC (1:800, ME-9F1, Biolegend), EpCMA1-PerCPCy5.5 (1:800, G8.8, Biolegend), Dclk1 (1:1000, ab31704, Abcam), HVEM-APC (1:400, HMHV-1B18, Biolegend), Ly51-PECy7 (1:400, 6C3, Biolegend), MHCII-APC/Fire750 (1:2000, M5, Biolegend), UEA1-Cy5 (1:400, Vector Laboratories, in-house labelling), Sca1-BV510 (1:800, D7, Biolegend), TCRβ-FITC (1:400, H57-597, Biolegend), TCRβ-PE (1:1000, H57-597, Biolegend), Tspan8-APC (1:400, FAB6524A, R&D Systems). Biotinylated antibodies were detected using Streptavidin-BV785 (1:500, Biolegend) and unlabelled Dclk1 using anti-rabbit IgG AF647 (1:4000, Invitrogen). Data was analysed using FlowJo (version 10).
Massively parallel flow cytometry
Cells were isolated and CD45 depletion plus backbone staining were performed as described. The surface backbone panel included antibodies directed against CD45 (1:400, 30-F11, AF700, Biolegend), EpCAM1 (1:800, G8.8, PerCPCy5.5, Biolegend), Ly51 (1:400, 6C3, PECy7, Biolegend), MHCII (1:2000, M5, BV510, Biolegend), CD40 (1:200, 3/23, PECy5, Biolegend), CD80 (1:400, 16-10A1, BV605, Biolegend)), CD86 (1:800, GL-1, BV650, Biolegend), Sca1 (1:1000, D7, BV785, Biolegend), Podoplanin (1:200, 8.1.1, APC, Biolegend), CD31 (1:1000, 390, FITC, Biolegend), the Ulex europaeus agglutinin I (UEA1) lectin labelled with biotin (1:1000, Vector Laboratories), followed by secondary streptavidin-BV421 (1:1000, Biolegend) staining and Zombie red (1:1000, Biolegend) staining. Subsequently, the stained cells were distributed across the three 96-well plates provided with the LEGENDScreen kit (Biolegend, Cat no: 700009), each well containing a unique PE-labelled exploratory antibody as well as isotype controls and blanks. PE-labelled antibodies targeting GP2, Tspan8, CD177 and F3 were used as additional exploratory surface antibodies. Due to the low cell numbers obtained after CD45 depletion only ¼ of the recommended quantity of exploratory antibodies was used. Plates were incubated at 4 °C for 30 min in the dark. Thereafter, fixation was performed using the Cytofix buffer (BD Biosciences, Cat no: 554714) for 1 h at 4 °C in the dark. As an additional backbone marker, cells were stained intracellularly for anti-AIRE (1:400, 5H12, AF750, Invitrogen) in Cytoperm buffer (BD Biosciences, Cat no: 554714) and one well stained with anti-FOXN137 (1:1600, 2/41, PE, kind gift from Hans-Reimer Rodewald) as an additional exploratory marker, over-night at 4 °C in the dark. The next day cells were resuspended in 100 μl FACS buffer before analysis.
Infinity Flow and single-cell clustering and expression analysis
For the Infinity Flow computational analysis of the LEGENDScreen datasets, the acquired fcs files were gated on CD45 negative cells or specifically on EpCAM1+ TEC using the FlowJo software. The newly exported fcs files were then used as the dataset for the Infinity Flow pipeline as recently published21. The augmented data matrices generated during this process were then further analysed using the Seurat package for hierarchical clustering of the cells and differential expression analysis24, mostly following the workflow presented in https://satijalab.org/seurat/articles/pbmc3k_tutorial.html. We used default parameters, except for the data normalization method, normalization.method = “CLR”. Values below zero were set to zero to allow for log normalization. Markers were filtered by hand to exclude T-cell related and focus on stromal cell related genes (Supplementary Table 1). The top 20 PCA dimensions were used for clustering and UMAP projections and a clustering resolution of 0.275 was used.
We compared the Infinity Flow data matrices with the scRNAseq dataset of reference 13 by identifying the most closely related genes for each Infinity Flow protein. However, some of the antibodies bind to protein complexes, and here we chose the most abundant transcript related to such a complex—for example, we chose H2-Ab1 RNA transcripts for MHC class II protein detection. Furthermore, UEA1 detection was identified via Fut1 RNA expression, since FUT1 synthesises the glycan target of UEA1; for the complete assignment see Supplementary Table 2. We compared the Infinity Flow fluorescence values with the scRNAseq normalised log counts. Clusters from each dataset were then compared using the SingleR package in R38, with the Wilcox ranked sum test (using the SingleR option de.method = “wilcox”).
Histological analyses
Frozen thymus tissue sections (7 µm) were fixed in acetone and stained using antibodies specific for CD69 (1:100, H1.2F3, Biolegend), Ly51 (1:200, 6C3, Biolegend), K8 (1:500, TROMA-1, NICHD supported Hybridoma Bank), K14 (1:500, Poly19053, Biolegend). Images were acquired using a Leica DMi8 microscope.
Thymic epithelial cell cultures (TECx)
Perinatal cTEC (CD45−EpCAM1+MHCII+Ly51+CD83+CD40+) and non-perinatal cTEC (CD45−EpCAM1+MHCII+Ly51+CD83−CD40−) were sorted from the thymi of 2-week-old C57BL/6 mice and put in co-cultures with CD69− DP thymocytes sorted from the same thymi, respectively. Cells were transferred in a 1:1 TEC to DP ratio into 1.5 mL tubes containing 1 mL Iscove’s modified Dulbecco’s medium (IMDM, Gibco, Cat no: 12440053) supplemented with 10% FBS, 100 units/mL penicillin and 100 μg/mL streptomycin (Sigma, Cat no: P4333-100ML) and 1× GlutaMAX supplement (Gibco, Cat no: 35050061). Co-cultures were maintained at 37 °C in a humidified atmosphere containing 10% CO2 for 48 h and then analysed by FACS. As a control DP cells were also cultured without the addition of TEC.
Anti-CD3 injections
6- to 7-week-old Rag2−/− animals were injected intraperitoneally with 50ug of anti-CD3ε (clone KT3, in-house produced) or HBSS. Four weeks post injection thymi were analysed for the appearance of DP thymocytes and for changes within their cTEC compartment.
Cellular indexing of transcriptomes and epitopes by sequencing (CITEseq)
Cells were isolated from six thymi of 1-week- and three thymi of 16-week-old C57BL/6 mice and depleted of CD45+ cells by AutoMACS. Subsequently cells were stained for CD45 (1:400, 30-F11, AF700, Biolegend), EpCAM1 (1:800, G8.8, PerCPCy5.5, Biolegend), Ly51 (1:400, 6C3, PECy7, Biolegend), Ter119-FITC (1:400, TER119, Biolegend) and with PI. In addition cells were stained with antibodies coupled to oligonucleotides directed against CD9 (MZ3, Totalseq-A, Biolegend), CD40 (3/23, Totalseq-A, Biolegend), CD49a (HMα1, Totalseq-A, Biolegend), CD54 (YN1/1.7.4, Totalseq-A, Biolegend), CD63 NVG-2, Totalseq-A, Biolegend), CD73 TY/11.8, Totalseq-A, Biolegend), CD83 (Michel-19, Totalseq-A, Biolegend), CD117 (2B8, Totalseq-A, Biolegend), CD146 (human with cross reactivity to mouse) (P1H12, Totalseq-A, Biolegend), CD200 (OX-90, Totalseq-A, Biolegend), CD274 MIH6, Totalseq-A, Biolegend), HVEM (HMHV-1B18, Totalseq-A, Biolegend), Ly6D (49-H4, Totalseq-A, Biolegend), Ly6C/Ly6G (Gr1) (RB6-8C5, Totalseq-A, Biolegend), MadCAM1 (MECA-367, Totalseq-A, Biolegend), Podoplanin (8.1.1, Totalseq-A, Biolegend), CD80 (16-10A1, Totalseq-A, Biolegend), CD86 (GL-1, Totalseq-A, Biolegend), MHCII (M5, Totalseq-A, Biolegend), Sca1 (D7, Totalseq-A, Biolegend), CD31 (390, Totalseq-A, Biolegend), EpCAM1 (G8.8, Totalseq-A, Biolegend), CD36 (HM36, Totalseq-A, Biolegend), CD133 (15-2C11, Totalseq-A, Biolegend), CD157 BP-3, Totalseq-A, Biolegend), CD300LG (ZAQ5, Totalseq-A, Biolegend), and the Ulex europaeus agglutinin I (UEA1) lectin labelled with biotin, followed by secondary staining with streptavidin-PE (Totalseq-A, Biolegend) coupled to an oligonucleotide. CD45−Ter119−EpCAM1+ and CD45−Ter119−EpCAM1− cells were sorted in a 70–30% ratio into a 1.5 mL tube containing FACS buffer for the 1-week-old and 16-week-old samples, respectively. For both timepoints an estimate of 28,000 total cells were loaded on two wells of a 10x Genomics Chromium Single Cell Controller. After single-cell capture cDNA and library preparation were performed according to the manufacturer’s instructions using a Single-Cell 3’ v3 Reagent Kit (10x Genomics) with the changes as described in26 to capture cDNA and produce libraries from antibody-derived oligos (ADT). Sequencing was performed on one lane of the Illumina NovaSeq 6000 system with a mix of 90% cDNA library and 10% ADT library resulting in 151nt-long paired-end reads.
The dataset was analysed by the Bioinformatics Core Facility, Department of Biomedicine, University of Basel. cDNA reads were aligned to ‘mm10’ genome using Ensembl 102 gene models with the STARsolo tool (v2.7.9a) with default parameter values except the following parameters: soloUMIlen=12, soloBarcodeReadLength=0, clipAdapterType=CellRanger4, outFilterType=BySJout, outFilterMultimapNmax=10, outSAMmultNmax=1, soloType=CB_UMI_Simple, outFilterScoreMin=30, soloCBmatchWLtype=1MM_multi_Nbase_pseudocounts, soloUMIfiltering=MultiGeneUMI_CR, soloUMIdedup=1MM_CR, soloCellFilter=None. ADT libraries were also processed using the STARsolo tool with default parameters except soloCBmatchWLtype=1MM_multi_Nbase_pseudocounts, soloUMIfiltering= MultiGeneUMI_CR, soloUMIdedup=1MM_CR, soloCellFilter=None, clipAdapterType=False, soloType=CB_UMI_Simple, soloBarcodeReadLength=0, soloUMIlen=12, clip3pNbases = 136.
Further analysis steps were performed using R (v4.1.2). Note that cell filtering was done based only on the analysis of the gene expression, not ADT abundance. Cells were considered as high-quality cells if they had at least 2000 UMI counts, which is the threshold derived from the distribution of UMI counts across cells, forming a data set of 9953 cells.
Multiple Bioconductor (v3.14) packages including DropletUtils (v1.14.2), scDblFinder (v1.8.0), scran (v1.22.1), scater (v1.22.0), scuttle (1.4.0) and batchelor (v1.10.0) were applied for the further analysis of the dataset mostly following the steps of the workflow presented at https://bioconductor.org/books/release/OSCA/. Normalised39 log-count values for the gene expression were used to construct a shared nearest-neighbour graph40, which nodes, i.e., cells, were clustered by ‘cluster_louvain’ method from the R igraph package41. Counts reflecting the ADT abundance in cells were also log-normalised and clustered in a similar manner.
The data set was subjected to the cell-type annotation using the Bioconductor package SingleR (v1.8.1) and samples from the Immunological Genome Project (ImmGen) provided by the Bioconducter package celldex (v1.4.0) as the reference. Clusters of cells mostly assigned to ‘Epithelial cells’ (5834 cells) were filtered (Supplementary Fig. 6d). Note that one of the clusters (cluster A, Supplementary Fig. 6a) was excluded at this step, because it was mostly composed of cells with elevated percentage of reads mapping to mitochondrial and ribosomal genes and lower number of counts.
The gene expression of filtered cells was re-analysed by removing the batch effect formed by the combination of the sample of origin and the number of counts per cell (cells with >12,000 counts and cells with 12,000 counts) and re-clustered (Fig. 5a, b). Cells were also subjected to the cell-type annotation using scRNAseq transcriptional profiles of single TEC as the reference data set13 (Fig. 5d, f). The scoreMarkers function of the scran package was applied to find marker genes of clusters 1–3. The standardised log-fold change across all pairwise comparisons ‘mean.logFC.cohen’>1 was used as the significance threshold defining the set of marker genes.
A t-SNE dimensionality reduction was used for visualizing single cells on two dimensions. T-SNE coordinates were calculated using the runTSNE function from the scater package and default parameters. For the visualization of cells based on the gene expression, coordinates of principal components and 2000 most variable genes with excluded mitochondrial and ribosomal genes were used as the input. For the visualization of cells based on the ADT abundance, coordinates of principal components and all ADTs were used as the input.
Bulk RNA sequencing
Triplicates of Sca1−CD63−CD66a+CD117+ tuft-like and Sca1−CD63−CD66a−CD117− non-tuft-like mTEC cells were sorted from thymi of 6-week-old C57BL/6 mice into trizol (Invitrogen, 15596026). Subsequently samples were submitted to ultra-low input bulk RNAseq. Reads were trimmed using Trimmomatic (version 0.36) to remove adapter sequences and aligned to the mouse genome (mm10) using STAR (version 2.7.3a)42,43. HTSeq (version 0.12.4) was used to assign reads to genes with the option “intersection-nonempty”44. Differentially expressed genes were identified using edgeR (FDR < 0.05)45. Spearman correlation coefficients were calculated between Sca1−CD63−CD66a−CD117− non-tuft-like mTEC bulk RNAseq samples and scRNAseq data from reference 13 for all differentially expressed genes with log2 fold change ≥1.
Statistical analysis
GraphPad Prism (version 9) was used to perform all statistical analyses, except for the bulk RNAseq and CITEseq datasets. The statistical tests used are described in the figure legends, and exact p-values are shown within each figure. Non-significant differences are not specified.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Supplementary information
Acknowledgements
We thank Emilie Cosway and Sonia Parnell for their technical assistance and Lilly von Muenchow for critical reading of the manuscript. Further we would like to acknowledge Grozdan Cvijetic for his help to set up the Infinity Flow computational analysis pipeline. We also thank Jakub Abramson for sharing the CsnbCre mice. This research was supported in part by the generosity of an anonymous donor to Stanford University establishing the DiGeorge Syndrome Research Fund (G.A.H.), Swiss National Science Foundation grant IZLJZ3_171050; 310030_184672 (G.A.H.), Medical Research Council grant MR/S036407/1 (G.A.H.), Wellcome Trust grant 105045/Z/14/Z and 211944/Z/18/Z (G.A.H.); Swiss National Science Foundation Early Postdoc.Mobility Fellowship (P2BSP3_188183) and Postdoc.Mobility Fellowship (P500PB_206823) to F.K.; NIHR Clinical Lectureship to F.D.; work in the G.A. lab is supported by an MRC Programme Grant (MR/T029765/1).
Source data
Author contributions
F.K., C.V.-V., S.M., F.D., S.Z., L.M., I.C.-A., M.E.D., A.J.W., and B.L. performed experiments; F.K., C.V.-V., S.P., A.B., F.D., A.E.H., S.Z., and L.M. analysed data; F.D. provided CsnbCre mice; G.A. provided the thymi from Pou2f3-/- mice; F.K. and G.A.H. conceived the project and wrote the manuscript.
Data availability
The CITEseq and the bulk RNAseq datasets have been deposited in the Gene Expression Omnibus database under accession numbers GSE215418 and GSE226128, respectively. Source data are provided with this paper.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
The online version contains supplementary material available at 10.1038/s41467-023-39722-9.
References
- 1.Anderson G, Takahama Y. Thymic epithelial cells: working class heroes for T cell development and repertoire selection. Trends Immunol. 2012;33:256–263. doi: 10.1016/j.it.2012.03.005. [DOI] [PubMed] [Google Scholar]
- 2.James KD, Jenkinson WE, Anderson G. Non-epithelial stromal cells in thymus development and function. Front. Immunol. 2021;12:634367. doi: 10.3389/fimmu.2021.634367. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Anderson G, Jenkinson EJ, Moore NC, Owen JJ. MHC class II-positive epithelium and mesenchyme cells are both required for T-cell development in the thymus. Nature. 1993;362:70–73. doi: 10.1038/362070a0. [DOI] [PubMed] [Google Scholar]
- 4.Handel AE, et al. Developmental dynamics of the neural crest-mesenchymal axis in creating the thymic microenvironment. Sci. Adv. 2022;8:eabm9844. doi: 10.1126/sciadv.abm9844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Klein L, Kyewski B, Allen PM, Hogquist KA. Positive and negative selection of the T cell repertoire: what thymocytes see (and don’t see) Nat. Rev. Immunol. 2014;14:377–391. doi: 10.1038/nri3667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kadouri N, Nevo S, Goldfarb Y, Abramson J. Thymic epithelial cell heterogeneity: TEC by TEC. Nat. Rev. Immunol. 2020;20:239–253. doi: 10.1038/s41577-019-0238-0. [DOI] [PubMed] [Google Scholar]
- 7.Zlotoff DA, Bhandoola A. Hematopoietic progenitor migration to the adult thymus. Ann. N. Y Acad. Sci. 2011;1217:122–138. doi: 10.1111/j.1749-6632.2010.05881.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Abramson J, Anderson G. Thymic epithelial cells. Annu Rev. Immunol. 2017;35:85–118. doi: 10.1146/annurev-immunol-051116-052320. [DOI] [PubMed] [Google Scholar]
- 9.Laufer TM, DeKoning J, Markowitz JS, Lo D, Glimcher LH. Unopposed positive selection and autoreactivity in mice expressing class II MHC only on thymic cortex. Nature. 1996;383:81–85. doi: 10.1038/383081a0. [DOI] [PubMed] [Google Scholar]
- 10.Derbinski J, Schulte A, Kyewski B, Klein L. Promiscuous gene expression in medullary thymic epithelial cells mirrors the peripheral self. Nat. Immunol. 2001;2:1032–1039. doi: 10.1038/ni723. [DOI] [PubMed] [Google Scholar]
- 11.Herzig Y, et al. Transcriptional programs that control expression of the autoimmune regulator gene Aire. Nat. Immunol. 2017;18:161–172. doi: 10.1038/ni.3638. [DOI] [PubMed] [Google Scholar]
- 12.Gray DH, et al. Developmental kinetics, turnover, and stimulatory capacity of thymic epithelial cells. Blood. 2006;108:3777–3785. doi: 10.1182/blood-2006-02-004531. [DOI] [PubMed] [Google Scholar]
- 13.Baran-Gale J, et al. Ageing compromises mouse thymus function and remodels epithelial cell differentiation. Elife. 2020;9:e56221. doi: 10.7554/eLife.56221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Dhalla F, et al. Biologically indeterminate yet ordered promiscuous gene expression in single medullary thymic epithelial cells. EMBO J. 2020;39:e101828. doi: 10.15252/embj.2019101828. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Bornstein C, et al. Single-cell mapping of the thymic stroma identifies IL-25-producing tuft epithelial cells. Nature. 2018;559:622–626. doi: 10.1038/s41586-018-0346-1. [DOI] [PubMed] [Google Scholar]
- 16.Mayer CE, et al. Dynamic spatio-temporal contribution of single beta5t+ cortical epithelial precursors to the thymus medulla. Eur. J. Immunol. 2016;46:846–856. doi: 10.1002/eji.201545995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Michel C, et al. Revisiting the road map of medullary thymic epithelial cell differentiation. J. Immunol. 2017;199:3488–3503. doi: 10.4049/jimmunol.1700203. [DOI] [PubMed] [Google Scholar]
- 18.Onder L, et al. Alternative NF-kappaB signaling regulates mTEC differentiation from podoplanin-expressing precursors in the cortico-medullary junction. Eur. J. Immunol. 2015;45:2218–2231. doi: 10.1002/eji.201545677. [DOI] [PubMed] [Google Scholar]
- 19.Ulyanchenko S, et al. Identification of a bipotent epithelial progenitor population in the adult thymus. Cell Rep. 2016;14:2819–2832. doi: 10.1016/j.celrep.2016.02.080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Nusser A, et al. Developmental dynamics of two bipotent thymic epithelial progenitor types. Nature. 2022;606:165–171. doi: 10.1038/s41586-022-04752-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Becht E, et al. High-throughput single-cell quantification of hundreds of proteins using conventional flow cytometry and machine learning. Sci. Adv. 2021;7:eabg0505. doi: 10.1126/sciadv.abg0505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Miller CN, et al. Thymic tuft cells promote an IL-4-enriched medulla and shape thymocyte development. Nature. 2018;559:627–631. doi: 10.1038/s41586-018-0345-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Lucas B, et al. Diversity in medullary thymic epithelial cells controls the activity and availability of iNKT cells. Nat. Commun. 2020;11:2198. doi: 10.1038/s41467-020-16041-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Hao Y, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184:3573–3587.e3529. doi: 10.1016/j.cell.2021.04.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Rattay K, Meyer HV, Herrmann C, Brors B, Kyewski B. Evolutionary conserved gene co-expression drives generation of self-antigen diversity in medullary thymic epithelial cells. J. Autoimmun. 2016;67:65–75. doi: 10.1016/j.jaut.2015.10.001. [DOI] [PubMed] [Google Scholar]
- 26.Stoeckius M, et al. Simultaneous epitope and transcriptome measurement in single cells. Nat. Methods. 2017;14:865–868. doi: 10.1038/nmeth.4380. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Heng TS, Painter MW, Immunological Genome Project C. The Immunological Genome Project: networks of gene expression in immune cells. Nat. Immunol. 2008;9:1091–1094. doi: 10.1038/ni1008-1091. [DOI] [PubMed] [Google Scholar]
- 28.Michelson DA, Hase K, Kaisho T, Benoist C, Mathis D. Thymic epithelial cells co-opt lineage-defining transcription factors to eliminate autoreactive T cells. Cell. 2022;185:2542–2558.e2518. doi: 10.1016/j.cell.2022.05.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Hollander GA, et al. Developmental control point in induction of thymic cortex regulated by a subpopulation of prothymocytes. Nature. 1995;373:350–353. doi: 10.1038/373350a0. [DOI] [PubMed] [Google Scholar]
- 30.Jacobs H, et al. CD3 components at the surface of pro-T cells can mediate pre-T cell development in vivo. Eur. J. Immunol. 1994;24:934–939. doi: 10.1002/eji.1830240423. [DOI] [PubMed] [Google Scholar]
- 31.Shinkai Y, Alt FW. CD3 epsilon-mediated signals rescue the development of CD4+CD8+ thymocytes in RAG-2-/- mice in the absence of TCR beta chain expression. Int Immunol. 1994;6:995–1001. doi: 10.1093/intimm/6.7.995. [DOI] [PubMed] [Google Scholar]
- 32.Levelt CN, Wang B, Ehrfeld A, Terhorst C, Eichmann K. Regulation of T cell receptor (TCR)-beta locus allelic exclusion and initiation of TCR-alpha locus rearrangement in immature thymocytes by signaling through the CD3 complex. Eur. J. Immunol. 1995;25:1257–1261. doi: 10.1002/eji.1830250519. [DOI] [PubMed] [Google Scholar]
- 33.Zheng GX, et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 2017;8:14049. doi: 10.1038/ncomms14049. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Ding J, et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat. Biotechnol. 2020;38:737–746. doi: 10.1038/s41587-020-0465-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Rota IA, et al. FOXN1 forms higher-order nuclear condensates displaced by mutations causing immunodeficiency. Sci. Adv. 2021;7:eabj9247. doi: 10.1126/sciadv.abj9247. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Srinivas S, et al. Cre reporter strains produced by targeted insertion of EYFP and ECFP into the ROSA26 locus. BMC Dev. Biol. 2001;1:4. doi: 10.1186/1471-213X-1-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Rode I, et al. Foxn1 protein expression in the developing, aging, and regenerating thymus. J. Immunol. 2015;195:5678–5687. doi: 10.4049/jimmunol.1502010. [DOI] [PubMed] [Google Scholar]
- 38.Aran D, et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunol. 2019;20:163–172. doi: 10.1038/s41590-018-0276-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Lun AT, Bach K, Marioni JC. Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol. 2016;17:75. doi: 10.1186/s13059-016-0947-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Xu C, Su Z. Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics. 2015;31:1974–1980. doi: 10.1093/bioinformatics/btv088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment2008, P10008 (2008).
- 42.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Dobin A, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15–21. doi: 10.1093/bioinformatics/bts635. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Anders S, Pyl PT, Huber W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics. 2015;31:166–169. doi: 10.1093/bioinformatics/btu638. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The CITEseq and the bulk RNAseq datasets have been deposited in the Gene Expression Omnibus database under accession numbers GSE215418 and GSE226128, respectively. Source data are provided with this paper.