Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Jul 1.
Published in final edited form as: Nature. 2022 Nov 21;613(7944):565–574. doi: 10.1038/s41586-022-05555-7

Pre-T cell receptor Self-MHC Sampling Restricts Thymocyte Dedifferentiation

Jonathan S Duke-Cohan 1,2,3,*, Aoi Akitsu 1,2,3, Robert J Mallis 1,2,4, Cameron M Messier 5, Patrick H Lizotte 5, Jon C Aster 6, Wonmuk Hwang 7,8,9, Matthew J Lang 10,11, Ellis L Reinherz 1,2,3,*
PMCID: PMC9851994  NIHMSID: NIHMS1855620  PMID: 36410718

Summary

Programming T lymphocytes to distinguish self from non-self is a vital, multi-step process arising in the thymus14. Signalling through the pre-T cell receptor (preTCR), a CD3-associated heterodimer comprising an invariant pTα chain and a clone-specific β chain, constitutes a critical early checkpoint in thymocyte development within the αβ T-cell lineage5,6. PreTCRs arrayed on double negative (DN) thymocytes, like αβ TCRs appearing on double positive (DP) thymocytes, ligate peptides bound to MHC molecules (pMHC) on thymic stroma but via a different molecular docking strategy710. Here we show the consequences of those distinctive interactions for thymocyte progression, using synchronized fetal thymic progenitor cultures differing in the presence or absence of pMHC on support stroma, determining single cell transcriptomes at key thymocyte developmental transitions. Although MHC negative stroma fosters αβ T lymphocyte differentiation, the absence of pMHC-preTCR interplay leads to deviant thymocyte transcriptional programming associated with dedifferentiation. Highly proliferative DN and DP subsets with antecedent characteristics of T cell lymphoblastic and myeloid malignancies emerge. Compensatory upregulation of diverse MHC class Ib proteins in B2m/H2-Ab1 MHC knockout mice partially safeguards in vivo thymocyte progression although, with ageing, disseminated DP thymic tumours may develop. Thus, beyond fostering β chain repertoire broadening for subsequent αβ TCR utilization, preTCR-pMHC interaction limits cellular plasticity to facilitate normal thymocyte differentiation and proliferation that, if absent, introduces developmental vulnerabilities.


The αβ T cell repertoire consists of many millions to billions of T lymphocytes, each expressing unique surface TCRs in a clonal manner1113. These lymphocytes mediate precise recognition and elimination of aberrant host cells displaying “foreign” surface pMHC ligands consequent to infection or cellular transformation. In the thymus of jawed vertebrates during foetal, neonatal and juvenile life, the repertoire of clonotypic αβTCRs and their predecessor preTCRs is generated6. Thymic progenitors originating from the bone marrow (and foetal liver in utero) proliferate during the early CD4CD8 double negative (DN1, DN2) stages and, under the influence of Notch at DN2, commit to the T cell lineage (Fig. 1a)1. Progression to the DN3a compartment (CD44CD25+CD28lo) leads to further αβT lineage commitment with recombination activating genes 1 and 2 (Rag-1 and Rag-2) fostering TCRβ locus rearrangements that produce a recombined β chain expressed as a disulphide-linked heterodimer with the invariant pTα subunit14. In turn, pTα-β associates with the CD3 signalling subunits. Upon preTCR signalling at the β selection checkpoint, the DN3b population (CD44CD25+CD28hi) undergoes a critical program change to suppress Notch signalling, downregulate transcription of Rag-1,-2 and Ptcra genes, increase cell cycling, and mediate allelic exclusion at the TCRβ locus enforcing expression of only one TCRβ chain per cell6. In turn, those thymocytes transition into the DN4 (CD44CD25) and then immature CD8 single positive (ISP) compartments15. Upon further progression to the double positive (CD4+CD8+; DP) stage, Rag genes are upregulated for a second time, permitting recombination and transcription at the TCRα locus and thereafter expression of the TCRαβ heterodimer1618. To refine the αβ T cell repertoire, both positive and negative selection events ensue at this DP stage in the thymic cortex and continue into the maturing SP (CD4+CD8 and CD4CD8+) medullary compartment followed by their later export as peripheral T cells19.

Fig. 1. Developmental trajectories for thymocyte-like development on MHC+ or MHC supporting stroma.

Fig. 1.

a. Schematic depicting representative gene transcript levels during key thymocyte developmental transitions (based on array data from the Immune Genome Project25). Upper panel: Early DN1-DN3 proliferation is driven by thymocyte Notch signalling, represented here by the Erg and Hes1 transcripts. Myeloid development is suppressed by downregulation of Spi1 (coding for PU.1) during the DN2a to DN2b transition and T lineage commitment following the Bcl11a to Bcl11b switch. Lower panel: following entry in to the DN3 stage, the preTCR with invariant pTα (pTCRα) is expressed. PreTCR signalling downregulates Notch-driven proliferation, inhibits TCR β locus recombination, downregulates Ptcra and upregulates the indicated transcripts.

b. UMAP projection of k-means clustering (k = 10) for DN3a, DN3b, DN4, and DP libraries for cells developing on either MHC+ or MHC stroma. All libraries are projected into the same space to permit direct comparison. Process for assignment of labels to each cluster is defined in the text.

c. Cluster developmental trajectories of individual libraries and relationship of individual clusters to phenotypically characterized thymocyte subsets. Projection of the individual FACS-sorted libraries (labeled in large font) into the primary space allows initial assignment of clusters expressing distinct transcriptomes (labelled in small font).

d. MHC thymocyte-like cells progress developmentally by phenotype from DN3a to the DP stage but with altered distribution on comparison with MHC+ cells. To focus only on the αβTCR lineage, ILC-γ/δ-like T cells and pre-apoptotic/apoptotic cells are excluded. For each library, the proportion of each defined developmental cluster is depicted. P < 2.5 x 10−7 for difference between MHC+ and MHC cells/stage distributions (Chi-square statistic).

PreTCR signalling was judged independent of ligand recognition at the DN3 stage consequent to several lines of prior investigation2023. First, ablation of the TCR β chain variable domain that forms part of the interaction surface with pMHC in the αβTCR did not impact development through the DN3a to DN3b checkpoint. Second, in further support of ligand binding dispensability, a preTCR missing the extracellular domains of both the β chain and the pTα chain could drive development to the DP compartment. Third, in MHCIMHCII double knockout mice, thymocyte progression was unimpaired through the DN3 stage to the DP stage with respect to both cell numbers and phenotypes.

Recent structural and biophysical data, however, reveal direct interactions between preTCRs and pMHC ligands that utilize a horizontal binding mode compatible with facile mechanosensing8,10,24. Functional assays demonstrate both restricted proliferation and repertoire development in the absence of stromal MHCI and MHCII molecules7,8. Together these findings necessitate re-examination of the earlier results.

Early T-lineage differentiation

Utilizing an in vitro model of thymocyte differentiation, we seeded haematopoietic stem cells (HSC) from foetal liver of wild-type C57Bl/6 mice onto OP9-DL4 MHCI (MHC+) stromal support cells or the same cells rendered MHCI-negative by CRISPR/Cas9 targeting of B2m and Tap2 genes (MHC)7. Both stromata lack endogenous MHCII expression. Extensive use of this model demonstrates synchronized expansion and development through to the immature single-positive (ISP) and DP stage within the d8 to d13 window thus recapitulating embryonic development25.

To examine the TCR β chain selection checkpoint at the DN3a to DN3b transition, 3.2 x 104 foetal liver-derived HSC were seeded onto MHC+ or MHC stroma and developing thymocyte-like cells analysed at d9 (6.225 x 107 on MHC+; 3.375 x 107 on MHC). For brevity, we refer to cells generated on MHC+ and MHC stroma with prefix MHC+ or MHC, respectively. Cells were sorted by FACS into DN3a, DN3b, DN4 and DP populations (Extended Data Fig. 1) and processed for scRNA-Seq using the 10X Genomics Chromium system simultaneously preparing from each cell a library enriched for TCR α and β chain clonotype transcripts. To reduce dimensionality of the transcriptome information, all libraries were aggregated and projected into a single Uniform Manifold Approximation and Projection (UMAP) plane allowing direct comparison of clusters and inferred trajectory analysis incident to the FACS sorting by phenotype (Fig. 1b; Supplemental Information Files 2, 3). To objectively delineate the relation of each cluster to thymocyte developmental stage, the dominant markers of normal transition from the DN3a to DPsm stages were extracted as reference arrays (Extended Data Fig. 2) from the Immunological Genome Project (IGP) α/β T lineage database26 and applied to each cluster yielding a transcriptome reference trajectory that matched with relative cluster representation in each stage-specific library (Fig. 1c). Within the DN4 libraries, a population with a γ/δ T cell-like and innate lymphoid cell-like (γ/δ-ILC) transcriptome signature partitions due to lack of expression of CD44 and CD25 (Extended Data Fig. 3a) pointing to the developmental fidelity of this in vitro system.

The pro-apoptotic (Extended Data Fig. 3b) and apoptotic populations (transcripts dominantly of mitochondrial origin) are retained not only as topological markers but to highlight the possibility, given the absence of thymic reticuloendothelial cells removing damaged cells, that in this assay apoptosis may be a significant process even before negative selection events occurring at DP stages and beyond. The DN3a/3b cluster (Fig. 1c) shows early upregulation of Ikzf3 and Cd28, markers of preTCR signalling (Fig. 1a, Extended Data Fig.2a) and bridges the DN3a and DN3b libraries. Likewise, the DN3b/4 cluster is represented in the DN3b, DN4, and DP libraries indicating the increased resolution over phenotype provided by the transcriptional signature (Fig. 1c, Extended Data Fig. 2b). The DPbl population segregates away from the mature DPsm population based on strong representation of cell cycling-related transcripts (Extended Data Fig. 2d).

Lack of MHC impacts preTCR signalling

Having established the cluster signature trajectory in normal developmental progression, development in the MHC state was examined (Fig. 1b, 1c). The MHC+ trajectory for the DN3a cluster shows a clear diminution with progression from the DN3a to the DN4 libraries (Fig. 1c, top row and Fig. 1d). For the MHC state, this progression is significantly less where >36% of the DN3b cells by phenotype, and >22% of the phenotypically DN4 cells, retain a DN3a-like transcriptome, contrasted to 22% and <10%, respectively, in the MHC+ condition (Fig. 1c, d). Nonetheless, there is phenotypic developmental progression in the absence of potential pMHC ligand binding to the preTCR. The DN3a to DN3b transition is marked by a new transcriptional program characterised by upregulation of Ikzf3, Rorc, Cd2, Cd28 and downregulation of Hes1, Erg and Ptcra (Fig. 1a, Extended Data Fig. 2b). We applied this gene panel to a subset of the DN3b/4 cluster more strongly represented in the MHC DN4 library than in the control condition, highlighted as a “tail” moving back into the DN3a/3b cluster (Fig. 1b, Fig. 2a). Splitting the MHC DN4 cluster into 2 subclusters, one representing the main region overlapping in position with the MHC+ DN3b/DN4 cluster and the other, the tail, showed clear differences. The latter, despite being phenotypically DN4, had neither upregulated Ikzf3, Rorc or Cd2 nor downregulated Hes1 and Erg as observed in the MHC+ cluster and, further, had not robustly upregulated Trbv transcription (Fig. 2b, Extended Data Fig. 2b). Collectively, these observations are consistent with a differentiation trajectory that bypasses the β selection checkpoint. The main MHC DN3b/4 cluster shows an intermediate expression between the MHC+ DN3b/4 cluster and the tail suggesting that elements of the aberrant transcriptional regulation observed in the tail subcluster extend to the main subcluster.

Fig. 2. Uncoupling of the transcriptome and repertoire from phenotype in thymocyte-like cells developing on MHC stroma.

Fig. 2.

a. The DN3b/4 cluster in the MHC DN4 library harbours a population with characteristics of cells not having passed through the preTCR signalling checkpoint. The MHC DN3b/4 cluster in the DN4 library is split into two subclusters, one corresponding to the DN3b/4 cluster in the MHC+ DN4 library (“main”, brown) and one corresponding to a set poorly represented in the MHC+ DN4 library (“tail”, orange).

b. Transcript expression in MHC+ and MHC DN3b/4 cells. Transcripts well-expressed and marking the transition to DN3b/4 cells (Ikzf3, Rorc, Cd2; Suppl. Fig. 3) remain low in the MHC “tail” subcluster, transcripts expected to be downregulated remain high (Fig. 1a), and robust TCR β chain upregulation is not observed (Extended Data. Fig. 2b).

c-f. Stage-specific analysis of β chain clonotype representation/ 10,000 cells in d9 MHC+, MHC, and scH-2Kb OP9-DL4 thymocyte-like development cultures. Representative of 6 experiments examining MHC+ (n = 5), MHC (n = 6), scH-2Kb (n = 3).

Reduced DN4 β clonotypic diversity

Appropriate developmental regulation of Trbv transcription and repertoire diversity at the DN4 stage was examined further by β chain clonotype analysis of the developing MHC+ and MHC subpopulations using targeted RNA-Seq. Wild-type HSC were seeded onto MHC+ OP9-DL4 stromal cells, onto MHC OP9-DL4 stromal cells or onto the same MHC cells transfected to re-express MHC class I as a single chain VSV8 peptide/β2m/H-2Kb (scH-2Kb). The scH-2Kb derivative expresses multiple copies of a single pMHC thus maintaining the potential for the horizontal binding mode to the preTCR but presenting a homogenous peptide, RGYVYQGL, derived from amino acids 52–59 of vesicular stomatitis virus nucleoprotein7. After 9 days, cell proliferation was uniformly better on the MHC+ stromal cells than on either the MHC or scH-2Kb support stroma (Extended Data Fig. 4a, b). Cells from each support stroma culture were sorted into phenotypically defined DN3, DN4, DPbl and DPsm populations (Supplemental Information File 3) and Trbv clonotypes of 104 cells for each stage and condition identified by targeted RNA-Seq.

TCR β clonotype diversity is high at the DN3 stage for cells developing on all variants of the OP9-DL4 support stroma used here (Fig. 2c). The DN4 compartment reveals a consistently contracted repertoire diversity only on the MHC support stroma (Fig. 2d, Extended Data Fig. 4c). Up to 70% of the clonotypes developing in the MHC DN4 population were found at <7.5% levels in the MHC+ and scH-2Kb populations (Extended Data Fig. 4d) suggesting these clonotypes may represent a restricted population of clonotypes responding to non-classical MHC or MHC-unrelated structures on the stromal surface. Conversely, ~92% of the clonotypes expressed on cells developing on the MHC+ and scH-2Kb stroma, permitting preTCR-pMHC interaction, are absent in the MHC cultures. The limited MHC clonotype repertoire is not a consequence of restricted cell proliferation since clonotype diversity of DN4 cells developing on scH-2Kb stroma is as rich as that of the cells developing in the MHC+ condition (Fig. 2d, Extended Data Fig. 4c) despite similar cell representation of all 3 DN4 cell populations (104 cells analysed/sample). The characteristics of cells developing on MHC stroma or scH-2Kb stroma both diverge from those on the MHC+ stroma during the DP stage (Fig. 2e, f; Extended Data Fig. 4c). Cells developing on scH-2Kb stroma reveal a contraction of β repertoire diversity, likely linked to limited positive selection afforded by a single peptide (i.e., VSV8) on scH-2Kb stroma. Of note, the N15β clonotype with known specificity for VSV8 peptide presented by H-2Kb appears in the top 20 DPsm clonotypes developing on the scH-2Kb stroma (Extended Data Table 1). On the other hand, the MHC developing cells recover diversity at the DPbl and DPsm stages, often overshooting that of cells on the MHC+ stroma (Fig. 2, Extended Data Fig. 4c) and indicating aberrant β chain transcriptional regulation if MHC-dependent preTCR signalling is circumvented. Continued Notch stimulation in the absence of preTCR signalling has already been demonstrated to permit differentiation through to the DP stages27.

Origin of β diversity in MHCIa system

The development of TCR clonotypes in the MHC condition implies that thymocytes can develop and bypass the preTCR checkpoint in the absence of MHC, either via a ligandless mode or utilizing non-classical MHCI and MHCII molecules or additional ligands. A panel of non-classical MHCI (MHCIb) was compiled (Extended Data Table 2) and, following full transcriptome analysis of the OP9 MHC+ and OP9 MHC stromal cells (Extended Data Fig. 5ac), expression of non-β2m dependent MHC were examined. Loss of CD1d surface expression, dependent upon β2m, was used as a functional validation marker of the CRISPR/Cas9 knockout in addition to loss of MHCI (Extended Data Fig. 5d) thus supporting our focus upon non-β2m dependent MHC. Transcriptome analysis identified only Raet-1d and Raet-1e as being expressed at the transcriptome level with detectable surface protein expression but with no difference between MHC+ OP9-DL4 and the MHC OP9-DL4 variant (Extended Data Fig. 5e). Consequently, the origin of the “background” clonotypes comprising the repertoire at the DN4 and subsequent stages in the MHC condition, also found as a minor fraction of the total repertoires in the MHC+ and scH-2Kb conditions (Extended Data Fig. 4d), is uncertain but may involve non-MHC ligands or non-classical MHCIb ligands independent of β2m or the peptide-loading complex for cell surface expression.

Without MHC, unusual DN4 cells develop

To further address the diminution in β chain representation at DN4 in the MHC condition, examination of the scRNA-Seq clustering is informative. Although the partitioning of the ILC-like and γ/δ T-like cells within the DN4 represents a β chain-low population (Fig.1c; Extended Data Fig. 3a), this is not the source of the difference as the representation of this cluster is similar between the MHC+ and MHC conditions. Apart from the ILC-like and γ/δ T-like cells, the DN3b/4 cluster is the only other significant representation in the thymocyte developmental path within the MHC+ DN4 library. These cells exhibit a robust upregulation of β chain transcript (264.6±74.3 fold increase, median = 88.4; P < 0.0001) on transitioning from the DN3a/3b cluster (Extended Data Fig. 2b). In contrast, in the MHC DN4 library in addition to the DN3b/4 population, there remains a high representation of phenotypically defined DN4 cells with a DN3a-like transcriptome as well as an unusual population barely observed in the MHC+ condition (“unusual”; Fig. 1c, d). As described above, on comparison with the MHC+ DN4 library Trbv transcript expression, within the MHC DN4 library the DN3b/4 main population trends toward suppression (Fig. 2b), the DN3b/4 tail exhibits a significant suppression (7.65-fold down against MHC+ DN3b/4, P<0.0002; 4.48-fold down against the MHC DN3b/4 main cluster, P<0.0025), as do the DN3a-like cells (5.38-fold down against MHC DN3b/4; P<0.0001), and the MHC unusual DN4 cluster (Fig. 3a; Extended Data Fig.5f). The aggregated effect of all these phenomena may contribute to the low DN4 Trbv clonotype representation in the MHC condition (Fig. 2d, Extended Data Fig. 4c).

Fig. 3. Single cell transcriptomics of the MHC DN4 unusual cluster reveal complex proliferative and lineage abnormalities.

Fig. 3.

a. Volcano plot of significant transcript differences (P<0.05) between the DN4 unusual population and the DN3b/4 cluster in the same MHC DN4 library. Upregulated DN4 unusual transcripts are shown to the right of zero on the x-axis, downregulated to the left. Functional significance of highlighted transcripts is listed in the inset. 14 other Trbv transcripts were downregulated but did not meet the P<0.05 threshold.

b. Violin plots depicting selected log-normalised transcript levels for the MHC+ DN4 library DN3b/4 cluster (2021 cells), MHC DN4 library DN3b/4 cluster (2493 cells), and MHC DN4 unusual cluster (1776 cells). For box plots, the box bounds the 1st to 3rd quartiles; where visible, the dotted line within represents mean, and the solid line represents median. Whiskers above and below (maximum and minimum) are defined as (quartile 3 + 1.5 * interquartile range) and (quartile 1 − 1.5 * interquartile range), respectively. Cluster colours are as depicted in Figs. 1 and 2. P values as in panel a.

c. Cluster distribution of top 20 DN4 clonotypes developing on MHC+ stroma (left panel) with expected developmental trajectory into DN3b/4 cluster (right panel). Each colour represents a unique clonotype and circle diameter is proportional to the cell number expressing that specific β chain (see cell count scale common to Fig. 3c and Fig. 3d). Colour/clonotype specification is unique to the panel and bears no relation to colour use in Fig. 3d or Fig. 4c, d. Right panel depicts clonotype tracking from cluster to cluster with cluster/track colouring concordant with left panel.

d. Distribution of top 20 DN4 clonotypes developing on MHC stroma similar to depiction in Fig. 3c with right panel depicting clonotype tracking from cluster to cluster.

e. ssGSEA scores for MHC+ DN3b/4 (green), MHC DN3b/4 (light orange), and MHC DN4 unusual (red) clusters compared with gene subset modules exhibiting co-ordinated up- or down-regulation in the indicated cancers (MSigDb C4).

f. Comparison of the DN4 unusual cluster developing on MHC with the DN3b/4 cluster developing on DN4 MHC+ stroma (positive control) for transcripts reported as dysregulated in human T-ALL. Shaded boxes identify regulatory transcripts over-expressed in the indicated forms of human T-ALL (columns 1-3). Heatmap in column 4 depicts log2-fold difference for the DN4 unusual cluster compared with the MHC DN3b/4 cluster (scale to right) and significance (P value).

The MHC unusual cluster (1776 cells; 14.3% of all DN4 cells), minimal in the MHC+ DN4 library (205 cells; 2.79% of all DN4 cells), displays a complex transcriptome. Unlike the DN3b/4 cells expected in the DN4 library, the unusual cluster cells have not consolidated the robust expression of β chains (Fig. 3a; Extended Data Fig. 5f). Nevertheless, 83.6% of the cells in the DN4 unusual cluster express Trbc1/2 transcripts and of these 70.7% express Lck and/or Ptcra confirming the T lineage origin of a large fraction of the cells (Supplemental Information File 2). Moreover, there is maintained expression of progenitor drivers (Kit, Lyl1, Ezh1 and Id2) as well as Spi1 coding for PU.1 that operates at the critical decision checkpoint determining myeloid or T cell lineage specification. These observations are consistent with not having passed through the preTCR checkpoint as is the maintained expression of early lineage and γδ T cell-linked developmental transcripts such as Fcer1g, Icos, Il18rap, etc. (Extended Data Fig. 3a). The high representation of these unusual cells is not part of the normal ILC or γ/δ T cell development, else they would also appear in the MHC+ DN4 library that harbours a similar ILC-γ/δ T cell cluster. Furthermore, the MHC unusual cells are in a cycling state with high histone transcript expression, and high expression of AY036118 (Fig.3a, 3b), a lncRNA (XR_877120.4) on Chr17 implicated in regulation of thymocyte proliferation possibly mediated by telomeric association28,29. The volcano plot identifies transcripts of high fold-change and probability averaged across the whole cluster hence significance can be driven by a well-represented subset of cells rather than the complete cluster population. Examining select transcripts that are regulated in the same direction in most cells within a cluster, in addition to AY036118, the histones represented here by Hist1h1d as well as Lars2 coding for mitochondrial leucyl-tRNA synthetase 2, a marker of high metabolic activity, stand out30 (Fig. 3b).

Unexpectedly, this analysis led to identification of irregularities in Rag1 and Rag2 transcription in the DN4 unusual population where both Rag transcripts are minimal (Fig. 3b). Rag1 is well expressed in MHC+ and MHC DN3b/4 clusters. Rag2 is expressed well in the MHC+ DN3b/4 cluster while expression in the MHC DN3b/4 is comparable to that of the DN4 unusual cluster. These findings not only illuminate possible differential regulation of the Rag1 and Rag2 transcripts but also show that the MHC DN3b/4 cells are already experiencing transcriptional aberrations despite appearing phenotypically identical with MHC+ DN3b/4 cells. Reduction of Rag1/Rag2 heterodimeric protein activity in the DN4 unusual population due to regulation of Rag2 transcripts might contribute to the loss of diversity in the β chain repertoire at this stage (Fig. 2d).

Dysregulated transcriptome of MHC DN4

Further refinement of the properties of the MHC DN4 unusual cluster are revealed by single cell β clonotype analysis. Examination of the MHC+ DN4 library for the top 20 clonotypes based on cellular representation (Extended Data Table 3a) shows that the majority localise to the DN3b/4 cluster as expected, given appropriate preTCR signalling with minimal tracking to other clusters (Fig. 3c). Similar analysis of the MHC DN4 library exposes a starkly different distribution where the majority of highly represented β clonotypes map to the DN4 unusual cluster (Fig. 3d). Of the 17 clonotypes represented in the DN4 unusual cluster, for 14 we can identify related cells bearing the same clonotype in the DN3a/3b and DN3b/4 clusters (Fig. 3d right panel). Consequently, we propose that in the absence of pMHC, some cells may differentiate from DN3a through to DN4 but deviate from the normal transcriptome trajectory to map to the unusual cluster. Cell representation of the top 20 clonotypes in the DN4 libraries, normalizing for differences in initial library size, shows 3.25 ± 0.55 cells for each MHC+ clonotype (only 2/20 found in the unusual cluster) compared with 6.56 ± 2.07 cells for each MHC clonotype (17/20 in the DN4 unusual cluster, P < 0.0001). This confirms the increased proliferation implied by the transcriptome signature of the MHC developing cells in this unusual cluster.

Eight of the top 20 clonotypes are found in the γ/δ T/ILC cluster and five of these are shared with the MHC DN4 unusual cluster implying T lineage developmental options may remain open without delivery of appropriate preTCR-pMHC-dependent regulatory signals. Of interest is the observation that cells expressing the same unique β clonotype, particularly those MHC developing cells, tend to group closely together within the UMAP cluster implying conservation of the transcriptional signature, even for occasional clonotypes split between clusters (Extended Data Fig. 6).

The transcript signature of the DN4 unusual population, with upregulation of early progenitor proliferative genes and of Spi1 controlling the myeloid/T lineage decision point at the DN2a/DN2b transition, connotes a de-differentiation of the DN4 cells in the absence of appropriate preTCR signalling. To examine the possibility that this uncommon transition may generate a transcriptional landscape consistent with aberrant transformation potential, we performed single sample gene set expression analysis (ssGSEA) against cancer modules followed by more refined comparisons with clinically defined T-ALL gene sets. By ssGSEA analysis, the DN4 “unusual” cluster shows a strong score (>1000) against 9 of the top 10 modules defined by maximal score difference from the DN3b/4 clusters of both the MHC+ and MHC DN4 libraries (Fig. 3e). Leukaemia/lymphoma transcriptomes show significant co-ordinated regulation with all 9 of these gene set modules. In contrast, the DN3b/4 clusters of both MHC+ and MHC DN4 libraries tracked together and showed weaker association or even inverse correlation. Further refinement of this analysis compared expression in the DN4 unusual population with published transcript panels defining T-cell acute lymphoblastic leukaemia (T-ALL) focussing upon Early T-cell Precursor ALL (ETP-ALL), a subset of T-ALL with poor prognosis in humans and believed to develop from early thymic progenitors immigrating from the bone marrow31,32. The selected transcripts were grouped as being common to T-ALL generally, or alternatively, representing DN1/2a ETP-ALL prior to committing to the T lineage (“early”), or DN2b ETP-ALL after commitment to the T lineage (“late”). Transcript representation within the MHC DN4 unusual cluster subsequently was compared with that in MHC+ DN3b/4 cells following the expected developmental trajectory (Fig. 3f). Seven of 10 transcripts representing the common panel trended to upregulation, while three showed no change. Except for Spib that is weakly upregulated, none of the transcripts in the DN2b-ALL “late” panel were upregulated. In contrast, 5 of the 8 selected genes in the DN1/2a-ALL “early” panel were significantly upregulated, and the remainder all trended upwards, compatible with the cells dedifferentiating from a DN4 state back towards the early progenitor state. As no significant differences were noted in CDR3 length or hydropathy between DN4 Vβ clonotypes developing on MHC+ versus MHC stroma (8 and Supplemental Information File 3), the abnormal transcriptome likely emanates from lack of preTCR ligation by MHC and not aberrant preTCR sequences per se.

Abnormal DP subset and dedifferentiation

The DN4 unusual cluster forms one section of a bipartite UMAP cluster that also includes a unique population found only in the MHC DP thymocyte-like library leading to its classification here as abnormal (Fig. 1b, c). This DP population projects away from the DN4 component due to the expression of Cd4, Cd8a and Cd8b1 (Extended Data Fig. 5g) but maps to the same DN4 cluster projection due to the strong expression of AY036118, histones, early progenitor-related transcripts, Spi1 driving non-T lineage commitment in the early DN stages and markers not strongly expressed in the MHC+ developing DPbl or DPsm clusters (Fig. 4a). Remarkably, the most significantly upregulated transcripts in this DP abnormal population are transcripts that define the myeloid lineage: Mpo (myeloperoxidase), Prtn3 (proteinase 3), Ctsg (cathepsin G), and Elane (neutrophil elastase), where expression is specific to this cluster without expression in any of the DN3a to DPsm clusters representing the expected developmental trajectory or in the MHC DN4 unusual cluster (Extended Data Fig. 5h). Selecting transcripts that are upregulated throughout the DP abnormal cluster confirms the signature AY036118 profile, cell cycling and DNA packaging using Hist1h1d as representative of a broad spectrum of histones, Plac8 as an oncogenic driver as well as the key myeloid markers, Prtn3 and Mpo (Fig. 4b). SPI1, LYL1, LMO2 and MEFC2 are dominant components of a panel defining human ETP-ALL33 and the mouse homologues are upregulated in the DP abnormal cluster, where Lyl1 and Spi1 are upregulated above that already seen in the DN4 unusual cluster (Fig. 3a, 4a). Supporting origin from αβTCR T cell lineage, the Spi1+ cells in the MHC DP cluster express β variable region transcripts (Fig. 4g) with fully recombined clonotypic TCR β chains in more than 37% of those cells (Extended Data Fig. 5i).

Fig. 4. Single cell transcriptomics of the MHC DP abnormal cluster disclose dedifferentiation and reprogramming to include a myeloid programme.

Fig. 4.

a. Volcano plot of significant transcript differences (P<0.05) between the DP abnormal population and the DPbl cluster in the same MHC DP library.

b. Violin plots depicting indicated log-normalised transcript levels for selected transcripts well-expressed in the MHC+ DP library DPbl cluster (1212 cells), MHC DP library DPbl cluster (3044 cells), and MHC DP abnormal cluster (2512 cells). Cluster colours are as depicted in Figs. 1 and 2. P values as in panel a.

c. Cluster distribution of top 20 DP clonotypes developing on MHC+ stroma (left panel) together with clonotype tracking (right panel). Colour/ clonotype specification is unique to the panel and bears no relation to colour use in Fig. 4d or Fig. 3c, d. Right panel maintains colour concordance with left panel.

d. DP cluster distribution of top 20 clonotypes developing on MHC stroma with right panel depicting clonotype tracking.

e. ssGSEA scores for MHC+ DPbl (green), MHC DPbl (light orange), and MHC DP abnormal (red) cells analysing correlation with module gene sets with co-ordinated gene regulation in the indicated tumours (MSigDb C4).

f. Comparison of MHC DP abnormal cluster with the DP cluster developing on MHC+ stroma for transcripts reported as dysregulated in human AML (T-ALL/AML common) or as overexpressed in CD34+ leukaemic stem cells (LSC) from AML patients (LSC17).

g. Co-expression of Spi1 and Trbv transcripts in Spi1+ MHC DP abnormal cells. Cell numbers for MHC+ DPbl, MHC DPbl, and MHC DP abnormal as in panel b, and 221 DP abnormal Spi1+ cells.

h. Schematic of proposed dedifferentiation for thymocytes developing in the MHC condition (pink area).

For box plots, the box bounds the 1st to 3rd quartiles; where visible, the dotted line within represents mean, and the solid line represents median. Whiskers above and below (maximum and minimum) are defined as (quartile 3 + 1.5 * interquartile range) and (quartile 1 − 1.5 * interquartile range), respectively.

To investigate the possibility that the DP abnormal cluster arose from aberrant expansion of one HSC in the progenitor pool, the fraction of Chr:Y+ cells in each cluster was determined. These results do not support stochastic growth independent of stromal cell MHC expression (Supplemental Information File 6). Further strengthening the proposal that the DP abnormal cells are following a path deviating from the wild-type pathway, 80% of the Mpo+ cells and 84% of the Mpo+Spi1+ cells are co-expressing Lck and/or Cd3e (Extended Data Fig. 5j). Confirmation that the expression of myeloid transcripts occurred in T lineage-committed cells and was not due to contaminating myeloid progenitors was independently derived via a separate set of experiments examining DP cells developing from purified DN3a and DN4 cells in vitro (Supplemental Information File 4 and Extended Data Fig. 5km). Significantly higher levels of Mpo and Spi1 transcripts were detected in the FACS sorted MHC DP cells by quantitative RT-PCR (qRT-PCR; Extended Data Fig. 5l, m).

Both NKT cells and Mucosal-associated invariant T cells (MAIT) develop from the DP population but there is no evidence that these cells are developing as an alternative path to canonical αβ TCR cells in the absence of pMHC ligation based on two orthogonal findings in our data. First, their respective transcriptional signatures do not map to the DP abnormal population (Rorc, Tbx21 and Gata3 for NKT; Zbtb16, Drosha, and Il18 for MAIT32,34, although Mr1 is 2-fold upregulated). Second, all 3 TCR β chains restricted to mouse NKT cells (Trbv1, Trbv13 alleles and Trbv29) are downregulated in the DP abnormal population while those β chains restricted to mouse MAIT cells are downregulated (Trbv13 alleles) or unchanged (Trbv19) (Fig. 4a)34.

As observed for the DN4 libraries, the distribution of the top 20 clonotypes by cell representation was markedly different between the MHC+ DP library and the MHC DP library (Extended Data Table 3b). Average cell representation of each clonotype was higher in the MHC condition (Fig. 4c, d), and tracking showed that this difference was retained in the DPsm and pre-apoptotic clusters. Representatives of both the DP abnormal and of the DPsm are found in the DPbl population, but there is minimal overlap between the DPsm and DP abnormal cells implying cluster destiny is specified at the DPbl stage. Given the proliferative transcript signature, the early progenitor profile, and the presence of the myeloid markers, the DP abnormal population was examined by ssGSEA for evidence of entering into a state conducive to future myeloid dysplasia or leukaemia development (Fig. 4e). MSigDb C4 cancer module 489 generated the highest differential score, a cell profile that is strongly associated with leukaemias including T-ALL and AML. The signature panel for regulatory gene abnormalities in CD34+ leukaemic stem cells isolated from acute myeloid leukaemia (AML) patients overlaps completely with that for T-ALL (Fig. 3f, 4f)35. A further AML CD34+ leukaemic stem cell panel (LSC17)36 was used to assess any potential equivalence of the DP abnormal cells with transformed AML leukaemic stem cells (Fig. 4f). The DP abnormal cells expressed 7/10 signature transcripts in the T-ALL/AML common panel at levels significantly higher than developing DPbl cells. The minimal change in Gata3 and Runx1 may indicate the ongoing T cell lineage programme in both subpopulations. Comparison with representatives of the human LSC17 panel found significant upregulation of 4/9 markers with a further 3 trending upward. Cd34, the canonical haematopoietic stem cell marker, was the most profoundly upregulated (293-fold). The high expression of Cd34 coupled with persistence of Erg (Fig. 1a) in the MHC DP abnormal population, absent in the MHC+ libraries, points to an earlier progenitor environment in the absence of pMHC-driven preTCR signalling.

B2m and H2-Ab1 dKO preTCR signalling

We next examined gene expression in the thymus of MHC+ B6 mice and mice on the same background carrying double knockout (dKO) mutations for both B2m and for H2-Ab1 previously created to abrogate expression of MHCI and MHCII21 (MHC). We ascertained whether the phenomena observed in vitro were recapitulated in vivo. Cell recoveries indicated a significant increase in DN3a cells in the MHC thymi, a differential that extended less significantly through the DN3b to DN4 stages (Extended Data Fig. 7a). Examining gene expression for cells transitioning from the DN3a to immature CD8 single positive (ISP) stage (Supplemental Information File 4), strong downregulation of B2m and moderate downregulation of H2-Ab1 was observed for H-2 negative thymocytes (Extended Data Fig. 7b). For the transcript changes occurring in the DN3a to ISP transition depicted in Fig. 1a, however, we found no difference between MHC+ and the MHC thymocytes (Extended Data Fig. 7c). Moreover, no reduction in the Trbv transcription at the DN4 stage was observed in the MHC thymocytes (Extended Data Fig. 7d). In contrast, a defined hallmark of preTCR signalling, the upregulation of anti-apoptotic Bcl2a1 family transcripts, was clearly observed in the MHC+ but not detected in MHC DN4 thymocytes (Extended Data Fig. 7e), while canonical Bcl2 pathway transcripts were similar in both37. Upregulation of Trav transcripts dependent upon preTCR signalling was significantly stronger in the MHC+ than in MHC mice (P=2x10−8; Extended Data Fig. 7f)38. Of interest, the Pim1 proto-oncogene associated with foetal haematopoiesis and overexpressed in myeloid and lymphoid leukaemias39 is one of the strongest expressed transcripts detected in the MHC libraries but barely detected in the MHC+ libraries (Extended Data Fig. 7e). Analysis of complete β chain repertoires for the entire thymus representation of DN3a to ISP cells was uninformative; for each library more than 98.9% of the clonotypes were represented by 3 or fewer UMI leading to such high repertoire diversity scores that no significant differences were observed between libraries.

MHCIb upregulation in dKO mice

Remarkably, in all dKO MHC libraries, both H2-T3 (TL) and H2-T22 were dramatically upregulated over those in B6 MHC+ libraries (Extended Data Fig. 7b). In contrast, OP9-DL4 H2-T22 expression was similar between the MHC+ and MHC cells and H2-T3 (TL) was undetectable in either of the isogenic stroma (Extended Data table 2). Noteworthy in MHC libraries was upregulation of H2-Q10 and H2-T-ps, the latter now believed to be protein coding (NCBI Gene ID: 667803).

The enhanced transcription of H2-T22, H2-T3 (TL), H2-Q10 and H2-T-ps genes implies that adaptation in vivo maintains functional β selection by upregulating non-classical minor MHCIb products, thereby compensating for loss of classical MHCIa alleles. This phenomenon is not operative in the OP9 cultures. Our animal studies are not only consistent with the apparent normal phenotypic thymocyte development of B2m and for H2-Ab1 double knockout mice observed previously21 but underscore the complexity of vital in vivo biological signalling including mechanisms to override pathway blockade via compensatory adaptation. Nonetheless, preTCR signalling is not entirely normal as evidenced by the failure to observe upregulation of Trav and Bcl2a1, in agreement with the suggestion that the narrow width of the MHCIb α1α2 presenting platform relative to that of MHCIa might attenuate preTCR signaling8.

Malignancy despite MHCIb compensation

Given the leukaemia/lymphoma-like transcriptome signatures of the DN4 unusual and DP abnormal clusters, we monitored the health of a cohort of animals for signs of cancer development as they aged. Of seven animals studied at 15 months of age, approximately half the lifespan of a B6 mouse, one developed significant weight loss, scruffy coat appearance, failure to thrive and was sacrificed. The gross pathology depicted in Extended Data Fig. 7g reveals a massive thymus, spleen, and lymph node as well as hepatomegaly. FACS staining of organ-derived cell suspensions of this animal revealed aberrant cell populations including a CD4dullCD8dull DP subset and a CD8dull SP subset in thymus as well as a DP population in the spleen, presumably of thymic origin (Extended Data Fig. 7h). Haematoxylin and eosin staining of fixed tissue sections showed the thymus and spleen to be effaced by tumour, obliterating normal landmarks (Extended Data Fig. 7i). The liver disclosed tumour cells within distended sinusoids consistent with haematogenous spread. The bone marrow was also extensively replaced by the acute lymphoid blast-like cells with a large nuclear to cytoplasmic ratio. By immunohistochemistry (IHC) analysis, the tumour cells were positive for CD8 as well as for TdT, a marker of early T lineage cells (Extended Data Fig. 7j), that together with the distribution of tumour cells is consistent with a thymic origin. The myeloid marker elastase was not detected but the tumour was positive for activated Notch1 (NICD1; Extended Data Fig. 7j).

Pre-TCR-pMHC safeguards development

The current in vitro study reveals that preTCR-pMHC interactions sculpt the transcriptome of DN3 and later stage thymocytes, in addition to fostering β clonotype diversity in the αβ T cell lineage. Three irregular UMAP clusters were uncovered on the MHC stroma. The first, an aberrant DN3b-DN4 transitional population, lacked evidence of preTCR signalling but maintained Notch signalling and manifest a broad decrease in Trbv transcripts. The second, a DN4 unusual population, minimally present in the MHC+ population, abnormally upregulated genes involved in earlier stages of thymic renewal (Lyl1, Kit, Id2, Dtx1 and Bcl11a), T cell co-stimulatory function (Icos), adhesion function (Itgb3) and cytokine receptors involved in inflammation (Il18r, Il23r). The third, an entirely anomalous cluster, DP abnormal, expressed Cd4, Cd8a and Cd8b1, with a subset simultaneously expressing multiple myeloid genes (Mpo, Prtn3, Ctsg, Elane, Hdc and Cst7). Both DN4 unusual highly proliferating cells and DP abnormal cells expressed the AY036118 gene implicated in control of thymocyte proliferation29.

Such aberrations of developmental programs at DN and DP thymocyte stages are noteworthy given that human T-ALL represent aggressive malignancies of these same phenotypic subpopulations including a subset of DN early T cell precursors31,40. Key human genetic abnormalities include instabilities fostering rearrangements and/or deletions of TCRB, TCRA and TCRD loci, genes linked to cell cycle growth control (Cdkn2a or Cdkn2b) and mutations associated with hyperactive Notch signalling41,42. The latter are present in ≥50% of cases, often with additional mutations of transcription factors and signalling pathways43.

In vivo mouse over-expression of transcription factors (Tal1 and Tlx1) also results in T-ALL44,45 with acceleration of disease mediated by additional mutations such as those involving Bcl11b or Notch146,47. Thus the activated Notch1 found in the MHC dKO-derived tumour described herein (Extended Data Fig. 7j) is not unexpected. Strikingly, disruption of competition between “new” bone marrow-derived immigrants and “existing” DN3a thymic resident progenitors leads to aberrant self-renewal of the latter culminating in murine T-ALL reminiscent of human T-ALL in virtually all respects, replete with their development of activating Notch1 mutations48. These DN3a thymic self-renewal progenitors give rise to TCRβ-deficient DP thymocytes with a high frequency of non-productive β gene rearrangements expressing Notch1 and Ptcra transcripts consistent with ongoing Notch signalling49.

The DP blast abnormal cluster’s expression of myeloid genes and transcriptional signatures shared with AML and ETP-ALL stem cells35,36 suggest that a subset of myeloid malignancies may arise from the DP compartment after further transformation, particularly in light of the clinical entity of mixed phenotype acute leukaemia (MPAL) expressing both lymphoid and myeloid malignant markers simultaneously33,50. Thus, rather than singularly arising from ETP, a thymic genesis of certain haematopoietic malignancies could involve de-differentiation from later stages of development including DP thymocytes or even involve transdifferentiation to other lineages. Dedifferentiation is a normal process whereby cells progress in a retrograde manner from a more to a lesser differentiated state as a safeguard against progenitor loss51. Although such phenomena have been induced by chemical or genetic means in the haematopoietic system5254, here we demonstrate that lack of appropriate signalling during development leads to reprogramming.

The UMAP projection localizes the abnormal cluster cells between the expected developmental path and the apoptotic cluster. While detected within a synchronised window of differentiation in vitro, rapid in vivo removal of apoptotic cells by phagocytes in the thymus would obscure the destiny of the unusual and abnormal cluster cells. Their elimination also might be one factor contributing to the discordance between in vitro and in vivo results in addition to compensatory MHC class Ib expression in vivo that maintains orderly developmental progression. Nevertheless, evolution of a thymic leukaemia despite upregulation of MHCIb in dKO mice underscores vulnerability that arises during thymic development in the absence of fully normal preTCR triggering. In the small cohort of ageing dKO mice studied here the frequency of the early T cell malignancy is 14%. In mice lacking the Vβ domain of the preTCR, the singular pMHC binding site on the receptor, tumour penetrance is 80% over a comparable time frame55. While additional dKO cohorts must be studied to generate robust statistics and tumour genetic features, this differential might be consequent to DN3 thymocytes in dKO mice interacting with MHC class Ib ligands via a normal preTCR in contrast to Vβ domain minus preTCR expressing mice whose preTCRs are incapable of binding to any MHC molecules.

We postulate that self-pMHC reactivity triggers preTCRs on thymocytes during β-selection, attendant downregulation of Notch signalling, modulation of cell-cell adhesion, migration, and metabolism. Recent studies demonstrated formation of an immunological synapse between DN3a thymocytes and stroma, thereby creating a preTCR platform around β-selection to integrate cues involving Notch ligand, the CXCR4 ligand as well as pMHC on thymic stroma likely involving asymmetric cell division to foster further differentiation56,57. A cellular niche of this type could serve as a pivotal nexus within the developmental circuit to terminate cellular plasticity and foster orderly downstream development. This circuit can go awry, however, if preTCR-pMHC ligation falters due to absent functional ligands, should there be disruption of the appropriate signalling pathways, or if there is dysregulated entrance of progenitors into and/or exit from their developmental niche. Likewise, generation of intra-thymic AML can be understood as a possible consequence of early developmental plasticity and thymic niche anomalies.

TCR gene rearrangement processes necessary for T-lineage repertoire formation bracket the β-selection that fosters clonal expansion and repertoire diversification, thereby creating a further vulnerability for tumourigenesis. Somatic TCR repertoire formation affording protective adaptive immunity incurs this potential cost. Our findings emphasize that while thymocyte progression per se can occur in the absence of classical MHC ligand-dependent preTCR function, those self-pMHC interactions are essential for normal development and to mitigate aberrant dedifferentiation. The observed upregulation in vivo of non-classical MHCIb in the dKO mice preserves some but not all features of ligand-dependent preTCR function, underscoring this biology but at the same time reveals that a risk of thymic tumour evolution in the absence of classical MHC molecules persists.

Methods

Mice

Six-week-old C57Bl/6 (B6) and B6.129-H2-Ab1tm1Gru B2mtm1Jae N1721 (MHC) mice were purchased from Taconic Farms Inc. and housed at the DFCI Animal Facility, accredited by the Association for Assessment and Accreditation of Laboratory Animal Care (AAALAC). All maintenance, breeding, and experimental procedures were approved under Dana-Farber Cancer Institute Institutional Animal Care and Use Committee (IACUC) protocols 03-138 and 04-113. Euthanasia was by CO2 inhalation followed by cervical dislocation. Following removal from the uterus, E14.5 fetuses were euthanized by decapitation with surgical scissors. Where appropriate, no gender preference was expressed for experimental animal use.

Reagents

The OP9-DL4 parental (MHC+) cell line, and the MHC and scH-2Kb variants, were developed and used as described previously (all mycoplasma negative)7,57. Anti-mouse CD44-APC/Cy7 (clone IM7) and anti-mouse CD117-APC (c-Kit; clone 2B8) were obtained from BD Biosciences. Anti-mouse CD24 and anti-mouse CD24-FITC (clone M1/69), anti-mouse CD3e-BV605 (clone 145-2C11), anti-mouse CD4-Pacific Blue and CD4-BV711 (clone RM4-5), anti-mouse CD8a-PerCP/Cy5.5 (clone 53-6.7), anti-mouse CD8b.2-PE (clone 53-5.8), anti-mouse CD8b-PerCP/Cy5.5 (clone YTS156.7.7), anti-mouse CD11b-biotin (clone M1/70), anti-mouse CD11c-biotin (clone N418), anti-mouse CD19-biotin (clone 6D5), anti-mouse CD28-PE (clone E18), anti-mouse CD45-BV605 and anti-mouse CD45-APC (clone 30-F11), anti-mouse CD45R/B220-BV421 (clone RA3-6B2), anti-mouse NK1.1-biotin (clone PK136), anti-mouse Gr-1-biotin (clone RB6-8C5), anti-mouse Ter-119-biotin (clone TER-119), anti-mouse TCRγ/δ-biotin (clone GL3), streptavidin-BV421, and Zombie Aqua were obtained from Biolegend. Anti-mouse Ly-6A/E (Sca1)-FITC (clone D7) and anti-mouse CD25-PE/Cy7 (clone PC61.5) were obtained from eBioscience.

The following antibodies from Cell Signaling Technology were used for immunohistochemical detection: anti-mouse CD8 (clone D4W2Z), anti-mouse neutrophil elastase (clone E8U3X) and anti-mouse Notch1 intracellular domain (clone D3B8). Anti-mouse TdT (clone EPR2976Y) was obtained from Abcam.

Analysis of B6 thymocyte-like development in vitro

Isolation of wild-type haematopoietic stem cells (HSC) followed the procedure described previously7. Briefly, fetal liver cells from 30 E14.5 B6 embryos from 3 dams were depleted of B cells using anti-CD24 and complement lysis (Cedarlane) followed by staining with anti-CD4-Pacific Blue, anti-CD8-PE, anti-ScaI-FITC and anti-CD117(c-Kit)-APC. The CD4CD8 (lineage negative, lin) ScaI+ c-Kit+ cells (HSC) were isolated by a Becton-Dickinson FACS Aria II cell sorter. For the T cell repertoire analysis from pooled FACS-sorted thymocyte-like cells, 2,000 HSC were seeded onto 70 to 90% confluent layers of wild-type OP9-DL4 cells (MHC+), or onto MHCnegative OP9-DL4 cells (MHC), or onto the MHC cells re-expressing a single chain H-2Kb presenting VSV8 peptide (scH-2Kb), in six-well plates (i.e. six independent cultures) in α-MEM without nucleosides + 15% FCS (OP9 media), HEPES (10mM), and gentamycin supplemented with Flt3 (5 ng/ml; R&D) and IL-7 (1 ng/ml; Peprotech). For the scRNA-seq experiments, 30,000 similarly prepared HSC were seeded onto MHC+ or MHC stromal cells under the same conditions, increasing the replicates to ten 10 cm dishes/OP9 variant. After growth for 9 days, cells were isolated from the cultures and counted prior to FACS separation to enrich by surface antigen phenotype for cells at different stages of thymocyte-like differentiation.

Cell sorting, library preparation and data processing for scRNA-seq and TCR V(D)J repertoire characterization.

For scRNA-seq analysis, cells were stained with a cocktail consisting of Zombie Aqua for gating of non-viable cells, anti-CD45-APC for gating of haematopoietic cells, with biotinylated anti-CD11b, anti-CD11c, anti-NK1.1, anti-mouse TCRγ/δ, anti-Gr-1, anti-Ter119, and anti-CD19 followed by streptavidin-BV421 for gating of non-T lineage cells, and of anti-CD4-BV711, anti-CD8α-PerCP/Cy5.5, anti-CD44-APC/Cy7, anti-CD25-PE/Cy7 and anti-CD28-PE for gating and collection of DN3a, DN3b, DN4 and DP thymocyte-like cells on a FACS Aria II cell sorter (Extended Data Fig. 1). Note that residual ILC-γ/δ-like T cells in the DN4 subset represent cells with a ILC precursor (Id2, Zbtb16), ILC2 (Gata3, Rora), γ/δ T cell-like transcriptome but with no or low surface TCR expression. For each condition (MHC+ or MHC), 50,000 DN3a, DN3b, DN4 and DP cells were collected by FACS for application to a 10X Chromium controller (10X Genomics) and recovery of 8,932 ± 920 (mean ± s.e.m.; n = 8) processed cells for gene expression (5’ GEX) and TCR V(D)J sequence library construction. Recovery for each MHC+ library was as follows - DN3a: 6,970 cells, DN3b: 7,711 cells, DN4: 7,337 cells, and DP: 5,747. Likewise, recovery for each MHC library was as follows - DN3a: 9,453 cells, DN3b: 8,776 cells, DN4: 12,454 cells, and DP: 13,011. Bar coding and 5’ library construction using v1.0 chemistry was performed following the manufacturer’s protocol. Targeted mouse TCR recovery utilised the Chromium Single Cell V(D)J Enrichment Kit for mouse T cells. All libraries were single i7-indexed using the Chromium i7 Multiplex kit. Following isolation and clean-up of library DNA, integrity was assessed using an Agilent Bioanalyzer and quantification by Qubit analysis (Invitrogen). All libraries were adjusted to ~50 ng/µL, where peak fragment size (including Illumina adapters) for the gene expression (5’ GEX) libraries averaged 473 bp and ranged from 300 – 740 bp for the 5’ TCR libraries representing ongoing recombination products in the developing thymocyte libraries. Sequencing (150 PE) was performed on HiSeq 3000 utilizing 4 lanes where two 5’ GEX libraries (2 x 40% of reads) and two TCR libraries (2 x 10% of reads) were sequenced per lane.

Following conversion of the bcl2 sequencing files to fastq format, the 5’ GEX sequencing results were pipelined to Cellranger 3.1.0 using the GRCm38.p6/mm10 mouse genome as reference and the TCR files were pipelined to Cellranger V(D)J 3.1.0 using vdj_GRCm38_alts_ensembl-3.1.0.gz-3.1.0 as reference, all using default parameters (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/algorithms/overview#secondary-analysis ) as described in the following overview. Gene expression data from all libraries were aggregated by Cellranger to generate a UMAP of all libraries projected into the same 2D space. For aggregation, the count output files for each Chromium controller well were processed using the “aggr” command to produce a single feature-barcode matrix containing all the data. Since barcodes may overlap between libraries, a well suffix is added to each barcode-nucleotide sequence to hardcode well origin. Before merging, depth normalization is performed to subsample reads for each library to equalize the number of reads confidently mapped to the transcriptome. Prior to Principal Component Analysis, the UMI counts were normalized towards the median across all cells by multiplying each cell’s UMI count by a scaling factor of the median UMI count across all the cells divided by the UMI count for the cell. The matrix is log-transformed then centered and scaled per-gene such that the mean is 0 and the standard deviation is 1 prior to clustering. Consequently, all data used for differential expression is log-normalized and a pseudocount of 1 was added to both the numerator and denominator of the mean expression. For differential expression analysis and associated P values, Cellranger utilizes an implementation of the exact negative binomial test (https://bioconductor.org/packages/release/bioc/html/sSeq.html ). For a cluster or selected cell subset within a cluster, log2-fold change was either tested against the mean expression for all other cells (global analysis) or against a selected cluster or subset (local analysis) using the Loupe browser 4.2.0 together with the Loupe V(D)J browser 3.0.0 (10X Genomics) for integration of TCR clonotype parameters. The log normalized data for specific clusters was piped into ssGSEA (https://www.genepattern.org/modules/docs/ssGSEAProjection/4# ) for analysis of co-ordinated gene regulation in gene set modules associated with up- or down-regulation in defined cancers (MSigDb C4; http://www.gsea-msigdb.org/gsea/msigdb/collections.jsp ). ssGSEA scores greater than 1000 are considered as strongly associated, equal to zero as showing no correlation with module genes, and < 0 as inversely correlated.

Bulk population TCR repertoire protocol and data processing for cells developing in vitro

For the bulk population β repertoire analyses of thymocyte-like cells developing on the MHC+ and MHC stromata, respectively, cells were stained with anti-CD45-APC, anti-CD4-Pacific Blue, anti-CD8-PE, anti-CD25-PE/Cy7 and anti-CD44-APC/Cy7 for simultaneous collection of DN3, DN4, DPbl and DPsm thymocytes on a FACS Aria II cell sorter (Supplemental Information File 4). Contaminating OP9 cells expressed GFP permitting their exclusion while selection for CD45 expression ensured only haematopoietic cells were used for subset delineation. Cells were gated as CD4CD8 (double negative, DN) and CD4+CD8+ (double positive) from which 10,000 cells each of DN3 (CD25+CD44) cells, DN4 (CD25CD44) cells, DPbl (blast cells, in cell cycle; CD4+CD8+ high forward scatter) and DPsm (small, more mature cells; CD4+CD8+ low forward scatter), were collected. For each population, the cells were immediately deposited into TCL x2 lysis buffer (Qiagen) supplemented with 2-mercaptoethanol (1%) on ice, snap-frozen by immersion in dry-ice-methanol and stored at −80°C until processed for RNA extraction and β chain repertoire analysis.

Total RNA was extracted from each sample of 104 cells using the PicoPure column purification system (Applied Biosystems). Subsequently, the procedure followed precisely that described by Mamedov et al.58. Briefly, using a 3′ Trbc (TCRβ constant region) universal primer, 1st strand cDNA was synthesized from the starting RNA and a universal “Switch” primer ligated to the 5′ ends. Nested/extended PCR amplification through the universal ends yielded unbiased amplification of transcripts containing the complete V(D)J region and a 5′ segment of the Trbc. In the second PCR, pentanucleotide bar codes were introduced to tag each library with unique barcodes at both 5′ and 3′ ends. Following quality control using the Agilent 2100 Bioanalyzer and Illumina adapter addition, samples were sequenced (150 PE) on the MiSeq platform. Library sequences were deconvoluted from the fastx sequence output files using the barcode splitter module of the FASTX toolkit (http://hannonlab.cshl.edu/fastx_toolkit/index.html ). The deconvoluted library sequences were aligned to Vβ regions in the GRCm38.p6/mm10 mouse genome followed by clone assembly and CDR3 extraction using the MiXCR suite running under Java59. Output provided V, D, J, and Cβ usage, CDR3 nucleotide and amino acid sequence, sequence quality, and relative representation by read count. The VDJtools analytical package was used to track and compare clonotypes within the libraries60.

Comparative transcriptomes of MHC+ and MHC stromal cells

Total RNA was extracted from OP9-DL4 MHCI+ parental cells and from OP9-DL4 MHC cells and processed commercially for standard RNA-Seq and data processing (Novogene). Briefly, following alignment to the GRCm38.p6/mm10 mouse genome using STAR 2.7.3a (https://github.com/alexdobin/STAR/releases), gene expression quantification was determined as Fragments per kilobase of transcript sequence per million base pairs sequenced (FPKM) followed by differential gene expression determined by DESeq2 (https://bioconductor.org/packages/release/bioc/html/DESeq2.html ) yielding expression level, log2 fold-difference between the two OP9-DL4 variants, a P value and a Padj utilising the Benjamin-Hochberg correction to control the false discovery rate.

Detection of myeloid transcripts in DP cells developing in vitro

Isolation of wild-type HSC followed the procedure described previously. Thirty thousand CD4CD8 lin ScaI+ c-Kit+ cells (HSC) were isolated by FACS and placed onto 3 x 105 MHC+ OP9-DL4 cells seeded 1 d previously in OP9 media. After growth for 7 d, cells were isolated from the culture, counted, and then stained with a cocktail of biotinylated anti-CD11b, anti-CD11c, anti-NK1.1, anti-Gr-1, anti-Ter-119, anti-CD19, and anti-γδ TCR. Cells were subsequently stained with streptavidin-BV421, anti-CD45-BV605, anti-CD4-BV711, anti-CD8β-PerCP/Cy5.5, anti-CD25-PE/Cy7 anti-CD44-APC/Cy7, anti-CD28-PE, and Zombie Aqua. Viable DN3a and DN4 thymocytes were collected by gating initially for Zombie Aqua GFP CD45+ cells, with subsequent gating for the CD4CD8 DN and sorting of 1,000 DN3a (CD25+CD44CD28) cells, and 1,000 DN4 (CD25CD44CD28+). Those cells were placed on 50,000 MHC+ or MHC OP9-DL4 cells seeded 1 d previously. After culture for 12 days, cells were isolated from the culture, counted, and stained with anti-B220-BV421, anti-CD45-BV605, anti-CD4-BV711, anti-CD8β-PerCP/Cy5.5 and Zombie Aqua. CD4+CD8+ DP cells were sorted into TCL x2 lysis buffer (Qiagen) supplemented with 2-mercaptoethanol (1%) on ice and stored at 80°C. Total RNA was prepared (RNeasy Micro; Qiagen) and quantitative RT-PCR was performed on an Applied Biosystems 7500 Fast using a SYBR green Superscript/Taq master mix (Power SYBR Green RNA-to-CT 1-Step Kit; ThermoFisher) and the following PrimerBank-validated primers61: mouse Mpo-F: AGTTGTGCTGAGCTGTATGGA; mouse Mpo-R: CGGCTGCTTGAAGTAAAACAGG; mouse Spi1-F: ATGTTACAGGCGTGCAAAATGG; mouse Spi1-R: TGATCGCTATGGCTTTCTCCA; mouse Actb-F: GGCTGTATTCCCCTCCATCG; mouse Actb-R: CCAGTTGGTAACAATGCCATGT. Data was processed using the LinRegPCR package62.

Gene expression and total β clonotype analysis of thymus DN3 to ISP cells

The thymus from each of 3 B6 and 3 MHC mice (all males aged 3 weeks) was isolated and the cells dispersed into RPMI-1640 medium treating each thymus as an individual sample. The cells were incubated with anti-CD4 (clone L3T4) covalently linked to microbeads (Miltenyi Biotec) used at a ratio of 100 µL beads/108 cells then incubated for 10 min on ice. The thymocyte/microbead mixtures were applied to replicate LS MACS columns in a MidiMACS separator and unbound cells collected as CD4-depleted populations removing DP thymocytes and CD4SP thymocytes. The CD4-depleted populations were then sorted to remove the non-T lineage cells as described above (viable, non-T lin < 0.05% CD4+). Following gating on DN cells (CD4CD8), cells were gated further on the DN3/4 population (CD44), and then into three further gates of CD25hiCD28lo/int (DN3a), CD25intCD28hi (DN3b), and CD25loCD28hi (DN4) as outlined in Supplemental Information File 4. Following gating on the CD8+ cells in the viable, non-T lin population, the cells were further gated on the CD24hiCD3 cells (Immature single positive; ISP) and isolated populations collected into TCL lysis buffer as described above. For each mouse this procedure yielded the complete representation of all phenotypically defined DN3a, DN3b, DN4, and ISP thymocytes. Total RNA for each population was prepared using the RNAqueous-4PCR protocol (Applied Biosystems/Life Technologies). From the isolated total RNA, 200 ng was removed for NGS library preparation (SMART-Seq v4 Ultra Low Input RNA, Takara), Illumina adapter addition, and sequencing (PE150, Novaseq platform, ~40×106 reads/sample) for gene expression analysis (Medgenome). Read count data were normalized using DESeq2. The aligned reads were used for estimating expression of the genes using cufflinks v2.2.1. The expression values are reported in FPKM (Fragments per kilobase per million) units for each gene. The remaining RNA was used for total population Trbv repertoire determination following the protocol of Mamedov et al58. with minor differences to the procedure described above. To reduce errors introduced by PCR amplification as well as estimate individual RNA contributing to a particular clonotype, the “Switch” primer incorporated a region with a universal molecular identifier (UMI) motif of 12 nucleotides within which were interspersed several deoxyuridine nucleotides subsequently treated after cDNA synthesis with uracyl deglycosylase to prevent participation of the Switch primer in the downstream PCR reactions. The individual barcoded DN3a, DN3b, DN4 and ISP libraries for each animal were pooled and Illumina adapters added to generate one total thymus library of these stages for each animal. Following sequencing (PE150, Novaseq platform, Medgenome), library deconvolution, assembly, alignment and UMI processing was handled by the MIGEC package63 to determine β clonotype repertoire based on UMI rather than total reads. The output was then pipelined directly to the VDJtools package60 as described above. Repertoire diversity was assessed using CalcDiversityStats module of the VDJtools package based on the D50 and Diversity Index (DI)64 that yields a value in the 0 – 1 range where 1 = maximal diversity.

Histological and immunohistochemical analyses

Tissue samples were fixed in formalin then paraffin-embedded using standard procedures. Sections of 5 μm were prepared, attached to slides, sections including bone were demineralized, and the samples processed for haematoxylin and eosin staining on the automated Histocore Spectra ST platform (Leica).

Immunohistochemistry was performed on the Bond III automated staining platform (Leica) using the Biosystems Refine Detection Kit (Leica). Staining utilized the following antibodies at the indicated dilution: anti-CD8 (1:200) with citrate antigen retrieval; anti-TdT (1:100) with EDTA antigen retrieval; anti-neutrophil elastase (1:200) with EDTA antigen retrieval; and anti-Notch1 intracellular domain (NICD1; 1:50) with EDTA antigen retrieval.

Y chromosome fractional analysis of transcriptionally defined clusters

To address the possibility that the well-represented DN4 unusual population (1776 cells, 14.3% of DN4 library) and DP abnormal population (2512 cells, 19.3% of DP library) in the MHC condition represent the clonal development of a single or limited number of aberrant progenitor cell(s) during the 9-day culture, skewing of the initial HSC male-to-female cell ratio was assessed by examining the XY cell fraction in each cluster. Given that the initial seeding of 30,000 foetal liver progenitors/initial culture plate originated from a common pool, the ratio of male (XY) cells to female (XX) cells should be maintained across all MHC+ and MHC cultures through to the isolation of phenotypically defined subsets and transcriptionally defined subsets (clusters) within. Accordingly, a skewed XY/XX ratio within a cluster may indicate non-uniform clonal expansion.

Chromosome Y transcripts likely to be expressed were determined as described in the Supplemental Information File 6 (and 65,66) and the list screened against all libraries to generate a panel of 4 transcripts (Ddx3y, Eif2s3y, Kdm5d, Uty) found to be consistently expressed and detectable across all libraries and clusters. The representation of these transcripts was then used to define presence of a Y chromosome. The reverse procedure utilizing transcripts with increased representation in XX cells was not feasible as none of the identified, skewed, transcripts were detected at levels high enough or specifically enough to characterize a cell as definitively XX. Consequently, results were expressed as fraction of cells within a cluster characterized as XY. To exclude observed skewing being a result of apoptotic or other processes occurring in a specific cluster independent of supporting stroma, a panel of genes matched to expression of the Y transcript panel was determined. This latter panel of autosomal gene transcripts (Cdk8, Slc25a5, Pank1, DFFB) provided an internal control for cluster-specific skewing within the stroma-specific libraries unrelated to aberrant clonal development (Supplemental Information File 6).

Statistics

For all gene expression results, P represents the adjusted p value (Padj) where P<0.05 was the threshold for significance. Standard parametric statistics followed by Students t-test and 2-tailed probabilities were used for all group comparisons. For comparison of small lists of transcripts representing gene expression levels (e.g. Trbv alleles), paired t-test or the Chi-square test (utilizing MHC+ as “expected”) with two-tailed probability were used. No sample size calculations were undertaken, no randomization of samples was performed and there was no blinding of samples.

Extended Data

Extended Data Fig. 1 |. Schematic for FACS isolation of thymocyte subsets (DN3a, DN3b, DN4, DP) for 10X scRNA-Seq and single cell TCR α and β chain clonotype sequencing.

Extended Data Fig. 1 |

Sorted cells were isolated as DN3a cells (CD25+CD44CD28), DN3b cells (CD25+CD44CD28+), DN4 (CD25CD44CD28+) cells, and DP (CD4+CD8+) cells.

Extended Data Fig. 2 |. Cluster delineation of DN3a to DPsm cell transitions.

Extended Data Fig. 2 |

For each transition, data from the Immune Genome Project (IGP) microarray and RNA-Seq data was used to construct a panel representing genes with the highest fold-change between phenotypically defined stages of thymocyte differentiation. The gene panel was then used to query the MHC+ thymocyte clusters identified by UMAP projection. Combination of library phenotype together with good fit to the interrogating gene panel permitted identification of cluster relationships and developmental trajectories. a. Delineation of early post-β selection checkpoint DN3a/3b thymocytes from pre-β selection checkpoint DN3a thymocytes by differential gene expression. The left-hand heatmap depicts a panel selected by comparison of DN3b thymocyte gene expression from the IGP with DN3a cell expression. The same genes were examined for expression in the clusters defined as DN3a and DN3a/3b in Fig.1B (right-hand heatmap). The volcano plot depicts the log2-fold increase of expression in the DN3a/3b population over DN3a for the expected normal developmental trajectory (x-axis). Note that for all volcano plots reported here, only the significantly changed transcripts are depicted (Padj < 0.05; y-axis). b. Delineation of late post-β selection checkpoint DN3b/4 thymocytes from early pre-β selection checkpoint DN3a/3b thymocytes by differential gene expression. The heatmap on the far left depicts a panel selected by comparison of DN4 thymocyte gene expression from the IGP with DN3b cell expression (neither DN3a/3b nor DN3b/4 transitional states are explicitly defined in the IGP database). Transcripts in red were predicted from IGP data to be upregulated in the DN3b to DN4 transition but are downregulated for the conditions reported here. c. Delineation of late post-β selection checkpoint DN3b/4 thymocytes from DPbl thymocytes by differential gene expression. The DPbl cluster was extracted from the DP library and delineated from the more mature DPsm population by transcriptome signature as described below. d. Delineation of mature DPsm thymocytes from cycling DPbl thymocytes by differential gene expression. The heatmap on the far left depicts a panel selected by comparison of DPsm thymocyte gene expression from the IGP with DPbl cell. Note that during the DPbl to DPsm transition, significant cell cycling transcripts were downregulated thus significantly upregulated transcripts in the volcano plot represent the DPbl cells.

Extended Data Fig. 3 |. Delineating the ILC-γ/δ TCR thymocyte cluster and pro-apoptotic cluster from the main α/β TCR lineage pathway.

Extended Data Fig. 3 |

a. Distinguishing ILC-γ/δ-like cells from DN3b/4 in the DN4 libraries by gene expression. The heatmap on the left shows a manually curated panel of gene transcripts selected by likely high representation in either DN3b/4 or ILC-γ/δ-like cells. Log2 Fold-change (L2FC) and Padj in the DN4 libraries for differential expression between the DN3b/4 clusters and ILC-γ/δ-like clusters are shown in the volcano plot to the right with transcripts associated with ILC development are highlighted in light purple (Id2, Zbtb16, Gata3, Rora). TCR γ and δ transcripts are highlighted in green, and Trbv transcripts highlighted in blue. b. Gene expression profile of the pro-apoptotic cluster. The dominant pro-apoptotic cluster upregulated gene expression changes are similar between all the MHC+ libraries on comparison with the 2 dominant clusters within each of these libraries. All log2-fold changes (L2FC) are relative only to the 3 clusters listed in each heatmap (i.e. local) and not to the average across all clusters in that library.

Extended Data Fig. 4 |. Development and TCR repertoire analyses for cells growing on MHC+, MHC and scH-2Kb stromal support cells.

Extended Data Fig. 4 |

a. Total cell recoveries after 9d development from 2,000 seeded HSC (Representative of 6 experiments examining MHC+ (n = 5), MHC (n = 6), and scH-2Kb (n = 3)). For all box plots, the box bounds the 1st to 3rd quartiles; where visible, the dotted line within represents mean, and the solid line represents median. Whiskers above and below (maximum and minimum) are defined as (quartile 3 + 1.5 * interquartile range) and (quartile 1 – 1.5 * interquartile range), respectively. P ( = 0.0204) determined by two-tailed t test. b. Apparent thymocyte developmental stage representation as fraction of total cells for cultures represented in panel a. c. Stage-specific analysis of β chain clonotype representation/10,000 cells in d9 MHC+, MHC, and scH-2Kb OP9-DL4 development cultures. Representation of data from replicate experiments of data in Fig. 2cf. d. TCR β chain clonotype diversity at DN4 on MHC+, MHC, and scH-2Kb stroma. The total number of TCR β chain clonotypes (black) recovered from 104 cells of each DN4 population isolated after growth for 9d on the varying OP9-DL4 stroma is represented by an ellipse of area in direct proportion to unique clonotype count (5 independent experiments). Percentage shared clonotypes of the total for each condition (MHC+ in blue, MHC in pink, and scH-2Kb in green) is depicted. Note that the area of overlap only approximates degree of sharing to maintain consistent orientation of the ellipses for presentation. The overlap of MHC+ and scH-2Kb for experiments 4 and 5 is <1% and too small to represent in this format. Statistics and P calculated from two-tailed t test presented on left.

Extended Data Fig. 5 |. Transcriptome and selected phenotype comparison of MHC+ and MHC OP9-DL4 cells and select gene expression profiles for the DN4 unusual and DPbl abnormal populations.

Extended Data Fig. 5 |

a. Comparison of MHC+ and MHC OP9-DL4 stromal cells for transcriptome and phenotypic differences. 93.6% of transcripts detected shared by MHC+ and MHC stroma. b. Correlation between cell transcriptomes. Square of two-tailed Pearson correlation coefficient (R2 = 0.958) ideally greater than 0.92 under optimal experimental conditions. c. Differential gene expression is <4% of all transcripts detected. d. Loss of CD1d surface expression in B2m/Tap2 KO MHC OP9-DL4 and confirmation of lack of MHC Class II expression in MHC+ and MHC OP9-DL4. e. Raet expression in MHC+ and MHC OP9-DL4. f. Select transcripts significantly differentially expressed between the MHC DN3b/4 cluster and the DN4 “unusual” cluster. Heatmap depicts log2-fold change (L2FC) of the DN4 “unusual” cluster relative to the DN3b/4 cluster. Actual L2FC values are listed within the heatmap. g. Co-expression of Cd4 transcript with Cd8a and/or Cd8b1 transcripts in an overlay of the MHC libraries focussed on the DN4 unusual, DPbl, and DP abnormal clusters. h. Characteristic myeloid gene transcript expression maps to the MHC DP abnormal cluster. i. Full-length clonotypic TCR β chain transcript expression in 82 of 221 Spi1+ cells (37.1%) in the MHC DP abnormal cluster. j. Mpo-expressing cells in the DP abnormal cluster and the Mpo+Spi1+ subset co-express T lineage Lck and/or Cd3e. k. DP cell yields after 12 d for DN3a and DN4 cells seeded onto MHC+ or MHC- stromal cells. l. Relative expression by qRT-PCR (normalised to Actb = 1000) of Mpo and Spi1 in DP cells developing from DN3a cells seeded 12 d earlier onto MHC+ or MHC stromal cells (Cells pooled from 3 separate cultures; n = 7 qRT-PCR replicates; Mpo: P < 0.00001, Spi1: P = 0.000655). m. Relative expression (normalised to Actb = 1000) of Mpo and Spi1 in DP cells developing from DN4 cells seeded 12 d earlier onto MHC+ or MHC stromal cells (Cells pooled from 3 separate cultures; Mpo: n = 8 qRT-PCR replicates; Spi1: n = 4 qRT-PCR replicates); for l, m: mean ± s.d.; P from two-tailed t-test; representative of 2 independent experiments; Mpo: P = 0.000013, Spi1: P = 0.000233).

Extended Data Fig. 6 |. Highly proliferating clonotypic progeny cluster together by transcriptional signature.

Extended Data Fig. 6 |

a. MHC+ DN4 20 most highly represented clonotypes by cell number. b. MHC DN4 20 most highly represented clonotypes by cell number. The identical MHC+ and MHC DN4 clonotypic cells to those presented in Fig.3d and Extended Data Table 3a are shown in their mapped positions in the UMAP projection. Each clonotype is represented for each panel in a unique colour with cell number indicated in key. Note that colours are not directly related to those used in Fig. 3d.

Extended Data Fig. 7 |. Transcriptome comparison of DN and ISP thymocyte subsets from MHC+ and MHC mice.

Extended Data Fig. 7 |

a. Thymocyte subset cell recoveries from thymi of MHC+ and MHC mice. Mean ± s.d. shown; 3 mice/group; ** p = 0.0169; *** p < 0.0039; **** p < 0.0003 determined by 2-tailed t-test. b. Log2-fold change in expression from global population mean for the MHC knocked out genes (B2m, H2-Ab1), classical and minor MHCI genes, and MHCII genes. Note that for each thymocyte subset there are 3 replicates except for the MHC DN4 cells for which there are duplicates. Asterisks highlight transcripts that are upregulated across all MHC libraries on comparison with MHC+ Q10 (p = 7 x 10−5), H2-T3 (TL) (p = 3 x 10−7), H2-T22 (p = 1 x 10−7) and H2-T-ps (p = 4 x 10−5). P calculated using two-tailed Chi-square test. c. Log2-fold change in expression of all development stage marker genes depicted in Fig. 1a. d. Log2-fold change in TCR Vβ chain segment (Trbv) expression. Mean depicted of triplicates for all libraries except for duplicates for MHC DN4 samples. e. Log2-fold change in Bcl2a1 family transcripts (upper panel), canonical Bcl2 transcripts (middle panel), and Pim1 protooncogene (lower panel). Mean values presented. f. Log2-fold change in TCR Vα chain segment (Trav) expression. Mean depicted of triplicates for all libraries except for duplicates for MHC DN4 samples. g. Display of haematopoietic/immune organs from an MHC dKO mouse with massive thymic growth at 15 months and from age-matched MHC+ control. h. FACS analysis of single cell thymic and splenocyte suspensions stained for CD4 and CD8. Numbers next to gates indicate % of cells in that gate. i. Haematoxylin and eosin staining of representative organs from an age-matched MHC+ wt B6 mouse and an MHC dKO mouse with leukaemic growth. Thymic cortex indicated by ‘c’, and thymic medulla by ‘m’. Cancellous bone indicated by ‘ca’. Arrow indicates leukaemic cell accumulation adjacent and around a hepatic vein. j. Immunohistochemistry of tumour cells in dKO thymus for TdT (immature thymocytes), CD8 (T lineage) and neutrophil elastase (myeloid lineage) and in dKO spleen metastatic focus for the intracellular domain of Notch 1 (NICD1). i, j: For each tissue and condition, the complete section was examined down to the cellular level and the image presented (~1% of each total section) is representative of that complete section. White bar in all images represents 100 μm.

Extended Data Table 1. |. Top 20 DPsm TCR β clonotypes developing on scH-2Kb stroma at d9.

For 3 pairs of CDR3, the pair members are separate clonotypes developing from unique nucleotide sequences. The Count parameter represents the number of high quality, validated reads for a unique clonotype. Clone I.D.’s 14 and 15 encode a sequence already established as the CDR3 of the N15 TCR β chain known to recognize VSV8 peptide presented by H-2Kb both as a mature TCR including the N15 α chain and as a component of the preTCR10.

I.D. Count Trbv Trbd Trbj Trbc sequence notes
01. 12 Trbv15 Trbd1/Trbd2 Trbj2-3 Trbc2 cassfgttsaetlyf
02. 9 Trbv3 Trbd1 Trbj2-3 Trbc2 cassrdrgaetlyf L12 encoded by ctg
03. 1 Trbv3 Trbd1 Trbj2-3 Trbc2 cassrdrgaetlyf L12 encoded by ttg
04. 7 Trbv13-2 Trbd2 Trbj2-7 Trbc2 casgvglggeqyf
05. 6 Trbv1 Trbd2 Trbj2-1 Trbc2 ctcsagtgnyaeqff
06. 6 Trbv13-1 Trbd1 Trbj1-4 Trbc2 cassdgtgnerlff
07. 6 Trbv13-2 Trbd1 Trbj2-3 Trbc2 casgdrttsaetlyf
08. 6 Trbv17 Trbd2 Trbj2-3 Trbc2 casrpglggaetlyf
09. 6 Trbv29 Trbd2 Trbj2-3 Trbc2 casslgwgaetlyf
10. 6 Trbv3 Trbd1 Trbj1-3 Trbc2 casswtnsgntlyf
11. 6 Trbv4 Trbd1 Trbj1-1 Trbc2 cassrqgaevff A8 encoded by gca
12. 1 Trbv4 Trbd1 Trbj1-1 Trbc2 cassrqgaevff A8 encoded by gcg
13. 5 Trbv1 Trbd1/Trbd2 Trbj2-3 Trbc2 ctcsadwggaetlyf
14. 5 Trbv12-1 Trbd2 Trbj2-7 Trbc2 casslrwgdeqyf S3 encoded by agc
15. 1 Trbv12-1 Trbd2 Trbj 2-7 Trbc2 casslrwgdeqyf S3 encoded by agt
16. 5 Trbv13-3 Trbd1 Trbj1-4 Trbc2 casrpgqgerlff
17. 5 Trbv15 Trbdl Trbj1-1 Trbc2 casslrgtevff
18. 5 Trbv15 Trbd1 Trbj1-5 Trbc2 casstgaplf
19. 5 Trbv19 Trbd2 Trbj2-3 Trbc2 cassiwggsaetlyf
20. 5 Trbv2 Trbd1 Trbj2-7 Trbc2 cassqgqgeqyf

Extended Data Table 2. |. Non-classical MHC class I expression in MHC+ and MHC- OP9-DL4.

Sixty-one non-classical MHC are listed67. Dependence on β2m indicated by ‘+’ or ‘no’. Genes labeled as pseudogene (Ps) may result in transcripts, initially classified as non-coding but subsequently found to be protein coding, as in the instances of H2-Q5, H2-Q10, H2-T1, H2-T4, H2-T12, H2-T13, H2-T14, H2-M10.4, H2-M10.6. H2-T-Ps has been provisionally redefined as protein coding. Genes in bold font are not β2m-dependent and have transcript levels above zero measured as transcripts per million (tpm). Small panel at bottom right presents expression data for the CRISPR/Cas9 targets B2m and Tap2 deleted in the MHC variant, Delta-like ligand 4, and major MHCI alleles.

0P9-DU L (tpm)

Gene 32m Ps MHC+ MHC

H2-Q1 + 33 32
H2-Q2 + 0 0
H2-Q3 + 0 0
H2-Q4 + 86 2
H2-Q5 + 0 0
H2-Q6 + 0 0
H2-Q7 + 1 0
H2-Q8 + 0 0
H2-Q9 + 0 0
H2-Q10 + 5 4
H2-T1 + 0 0
H2-T2 + 0 0
H2-T3 (TL) + 0 0
H2-T4 + 0 0
H2-T5 + 0 0
H2-T6 + 0 0
H2-J7 + 0 0
H2-T8 + 0 0
H2-T9 + 0 0
H2-T10 + 51 13
H2-T-ps (+) (+) 0 0
H2-T12 + 0 0
H2-T13 + 0 0
H2-T14 + 0 0
H2-T15 + 0 0
H2-T16 + 0 0
H2-T17 + 0 0
H2-T18 + 0 0
H2-T19 + 0 0
H2-T20 + 0 0
H2-T21 + 0 0
H2-T22 + 1161 753
H2-T23 (Qa-1b) + 404 315
H2-T24 + 21 47
H2-BI + 0 0
H2-M1 + 0 0
H2-M2 + 0 0
H2-M3 + 90 74
H2-M4-ps + + 0 0
H2-M5 + 41 46
H2-M6-ps + + 4 2
H2-M7-ps + + 0 41
H2-M8-ps + + 0 0
H2-M9 + 0 0
H2-M10.1 + 0 0
H2-M10.2 + 0 0
H2-M10.3 + 0 0
H2-M10.4 + 0 0
H2-M10.5 + 0 0
H2-M10.6 + 0 0
Cd1d1 + 112 67
Cd1d2 + 0 0
Raet1a no 0 0
Raet1b no 0 0
Raet1c no 0 0
Raet1d no 109 309
Raet1e no 1128 1100
Fcgrt + 1867 1093
Hfe + 954 1560
Azgp1 no 0 0
Mr1 + 798 571
Other significant transcripts

B2m mutated 9589 2003
Tap2 mutated 1196 630
Dll4 transduced 33378 55021
H2-K1 + 327 95
H2-K1 + 327 95
H2-D1 + 573 481

Extended Data Table 3. |. Well-represented clonotypes in MHC+ and MHC libraries.

a. Top 20 clonotypes proliferating in the DN4 libraries of cells developing on MHC+ and MHC stroma. Frequency refers to cell number expressing the identical clonotype. Clonotypes present in the DN4 unusual cluster are highlighted in bold. Note that the Trbv12-2/13-2 transcript is not a mix of Trbv12-2 and Trbv13-2 but rather the result of an independent recombination event between the 5’ end of Trbv12-2 and the 3’ end of Trbv13-2, an event recently and frequently detected in 10X TCR single cell repertoire analyses. b. Top 20 clonotypes proliferating in the DP libraries of cells developing on MHC+ and MHC stroma. Clonotypes present in the DP abnormal cluster are highlighted in bold.


a MHC+ DN4
MHC DN4
frequency proportion CDR3 amino acid CDR3 length Trbv frequency proportion CDR3 amino acid CDR3 length Trbv


5 0.00197 cassqdrantevff 14 Trbv5 21 0.00745 ctcsadwgganqdtqyf 17 Trbv1
4 0.00158 casswdnyaeqff 13 Trbv10 17 0.00603 ctcsfglggeqyf 13 Trbv1
4 0.00158 cassqgqgantevff 15 Trbv2 15 0.00532 casrdsqntlyf 12 Trbv19
4 0.00158 cgargaevff 10 Trbv20 14 0.00497 casgdetgvsyeqyf 15 Trbv12-2
3 0.00118 casgdagqgggaetlyf 17 Trbv12-2/13-2 13 0.00461 casgdstggdqdtqyf 16 Trbv12-2
3 0.00118 casrdrssyeqyf 13 Trbv13-1 13 0.00461 cassfpasqntlyf 14 Trbv14
3 0.00118 cassdrgvsnerlff 15 Trbv13-3 13 0.00461 cassqdwlnqdtqyf 15 Trbv5
3 0.00118 cassfglggreqyf 14 Trbv15 12 0.00426 casslsrgeqyf 12 Trbv29
3 0.00118 cassihsgntlyf 13 Trbv19 11 0.00390 casrrtggagaeqff 15 Trbv26
3 0.00118 casspgqgadtgqlyf 16 Trbv19 10 0.00354 cassynsgntlyf 13 Trbv10
3 0.00118 cassqdraetlyf 13 Trbv2 10 0.00354 casslednyaeqff 14 Trbv29
3 0.00118 cassqghqntlyf 13 Trbv2 9 0.00319 cassegatevff 12 Trbv13-3
3 0.00118 cassqdrgeqyf 12 Trbv2 9 0.00319 casslrentlyf 12 Trbv12-2
3 0.00118 cgardtntevff 12 Trbv20 9 0.00319 casgegrdfqdtqyf 15 Trbv12-2
3 0.00118 cgardtnsdytf 12 Trbv20 9 0.00319 casgeqyf 8 Trbv12-2
3 0.00118 cgartggyeqyf 12 Trbv20 9 0.00319 cassqewgvqdtqyf 15 Trbv5
3 0.00118 cgardrgreqyf 12 Trbv20 8 0.00283 cassdgtgasaetlyf 16 Trbv13-1
3 0.00118 cassgtyeqyf 11 Trbv26 8 0.00283 cassgtyeqyf 11 Trbv13-1
3 0.00118 casslgqganerlff 15 Trbv3 8 0.00283 cawssgtggyeqyf 14 Trbv31
3 0.00118 cassladwgdtqyf 14 Trbv3 8 0.00283 casspdrgpevff 13 Trbv5

b MHC+ DP
MHC DP
frequency proportion CDR3 amino acid CDR3 length Trbv frequency proportion CDR3 amino acid CDR3 length Trbv


5 0.00197 casslqgansdytf 14 Trbv12-2 14 0.00152 casgdaggtgqlyf 14 Trbv12-2/13-2
4 0.00158 ctcsaagtgfnerlff 16 Trbv1 14 0.00152 cassrdrgqdtqyf 14 Trbv17
4 0.00158 cassrqgantevff 14 Trbv10 13 0.00141 cassqqgantevff 14 Trbv5
4 0.00158 cgardtntevff 12 Trbv20 12 0.00131 cassrrantevff 13 Trbv12-2
3 0.00118 casgedtnsdytf 13 Trbv12-2/13-2 11 0.00120 casslgtggeqyf 13 Trbv15
3 0.00118 casgddsqntlyf 13 Trbv12-2/13-2 10 0.00109 casgdnsplyf 11 Trbv12-2/13-2
3 0.00118 casslrggggaetlyf 16 Trbv15 10 0.00109 cgarghtevff 11 Trbv20
3 0.00118 cassqdwggyeqyf 14 Trbv2 10 0.00109 cassrdtntevff 13 Trbv3
3 0.00118 casssgtgdnqaflf 15 Trbv29 9 0.00098 cassldrgrqntlyf 15 Trbv12-2
3 0.00118 casslgdsnerlff 14 Trbv3 9 0.00098 cassdgtantevff 14 Trbv13-3
3 0.00118 cawsltgqlyf 11 Trbv31 9 0.00098 cgardrantevff 13 Trbv20
3 0.00118 cassqdntevff 12 Trbv5 8 0.00087 ctcsaqantevff 13 Trbv1
3 0.00118 cassldssyeqyf 13 Trbv12-2 8 0.00087 ctcsadqgaetlyf 14 Trbv1
3 0.00118 casslgtgedtqyf 14 Trbv16 8 0.00087 cassldrdrgaeqff 15 Trbv16
3 0.00118 cassqedrdtevff 12 Trbv2 8 0.00087 cassldwggaeqff 14 Trbv16
3 0.00118 cassqdeqyf 10 Trbv2 8 0.00087 casrqgagqlyf 12 Trbv19
3 0.00118 cgardtgsdytf 12 Trbv20 8 0.00087 cassqdtnsdytf 13 Trbv2
3 0.00118 cassrdwgyeqyf 13 Trbv3 8 0.00087 cassqqgaevff 12 Trbv5
3 0.00118 cassqdrgqntlyf 14 Trbv5 7 0.00076 ctcsadqdtqyf 12 Trbv1
3 0.00118 cassqdssyeqyf 13 Trbv5 7 0.00076 cassldsqntlyf 13 Trbv14

Supplementary Material

1855620_SI_Dataset_6
1855620_SI_Guide
1855620_SI_Dataset_5
1855620_SI_Dataset_1
1855620_SI_Dataset_4
1855620_SI_Dataset_3
1855620_SI_Dataset_2
1855620_SD_Fig_4e
1855620_SD_Fig_3e
1855620_SD_ED_Fig_7
1855620_SD_ED_Fig_5
1855620_SD_ED_Fig_4

Acknowledgements

This research was supported by NIH NIAID grant AI136301. CMM and PHL were supported additionally by the Expect Miracles Foundation and the Robert and Renée Belfer Foundation. We gratefully acknowledge Dr. Jia-huai Wang for scientific discussion and insight, Drs. David A. Barbie and Cloud P. Paweletz (Robert and Renée Belfer Center for Applied Cancer Research, Dana-Farber Cancer Institute) for facilitating the scRNA-Seq analyses, and the Dana-Farber/Harvard Cancer Center Specialized Histopathology Core (NIH NCI grant P30 CA006516-57). We thank Steve Moskovitz for graphic design of figures.

Footnotes

Competing interests

The authors declare no competing interest.

Additional information

Supplementary Information is available for this paper. Correspondence and requests for materials should be addressed to: jonathan_duke-cohan@dfci.harvard.edu, ellis_reinherz@dfci.harvard.edu

Data availability

All sequence files deposited in NCBI Gene Expression Omnibus (GEO) under accession GSE186049. Data from the Immunological Genome Project is available at: https://www.immgen.org/ and from MSigDB at: https://www.gsea-msigdb.org/gsea/msigdb/ .

References

  • 1.Hosokawa H & Rothenberg EV How transcription factors drive choice of the T cell fate. Nat Rev Immunol 21, 162–176, doi: 10.1038/s41577-020-00426-6 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Koch U et al. Delta-like 4 is the essential, nonredundant ligand for Notch1 during thymic T cell lineage commitment. J Exp Med 205, 2515–2523, doi: 10.1084/jem.20080829 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Rodewald HR, Ogawa M, Haller C, Waskow C & DiSanto JP Pro-thymocyte expansion by c-kit and the common cytokine receptor gamma chain is essential for repertoire formation. Immunity 6, 265–272, doi: 10.1016/s1074-7613(00)80329-5 (1997). [DOI] [PubMed] [Google Scholar]
  • 4.Shortman K, Egerton M, Spangrude GJ & Scollay R The generation and fate of thymocytes. Semin Immunol 2, 3–12 (1990). [PubMed] [Google Scholar]
  • 5.Kreslavsky T et al. beta-Selection-induced proliferation is required for alphabeta T cell differentiation. Immunity 37, 840–853, doi: 10.1016/j.immuni.2012.08.020 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.von Boehmer H The thymus in immunity and in malignancy. Cancer Immunol Res 2, 592–597, doi: 10.1158/2326-6066.CIR-14-0070 (2014). [DOI] [PubMed] [Google Scholar]
  • 7.Das DK et al. Pre-T Cell Receptors (Pre-TCRs) Leverage Vbeta Complementarity Determining Regions (CDRs) and Hydrophobic Patch in Mechanosensing Thymic Self-ligands. J Biol Chem 291, 25292–25305, doi: 10.1074/jbc.M116.752865 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Li X et al. Pre-T cell receptors topologically sample self-ligands during thymocyte beta-selection. Science 371, 181–185, doi: 10.1126/science.abe0918 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Mallis RJ, Arthanari H, Lang MJ, Reinherz EL & Wagner G NMR-directed design of pre-TCRbeta and pMHC molecules implies a distinct geometry for pre-TCR relative to alphabetaTCR recognition of pMHC. J Biol Chem 293, 754–766, doi: 10.1074/jbc.M117.813493 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Mallis RJ et al. Pre-TCR ligand binding impacts thymocyte development before alphabetaTCR expression. Proc Natl Acad Sci U S A 112, 8373–8378, doi: 10.1073/pnas.1504971112 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Davis MM & Bjorkman PJ T-cell antigen receptor genes and T-cell recognition. Nature 334, 395–402, doi: 10.1038/334395a0 (1988). [DOI] [PubMed] [Google Scholar]
  • 12.Rudolph MG, Stanfield RL & Wilson IA How TCRs bind MHCs, peptides, and coreceptors. Annu Rev Immunol 24, 419–466, doi: 10.1146/annurev.immunol.23.021704.115658 (2006). [DOI] [PubMed] [Google Scholar]
  • 13.Wang JH & Reinherz EL The structural basis of alphabeta T-lineage immune recognition: TCR docking topologies, mechanotransduction, and co-receptor function. Immunol Rev 250, 102–119, doi: 10.1111/j.1600-065X.2012.01161.x (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Saint-Ruf C et al. Analysis and expression of a cloned pre-T cell receptor gene. Science 266, 1208–1212, doi: 10.1126/science.7973703 (1994). [DOI] [PubMed] [Google Scholar]
  • 15.Xiong J, Armato MA & Yankee TM Immature single-positive CD8+ thymocytes represent the transition from Notch-dependent to Notch-independent T-cell development. Int Immunol 23, 55–64, doi: 10.1093/intimm/dxq457 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Petrie HT et al. Multiple rearrangements in T cell receptor alpha chain genes maximize the production of useful thymocytes. J Exp Med 178, 615–622, doi: 10.1084/jem.178.2.615 (1993). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Shinkai Y et al. Restoration of T cell development in RAG-2-deficient mice by functional TCR transgenes. Science 259, 822–825, doi: 10.1126/science.8430336 (1993). [DOI] [PubMed] [Google Scholar]
  • 18.Wilson A, Held W & MacDonald HR Two waves of recombinase gene expression in developing thymocytes. J Exp Med 179, 1355–1360, doi: 10.1084/jem.179.4.1355 (1994). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Klein L, Kyewski B, Allen PM & Hogquist KA Positive and negative selection of the T cell repertoire: what thymocytes see (and don’t see). Nat Rev Immunol 14, 377–391, doi: 10.1038/nri3667 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Fehling HJ, Krotkova A, Saint-Ruf C & von Boehmer H Crucial role of the pre-T-cell receptor alpha gene in development of alpha beta but not gamma delta T cells. Nature 375, 795–798, doi: 10.1038/375795a0 (1995). [DOI] [PubMed] [Google Scholar]
  • 21.Grusby MJ et al. Mice lacking major histocompatibility complex class I and class II molecules. Proc Natl Acad Sci U S A 90, 3913–3917, doi: 10.1073/pnas.90.9.3913 (1993). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Irving BA, Alt FW & Killeen N Thymocyte development in the absence of pre-T cell receptor extracellular immunoglobulin domains. Science 280, 905–908, doi: 10.1126/science.280.5365.905 (1998). [DOI] [PubMed] [Google Scholar]
  • 23.Koller BH, Marrack P, Kappler JW & Smithies O Normal development of mice deficient in beta 2M, MHC class I proteins, and CD8+ T cells. Science 248, 1227–1230, doi: 10.1126/science.2112266 (1990). [DOI] [PubMed] [Google Scholar]
  • 24.Mizsei R et al. A general chemical crosslinking strategy for structural analyses of weakly interacting proteins applied to preTCR-pMHC complexes. J Biol Chem 296, 100255, doi: 10.1016/j.jbc.2021.100255 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Xiao SY, Li Y & Chen WF Kinetics of thymocyte developmental process in fetal and neonatal mice. Cell Res 13, 265–273, doi: 10.1038/sj.cr.7290171 (2003). [DOI] [PubMed] [Google Scholar]
  • 26.Mingueneau M et al. The transcriptional landscape of alphabeta T cell differentiation. Nat Immunol 14, 619–632, doi: 10.1038/ni.2590 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Allman D et al. Separation of Notch1 promoted lineage commitment and expansion/transformation in developing T cells. J Exp Med 194, 99–106, doi: 10.1084/jem.194.1.99 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Fujita T, Yuno M, Okuzaki D, Ohki R & Fujii H Identification of non-coding RNAs associated with telomeres using a combination of enChIP and RNA sequencing. PLoS One 10, e0123387, doi: 10.1371/journal.pone.0123387 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Lin YW & Aplan PD Gene expression profiling of precursor T-cell lymphoblastic leukemia/lymphoma identifies oncogenic pathways that are potential therapeutic targets. Leukemia 21, 1276–1284, doi: 10.1038/sj.leu.2404685 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Li R & Guan MX Human mitochondrial leucyl-tRNA synthetase corrects mitochondrial dysfunctions due to the tRNALeu(UUR) A3243G mutation, associated with mitochondrial encephalomyopathy, lactic acidosis, and stroke-like symptoms and diabetes. Mol Cell Biol 30, 2147–2154, doi: 10.1128/MCB.01614-09 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Coustan-Smith E et al. Early T-cell precursor leukaemia: a subtype of very high-risk acute lymphoblastic leukaemia. Lancet Oncol 10, 147–156, doi: 10.1016/S1470-2045(08)70314-0 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Vadillo E, Dorantes-Acosta E, Pelayo R & Schnoor M T cell acute lymphoblastic leukemia (T-ALL): New insights into the cellular origins and infiltration mechanisms common and unique among hematologic malignancies. Blood Rev 32, 36–51, doi: 10.1016/j.blre.2017.08.006 (2018). [DOI] [PubMed] [Google Scholar]
  • 33.Dai Y-T et al. Transcriptome-wide subtyping of pediatric and adult T cell acute lymphoblastic leukemia in an international study of 707 cases. Proceedings of the National Academy of Sciences of the United States of America 119, e2120787119, doi: 10.1073/pnas.2120787119 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Pellicci DG, Koay HF & Berzins SP Thymic development of unconventional T cells: how NKT cells, MAIT cells and gammadelta T cells emerge. Nat Rev Immunol 20, 756–770, doi: 10.1038/s41577-020-0345-y (2020). [DOI] [PubMed] [Google Scholar]
  • 35.Thoms JAI et al. Disruption of a GATA2, TAL1, ERG regulatory circuit promotes erythroid transition in healthy and leukemic stem cells. Blood, doi: 10.1182/blood.2020009707 (2021). [DOI] [PubMed] [Google Scholar]
  • 36.Ng SW et al. A 17-gene stemness score for rapid determination of risk in acute leukaemia. Nature 540, 433–437, doi: 10.1038/nature20598 (2016). [DOI] [PubMed] [Google Scholar]
  • 37.Mandal M et al. The BCL2A1 gene as a pre-T cell receptor-induced regulator of thymocyte survival. The Journal of experimental medicine 201, 603–614, doi: 10.1084/jem.20041924 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Koyasu S et al. Pre-TCR signaling components trigger transcriptional activation of a rearranged TCR alpha gene locus and silencing of the pre-TCR alpha locus: implications for intrathymic differentiation. International immunology 9, 1475–1480, doi: 10.1093/intimm/9.10.1475 (1997). [DOI] [PubMed] [Google Scholar]
  • 39.Amson R et al. The human protooncogene product p33pim is expressed during fetal hematopoiesis and in diverse leukemias. Proceedings of the National Academy of Sciences of the United States of America 86, 8857–8861, doi: 10.1073/pnas.86.22.8857 (1989). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Reinherz EL, Kung PC, Goldstein G, Levey RH & Schlossman SF Discrete stages of human intrathymic differentiation: analysis of normal thymocytes and leukemic lymphoblasts of T-cell lineage. Proc Natl Acad Sci U S A 77, 1588–1592, doi: 10.1073/pnas.77.3.1588 (1980). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Van Vlierberghe P & Ferrando A The molecular basis of T cell acute lymphoblastic leukemia. J Clin Invest 122, 3398–3406, doi: 10.1172/JCI61269 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Girardi T, Vicente C, Cools J & De Keersmaecker K The genetics and molecular biology of T-ALL. Blood 129, 1113–1123, doi: 10.1182/blood-2016-10-706465 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Zhang J et al. The genetic basis of early T-cell precursor acute lymphoblastic leukaemia. Nature 481, 157–163, doi: 10.1038/nature10725 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Condorelli GL et al. T-cell-directed TAL-1 expression induces T-cell malignancies in transgenic mice. Cancer Res 56, 5113–5119 (1996). [PubMed] [Google Scholar]
  • 45.Kelliher MA, Seldin DC & Leder P Tal-1 induces T cell acute lymphoblastic leukemia accelerated by casein kinase IIalpha. EMBO J 15, 5160–5166 (1996). [PMC free article] [PubMed] [Google Scholar]
  • 46.De Keersmaecker K et al. The TLX1 oncogene drives aneuploidy in T cell transformation. Nat Med 16, 1321–1327, doi: 10.1038/nm.2246 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Rakowski LA, Lehotzky EA & Chiang MY Transient responses to NOTCH and TLX1/HOX11 inhibition in T-cell acute lymphoblastic leukemia/lymphoma. PLoS One 6, e16761, doi: 10.1371/journal.pone.0016761 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Martins VC et al. Cell competition is a tumour suppressor mechanism in the thymus. Nature 509, 465–470, doi: 10.1038/nature13317 (2014). [DOI] [PubMed] [Google Scholar]
  • 49.Paiva RA et al. Self-renewal of double-negative 3 early thymocytes enables thymus autonomy but compromises the beta-selection checkpoint. Cell Rep 35, 108967, doi: 10.1016/j.celrep.2021.108967 (2021). [DOI] [PubMed] [Google Scholar]
  • 50.Khan M, Siddiqi R & Naqvi K An update on classification, genetics, and clinical approach to mixed phenotype acute leukemia (MPAL). Ann Hematol 97, 945–953, doi: 10.1007/s00277-018-3297-6 (2018). [DOI] [PubMed] [Google Scholar]
  • 51.Kai T & Spradling A Differentiating germ cells can revert into functional stem cells in Drosophila melanogaster ovaries. Nature 428, 564–569, doi: 10.1038/nature02436 (2004). [DOI] [PubMed] [Google Scholar]
  • 52.Cobaleda C, Jochum W & Busslinger M Conversion of mature B cells into T cells by dedifferentiation to uncommitted progenitors. Nature 449, 473–477, doi: 10.1038/nature06159 (2007). [DOI] [PubMed] [Google Scholar]
  • 53.Laiosa CV, Stadtfeld M, Xie H, de Andres-Aguayo L & Graf T Reprogramming of committed T cell progenitors to macrophages and dendritic cells by C/EBP alpha and PU.1 transcription factors. Immunity 25, 731–744, doi: 10.1016/j.immuni.2006.09.011 (2006). [DOI] [PubMed] [Google Scholar]
  • 54.Riddell J et al. Reprogramming committed murine blood cells to induced hematopoietic stem cells with defined factors. Cell 157, 549–564, doi: 10.1016/j.cell.2014.04.006 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Jacobs H et al. Oncogenic potential of a pre-T cell receptor lacking the TCR beta variable domain. Oncogene 12, 2089–2099 (1996). [PubMed] [Google Scholar]
  • 56.Charnley M, Ludford-Menting M, Pham K & Russell SM A new role for Notch in the control of polarity and asymmetric cell division of developing T cells. J Cell Sci 133, doi: 10.1242/jcs.235358 (2019). [DOI] [PubMed] [Google Scholar]
  • 57.Mohtashami M et al. Direct comparison of Dll1- and Dll4-mediated Notch activation levels shows differential lymphomyeloid lineage commitment outcomes. J Immunol 185, 867–876, doi: 10.4049/jimmunol.1000782 (2010). [DOI] [PubMed] [Google Scholar]
  • 58.Mamedov IZ et al. Preparing unbiased T-cell receptor and antibody cDNA libraries for the deep next generation sequencing profiling. Front Immunol 4, 456, doi: 10.3389/fimmu.2013.00456 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.Bolotin DA et al. MiXCR: software for comprehensive adaptive immunity profiling. Nat Methods 12, 380–381, doi: 10.1038/nmeth.3364 (2015). [DOI] [PubMed] [Google Scholar]
  • 60.Shugay M et al. VDJtools: Unifying Post-analysis of T Cell Receptor Repertoires. PLoS Comput Biol 11, e1004503, doi: 10.1371/journal.pcbi.1004503 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Wang X, Spandidos A, Wang H, & Seed B PrimerBank: a PCR primer database for quantitative gene expression analysis, 2012 update. Nucleic Acids Res 40, D1144–9, doi: 10.1093/nar/gkr1013 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Ruijter JM et al. Evaluation of qPCR curve analysis methods for reliable biomarker discovery: bias, resolution, precision, and implications. Methods 59, 32–46, doi: 10.1016/j.ymeth.2012.08.011 (2013). [DOI] [PubMed] [Google Scholar]
  • 63.Shugay M et al. Towards error-free profiling of immune repertoires. Nat Methods 11, 653–655, doi: 10.1038/nmeth.2960 (2014). [DOI] [PubMed] [Google Scholar]
  • 64.Han FF et al. Profiling the pattern of human TRB/IGH-CDR3 repertoire in liver transplantation patients via high-throughput sequencing analysis. Scand J Immunol 92, e12912, doi: 10.1111/sji.12912 (2020). [DOI] [PubMed] [Google Scholar]
  • 65.Stevant I et al. Dissecting Cell Lineage Specification and Sex Fate Determination in Gonadal Somatic Cells Using Single Cell Transcriptomics. Cell Reports 26, 3272–3283.e3, doi: 10.1016/j.celrep.2019.02.069 (2019). [DOI] [PubMed] [Google Scholar]
  • 66.Godfrey AK et al. Quantitative analysis of Y-chromosome gene expression across 36 human tissues. Genome Res. 30, 860–873, doi: 10.1101/gr.261248.120 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Forman J & Fischer Lindahl K Listing, Location, Binding Motifs, and Expression of Nonclassical Class I and Related Genes and Molecules. Current Protocols in Immunology 49, A.1M.1–A.1M.13 (2002). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1855620_SI_Dataset_6
1855620_SI_Guide
1855620_SI_Dataset_5
1855620_SI_Dataset_1
1855620_SI_Dataset_4
1855620_SI_Dataset_3
1855620_SI_Dataset_2
1855620_SD_Fig_4e
1855620_SD_Fig_3e
1855620_SD_ED_Fig_7
1855620_SD_ED_Fig_5
1855620_SD_ED_Fig_4

Data Availability Statement

All sequence files deposited in NCBI Gene Expression Omnibus (GEO) under accession GSE186049. Data from the Immunological Genome Project is available at: https://www.immgen.org/ and from MSigDB at: https://www.gsea-msigdb.org/gsea/msigdb/ .

RESOURCES