Skip to main content
Immunology logoLink to Immunology
. 2020 Oct 7;161(4):354–363. doi: 10.1111/imm.13256

Preselection TCR repertoire predicts CD4+ and CD8+ T‐cell differentiation state

Xianliang Hou 1,2, Wenbiao Chen 1, Xujun Zhang 1, Guangyu Wang 2, Jianing Chen 1, Ping Zeng 1, Xuyan Fu 1, Qiong Zhang 1, Xiangdong Liu 3, Hongyan Diao 1,
PMCID: PMC7692249  PMID: 32875554

Four T‐cell subsets were distinguished from one another in both preselection and post‐selection repertoires.

graphic file with name IMM-161-354-g005.jpg

Keywords: cell subsets, deep sequencing, T‐cell receptor

Summary

T cells must display diversity regarding both the cell state and T‐cell receptor (TCR) repertoire to provide effective immunity against pathogens; however, the generation and evolution of cellular T‐cell heterogeneity in the adaptive immune system remains unclear. In the present study, a combination of multiplex PCR and immune repertoire sequencing (IR‐seq) was used for a standardized analysis of the TCR β‐chain repertoire of CD4+ naive, CD4+ memory, CD8+ naive and CD8+ memory T cells. We showed that the T‐cell subsets could be distinguished from each another with regard to the TCR β‐chain (TCR‐β) diversity, CDR3 length distribution and TRBV usage, which could be observed both in the preselection and in the post‐selection repertoire. Moreover, the Dβ‐Jβ and Vβ‐Dβ combination patterns at the initial recombination step, template‐independent insertion of nucleotides and inter‐subset overlap were consistent between the pre‐ and post‐selection repertoires, with a remarkably positive correlation. Taken together, these results support differentiation of the CD4+ and CD8+ T‐cell subsets prior to thymic selection, and these differences survived both positive and negative selection. In conclusion, these findings provide deeper insight into the generation and evolution of TCR repertoire generation.


Abbreviations

CDR3

complementarity‐determining region 3

IR‐seq

immune repertoire sequencing

PBMC

peripheral blood mononuclear cells

RAG

recombination of activating gene

TCR

T‐cell receptor

TCR‐β

T‐cell receptor β chains

TRBV

TCR beta chain variable gene

Introduction

The power of adaptive immunity in humans is achieved through the hypervariable molecules, including T‐cell receptors (TCRs), which have the potential to recognize a wide variety of pathogens and drive a specific immune response. The T‐cell receptor is comprised of a heterodimer of two chains (alpha and beta chains), each of which is encoded by germline components of variable (V), joining (J) and constant (C) regions (the beta chain includes an additional short diversity (D) segment). 1 Recombination of the gene segments and random deletion and insertion of nucleotides on complementarity‐determining region 3 (CDR3) creates extreme diversity in the antigen recognition regions of the TCR. Much of this diversity is focused in the CDR3 region, which is responsible for the interaction with antigenic peptides. Specifically, diversity in the CDR3 sequences provides a measure of T‐cell diversity. 2 However, during the rearrangement process, several rearrangements are invalid. The usefulness of any given rearrangement is dependent on both the sequence and number of DNA bases located between the initiation and termination codons. Consequently, thymus TCR‐β rearrangement follows one of three possible courses: (1) TCR‐β VDJ genes successfully rearrange into one of the two chromosomes, leading to mRNA that translates into a productive TCR‐β chain, which binds to the surrogate pre‐TCR‐α chain (pre‐Tα). This gives rise to the expression of a pre‐Tα/TCR‐β complex (pre‐TCR) on the cell surface. Pre‐TCR‐mediated signalling inhibits further rearrangement of the TCR‐β chain in the other chromosome via recombination of activating gene (RAG) phosphorylation, termed allelic exclusion. Subsequently, cells with a functional pre‐TCR proliferate, and rearrangement is initiated once the proliferation stops the TCR‐α, followed by positive and negative selection 3 ; (2) rearrangement on the first chromosome leads to an out‐of‐frame (OOF) mRNA. In this case, the T‐cell attempts to arrange the second chromosome, and if successful (in‐frame), TCR‐β formation occurs, and the cell will express a pre‐TCR and will undergo proliferation, TCR‐α rearrangement, and positive and negative selection as described above. This T‐cell carries both functional and non‐functional TCR genes. 3 , 4 Non‐functional TCR‐β (OOF) sequences are not affected by any type of selection (either in the thymus or on the periphery) and are thus used to characterize the initial repertoire formed by recombination itself; 3 , 4 , 5 , 6 , 7 , 8 and (3) if rearrangement in both chromosomes leads to OOF mRNA, the cell will not signal through the pre‐TCR and will die. 3

During their development in the thymus, positively selected CD4+CD8+ double‐positive thymocytes, which successfully and appropriately bind to self‐major histocompatibility complex (MHC) antigens, receive critical survival signals and differentiate into CD4 and CD8 single‐positive T cells according to their affinity to MHC class II and class I, respectively. The developing thymocytes subsequently undergo negative selection, a process that kills thymocytes exhibiting strong TCR reactivity towards self‐antigens. Finally, CD4 and CD8 single‐positive T cells enter the circulation and become naive T cells. In the periphery, upon antigen encounter and recognition by the TCR, naive T cells become activated and antigen‐specific T cells clonally expand and differentiate into effector and memory T cells, thereby shaping the TCR repertoire.

Although TCR signalling impacts the fate of T cells, including recruitment, expansion, differentiation, trafficking and survival, it remains unclear how differences in the TCR contribute to heterogeneity in the T‐cell state. In the present study, we applied high‐throughput sequencing to study the human TCR‐β CDR3 repertoires of CD4+ naive, CD4+ memory, CD8+ naive and CD8+ memory T cells, with the aim of evaluating the differences and correlation of the TCR‐β CDR3 repertoire characteristics of the various cell subsets. To gain a better understanding of the early events in thymic T‐cell development and repertoire generation, and compare the TCR‐β CDR3 repertoires before and after thymus selection, we focused on the non‐functional and functional TCR‐β sequences. Non‐functional TCRs were used to study the preselection TCR repertoire, as they are not subject to functional selection (positive and negative selection), whereas functional TCRs were used to study the post‐selection TCR repertoire. Our goal was to produce comprehensive, unrestricted profiles of the TCR‐β repertoire for key T‐cell subsets, which will be essential for quantitative understanding of the generation and evolution of cellular heterogeneity in the adaptive immune system.

Materials and methods

Subjects

Six health volunteers (mean age 54·67 years, range 45–60, 5 female) were recruited into this study. The study was approved by the Ethics Committee of the First Affiliated Hospital, College of Medicine, Zhejiang University, China (Ref No 2015‐313), and was performed according to the tenets of the Declaration of Helsinki. In addition, written informed consent was provided by all participating individuals.

T‐cell isolation and extraction of RNA

20 ml of blood was taken from each volunteer, and the peripheral blood mononuclear cells (PBMC) were isolated over Ficoll gradient. T cells were sorted by FACSAria (BD Biosciences, San Jose, CA,USA) using the following antibodies (all from BD Biosciences): anti‐CD45RO PE (UCHL‐1), anti‐CD45RA APC (HI100), anti‐CD8 FITC (RPA‐T8) and anti‐CD4 PerCP‐Cy5.5 (OKT4). CD4+ naive and CD4+ memory cells were defined as CD4CD8− CD45RACD45RO and CD4CD8− CD45RA− CD45RO+, respectively. CD8+ naive and CD8+ memory cells were defined as CD4− CD8CD45RACD45RA and CD4− CD8CD45RA− CD45RO+, respectively. 9 The number of each T‐cell subset isolated from each donor is shown in Table S1. Flow cytometry (FACS) analysis confirmed that all sorted populations contained more than 500 000 cells and > 95% pure. According to the manufacturer's instructions, the TRIzol reagent was used to extract the total RNA from the sorting cells. cDNA was synthesized using the High Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Foster City, CA, USA).

Sequencing of TCR‐β repertoires and bioinformatic analyses

To amplify the rearranged CDR3 regions of the TCR‐β, a multiplex PCR system was designed using previously described methods. Briefly, the cDNA for each sample was amplified using a Qiagen Multiplex PCR Kit (QIAGEN) with 32 forward primers specific to the FR3 region and 13 reverse primers specific to the junction (J) region of the TCR, as published in our previous study. 10 , 11 The reaction conditions steps were as follows: 0·5 × Q solution, 0·2 μm Jβ R pool, 0·2 μm Vβ F pool and 1× Qiagen Multiplex PCR Master Mix. The cycling conditions were as follows: 95° for 15 min, followed by 30 cycles of 94° for 30 s, 60° for 90 s, 72° for 30 s, plus a final extension of 5 min at 72°. The target amplification product was loaded on a 2% agarose gel, and excised and purified the 100‐ to 200‐bp fractions. And then sequencing and analysing the amplification region on a high‐throughput sequencing platform (HiSeq 2000; Illumina). MiTCR software was used to correct the sequencing errors and PCR amplification bias. 12 In addition, algorithms to eliminate PCR and sequencing errors for the Illumina platform were executed as previously described. 13 We filtered the sequencing background and aligned the raw data to the V‐ and J‐gene segments using the MiTCR program (developed by MiLaboratory; http://mitcr.milaboratory.com/downloads/) by selecting the best V/D/J alignment. Finally, multiple statistics of the TCR data were performed, including the CDR3 length distribution, CDR3 frequency distribution, V‐J pairing and V/J usage. Diversity of the TCR repertoire was assessed based on earlier published work. 14 , 15

Statistical analysis

If not otherwise stated, results are presented as percentages (%) or as mean ± SD values, and paired t‐test, Pearson test, Mann–Whitney U‐test or two‐way ANOVA test was applied to compare the values where appropriate. All analyses were two‐tailed. P value less than 0·05 was considered as statistically significant. Statistical analyses of this study were performed using SPSS20.

Results

We used high‐throughput sequencing technology to investigate the TCR repertoire features of the different T‐cell subsets (CD4CD45RA+, 4RA; CD4CD45RO+, 4RO; CD8CD45RA+, 8RA; and CD8CD45RO+, 8RO) at a sequence‐level resolution by obtaining an average of 20·08 million total raw reads per sample. We obtained an average of 7·12, 5·09, 9·00 and 5·50 million pairs of raw reads from the 4RA, 4RO, 8RA and 8RO repertoires, respectively. After filtering and comparing the data against the MiTCR program, we identified 390766 (23339), 195276 (15502), 155256 (9609) and 53008 (5304) distinct productive (out‐of‐frame) CDR3 nucleotide sequences from the 4RA, 4RO, 8RA and 8RO repertoires, respectively. The TCR‐β CDR3 repertoire was almost identical in the two samples obtained from the same person (Fig. S1); this verifies the accuracy of the results. In addition, the sequencing data were normalized based on the sequencing depth. In the following analysis, out‐of‐frame CDR3 nucleotide sequences were used to study the preselection TCR repertoire, as they are not subject to functional selection (positive and negative selection). Productive CDR3 nucleotide sequences were used to study the post‐selection TCR repertoire.

T‐cell subsets are distinct in TCR‐β diversity in both the pre‐ and post‐selection repertoires

We investigated the distribution in clonal abundance and TCR‐β diversity of the four T‐cell subsets using several evaluation methods, including Shannon index, Gini index and D50 diversity index. For Shannon and D50 index, the greater the index value, the greater the CDR3 diversity; for Gini index, the smaller the index value, the greater the CDR3 diversity. As shown in Fig. 1a,b, and Fig. S2a, the highest diversity was 4RA, followed by 4RO, 8RA and 8RO. The degree of expansion of each individual clone was based on the unique CDR3 sequence frequency (abundance) within the sample. First, the abundance of these TCR‐βs was characterized in three groups based on the reads (corrected) detected by sequencing per distinct TCR‐β: low (1–10 reads), medium (11–100 reads) and high (> 100 reads), as the percentage of the total distinct TCR‐β sequences (Fig. 1c). We found that the mean percentage of low‐ and medium‐abundance TCR‐βs was the highest in 4RA, followed by 4RO and 8RA, with the lowest in 8RO. The mean percentage of high‐abundance TCR‐βs was the highest in 8RO, followed by 8RA and 4RO, and the lowest was 4RA. These results indicate that the TCR‐β diversity of 4RA was the highest and its degree of expansion was the lowest. Subsequently, the mean TCR‐β abundance was calculated in each of the samples. This value was 12·48 ± 3·58, 24·58 ± 7·29, 36·00 ± 15·64 and 119·98 ± 75·82 in the post‐selection repertoires of the 4RA, 4RO, 8RA and 8RO groups, respectively (Fig. S3a). In addition, the clonal expansion was further assessed by calculating the cumulative percentage of the repertoire that constituted by the top 1000 TCR‐β nucleotide clonotypes (Fig. S3B). The results showed that the rank of the degree of expansion (from low to high) was 4RA, 4RO, 8RA and 8RO. It is interesting to note that these differences in the clonal abundance distribution and TCR‐β diversity in the post‐selection repertoire could underlie similar findings in the preselection repertoire (Fig. 1d–d, Figs S2b and S3c,d). In summary, the memory repertoire was less diverse than that of the naive T cells, and the CD8+ T‐cell repertoire was less diverse than that of the CD4+ T cells in both the preselection and post‐selection repertoires. Moreover, we found that the preselection TCR‐β repertoire diversity was obviously higher than that of the post‐selection repertoire (Fig. S4a). Notably, individuals with high diversity in the preselection repertoires also exhibited a high diversity in the post‐selection repertoires, presenting a positive correlation (Fig. S4b). To avoid biases due to differences in the sequence frequencies, most of our following analyses (including CDR3 length distribution, VDJ‐gene usage, indel analysis, overlap calculation and correlations) were performed on unique sequences.

Figure 1.

Figure 1

Degree of expansion and diversity analysis of the TCR‐β repertoire. (a, b) The TCR‐β CDR3 diversity of the four T‐cell subsets was estimated by Shannon diversity index (A) and Gini diversity index (b). (c) Distribution in the clonal abundance of the different T‐cell subsets in healthy individuals. The abundance of these TCR‐βs was characterized into three groups based on the reads detected by sequencing per distinct TCR‐β: low (1–10 reads), medium (11–100 reads) and high (> 100 reads), as the percentage of the total distinct TCR‐β sequences in the post‐selection repertoires. Data were presented as the mean ± SD values and compared using a paired t‐test. *P < 0·05, **P < 0·01; ***P < 0·001; ****P < 0·0001 (two‐tailed). (d–f) The same analysis was performed for OOF nucleotide clonotypes (preselection repertoire) from each sample of the different T‐cell subsets. 4RA, CD4+ CD45RA+ cells; 4RO, CD4+ CD45RO+ cells; 8RA, CD8+ CD45RA+ cells; 8RO, CD8+ CD45RO+ cells.

T‐cell subsets are distinct in the CDR3 length distribution in both the pre‐ and post‐selection repertoires

Different rearrangements may lead to variable CDR3 lengths, and the distribution in the CDR3 sequence lengths is another feature that provides an overall view of the repertoire composition. We found that the average CDR3 length of memory T cells was significantly shorter than that of the naive T cells, at both the CD4+ and CD8+ T‐cell levels. In addition, the average CDR3 length of the CD4+ T cells was significantly shorter than that of the CD8+ T cells, at both the naive and memory T‐cell levels (Fig. 2a). Remarkably, in line with our findings for the productive TCR‐β CDR3s, the CDR3 length distribution of the preselection repertoire displayed similar findings (Fig. 2b). In addition, we also assessed the CDR3 length across the overall TCR‐β repertoires (including the abundance of each clonotype). We found that the difference in CDR3 length between cell subsets still existed (Fig. S5). Moreover, there was a strong positive correlation between the pre‐ and post‐selection TCR‐β CDR3 lengths, for the shorter clonotypes (16–36 nt), medium clonotypes (37–51 nt) and longer clonotypes (52–106 nt) (Fig. S6). Therefore, the cell subsets with a higher percentage of short post‐selection clonotypes also showed a higher percentage of short clonotypes in the preselection repertoire. We next aimed to validate these findings using single‐cell TCR repertoires in previously published thymus TCR studies. 16 The thymus is a critical organ for T‐cell development and TCR repertoire formation, which shapes the adaptive immune landscape. We found that the mean CDR3 length of the double‐positive (DP) T cells was shorter than that of the single‐positive (SP) T cells, in both productive TCR‐β CDR3s and non‐productive repertoire CDR3s (Fig. S7).

Figure 2.

Figure 2

Differences in the T‐cell subsets regarding the CDR3 length arise from germline sequences. (a, b) Mean length of unique OOF/productive TCR‐β CDR3 nucleotide sequences (a, productive; b, OOF) from CD4+ naive (4RA), CD4+ memory (4RO), CD8+ naive (4RA) and CD8+ memory (8RO) T cells. (c) Comparison of the mean inserted length among the four T‐cell subsets in pre‐ and post‐selection repertoires. Data were presented as the mean ± SD values and compared using a paired t‐test. (d, e) There was no obvious association between the mean inserted length and mean length of CDR3 among the four T‐cell subsets. The correlation between the mean inserted length and mean length of CDR3 was calculated for the preselection repertoires (d) and post‐selection repertoires (e). (f, g) There was a significant correlation between the mean length of the germline sequences and mean length of CDR3 in the preselection repertoires (f) and post‐selection repertoires (g). (h) A positive correlation was detected between the preselection repertoires and post‐selection repertoires in the mean length of the germline sequences.

Cell subsets can be distinguished based on the CDR3 length distribution stemming from the original germline sequences

In summary, we found that there was a statistically significant difference in the CDR3 length distribution among the four T‐cell subsets (4RA, 4RO, 8RA and 8RO). To understand the molecular basis of these features, we analysed the recombination events (nucleotides inserted), as a higher frequency of a long TCR‐β CDR3 length in naive cells (CD8+ cells) may arise from increased insertions during the TCR‐β rearrangement process. We compared the mean length of the nucleotides inserted among the four T‐cell subsets. To our surprise, in both the pre‐ or post‐selection repertoires, the number of inserted nucleotides was significantly increased in the memory pool compared with the naive pool (Fig. 2c); however, the memory pool had significantly shorter CDR3 regions. In addition, the results showed that the mean length of the CDR3 nucleotide sequences did not significantly correlate with the mean length of the inserted nucleotides in both the preselection (Fig. 2d) or post‐selection repertoires (Fig. 2e). Therefore, there should be some other influential factors that play a determinative role in the CDR3 lengths of the different cell subsets. Indeed, a positive correlation was detected between the length of the original germline sequences and CDR3 length in both the preselection (Fig. 2f) or post‐selection repertoires (Fig. 2g). Moreover, it is interesting to note that there was a significant positive correlation between the length of the original germline sequences in the preselection and post‐selection repertoires (Fig. 2h). These findings indicate that the biases established during recombination survived the T‐cell development and selection processes. Taken together, these data suggest that early events in thymic T‐cell development and repertoire generation are distinct from the four T‐cell subsets (4RA, 4RO, 8RA and 8RO).

T‐cell subsets are distinct in TRBV segment usage in both the pre‐ and post‐selection repertoires

In addition, our analysis results showed that the usage frequency of the TRBV segments significantly differed among the four T‐cell subsets in both the pre‐ and post‐selection TCR‐β repertoires (Fig. 3). In the preselection TCR‐β repertoire, the usage frequency of TRBV20‐1, TRBV6‐4, TRBV11‐3, TRBV28, TRBV6‐6, TRBV6‐7, TRBV13 and TRBV11‐1 displayed an obvious difference between the naive CD4+ and naive CD8+ T cells. In the post‐selection TCR‐β repertoire, the usage frequency of TRBV20‐1, TRBV5‐1, TRBV28, TRBV6‐4, TRBV19, TRBV11‐3, TRBV13 and TRBV11‐1 significantly differed between the naive CD4+ and CD8+ T cells. It is important to note that the different usage frequencies of TRBV20‐1, TRBV6‐4, TRBV11‐3, TRBV28, TRBV13 and TRBV11‐1 between naive CD4+ and CD8+ T cells could be observed in both the pre‐ and post‐selection TCR‐β repertoires. In addition, it is remarkable that the discrepant TRBV usage of the T‐cell subsets showed a consistent variation trend in different donors (Fig. S8). In addition, the four T‐cell subsets could be distinguished from each other by a principal co‐ordinate analysis (PCA) based on TRBV segment usage in the preselection (Fig. 4a) or post‐selection TCR‐β repertoire (Fig. 4b).

Figure 3.

Figure 3

Usage frequency of the TRBV genes differed among the four T‐cell subsets in both the pre‐ and post‐selection repertoires. Differences in usage frequency of the TRBV genes among the four T‐cell subsets in the preselection (left panel) and post‐selection (right panel) repertoires.

Figure 4.

Figure 4

Four T‐cell subsets were distinguished from one another in both preselection and post‐selection repertoires. (a, b) Principal co‐ordinate analysis (PCA) based on TRBV segment usage showed profound differences among the four T‐cell subsets, preselection repertoire (a) or post‐selection (b) repertoire. (c) Discrepancies in the pattern of Dβ‐Jβ combinations among the four T‐cell subsets in the preselection TCR repertoire (left panel) and post‐selection TCR repertoire (right panel), but were similar between the pre‐ and post‐selection TCR repertoire. (d) Overlap indices for 4RA∩4RO, 4RA∩8RA, 4RA∩8RO, 4RO∩8RA, 4RO∩8RO and 8RA∩8RO cell subsets within the same donor were calculated. '∩' represents 'intersection'. The overlap indices of the six donors were presented, showing differences in the degree of overlap between the cell subsets could be observed in both the preselection repertoire (up panel) and the post‐selection repertoire (down panel).

The pattern of VDJ recombination, nucleotide insertion and inter‐subset overlap are consistent between the pre‐ and post‐selection repertoires

We analysed the pattern of D‐gene use conditioned on the J‐gene choice and pattern of V‐gene use conditioned on the D‐gene choice at the initial recombination step, and found that the patterns were similar between the pre‐ and post‐selection TCR‐β repertoires in each of the four T‐cell subsets, and there was a remarkable positive correlation (Fig. 4c, Figs S9 and S10). Furthermore, much of the TCR‐β CDR3 diversity was created by a template‐independent insertion of nucleotides at the Vβ‐Dβ and Dβ‐Jβ junctions by terminal deoxynucleotidyl transferase (Tdt). 10 The frequency at which Tdt inserts each of the four nucleotides (G, C, A and T) was estimated, and the insertion frequencies of the four nucleotides exhibited a significantly positive relationship between the pre‐ and post‐selection TCR‐β repertoires, regardless of the Vβ‐Dβ (Fig. S11a–d) or Dβ‐Jβ junctions (Fig. S11e–h). These results suggest that the biases established during recombination are largely maintained in mature T cells. To further explore whether the preselection TCR‐β repertoire had already displayed the repertoire features of the four T‐cell subsets, which observed in the post‐selection TCR repertoire, we next investigated the overlap degree of TCR‐β repertoires among the four T‐cell subsets (4RA, 4RO, 8RA and 8RO). The degree in overlap reflected the degree in similarity among the T‐cell subsets. The results showed that the degree in overlap between any of the two T‐cell subsets differed. Remarkably, differences between the cell subsets were observed in both the pre‐ and post‐selection TCR‐β repertoires in both consistency and harmony (Fig. 4d). We performed an analysis of interindividual sharing of identical TCR‐β clonotypes. The results were consistent with the above findings of the inter‐subset overlap. The degree in overlap differed among the four T‐cell subsets; however, it was similar between the pre‐ and post‐selection TCR‐β repertoires (Fig. S12).

Discussion

It is essential for T lymphocytes to recognize an enormous number of pathogens, which is primarily achieved by the hypervariable CDR3 regions of the TCR. T cells can be divided into several subsets according to their functions and characteristics; however, it remains unclear how differences in the TCR contribute to heterogeneity in the T‐cell state. In the present study, we applied high‐throughput sequencing technology to analyse the TCR‐β CDR3 repertoire of four different T‐cell subsets (CD4+ naive, CD4+ memory, CD8+ naive and CD8+ memory T cells). We found that there are obvious differences among the four T‐cell subsets, including TCR‐β diversity, CDR3 length distributions, TRBV segment usage and repertoire overlap. The reason for the difference in CDR3 between T‐cell subsets is not clear yet. Differences between T‐cell subsets may be related to other external factors, including the potential of certain TCRs to support increased homeostatic proliferation. In the present study, most striking was our observation that the four T‐cell subsets could be distinguished from one another and observed in both the pre‐ and post‐selection repertoires.

Our results showed that a distinction in the post‐selection repertoire could underlie similar findings in the preselection repertoire. Indeed, this concept arises out of a series of additional observations: (1) individuals with a high diversity in the preselection repertoires also have high diversity in the post‐selection repertoires; (2) there was a strong positive correlation between the CDR3 length distributions of the pre‐ and post‐selection repertoires; (3) insertion frequencies of the four nucleotides exhibit a significant positive relationship between the pre‐ and post‐selection repertoires in both Vβ‐Dβ or Dβ‐Jβ junctions; and (4) there was a strong positive correlation between the pre‐ and post‐selection repertoires regarding the patterns of Dβ‐Jβ and Vβ‐Dβ combinations at the initial recombination step. Taken together, these data suggest that early events in thymic T‐cell development and repertoire generation differ among the four T‐cell subsets. Moreover, these findings were supported by single‐cell TCR repertoires in previously published thymus TCR studies. 16 We found that the mean CDR3 length of the double‐positive (DP) T cells was shorter than that of the single‐positive (SP) T cells, in both productive or non‐productive TCR‐β CDR3 repertoires. For post‐selection TCR‐βs, several differences were maintained, suggesting that these differences survive through both positive and negative selection. Relevant to this investigation, work carried out by Lu et al. 17 reported that thymic selection of MHC‐independent TCR is largely unconstrained, but the selection of MHC‐specific TCR is restricted by both CDR3 length and specific amino acid usage. In addition, Dupic et al. 18 used results of high‐throughput sequencing and computational chain‐pairing experiments of human TCR repertoires, they quantitatively characterize the αβ generation process.

Genetic factors may play an important role on the structure of the TCR‐β repertoire the four T‐cell subsets. Using 'next‐generation' sequencing, a recent study in three pairs of healthy, monozygotic twins suggested that TRBV segment usage in out‐of‐frame sequences was more similar in twins than in non‐twins, which indicated a genetically determined bias prior to selection. 4 A study by Melenhorstal et al. 19 reported that TRBV usage in mice was already skewed at the earliest stages of protein expression (even before cell surface expression), suggesting that genetic, rather than thymic, selection pressure is critical for determining the TRBV repertoire. The protein product of the ataxia telangiectasia‐mutated (ATM) gene is a kinase critical for V(D)J recombination. In a recent study by Schön et al., 20 sequencing of TCR‐β CDR3 demonstrates that ATM‐deficient CD4(+)CD8(+) double‐positive thymocytes and peripheral T cells have altered processing of coding ends for both in‐frame and out‐of‐frame TCR‐β rearrangements, providing the unique demonstration that ATM deficiency alters the expressed TCR‐β repertoire by a selection‐independent mechanism. In addition, a previous report by Melenhorst et al. 19 showed that 42%–47% of the mature TRBV repertoire in both adult and neonatal donors is determined by generic TCR‐locus factors. The TCR locus is rich in SNPs, which are dispersed over both coding and non‐coding regions, including promoters, introns and recombination signal sequences (RSSs). Historically, it has been considered that minor dissimilarities in the RSS can have profound effects on targeting RAG1 and RAG2 (i.e. there is a correlation between how well the RSS targets recombinase activity and the frequency of lymphocytes expressing that gene segment), 21 which has been confirmed for both the murine TCR locus 22 and Igs locus. 21 Remarkably, the biases established during recombination are largely maintained in mature T cells. 19 Mallis et al. 23 reported that the pre‐TCR (preTCR), a pTα‐β heterodimer appearing before αβTCR expression, displays robust ligand binding behaviour to initiate the first stage of repertoire selection. The 'accessibility hypothesis' 24 may account for differences in the composition of the preselection repertoire among the four T‐cell subsets, indicating that in order for recombination to occur, gene segments must first be made accessible to the recombination machinery. This in turn depends on the subnuclear relocation of the rearranging TCR loci, DNA methylation status, recruitment of chromatin remodelling enzymes, histone modification and germline transcription. 25 Therefore, the complexity of the TCR repertoire is not achieved at random. Rather, diversity in T‐cell subsets is tightly regulated and the composition of the repertoire, even prior to thymic selection, is highly structured.

In conclusion, we used high‐throughput sequencing to study the pre‐ and post‐selection TCR‐β repertoires of CD4+ naive, CD4+ memory, CD8+ naive and CD8+ memory T cells. These analyses provide information to distinguish the different T‐cell subsets, and gain a better understanding of the generation and evolution of TCR‐β CDR3 repertoires in the adaptive immune system. This study raises important questions for future analyses, most notably the factors that direct the maturation of the phenotype of various T‐cell subsets and determine the size of the clonotype cell subsets.

Funding

This work was supported by funds received from the Major National S&T Projects for Infectious Diseases (2018ZX10301401), Key Research & Development Plan of Zhejiang Province (2019C04005), National Key Science and Technology Project (2018YFC2000500), the Open Funds of the Guangxi Key Laboratory of Tumor Immunology and Microenvironmental Regulation (2019KF004), Guilin Science Research and Technology Development Project (20190218‐5‐5) and the Natural Science Foundation of Guangxi (2019GXNSFBA245032).

Author contributions

HYD and XLH conceived, designed and performed most of the experiments with significant contributions from WBC, XJZ and JNC; PZ, GYW and XYF contributed sample collection; QZ and XDL provided critical revision of the manuscript for important intellectual content; and HYD and XLH wrote the paper.

Disclosures

The authors declare that they have no competing interests.

Supporting information

Figure S1. Assessment of the stability and reliability of sequencing.

Figure S2. The diversity of the TCRβ CDR3 repertoire was showed based on the D50 index.

Figure S3. Expansion degree of the TCRβ repertoire.

Figure S4. Diversity analysis of the TCRβ repertoire.

Figure S5. T cell subsets are distinct in the CDR3 length distribution in the total TCRβ nucleotide repertoires (including the abundance of each clonotype).

Figure S6. TCRβ CDR3 length correlated between the pre‐selection and post‐selection repertoires.

Figure S7. The CDR3 length of double positive (DP) and single positive (SP) T cell in the thymus. OOF, out of frame.

Figure S8. The discrepant TRBV usage between CD4+ naive and CD4+ memory T cell subsets showed a consistent variation trend in different donors.

Figure S9. Discrepant in the pattern of Dβ–Jβ combinations among the four T cell subsets, but similar between Pre‐ and Post‐selection TCR repertoire.

Figure S10. Discrepant in the pattern of Vβ–Dβ combinations among the four T cell subsets, but similar between Pre‐ and Post‐selection TCR repertoire.

Figure S11. Spearman correlation analysis of the insertion frequencies of the four nucleotides (G, C, A, and T) between the pre‐ and post‐selection repertoire, in the Vβ‐Dβ junctions (A–D) or Dβ‐Jβ junctions (E–H).

Figure S12. Interindividual sharing across all six donors of the TCRβ nucleotide clonotypes in all of the four T cell subsets.

Table S1. The number of each T cell subset isolated for each donor.

Data Availability Statement

Data sharing is not applicable to this article as all data generated or analysed during this study are included in this article.

References

  • 1. Hou XL, Wang L, Ding YL, Xie Q, Diao HY. Current status and recent advances of next generation sequencing techniques in immunological repertoire. Genes Immun 2016; 17:153–64. [DOI] [PubMed] [Google Scholar]
  • 2. Hou X, Lu C, Chen S, Xie Q, Cui G, Chen J, et al High Throughput Sequencing of T Cell Antigen Receptors Reveals a Conserved TCR Repertoire. Medicine 2016; 95:e2839. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Gomez‐Tourino I, Kamra Y, Baptista R, Lorenc A, Peakman M. T cell receptor beta‐chains display abnormal shortening and repertoire sharing in type 1 diabetes. Nat Commun 2017; 8:1792. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Zvyagin IV, Pogorelyy MV, Ivanova ME, Komech EA, Shugay M, Bolotin DA, et al Distinctive properties of identical twins' TCR repertoires revealed by high‐throughput sequencing. Proc Natl Acad Sci USA 2014; 111:5980–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5. Putintseva EV, Britanova OV, Staroverov DB, Merzlyak EM, Turchaninova MA, Shugay M, et al Mother and child T cell receptor repertoires: deep profiling study. Front Immunol 2013; 4:463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Murugan A, Mora T, Walczak AM, Callan CJ. Statistical inference of the generation probability of T‐cell receptors from sequence repertoires. Proc Natl Acad Sci USA 2012; 109:16161–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Robins HS, Srivastava SK, Campregher PV, Turtle CJ, Andriesen J, Riddell SR, et al Overlap and effective size of the human CD8+ T cell receptor repertoire. Sci Transl Med 2010; 2:47ra64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Larimore K, McCormick MW, Robins HS, Greenberg PD. Shaping of human germline IgH repertoires revealed by deep sequencing. J Immunol 2012; 189:3221–30. [DOI] [PubMed] [Google Scholar]
  • 9. Hou XL, Yang YD, Chen JN, Jia HY, Zeng P, Lv LX, et al TCRβ repertoire of memory T cell reveals potential role for Escherichia coli in the pathogenesis of primary biliary cholangitis. Liver Int 2019; 39:956–66. [DOI] [PubMed] [Google Scholar]
  • 10. Hou XL, Zeng P, Zhang XJ, Chen JN, Liang Y, Yang JZ, et al Shorter TCR β‐chains are highly enriched during thymic selection and antigen‐driven selection. Front Immunol 2019; 10:299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Sui W, Hou X, Zou G, Che W, Yang M, Zheng C, et al Composition and variation analysis of the TCR beta‐chain CDR3 repertoire in systemic lupus erythematosus using high‐throughput sequencing. Mol Immunol 2015; 67:455–64. [DOI] [PubMed] [Google Scholar]
  • 12. Bolotin DA, Shugay M, Mamedov IZ, Putintseva EV, Turchaninova MA, Zvyagin IV, et al MiTCR: software for T‐cell receptor sequencing data analysis. Nat Methods 2013; 10:813–4. [DOI] [PubMed] [Google Scholar]
  • 13. Bolotin DA, Mamedov IZ, Britanova OV, Zvyagin IV, Shagin D, Ustyugova SV, et al Next generation sequencing for TCR repertoire profiling: platform‐specific features and correction algorithms. Eur J Immunol 2012; 42:3073–83. [DOI] [PubMed] [Google Scholar]
  • 14. Liaskou E, Klemsdal HE, Holm K, Kaveh F, Hamm D, Fear J, et al High‐throughput T‐cell receptor sequencing across chronic liver diseases reveals distinct disease‐associated repertoires. Hepatology 2016; 63:1608–19. [DOI] [PubMed] [Google Scholar]
  • 15. Keylock C. Simpson diversity and the Shannon‐Wiener index as special cases of a generalized entropy. Oikos 2005; 109:203–7. [Google Scholar]
  • 16. Park JE, Botting RA, Conde CD, Popescu DM, Lavaert M, Kunz DJ, et al A cell atlas of human thymic development defines T cell repertoire formation. Science 2020; 367:eaay3224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Lu JH, Laethem FV, Bhattacharya A, Craveiro M, Saba I, Chu J, et al Molecular constraints on CDR3 for thymic selection of MHC‐restricted TCRs from a random pre‐selection repertoire. Nat Commun 2019; 10:1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Dupic T, Marcou Q, Walczak AM, Mora T. Genesis of the αβ T‐cell receptor. PLoS Comput Biol 2019; 15:e1006874. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Melenhorst JJ, Lay MD, Price DA, Adams SD, Zeilah J, Sosa E, et al Contribution of TCR‐beta locus and HLA to the shape of the mature human Vbeta repertoire. J Immunol 2008; 180:6484–9. [DOI] [PubMed] [Google Scholar]
  • 20. Hathcock KS, Bowen S, Livak F, Hodes RJ. ATM influences the efficiency of TCRβ rearrangement, subsequent TCRβ‐dependent T cell development, and generation of the Pre‐selection TCRβ CDR3 repertoire. PLoS One 2013; 8:e62188. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Livak F, Petrie HT. Somatic generation of antigen‐receptor diversity: a reprise. Trends Immunol 2001; 22:608–12. [DOI] [PubMed] [Google Scholar]
  • 22. Livak F, Burtrum DB, Rowen L, Schatz DG, Petrie HT. Genetic modulation of T cell receptor gene segment usage during somatic recombination. J Exp Med 2000; 192:1191–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Mallis RJ, Bai K, Arthanari H, Hussey RE, Handley M, Li Z, et al Pre‐TCR ligand binding impacts thymocyte development before alphabetaTCR expression. Proc Natl Acad Sci USA 2015; 112:8373–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Krangel MS. Beyond hypothesis: direct evidence that V(D)J recombination is regulated by the accessibility of chromatin substrates. J Immunol 2015; 195:5103–5. [DOI] [PubMed] [Google Scholar]
  • 25. Attaf M, Huseby E, Sewell AK. alphabeta T cell receptors as predictors of health and disease. Cell Mol Immunol 2015; 12:391–9. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1. Assessment of the stability and reliability of sequencing.

Figure S2. The diversity of the TCRβ CDR3 repertoire was showed based on the D50 index.

Figure S3. Expansion degree of the TCRβ repertoire.

Figure S4. Diversity analysis of the TCRβ repertoire.

Figure S5. T cell subsets are distinct in the CDR3 length distribution in the total TCRβ nucleotide repertoires (including the abundance of each clonotype).

Figure S6. TCRβ CDR3 length correlated between the pre‐selection and post‐selection repertoires.

Figure S7. The CDR3 length of double positive (DP) and single positive (SP) T cell in the thymus. OOF, out of frame.

Figure S8. The discrepant TRBV usage between CD4+ naive and CD4+ memory T cell subsets showed a consistent variation trend in different donors.

Figure S9. Discrepant in the pattern of Dβ–Jβ combinations among the four T cell subsets, but similar between Pre‐ and Post‐selection TCR repertoire.

Figure S10. Discrepant in the pattern of Vβ–Dβ combinations among the four T cell subsets, but similar between Pre‐ and Post‐selection TCR repertoire.

Figure S11. Spearman correlation analysis of the insertion frequencies of the four nucleotides (G, C, A, and T) between the pre‐ and post‐selection repertoire, in the Vβ‐Dβ junctions (A–D) or Dβ‐Jβ junctions (E–H).

Figure S12. Interindividual sharing across all six donors of the TCRβ nucleotide clonotypes in all of the four T cell subsets.

Table S1. The number of each T cell subset isolated for each donor.

Data Availability Statement

Data sharing is not applicable to this article as all data generated or analysed during this study are included in this article.


Articles from Immunology are provided here courtesy of British Society for Immunology

RESOURCES