Significance
T cells play a key role in the adaptive immune system. The broad repertoire of unique receptors expressed by T cells is in principle able to recognize a huge diversity of pathogens, but how to extract that information from blood samples remains unclear. By sequencing and analyzing the statistics of T cell receptors of subjects vaccinated against yellow fever, we identified vaccine-specific receptors that expanded following vaccination. We show that each individual has a unique response, which is similar yet across subjects in its sequence composition, with a slightly higher similarity between twins. Our method can be used in the clinic to track disease-specific T cell clones expanding or contracting after infection, vaccination, or therapy.
Keywords: vaccination, high-throughput sequencing, twins, T cell receptor, RepSeq
Abstract
T cell receptor (TCR) repertoire data contain information about infections that could be used in disease diagnostics and vaccine development, but extracting that information remains a major challenge. Here we developed a statistical framework to detect TCR clone proliferation and contraction from longitudinal repertoire data. We applied this framework to data from three pairs of identical twins immunized with the yellow fever vaccine. We identified 600 to 1,700 responding TCRs in each donor and validated them using three independent assays. While the responding TCRs were mostly private, albeit with higher overlap between twins, they could be well-predicted using a classifier based on sequence similarity. Our method can also be applied to samples obtained postinfection, making it suitable for systematic discovery of new infection-specific TCRs in the clinic.
The extremely diverse repertoire of T cell receptor (TCR) sequences allows the immune system to develop a specific response to almost any possible pathogen. In recent years, huge progress has been made in the deep profiling of TCR repertoires by high-throughput sequencing (RepSeq), allowing for the identification of millions of TCR sequences in a single experiment (1). TCR sequences, phenotypes, and the relative abundances of the corresponding T cell clones encode both the history of previous infections and the protection against yet unseen pathogens. However, despite recent large-scale efforts (2, 3), it is still impossible to predict systematically which antigen is recognized by a TCR with a given sequence. The most reliable method to identify antigen-specific TCR—MHC-multimer staining (4)—is restricted by the choice of HLA alleles (which are the most polymorphic genes in the human population) (5) and by the knowledge of immunodominant peptides. HLA-independent methods for quantitative monitoring of challenge-specific TCRs in vivo are still lacking.
Here we developed a methodology for identifying responding TCR clonotypes from time-dependent RepSeq data and applied it to yellow fever immunization: a classical model of acute viral infection in humans. The yellow fever vaccine (YFV strain 17D) is one of the most efficient and safe vaccines ever made (6). Because YFV 17D is a live attenuated virus, vaccination leads to viremia and very intense T cell and humoral responses (7). Tracking of activated T cell subsets such as CD8+CD38+HLA–DR+ (7) and fluorescent MHC-tetramer staining (8) make it possible to quantitatively describe the kinetics of the T cell response to primary YFV 17D immunization. The response peaks around day 14 after immunization, when activated T cells responding to vaccination occupy 2% to 13% of the CD8+ subpopulation (7–9) and 3% to 4% of the CD4+ subpopulation (10–12). Several immunodominant peptides were identified, and the corresponding pMHC-multimers have made it possible to track YFV-specific T cells years and even decades after immunization (8, 12, 13). The only RepSeq study of yellow fever immunization available to our knowledge (14) reports thousands of CD8+ T cell clones expanding after yellow fever immunization and preferential recruitment of highly expanded CD8+ T cell clones to the memory subpopulation. However, the clonal structure of the T cell immune response, how personalized this response is, and what the impact of genetic factors on the response is still remain poorly understood. Studying monozygous twins allows us to quantify the impact of genetic factors using a small cohort of donors (15–17).
In this study, we identified nearly 5,000 YFV 17D-responding TCRs in three pairs of identical twin donors, using RepSeq profiling of T cell repertoires at different timepoints analyzed with advanced statistical modeling. We validated the yellow fever specificity of expanded clones using three independent functional tests. The detailed analysis of the TCR sequences of the expanded clones showed both a highly personalized response and high sequence similarity across individuals, especially between twins. This convergence allowed us to develop a supervised classifier that predicted YFV-reactive TCR sequences with high specificity. Remarkably, the dynamics of clonal contraction in the month after the peak response specifically predicts the YFV-reactive TCR clonotypes. Thus, our methodology can be used during the postinfection period in the clinic to identify TCR sequences that recognized and responded to an acute infection of interest, even without prior knowledge of donor MHC alleles or pathogen epitopes.
Results
Detection of Significantly Expanded TCR Following YFV 17D Vaccination.
Blood samples were obtained for three pairs of identical twin volunteers aged 20 to 23. We collected peripheral blood samples on five different timepoints (two before and three after immunization) with the live attenuated YFV 17D vaccine (see Fig. 1A). On each timepoint, we collected two biological replicates—two independent tubes of blood—to isolate bulk peripheral blood mononuclear cells (PBMCs) and another tube of blood to isolate CD4+ and CD8+ T cell subpopulations. Additional portions of blood were used for other subpopulation isolation (CD45RO+) and functional tests on several timepoints (see Materials and Methods for details). From each sample, cDNA libraries of TCR chains were prepared as previously described (18) and sequenced on the Illumina HiSeq platform.
Fig. 1.
(A) Yellow fever vaccination (YFV 17D) study design. (Top) Experimental timeline with the list of samples collected at each timepoint. Unsorted PBMC samples were collected in two biological replicates at each timepoint. (Bottom) Method overview. Peripheral blood samples were subjected to PBMC isolation, synthesis of TCR cDNA libraries, Illumina sequencing, and reconstruction of TCR repertoires. (B) Number of significantly expanded clonotypes in comparison with day 0. The number peaks at day 15 for all donors. The comparison with day −7 corresponds to a contraction reflecting the normal dynamics of a healthy repertoire in absence of vaccination. (C) Activated CD8+CD38+HLA–DR+ subpopulation is enriched with clonotypes expanded between days 0 and 15. Relative abundance of a TCR sequence in the CD8+CD38+HLA–DR+ activated subpopulation (x axis) versus its abundance in the bulk CD8+ population isolated at the same timepoint (y axis). Yellow dots indicate clonotypes that strongly expanded between day 0 and day 15. Brighter shades of blue and yellow indicate clonotypes significantly enriched in the activated subpopulation. Black line shows identity. Red circles indicate sequences found in the A02-NS4b214−222-dextramer-positive fraction 2 y later.
We developed a Bayesian statistical framework that identifies clonotypes both significantly () and strongly (fold-change ) expanded between different timepoints compared with the expected variability between replicates (see Materials and Methods). We reproduced the results with edgeR (19), a widely used method to analyze differential gene expression in RNA-sequencing experiments (see SI Appendix, Fig. S1). Since differential expression analysis software such as edgeR was developed for RNA-seq data, its applicability to RepSeq data cannot be assumed a priori. By explicitly modeling the difference between true clone size, sampled clone size, and read count through a two-step process, our model better captures the low-number pair-count statistics than single-step models used by existing methods (see SI Appendix, Fig. S2). Another advantage is that our Bayesian approach allows us to transparently assess the uncertainty of the estimated fold-change for each clone, by providing its complete posterior distribution (see SI Appendix, Fig. S3). In Fig. 1B, we show the number of TCR clonotypes identified as expanded with respect to day 0. In all donors, we observe many more expanded clonotypes between day 0 and day 15 than for any other pair of timepoints, despite variations between donors. In the following, we simply call “expanded clonotypes” TCR sequences that significantly increased in fraction between days 0 and 15. Clonotypes with a significantly higher frequency on the prevaccination timepoint relative to the vaccination timepoint (day 0 vs day −7, so actually corresponding to a contraction) were relatively few for all donors except for patient 1 (P1), whom we speculate was undergoing another transient immune response.
Most expanded clones were not detected before immunization, and often not even on day 7, due to their low frequency. We report expansion rates of up to 2,000- to 3,000-fold in 7 d, although this estimate is only a lower bound due to lack of detection before day 15 (see SI Appendix, Fig. S4).
The Majority of Expanded TCRs Are YFV 17D-Specific.
We hypothesized that most of the expanded clonotypes proliferated specifically in response to the YFV 17D vaccine. To check this hypothesis, three independent functional tests with subsequent TCR repertoire sequencing were performed on donor S1: (i) interferon (IFN)-gamma secretion assay after stimulation with YFV 17D vaccine; (ii) fluorescent sorting of the activated CD8+CD38+HLA–DR+ subset, which was reported to be largely YFV 17D-specific at the peak of the response (7); (iii) staining with a MHC-dextramer loaded with an immunodominant epitope. For i and ii, we considered the clonotype validated if it was enriched in the IFN-gamma positive (i) or CD8+CD38+HLA–DR+ (ii) fractions, by comparison with the bulk PBMC or CD8+ populations, respectively (using a one-sided exact Fisher test; see SI Appendix). We found that out of 773 clonotypes expanded in donor S1 (see Fig 1B), 331 were enriched in the CD8+CD38+HLA–DR+ fraction (see Fig. 1C) and 64 were enriched in the IFN-gamma secretion assay (see SI Appendix, Fig. S5). For iii, we isolated T cells positive for the HLA-A*02 dextramer loaded with the immunodominant YFV epitope NS4b214−222 (LLWNGPMAV) from a sample collected 2 y after immunization of donor S1 and sequenced their TCR. We found 68 clonotypes in the dextramer-positive fraction that were labeled as expanded for this donor by our statistical method. Their cumulative frequency accounted for at least 22% of the CD8+ expanded fraction at day 15. We also sequenced unsorted PBMCs for the 2-y timepoint of this donor and found that the fraction of the repertoire occupied by the YFV-responding clones largely declined 2 y after immunization—from 2.1% (on day 45) to 0.07%—but still exceeded prevaccination levels by nearly 2 orders of magnitude.
The total number of clonotypes validated by any of the three methods was 395 out of 773. Notably, the remaining unvalidated clonotypes had significantly lower average frequencies than the validated ones on day 15 (, t test; SI Appendix, Fig. S6). To check whether our method did not simply pick up large clones at day 15 (YFV-specific or not), we repeatedly subsampled 773 random TCRs according to their abundance at day 15 and asked whether they were found in any of the validation datasets. On average, only of those were found in the CD8+CD38+HLA–DR+ subset (vs. 331 of the expanded clonotypes), in the IFN-gamma assay (vs. 64), and in the dextramer assay (vs. 68).
To further validate the expanded clones for YFV 17D specificity, we used previously published TCR sequences from the VDJdb database (20) (https://vdjdb.cdr3.net; see Dataset S1) specific for the NS4b214−222 YFV 17D immunodominant epitope (n = 264) and an unrelated epitope as a control (n = 370): the cytomegalovirus (CMV) immunodominant HLA-A*02–restricted pp65495−503 peptide (NLVPMVATV). All of our donors are HLA-A*02–positive (complete HLA genotypes can be found in SI Appendix, Table S1). We checked if published A02-NS4b214−222–specific TCR sequences could be found in sets of expanded clonotypes identified in each donor by our model (SI Appendix, Table S2). We found multiple exact matches for published A02-NS4b214−222–specific sequences and no exact matches for CMV-specific A02-pp65495−503 sequences. We also found an increase of cumulative frequency of published YFV-specific (but not CMV-specific) TCRs in CD8+ repertoires of our donors after immunization (see SI Appendix and SI Appendix, Fig. S7).
In summary, the expanded clonotypes were validated for their YFV specificity and were also consistent with published YFV-specific TCRs.
The CD8+ T Cell Response Is Sustained Longer than the CD4+ Response.
To track the fractions of the CD4+ and CD8+ repertoires involved in the response over time, we calculated the cumulative frequency of expanded clonotypes in the CD4+ and CD8+ compartments at each timepoint. We found similar dynamics in all donors, with both CD8+ and CD4+ responses peaking on day 15. In all donors, the CD4+ response fell off more quickly than the CD8+ response from day 15 to day 45, with a mean -fold decrease for CD4+ versus for CD8+ (, paired one-sided t test). We checked the significance of this difference in decay for each donor separately using our statistical framework (see SI Appendix). The responding clonotypes at the peak of the response occupied up to 8% of the repertoire (cumulative frequency) for CD8+ and up to 5% for CD4+ T cells. Almost all these clonotypes were undetected before immunization.
The previous analysis described the dynamics of the CD4+ and CD8+ responses separately. What is the relative importance and diversity of the CD4+ and CD8+ responses within the overall T cell population? To answer this question, we associated expanded clonotypes from the unsorted PBMC sample to either the CD4+ or CD8+ subsets in the following way: We labeled an expanded clonotype from the unsorted PBMC sample CD4+ if it had a larger concentration in the CD4+ subpopulation sequenced than in the CD8+ subpopulation and vice versa. This procedure gave unambiguous in silico phenotypes for each expanded clonotype sequence from the PBMC at day 15: The relative abundance of sequences in the CD4+ versus CD8+ compartments was strongly bimodal, with two peaks close to 100% CD4+ and 100% CD8+ (see SI Appendix, Fig. S8). This analysis revealed that both CD4+ and CD8+ clones strongly expanded in response to the vaccination, with no strong preference for CD4+ or CD8+ expanded clonotypes in terms of the cumulative frequency or diversity (see SI Appendix, Fig. S9). We next asked if the presence of expanded clonotypes in the memory (CD45RO+) compartment 45 d after immunization depends on their CD4+ or CD8+ phenotype. We detected on average 49% (%) of CD4+ expanded clonotypes in memory repertoires 45 d after immunization versus only 21% (%) of CD8+ expanded clonotypes. This may be explained by different levels of CD45RO + expression in CD4+ and CD8+ memory T cells reactive to yellow fever.
TCR Response Is Highly Personalized Even Among Twins.
Our method not only reconstructs the dynamics of immune response but also enables the analysis of TCR sequences of the responding T cell clones. For each donor, we found 600 to 1,700 YFV 17D-reactive clonotypes expanded between day 0 and day 15. We compared the pairwise sharing of responding amino acid TCR sequences between different individuals. We found more overlap in twins than in nonrelated individuals (see Fig. 2A). However, absolute numbers of identical expanded TCR clonotypes were low even in twin pairs (up to 21 out of 1,685 amino acid sequences), indicating that each individual developed an almost unique response. We also observed that the same nucleotide variants encoded some of the amino acid TCR sequences shared between the twins (see SI Appendix, Table S3). This identity of nucleotide variants was more frequent in twins than expected from convergent recombination alone (; see SI Appendix).
Fig. 2.
Sharing of YFV-responding TCRs across donors. (A) Number of expanded TCR amino acid clonotypes shared between pairs of donors, divided by the product of the numbers of expanded clonotypes in each donor. All three twin pairs (S, P, and Q) show higher numbers of shared expanded clonotype TCRs than unrelated individuals and are clustered together by hierarchical clustering (dendrogram on top). (B) The normalized sharing of expanded clonotype TCRs (Left) is much higher than the normalized sharing in the whole TCR repertoire (Right). Sharing in twins (red) always exceeds sharing in unrelated individuals (blue).
Overall, shared sequences (across at least two donors) accounted for a small fraction of the response in each donor. Only 2.5% to 4.4% of unique responding TCR sequences were public (present in more than one donor). These sequences correspond to 2.7% to 8.2% of YF-specific cells in all donors except Q1, which show an outlying value of 33% due to large expansion of few public TCR sequences (see SI Appendix, Table S4). Despite the low numbers of shared TCRs, normalized sharing of YFV 17D-reactive amino acid clonotype sequences was more than 2 orders of magnitude higher than in the total TCR repertoire for both twins and unrelated individuals (Fig. 2B). This may result from convergent selection of the same TCR variants recognizing the same epitopes in different donors.
Fig. 3.
Convergence of amino acid sequences in the YFV-responding TCR repertoire. (A) Number of pairs of similar clones (Left: exact same CDR3 amino acid sequence; Middle: up to one mismatch; Right: up to two mismatches) normalized by the number of possible pairings in each individual (see SI Appendix). The number of similar clones in the data (yellow dots) is larger than the number of similar clones in randomly drawn samples (dark-blue dots) of the same size ( for zero, one, and two mismatches; one-sided paired t test). As a reference, the red dots show an example of a restricted and specific repertoire for the yellow fever virus immunodominant epitope NS4b214−222 [data from VDJdb (20); see Dataset S1]. (B) One thousand most abundant TCRs from donor S1 at day 15. Each vertex corresponds to a TCR amino acid sequence; edges connect sequences differing by two or fewer amino acids in their CDR3. Only vertices with neighbors are plotted. Yellow clonotypes indicate expanded clonotypes, while blue clonotypes were present before immunization at similar frequencies as on day 15. The vast majority of edges (95 out of 103) are formed between TCR clonotypes of the same status (expanded or not expanded).
TCR Sequence Analysis Reveals a Mixture of Convergent and Private Response.
It was previously shown that in many cases TCRs recognizing the same antigens have restricted sequence diversity (2, 21). To analyze the sequence diversity of responding TCRs, we performed a pairwise comparison of all expanded TCR sequences within each donor at the amino acid level. In each individual, we identified many more pairs of expanded TCRs with the same or highly similar CDR3 amino acid sequences (up to zero, one, or two amino acid mismatch) on day 15 than in a random subset of equal size (see Fig. 3A). Interestingly, our expanded TCRs were still more diverse than published TCRs selected for their specificity to a single immunodominant YFV epitope NS4b214−222 [red dots; data from VDJdb (20); see Dataset S1], suggesting that the response is directed against multiple epitopes in each donor. Expanded TCRs form multiple dense clusters of highly similar sequences (Fig. 3B, yellow circles), suggesting they are responding to multiple epitopes. Further experiments need to test this conjecture. By contrast, prevaccination abundant TCRs form fewer, sparser, and smaller clusters (blue circles).
Sequence Score Based on Distance to Expanded TCR Predicts YFV 17D Specificity.
The similarity of sequences in the expanded repertoire makes it possible to build a simple classifier to identify novel YFV 17D-specific TCR sequences. For each TCR of interest, we defined a YFV specificity score as the Hamming distance to the closest amino acid sequence neighbor among the expanded clonotypes from all six donors. To test how informative this score is about YFV specificity, we first applied it to published TCRs [from VDJdb (20); Dataset S1] specific to the NS4b214−222 epitope from YFV 17D and to the pp65495−503 epitope from CMV as a negative control. We found that TCRs of published YFV-specific clonotypes were much closer to our set of expanded sequences than CMV-specific clonotypes (SI Appendix, Fig. S10A), with some exact matches for YFV-specific but none for CMV-specific sequences (SI Appendix, Table S2). Accordingly, using the score to discriminate YFV-specific from CMV-specific published sequences yields good specificity and sensitivity (SI Appendix, Fig. S10B).
We then asked if the score could be used to identify reactive clonotypes in the repertoire of an individual after immunization. To test this capability, we used a leave-one-out approach. Information about expanded clonotypes in five individuals was used to build a score as described earlier. The score was then used to predict expanded clonotypes among the 1,000 most abundant ones at day 15 in the sixth individual. The score performed with similar accuracy as on published epitope-specific clonotypes (SI Appendix, Fig. S10C).
Retrospective Detection of YFV 17D-Reactive TCRs Using Postvaccination Data.
So far we have identified responding TCRs as those that significantly expanded between days 0 and 15. While prechallenge timepoints are easy to collect in vaccination studies, this is not the case for acute infections, where the first samples can usually be obtained only after the onset of symptoms, when it is too late to detect TCR expansion. However, the clonal contraction dynamics (Fig. 4, day 15 to day 45) can in principle be used to identify responding clonotypes, by comparing a timepoint taken on the peak of the response to a timepoint taken several weeks or months after the infection.
Fig. 4.
Dynamics of YFV-responding T cells in the CD4+ and CD8+ compartments. Total fraction of (A) CD4+ and (B) CD8+ repertoires occupied by clonotypes significantly expanded from day 0 to day 15 for different timepoints. CD4+ and CD8+ T cell subpopulations show similar dynamics, although the CD8+ response degrades more slowly. Error bars are smaller than one line width.
To demonstrate the feasibility of this detection on our data, we identified significantly (, fold-change ; see Materials and Methods) contracted clonotypes between day 15 and day 45 using our model. We computed the overlap between this set of candidate TCR clonotypes with the subset of expanded clonotypes obtained before (SI Appendix, Fig. S11). Strikingly, 74% to 97% of the significantly contracted TCRs were also present in the expanded subset, showing that the contraction dynamics can help to identify YFV-reactive clonotypes with high specificity. This method is also sensitive: 45% to 81% of expanded TCRs could be identified by contraction. This shows that contraction dynamics alone is sufficient to identify a large fraction of the responding TCRs. Thus, our method could be used to identify clonotypes responding to infections in the clinic, when preinfection timepoints are not available.
Discussion
In this study, we used high-throughput TCR repertoire profiling to identify major changes occurring in the repertoire after immunization with the live attenuated YFV 17D vaccine. We found several hundreds of unique TCR clones in each donor expanded in response to vaccination. A strong clonal expansion of up to -fold occurred between days 7 and 15 following vaccination. This proliferation corresponds to at least 11 divisions in 7 d, with an average of 15 h per cell cycle. Similar division rates (doubling times of 8 to 15 and 11 to 17 h for CD4+ and CD8+ T cells, respectively) were observed for lymphocytic choriomeningitis virus (LCMV) infection (22) and also in adoptive transfer experiments in mice (18 h) (23). However, in our case, the actual expansion rate could be much higher, because when a clonotype is not found on day 7, its initial concentration is unknown, providing only a lower bound estimate of the fold-change. Higher sequencing depth or shorter intervals between timepoints are needed to refine this estimate of expansion rates. This extension would allow us to study the impact of the T cell clone phenotype and the TCR sequence on the clonal expansion rate. Initial low concentrations of YFV-reactive clonotypes suggest their naive phenotype.
Our longitudinal approach can be applied to identify antigen-specific clonotypes for poorly characterized pathogens with unknown epitopes, which may be useful to study the immune response to emerging viral infections. One could also use this approach to track the response to infection and vaccination of unconventional T cell subsets for which antigens are still unknown (i.e., T cells).
While analyzing expansion between a prevaccination timepoint and the peak of the T cell response (day 15 in our case) is the the most natural choice, for real-world, nonexperimental acute infections, acquiring a pre-infection timepoint is often impossible or impractical. In this case, we showed how to use the dynamics of clonal contraction to identify reactive clonotypes, by comparing their abundance at the peak of the response to a timepoint taken a few weeks later. Implementing such a protocol in the clinic for acute primary viral infections with predictable kinetics [e.g., tick-borne encephalitis (24), hantavirus infection (25), and dengue fever (26) in which the T cell response peak occurs 1 wk after symptom onset] could lead to the rapid growth of the number of annotated TCR sequences specific to a range of infectious diseases, facilitating diagnostics and vaccine design. For secondary immune responses and chronic infections, further studies are required to determine the set of timepoints and sequencing depth needed to identify responding clonotypes.
Simultaneous sequencing of bulk PBMC, CD4+, and CD8+ subsets allowed us to determine the phenotype of the responding clones and to describe the kinetics of the response inside each compartment. While Blom et al. (10) found that YFV-specific CD4+ T cell concentration peaked slightly earlier than CD8+ T cells, we did not detect any difference in the expansion kinetics with our limited temporal resolution. More timepoints between days 7 and 15 could have been helped to detect such differences. However, we did find differences in the contraction kinetics, with CD4+ cells contracting faster than CD8+ cells. We also found differential recruitment to the CD45RO+ memory compartment, with many more CD4+ than CD8+ TCRs detected in the CD45RO + fraction on day 45. CD8+ memory formed in response to YFV immunization was shown to be largely CD45RA+ (8, 10, 13), whereas James et al. (27) found that CD4+ memory cells were mainly CD45RA–. Our results suggest that CD4+ and CD8+ YFV-reactive memory populations also differ in their CD45RO expression.
Our study describes the reaction of the T cell repertoire in a model of an acute infection in genetically identical individuals. It was previously shown that the T cell repertoires of twins have more sequence overlap in abundant clonotypes (15, 16). Part of this extensive overlap may be explained by in utero sharing of T cells (18). The authors in ref. 17 also found more overlap in vaccinia virus-specific CD4+ repertoires (isolated by in vitro cultivation with the antigen) in twins. Consistent with these findings, we report more TCR amino acid sequence overlap in the YFV-reactive repertoires of twins than in those of unrelated individuals. The maximum normalized number of shared YFV-reactive clonotypes that we observe between twins () is very close to the values measured by Qi et al. (17) for VZV-specific clonotypes (). Twin YFV-specific TCR clonotypes also show higher nucleotide sequence overlap than those of unrelated individuals, even relative to their higher amino acid sequence overlap (see SI Appendix, Table S4). To assess the significance of this observation, we used a generative model (28) to ask how likely it is to produce such an amount of shared nucleotide sequences by convergent recombination (see SI Appendix). In the vast majority of simulations (), the model underestimates the number of shared sequences between twin donors. We speculate that some of these shared clonotypes were exchanged in utero. Yet two-thirds of YFV-reactive TCR amino acid sequences shared between twins have different nucleotide sequences and can only be explained by convergent recombination and selection. Our results also suggest a mechanism for the previously reported extensive sharing of abundant TCRs in twins (15). Under this mechanism, twins would share more TCR sequences expanded in response to the same infection.
We showed that the response to the vaccine is very diverse, with few TCRs shared even between donors with identical genetic and environmental backgrounds. Nevertheless, using a simple similarity measure, it is possible to identify YFV-specific clonotypes from yet unseen repertoires with high specificity, using datasets of TCRs with known YFV specificity. The sensitivity of this classifier could be improved by collecting more examples of YFV-specific TCRs from more donors. We also show how published antigen-specific sequences can be used for functional repertoire annotation. On day 15 after immunization, we found a much higher cumulative frequency of published A02-NS4b214−222–specific sequences compared with prevaccination levels. Yet only a few significantly expanded clones matched those published sequences. This could be explained in two possible ways. First, our significantly expanded clones may be specific to epitopes other than NS4b214−222. Second, the A02-NS4b214−222–specific TCR repertoire may be so diverse that little overlap is to be expected between random subsamples of it. Further accumulation of antigen-specific TCR sequence data—acquired by sequencing of multimer-specific cells, longitudinal studies as done here, and disease-associated studies with large cohorts (29, 30)—will provide the means for disease diagnostics and extraction of clinically relevant information from T cell repertoire data.
Materials and Methods
Donors and Samples.
Detailed description of all experimental procedures could be found in SI Appendix. Three pairs of healthy identical twins participated in this study. The blood was collected with informed consent in a certified diagnostics laboratory. The study was approved by the institutional review board (IRB) of Pirogov Russian National Research Medical University. PBMCs were isolated with the Ficoll–Paque method, and CD4+, CD8+, and CD45RO+ subpopulations were immunomagnetically positively isolated from PBMCs. Staining with the HLA-A*02 dextramer loaded with the NS4b214−222 peptide (LLWNGPMAV) from YFV 17D (Immudex) was performed according to the manufacturer’s protocol. The IFN-gamma secretion assay was performed using the IFN-gamma Secretion Assay-Cell Enrichment and Detection Kit (Myltenyi Biotec) according to the manufacturer’s protocol. To stimulate IFN-gamma production, whole blood was incubated with viral particles and anti-CD28 antibody for 5 h. All isolated cells were immediately lyzed with Trizol reagent (Invitrogen).
TCR cDNA Library Preparation and Sequencing.
TCR cDNA library preparation, sequencing, and data analysis were performed as described in ref. 18. Briefly, total RNA was isolated from cells using TRIzol reagent. The 5‘RACE cDNA synthesis with primer specific for TCR constant segments was followed by two-step PCR amplification. All libraries were sequenced on Illumina HiSeq 2500 platform. Raw data were demultiplexed and clustered by UMI (unique molecular identifiers) using the MIGEC software (31); alignment of V, D, and J templates was then performed by MiXCR (32).
Statistical Analysis.
We identified expanded clones probabilistically using Bayesian statistics. We first assumed and inferred from data (using maximum likelihood) a generative model with power-law clone size distribution (33), followed by selective expansion of a subset of clones after vaccination. The number of cells of each clone in each sample was modeled by a negative binomial and the number of sequenced mRNAs for each cell by a Poisson distribution. We then applied Bayes rule to compute the posterior probability distribution of the fold-change in concentration for each clone. Clones that expanded -fold with posterior probability were selected for further analysis. Mathematical details are given in SI Appendix.
Data and Code Availability.
The code for the clonal expansion model is available on github: https://github.com/mptouzel/pogorelyy_et_al_2018. Raw data are available on NCBI Sequence Read Archive (accession no. PRJNA493983).
Supplementary Material
Acknowledgments
We thank Dr. I. V. Zvyagin for his help in finding appropriate donors for this study and Drs. L. V. Gmyl and M. F. Vorovitch for YFV 17D purification. TCR libraries sequencing, raw sequencing data processing, and reconstruction of TCR repertoires were supported by Russian Science Foundation Grant 15-15-00178. M.V.P. and E.S.E. are supported by Skoltech Systems biology fellowships. This work was partially supported by European Research Council Consolidator Grant 724208.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
Data deposition: The sequence reported in this paper has been deposited in the NCBI Sequence Read Archive (accession no. PRJNA493983). The code for the clonal expansion model has been deposited in GitHub, https://github.com/mptouzel/pogorelyy_et_al_2018.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1809642115/-/DCSupplemental.
References
- 1.Benichou J, Ben-Hamo R, Louzoun Y, Efroni S. Rep-seq: Uncovering the immunological repertoire through next-generation sequencing. Immunology. 2012;135:183–191. doi: 10.1111/j.1365-2567.2011.03527.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Dash P, et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature. 2017;547:89–93. doi: 10.1038/nature22383. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Glanville J, et al. Identifying specificity groups in the T cell receptor repertoire. Nature. 2017;547:94–98. doi: 10.1038/nature22976. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Davis MM, Altman JD, Newell EW. Interrogating the repertoire: Broadening the scope of peptide–MHC multimer analysis. Nat Rev Immunol. 2011;11:551–558. doi: 10.1038/nri3020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Robinson J, et al. The IMGT/HLA database. Nucleic Acids Res. 2013;41:D1222–D1227. doi: 10.1093/nar/gks949. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Monath TP, Vasconcelos PF. Yellow fever. J Clin Virol. 2015;64:160–173. doi: 10.1016/j.jcv.2014.08.030. [DOI] [PubMed] [Google Scholar]
- 7.Miller JD, et al. Human effector and memory CD8+ T cell responses to smallpox and yellow fever vaccines. Immunity. 2008;28:710–722. doi: 10.1016/j.immuni.2008.02.020. [DOI] [PubMed] [Google Scholar]
- 8.Akondy RS, et al. The yellow fever virus vaccine induces a broad and polyfunctional human memory CD8+ T cell response. J Immunol (Baltimore, Md.: 1950) 2009;183:7919–7930. doi: 10.4049/jimmunol.0803903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Akondy RS, et al. Initial viral load determines the magnitude of the human CD8 T cell response to yellow fever vaccination. Proc Natl Acad Sci USA. 2015;112:3050–3055. doi: 10.1073/pnas.1500475112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Blom K, et al. Temporal dynamics of the primary human T cell response to yellow fever virus 17D as it matures from an effector- to a memory-type response. J Immunol. 2013;190:2150–2158. doi: 10.4049/jimmunol.1202234. [DOI] [PubMed] [Google Scholar]
- 11.Kohler S, et al. The early cellular signatures of protective immunity induced by live viral vaccination. Eur J Immunol. 2012;42:2363–2373. doi: 10.1002/eji.201142306. [DOI] [PubMed] [Google Scholar]
- 12.Kongsgaard M, et al. Adaptive immune responses to booster vaccination against yellow fever virus are much reduced compared to those after primary vaccination. Sci. Rep. 2017;7:1–14. doi: 10.1038/s41598-017-00798-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Fuertes Marraco SA, et al. Long-lasting stem cell-like memory CD8+ T cells with a naïve-like profile upon yellow fever vaccination. Sci Transl Med. 2015;7:282ra48. doi: 10.1126/scitranslmed.aaa3700. [DOI] [PubMed] [Google Scholar]
- 14.DeWitt WS, et al. Dynamics of the cytotoxic T cell response to a model of acute viral infection. J Virol. 2015;89:4517–4526. doi: 10.1128/JVI.03474-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Zvyagin IV, et al. Distinctive properties of identical twins’ TCR repertoires revealed by high-throughput sequencing. Proc Natl Acad Sci USA. 2014;111:5980–5985. doi: 10.1073/pnas.1319389111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Rubelt F, et al. Individual heritable differences result in unique cell lymphocyte receptor repertoires of naïve and antigen-experienced cells. Nat Commun. 2016;7:11112. doi: 10.1038/ncomms11112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Qi Q, et al. Diversification of the antigen-specific T cell receptor repertoire after varicella zoster vaccination. Sci Transl Med. 2016;8:332ra46. doi: 10.1126/scitranslmed.aaf1725. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Pogorelyy MV, et al. Persisting fetal clonotypes influence the structure and overlap of adult human T cell receptor repertoires. PLoS Comput Biol. 2017;13:e1005572. doi: 10.1371/journal.pcbi.1005572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Robinson MD, McCarthy DJ, Smyth GK. edgeR: A bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2009;26:139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Shugay M, et al. VDJdb: A curated database of T-cell receptor sequences with known antigen specificity. Nucleic Acids Res. 2017;46:419–427. doi: 10.1093/nar/gkx760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Miles JJ, Douek DC, Da Price. Bias in the T-cell repertoire: Implications for disease pathogenesis and vaccination. Immunol Cell Biol. 2011;89:375–387. doi: 10.1038/icb.2010.139. [DOI] [PubMed] [Google Scholar]
- 22.De Boer RJ, Homann D, Perelson AS. Different dynamics of CD4+ and CD8+ T cell responses during and after acute lymphocytic choriomeningitis virus infection. J Immunol. 2003;171:3928–3935. doi: 10.4049/jimmunol.171.8.3928. [DOI] [PubMed] [Google Scholar]
- 23.Buchholz VR, et al. Disparate individual fates compose robust CD8+ T cell immunity. Science. 2013;340:630–635. doi: 10.1126/science.1235454. [DOI] [PubMed] [Google Scholar]
- 24.Blom K, et al. Specificity and dynamics of effector and memory CD8 T cell responses in human tick-borne encephalitis virus infection. PLoS Pathog. 2015;11:e1004622. doi: 10.1371/journal.ppat.1004622. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Lindgren T, et al. Longitudinal analysis of the human T cell response during acute hantavirus infection. J Virol. 2011;85:10252–10260. doi: 10.1128/JVI.05548-11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Rivino L. Understanding the human T cell response to dengue virus. In: Hilgenfeld R, Vasudevan SG, editors. Dengue and Zika: Control and Antiviral Treatment Strategies. Springer Singapore; Singapore: 2018. pp. 241–250. [DOI] [PubMed] [Google Scholar]
- 27.James EA, et al. Yellow fever vaccination elicits broad functional CD4+ T cell responses that recognize structural and nonstructural proteins. J Virol. 2013;87:12794–12804. doi: 10.1128/JVI.01160-13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Murugan A, Mora T, Walczak AM, Callan CG. Statistical inference of the generation probability of T-cell receptors from sequence repertoires. Proc Natl Acad Sci USA. 2012;109:16161–16166. doi: 10.1073/pnas.1212755109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Emerson RO, et al. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat Genet. 2017;49:659–665. doi: 10.1038/ng.3822. [DOI] [PubMed] [Google Scholar]
- 30.Pogorelyy MV, et al. Method for identification of condition-associated public antigen receptor sequences. eLife. 2018;7:e33050. doi: 10.7554/eLife.33050. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Shugay M, et al. Towards error-free profiling of immune repertoires. Nat Methods. 2014;11:653–655. doi: 10.1038/nmeth.2960. [DOI] [PubMed] [Google Scholar]
- 32.Bolotin DA, et al. MiXCR: Software for comprehensive adaptive immunity profiling. Nat Methods. 2015;12:380–381. doi: 10.1038/nmeth.3364. [DOI] [PubMed] [Google Scholar]
- 33.Mora T, Walczak A. Quantifying lymphocyte receptor diversity. In: Das JD, Jayaprakash C, editors. System Immunology. CRC Press; Boca Raton, FL: 2018. pp. 185–199. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The code for the clonal expansion model is available on github: https://github.com/mptouzel/pogorelyy_et_al_2018. Raw data are available on NCBI Sequence Read Archive (accession no. PRJNA493983).




