Can we predict T cell specificity with digital biology and machine learning?

Dan Hudson; Ricardo A Fernandes; Mark Basham; Graham Ogg; Hashem Koohy

doi:10.1038/s41577-023-00835-3

. 2023 Feb 8:1–11. Online ahead of print. doi: 10.1038/s41577-023-00835-3

Can we predict T cell specificity with digital biology and machine learning?

Dan Hudson ^1,², Ricardo A Fernandes ³, Mark Basham ², Graham Ogg ^1,³, Hashem Koohy ^1,^4,^✉

PMCID: PMC9908307 PMID: 36755161

Abstract

Recent advances in machine learning and experimental biology have offered breakthrough solutions to problems such as protein structure prediction that were long thought to be intractable. However, despite the pivotal role of the T cell receptor (TCR) in orchestrating cellular immunity in health and disease, computational reconstruction of a reliable map from a TCR to its cognate antigens remains a holy grail of systems immunology. Current data sets are limited to a negligible fraction of the universe of possible TCR–ligand pairs, and performance of state-of-the-art predictive models wanes when applied beyond these known binders. In this Perspective article, we make the case for renewed and coordinated interdisciplinary effort to tackle the problem of predicting TCR–antigen specificity. We set out the general requirements of predictive models of antigen binding, highlight critical challenges and discuss how recent advances in digital biology such as single-cell technology and machine learning may provide possible solutions. Finally, we describe how predicting TCR specificity might contribute to our understanding of the broader puzzle of antigen immunogenicity.

Subject terms: Immunological memory, Machine learning, Autoimmune diseases

Koohy and co-workers discuss how we must turn to machine-learning approaches to define the antigen specificity of the many millions of possible T cell receptors. They review the models and methods currently being used to predict cognate antigens for orphan T cell receptors.

Introduction

T cells typically recognize antigens presented on members of the MHC protein family via highly diverse heterodimeric T cell receptors (TCRs) expressed at their surface (Fig. 1). These antigens are commonly short peptide fragments of eight or more residues, the presentation of which is dictated in large part by the structural preferences of the MHC allele¹. Lipid, metabolite and oligosaccharide T cell antigens have also been reported^2–4. TCRs typically engage antigen–MHC complexes via one or more of their six complementarity-determining loops (CDRs), three contributed by each chain of the TCR dimer.

Fig. 1 — a, Cartoon illustrating cancer cell antigen presentation to a naive T cell; T cell activation and expansion and effector T cell engagement of the cancer cell. b, Antigen recognition by conventional T cells through the interaction of the αβ T cell receptor (TCR) heterodimer with peptide antigen presented by an MHC class I molecule. c, Crystal structure of the affinity-enhanced A3A TCR engaging with melanoma-associated antigen 3 (MAGE-A3)-derived peptide presented by HLA-A*01 (ref. ¹⁰¹) (generated with data from ref. ¹⁰¹ and visualized with PyMOL (see Related links)).

The pivotal role of the TCR in surveillance and response to disease, and in the development of new vaccines and therapies, has driven concerted efforts to decode the rules by which T cells recognize cognate antigen–MHC complexes. However, cost and experimental limitations have restricted the available databases to just a minute fraction of the possible sample space of TCR–antigen binding pairs (Box 1). As we discuss later, these data sets^5–8 are also poorly representative of the universe of self and pathogenic epitopes and of the varied MHC contexts in which they may be presented (Fig. 2).

Fig. 2 — a, Number of T cell receptors (TCRs) containing α-chains, β-chains or paired chains, showing variation in numbers according to the data set (manually curated catalogue of pathology-associated TCR sequences (McPas-TCR), VDJ database (VDJdb), Immune Epitope Database (IEDB) and multiplex identification of TCR antigen specificity (MIRA)). b, Number of TCRs per antigen species of origin, showing that the majority of all antigens reported as binding a TCR are of viral origin. c, Cumulative frequency of antigens, showing that a group of 100 antigens makes up 70% of TCR–antigen pairs. d, Number of TCRs by HLA-A type, showing that known antigens are reported in complex with only a few common HLA alleles. e, Frequency histogram showing that most antigens have only one known cognate TCR in the combined data set^5–8.

The research community has therefore turned to machine learning models as a means of predicting the antigen specificity of the so-called orphan TCRs having no known experimentally validated cognate antigen. Accurate prediction of TCR–antigen specificity can be described as deriving computational solutions to two related problems: first, given a TCR of unknown antigen specificity, which antigen–MHC complexes is it most likely to bind; and second, given an antigen–MHC complex, which are the most likely cognate TCRs?

A critical requirement of models attempting to answer these questions is that they should be able to make accurate predictions for any combination of TCR and antigen–MHC complex. These should cover both ‘seen’ pairs included in the data on which the model was trained and novel or ‘unseen’ TCR–epitope pairs to which the model has not been exposed⁹. Impressive advances have been made for specificity inference of seen epitopes in particular disease contexts. For example, clusters of TCRs having common antigen specificity have been identified for Mycobacterium tuberculosis¹⁰ and SARS-CoV-2 (ref. ¹¹), providing possible avenues for new vaccine and pharmaceutical development. However, as discussed later, performance for seen epitopes wanes beyond a small number of immunodominant viral epitopes and is generally poor for unseen epitopes^9,12. This matters because many epitopes encountered in nature will not have an experimentally validated cognate TCR, particularly those of human or non-viral origin (Fig. 2). In the text to follow, we refer to the case for generalizable TCR–antigen specificity inference, meaning prediction of binding for both seen and unseen antigens in any MHC context.

We must also make an important distinction between the related tasks of predicting TCR specificity and antigen immunogenicity. The former, and the focus of this article, is the prediction of binding between sets of TCRs and antigen–MHC complexes. The latter can be described as predicting whether a given antigen will induce a functional T cell immune response: a complex chain of events spanning antigen expression, processing and presentation, TCR binding, T cell activation, expansion and effector differentiation. Although great strides have been made in improving prediction of antigen processing and presentation for common HLA alleles, the nature and extent to which presented peptides trigger a T cell response are yet to be elucidated¹³. A significant gap also remains for the prediction of T cell activation for a given peptide^14,15, and the parameters that influence pathological peptide or neoantigen immunogenicity remain under intense investigation¹⁶. We believe that only by integrating knowledge of antigen presentation, TCR recognition, context-dependent activation and effector function at the cell and tissue level will we fully realize the benefits to fundamental and translational science (Box 2).

Box 1 The extraordinary diversity of TCR–antigen pairs.

At a conservative estimate of 5 million unique T cell receptors (TCRs) per individual at a given time¹⁰², a global population of 8 billion sharing 11% of their TCRs¹⁰² would represent a unique TCR pool of 3.6 × 10¹⁵. This figure excludes recognition of antigens from over 1,400 pathogens known to be capable of infecting humans¹⁰³, binding to self and neoantigens and presentation of antigens in over 34,000 HLA contexts¹⁰⁴. The universe of feasible TCR–antigen–MHC combinations is, therefore, likely to be orders of magnitude higher, especially when accounting for degeneracy in TCR–antigen recognition.

Box 2 Implications of accurate TCR specificity prediction.

The ability to accurately predict the cognate ligand of a given T cell receptor (TCR) or antigen–MHC complex has important implications for the design of new therapies and vaccines and our understanding of the biological role of T cells in health and disease³⁸.

In oncology, T cell antigen recognition has become the focus of new drug development efforts, including checkpoint inhibitors, chimeric antigen receptor (CAR) T cells, endogenous or affinity-enhanced TCRs, and cancer vaccines¹⁰⁵. Cross-reactivity in TCR-based T cell therapies has presented a major roadblock to the development of safe interventions, and gaps in preclinical screening have led to tragedies in the clinic¹⁰⁶. Accurate and generalizable specificity inference could provide an additional safety net to robust experimental screens, predicting likely autoreactivity for a given patient population in oncology and beyond^107,108.

Beyond the implications for new medicines development, there is significant potential to use predictive tools to dissect the fundamental role of T cells in the surveillance of malignancy. For example, there are reports of the accumulation of clones with driver mutations in sun-exposed skin¹⁰⁹, but the extent to which mutational burden is reflected in TCR repertoires is not well understood¹¹⁰. Exhausted cytotoxic CD8⁺ T cells have long been known to be a hallmark of an inefficient antitumour immune response^111–113. However, although early data are emerging^114,115, we do not yet fully know whether T cells with particular antigen specificity are more likely to be exhausted.

For infectious diseases such as SARS-CoV-2, predictors of T cell specificity could be of great use in understanding the magnitude and dynamics of antigen-specific T cell responses to the disease¹¹⁶ and vaccination¹¹⁷. However, there remains a significant opportunity to improve open-source systems immunology tools for confident linkage of T cell antigen specificity to differential vaccine-induced response.

Linkage of expanded effector T cell populations to their cognate self-antigen will also provide vital diagnostic clues as to disease aetiology of autoimmune conditions. This is exemplified by a recent longitudinal study that demonstrated an association between Epstein–Barr virus infection and the incidence of multiple sclerosis, supportive of new vaccine development¹¹⁸.

State of the art

From deepening our mechanistic understanding of disease to providing routes for accelerated development of safer, personalized vaccines and therapies, the case for constructing a complete map of TCR–antigen interactions is compelling. We now explore some of the experimental and computational progress made to date, highlighting possible explanations for why generalizable prediction of TCR binding specificity remains a daunting task.

Experimental methods

The development of recombinant antigen–MHC multimer assays¹⁷ has proved transformative in the analysis of TCR–antigen specificity, enabling researchers to track and study T cell populations under various conditions and disease settings^18–20. Nonetheless, critical limitations remain that hamper high-throughput determination of TCR–antigen specificity. We direct the interested reader to a recent review²¹ for a thorough comparison of these technologies and summarize some of the principal issues subsequently.

Antigen–MHC multimers may be used to determine TCR specificity using bulk (pooled) T cell populations, or newer single-cell methods. Bulk methods are widely used and relatively inexpensive, but do not provide information on αβ TCR chain pairing or function. As a result, single chain TCR sequences predominate in public data sets (Fig. 2). However, both α-chains and β-chains contribute to antigen recognition and specificity^22,23. We shall discuss the implications of this for modelling approaches later. Multimodal single-cell technologies provide insight into chain pairing and transcriptomic and phenotypic profiles at cellular resolution, but remain prohibitively expensive, return fewer TCR sequences per run than bulk experiments and show significant bias towards TCRs with high specificity^24–26. The appropriate experimental protocol for the reduction of nonspecific multimer binding, validation of correct folding and computational improvement of signal-to-noise ratios remain active fields of debate^25,26. Indeed, concerns over nonspecific binding have led recent computational studies to exclude data derived from a 10× study of four healthy donors²⁷.

Although bulk and single-cell methods are limited to a modest number of antigen–MHC complexes per run, the advent of technologies such as lentiviral transfection assays^28,29 provides scalability to up to 96 antigen–MHC complexes through library-on-library screens. However, previous knowledge of the antigen–MHC complexes of interest is still required. This precludes epitope discovery in unknown, rare, sequestered, non-canonical and/or non-protein antigens³⁰.

The advent of synthetic peptide display libraries (Fig. 3a) permits the extension of binding analysis to hundreds of thousands of peptides per TCR^30–33. Using transgenic yeast expressing synthetic peptide–MHC constructs from a library of 2 × 10⁸ peptides, Birnbaum et al.³¹ dissected the binding preferences of autoreactive mouse and human TCRs, providing clues as to the mechanisms underlying autoimmune targeting in multiple sclerosis. High-throughput library screens such as these provide opportunities for improved screening of the antigen–MHC space, but limit analysis to individual TCRs and rely on TCR–MHC binding instead of function. There remains a need for high-throughput linkage of antigen specificity and T cell function, for example, through mammalian or bead display^34–37.

As a result of these barriers to scalability, only a minuscule fraction of the total possible sample space of TCR–antigen pairs (Box 1) has been validated experimentally. At the time of writing, fewer than 1 million unique TCR–epitope pairs are available from VDJdb, McPas-TCR, the Immune Epitope Database and the MIRA data set^5–8 (Fig. 2). Just 4% of these instances contain complete chain pairing information (Fig. 2a). About 97% of all antigens reported as binding a TCR are of viral origin, and a group of just 100 antigens makes up 70% of TCR–antigen pairs (Fig. 2b,c). Where the HLA context of a given antigen is known, the training data are dominated by antigens presented by a handful of common alleles (Fig. 2d). Many antigens have only one known cognate TCR (Fig. 2e). These limitations have simultaneously provided the motivation for and the greatest barrier to computational methods for the prediction of TCR–antigen specificity.

Computational methods

A comprehensive survey of computational models for TCR specificity inference is beyond the scope intended here but can be found in the following helpful reviews^15,38–42. Broadly speaking, current models can be divided into two categories, which we dub supervised predictive models (SPMs) (Fig. 3b) and unsupervised clustering models (UCMs) (Fig. 3c) on account of their respective use of supervised learning and unsupervised learning. A non-exhaustive summary of recent open-source SPMs and UCMs can be found in Table 1.

Table 1.

A non-exhaustive list of supervised and unsupervised models for inference of TCR epitope specificity published since 2020

Model	Date	TCR chain input	Training data	Method	Availability
Supervised predictive models
ATM-TCR⁸¹	07/2022	Single	IEDB⁷ McPas-TCR⁶ VDJdb⁵	DNN-SPM	https://github.com/Lee-CBG/ATM-TCR
ImmuneML⁸²	11/2021	Paired	Heikkila⁸³ VDJdb⁵	DNN-SPM	https://immuneml.uio.no/
NetTCR2 (ref. ⁴⁴)	10/09/2021	Single or paired	IEDB⁷ VDJdb⁵ 10×⁸⁴	DNN-SPM	https://services.healthtech.dtu.dk/service.php?NetTCR-2.0
SwarmTCR⁸⁵	07/09/2021	Single or paired	IEDB⁷ VDJdb⁵ Private	SPM	https://github.com/thecodingdoc/SwarmTCR
ImRex⁹	07/2021	Single	VDJdb⁵ Dean⁸⁶	DNN-SPM	https://github.com/pmoris/ImRex
Luu et al.⁴³	04/2021	Single	IEDB⁷ McPas-TCR⁶ PIRD⁸⁷ VDJdb⁵	DNN-SPM	https://github.com/jssong-lab/TCR-Epitope-Binding
TCRGP⁸⁸	03/2021	Single or paired	Dash et al.⁵⁴ VDJdb⁵	SPM	https://github.com/emmijokinen/TCRGP
TcellMatch⁴⁸	08/2020	Paired	IEDB⁷ VDJdb⁵ 10×⁸⁴	DNN-SPM	https://github.com/theislab/tcellmatch
SETE⁸⁹	06/2020	Single	Dash et al.⁵⁴ VDJdb⁵	SPM	https://github.com/wonanut/SETE
Unsupervised clustering models^a
ClusTCR⁵⁵	12/2021	Single or paired	VDJdb⁵ Emerson²³	UCM	https://github.com/svalkiers/clusTCR
TCRdist3 (ref. ¹¹)	11/2021	Single or paired	Dash et al.⁵⁴ Nolan⁸ Snyder⁹⁰ VDJdb⁵	UCM	https://github.com/kmayerb/tcrdist3/
GIANA⁵¹	08/2021	Single	Dash et al.⁵⁴ Glanville et al.¹⁹ IEDB⁷ VDJdb⁵ Zhang et al.⁹¹	UCM	https://github.com/s175573/GIANA
GLIPH2 (ref. ¹⁰)	04/2020	Single or paired	Private VDJdb⁵	UCM	http://50.255.35.37:8080/
iSMART⁹²	03/2020	Single	Emerson²³ VDJdb⁵ TCGA	UCM	https://github.com/s175573/iSMART
Other
TCRDock⁶⁶	08/2022	Paired	BFD⁷⁰ PDB⁹³	Pre-trained DNN and DNN-SPM	https://github.com/phbradley/TCRdock
TCR-BERT⁴⁹	11/2021	Single	PIRD⁸⁷ VDJdb⁵ TCRdb⁹⁴	Pre-trained DNN and DNN-SPM	https://huggingface.co/wukevin/tcr-bert
TITAN¹²	07/2021	Single	BindingDB⁹⁵ VDJdb⁵ ImmuneCODE⁹⁶	Pre-trained DNN and DNN-SPM	https://github.com/PaccMann/TITAN
pMTNet⁴⁷	09/2021	Single	Chen et al.⁹⁷ Huth et al.⁹⁸ Joglekar et al.³⁷ PIRD⁸⁷ McPas-TCR⁶ VDJdb⁵ Zhang et al.⁹¹ 10×⁸⁴	Pre-trained DNN and DNN-SPM	https://github.com/tianshilu/pMTnet
ERGO-II⁹⁹	04/2021	Single or paired	Kanakry et al.¹⁰⁰ McPas-TCR⁶ VDJdb⁵ Zhang et al.⁹¹	Pre-trained DNN-UCM and DNN-SPM	https://github.com/IdoSpringer/ERGO-II
ICON-TCRAI²⁵	05/2021	Single or paired	McPas-TCR⁶ VDJdb⁵ 10×⁸⁴	Pre-trained DNN and DNN-SPM	https://github.com/regeneron-mpds/ICON
TCRMatch⁵²	03/2021	Single	IEDB⁷^,^a 10×⁸⁴	Unsupervised: nearest neighbour	https://www.github.com/IEDB/TCRMatch
DeepTCR⁵³	03/2021	Single or paired	Dash et al.⁵⁴ Glanville et al.¹⁹ Sidhom et al.⁵³ 10×⁸⁴ Multiple public sources: see DeepTCR GitHub repository	DNN-UCM or DNN-SPM	https://github.com/sidhomj/DeepTCR

Open in a new tab

ATM-TCR, multi-head self-attention model-TCR; BFD, Big Fantastic Database; DNN, deep neural network; ERGO, peptide TCR matching prediction; GIANA, geometric isometry-based TCR alignment algorithm; GLIPH2, grouping of lymphocytes by paratope hotspots 2; ICON-TCRA, integrative context-specific normalisation TCR-artificial intelligence; IEDB, Immune Epitope Database; ImRex, interaction map recognition; McPas-TCR, manually curated catalogue of pathology-associated TCR sequences; PIRD, pan-immune repertoire database; pMTNet, pMHC-TCR binding prediction network; SETE, sequence-based ensemble learning; SPM, supervised predictive model; TCGA, The Cancer Genome Atlas; TCR, T cell receptor; TCR-BERT, TCR-bidirectional encoder representations from transformers; TCRGP; TCR Gaussian process; TITAN, TCR epitope bimodal attention networks; UCM, unsupervised clusteringmodel; VDJdb, VDJ database. ^aNot all UCMs are explicitly trained, and so data sets reported for non-DNN UCMs are those on which the models were evaluated.

Supervised predictive models

SPMs are those which attempt to learn a function that will correctly predict the cognate epitope for a given input TCR of unknown specificity, given some training data set of known TCR–peptide pairs. The past 2 years have seen an acceleration of publications aiming to address this challenge with deep neural networks (DNNs). Although there are many possible approaches to comparing SPM performance, among the most consistently used is the area under the receiver-operating characteristic curve (ROC-AUC). One would expect to observe 50% ROC-AUC from a random guess in a binary (binding or non-binding) task, assuming a balanced proportion of negative and positive pairs.

Performance by this measure surpasses 80% ROC-AUC for a handful of ‘seen’ immunodominant viral epitopes presented by MHC class I^9,43. However, representation is not a guarantee of performance: 60% ROC-AUC has been reported for HLA-A2*01–CMV-NLVPMVATV⁴⁴, possibly owing to the recognition of this immunodominant antigen by diverse TCRs. Critically, few models explicitly evaluate the performance of trained predictors on unseen epitopes using comparable data sets. Weber et al.¹² achieved an average of 62 ± 6% ROC-AUC for TITAN, compared with 50% for ImRex on a reference data set of unseen epitopes from VDJdb and COVID-19 data sets. Values of 56 ± 5% and 55 ± 3% were reported for TITAN and ImRex, respectively, in a subsequent paper from the Meysman group⁴⁵. Other groups have published unseen epitope ROC-AUC values ranging from 47% to 97%; however, many of these values are reported on different data sets (Table 1), lack confidence estimates following validation^46–49 and have not been consistently reproducible in independent evaluations⁵⁰.

Together, these results highlight a critical need for a thorough, independent benchmarking study conducted across models on data sets prepared and analysed in a consistent manner^27,50. Until then, newer models may be applied with reasonable confidence to the prediction of binding to immunodominant viral epitopes by common HLA alleles. However, SPMs should be used with caution when generalizing to prediction of any epitope, as performance is likely to drop the further the epitope is in sequence from those in the training set⁹.

Unsupervised clustering models

Unlike SPMs, UCMs do not depend on the availability of labelled data, learning instead to produce groupings of the TCR, antigen or HLA input that reflect the underlying statistical variations of the data^19,51 (Fig. 3c). Applied to TCR repertoires, UCMs take as their input single or paired TCR CDR3 amino acid sequences, with or without gene usage information, and return a mapping of sequences to unique clusters. Clustering is achieved by determining the similarity between input sequences, using either ‘hand-crafted’ features such as sequence distance or enrichment of short sub-sequences, or by comparing abstract features learnt by DNNs (Table 1).

Clustering provides multiple paths to specificity inference for orphan TCRs^39–41. Epitope specificity can be predicted by assuming that if an unlabelled TCR is similar to a receptor of known specificity, it will bind the same epitope⁵². One may also co-cluster unlabelled and labelled TCRs and assign the modal or most enriched epitope to all sequences that cluster together⁵¹. Finally, DNNs can be used to generate ‘protein fingerprints’, simple fixed-length numerical representations of complex variable input sequences that may serve as a direct input for a second supervised model^25,53.

As for SPMs, quantitative assessment of the relative merits of hand-crafted and neural network-based UCMs for TCR specificity inference remains limited to the proponents of each new model. Although some DNN-UCMs allow for the integration of paired chain sequences and even transcriptomic profiles⁴⁸, they are susceptible to the same training biases as SPMs and are notably less easy to implement than established clustering models such as GLIPH and TCRdist^19,54. However, these established clustering models scale relatively poorly to large data sets compared with newer releases^51,55. Recent analyses^27,53 suggest that there is little to differentiate commonly used UCMs from simple sequence distance measures. Here again, independent benchmarking analyses would be valuable, work towards which our group is dedicating significant time and effort.

Key challenges

Despite the exponential growth of unlabelled immune repertoire data and the recent unprecedented breakthroughs in the fields of data science and artificial intelligence, quantitative immunology still lacks a framework for the systematic and generalizable inference of T cell antigen specificity of orphan TCRs. Among the most plausible explanations for these failures are limitations in the data, methodological gaps and incomplete modelling of the underlying immunology.

Data

As we have set out earlier, the single most significant limitation to model development is the availability of high-quality TCR and antigen–MHC pairs. The need is most acute for under-represented antigens, for those presented by less frequent HLA alleles, and for linkage of epitope specificity and T cell function. Meanwhile, single-cell multimodal technologies have given rise to hundreds of millions of unlabelled TCR sequences^8,56, linked to transcriptomics, phenotypic and functional information. However, these unlabelled data are not without significant limitations. Notably, biological factors such as age, sex, ethnicity and disease setting vary between studies and are likely to influence immune repertoires. Differences in experimental protocol, sequence pre-processing, total variation filtering (denoising) and normalization between laboratory groups are also likely to have an impact: batch correction may well need to be applied⁵⁷. Therefore, thoughtful approaches to data consolidation, noise correction, processing and annotation are likely to be crucial in advancing state-of-the-art predictive models.

Modelling

The exponential growth of orphan TCR data from single-cell technologies, and cutting-edge advances in artificial intelligence and machine learning, has firmly placed TCR–antigen specificity inference in the spotlight. However, we believe that several critical gaps must be addressed before a solution to generalized epitope specificity inference can be realized.

First, models whose TCR sequence input is limited to the use of β-chain CDR3 loops and VDJ gene codes are only ever likely to tell part of the story of antigen recognition, and the extent to which single chain pairing is sufficient to describe TCR–antigen specificity remains an open question. Structural⁵⁸ and statistical⁵⁹ analyses suggest that α-chains and β-chains contribute equally to specificity, and incorporating both chains has improved predictive performance⁴⁴. However, chain pairing information is largely absent (Fig. 2a), and many state-of-the-art SPMs and UCMs rely on single chain information alone (Table 1). Although CDR3 loops may be primarily responsible for antigen recognition, residues from CDR1, CDR2 and even the framework region of both α-chains and β-chains may be involved⁵⁸. Subtle compensatory changes in interaction networks between peptide–MHC and TCR, altered binding modes and conformational flexibility in both TCR and MHC may underpin TCR cross-reactivity^60,61. Explicit encoding of structural information for specificity inference has until recently been limited to studies of a limited set of crystal structures^19,62. However, the advent of automated protein structure prediction with software programs such as RoseTTaFold, ESMFold and AlphaFold-Multimer provide potential opportunities for large-scale sequence and structure interpretations of TCR epitope specificity^63–65. This has been illustrated in a recent preprint in which a modified version of AlphaFold-Multimer has been used to identify the most likely binder to a given TCR, achieving a mean ROC-AUC of 82% on a small pool of eight seen epitopes⁶⁶.

To train models, balanced sets of negative and positive samples are required. In the absence of experimental negatives, negative instances may be produced by shuffling or drawing randomly from healthy donor repertoires⁹. However, these approaches assume, on the one hand, that TCRs do not cross-react and, on the other hand, that the healthy donor repertoires do not include sequences reactive to the epitopes of interest. A recent study from Jiang et al.⁶⁷ provides interesting strategies to address this challenge.

Finally, developers should use the increasing volume of functionally annotated orphan TCR data to boost performance through transfer learning: a technique in which models are trained on a large volume of unlabelled or partially labelled data, and the patterns learnt from those data sets are used to inform a second predictive task. This technique has been widely adopted in computational biology, including in predictive tasks for T and B cell receptors^49,66,68. Indeed, the best-performing configuration of TITAN made used a TCR module that had been pretrained on a BindingDB database (see Related links) of 471,017 protein–ligand pairs¹². Incorporating evolutionary and structural information through sequence and structure-aware representations of the TCR and of the antigen–MHC complex^69,70 may yield further benefits.

Immunology

It is now evident that the underlying immunological correlates of T cell interaction with their cognate ligands are highly variable and only partially understood, with critical consequences for model design. Importantly, TCR–antigen specificity inference is just one part of the larger puzzle of antigen immunogenicity prediction^16,18, which we condense into three phases: antigen processing and presentation by MHC, TCR recognition and T cell response.

Antigen processing and presentation pathways have been extensively studied, and computational models for predicting peptide binding affinity to some MHC alleles, especially class I HLAs, have achieved near perfect ROC-AUC^15,71 for common alleles. However, this problem is far from solved, particularly for less-frequent MHC class I alleles and for MHC class II alleles⁷.

A key challenge to generalizable TCR specificity inference is that TCRs are at once specific for antigens bearing particular motifs and capable of considerable promiscuity^72,73. This contradiction might be explained through specific interaction of conserved ‘hotspot’ residues in the TCR CDR loops with corresponding two to three residue clusters in the antigen, balanced by a greater tolerance of variations in amino acids at other positions⁶⁰. TCRs may also bind different antigen–MHC complexes using alternative docking topologies⁵⁸. Despite the known potential for promiscuity in the TCR, the pre-processing stages of many models assume that a given TCR has only one cognate epitope. Another under-explored yet highly relevant factor of T cell recognition is the impact of positive and negative thymic selection and more specifically the effect of self-peptide presentation in formation of the naive immune repertoire⁷⁴.

Many groups have attempted to bypass this complexity by predicting antigen immunogenicity independent of the TCR¹⁴, as a direct mapping from peptide sequence to T cell activation. However, similar limitations have been encountered for those models as we have described for specificity inference. Many predictors are trained using epitopes from the Immune Epitope Database labelled with readouts from single time points⁷. However, Achar et al.⁷⁵ illustrated that integrating cytokine responses over time improved prediction of quality. Antigen load and affinity can also play important roles^74,76. Thus, models capable of predicting functional T cell responses will likely need to bridge from antigen presentation to TCR–antigen recognition, T cell activation and effector differentiation and to integrate complex tissue-specific cytokine, cell phenotype and spatiotemporal data sets. Our view is that, although T cell-independent predictors of immunogenicity have clear translational benefits, only after we can dissect the relative contribution of the three stages described earlier will we understand what determines antigen immunogenicity.

New experimental and computational techniques that permit the integration of sequence, phenotypic, spatial and functional information and the multimodal analyses described earlier provide promising opportunities in this direction^75,77. Integrating TCR sequence and cell-specific covariates from single-cell data has been shown to improve performance in the inference of T cell antigen specificity⁴⁸. By taking a graph theoretical approach, Schattgen et al.⁷⁸ reported an association between clonotype clustering with the cellular phenotypes derived from gene expression and surface marker expression. We believe that such integrative approaches will be instrumental in unlocking the secrets of T cell antigen recognition.

Conclusions and call to action

Together, the limitations of data availability, methodology and immunological context leave a significant gap in the field of T cell immunology in the era of machine learning and digital biology. We believe that by harnessing the massive volume of unlabelled TCR sequences emerging from single-cell data, applying data augmentation techniques to counteract epitope and HLA imbalances in labelled data, incorporating sequence and structure-aware features and applying cutting-edge computational techniques based on rich functional and binding data, improvements in generalizable TCR–antigen specificity inference are within our collective grasp. To aid in this effort, we encourage the following efforts from the community.

First, a consolidated and validated library of labelled and unlabelled TCR data should be made available to facilitate model pretraining and systematic comparisons. Second, a coordinated effort should be made to improve the coverage of TCR–antigen pairs presented by less common HLA alleles and non-viral epitopes. We encourage the continued publication of negative and positive TCR–epitope binding data to produce balanced data sets. Third, an independent, unbiased and systematic evaluation of model performance across SPMs, UCMs and combinations of the two (Table 1) would be of great use to the community. Such a comparison should account for performance on common and infrequent HLA subtypes, seen and unseen TCRs and epitopes, using consistent evaluation metrics including but not limited to ROC-AUC and area under the precision–recall curve. We encourage validation strategies such as those used in the assessment of ImRex and TITAN^9,12 to substantiate model performance comparisons. In the future, TCR specificity inference data should be extended to include multimodal contextual information as a means of bridging from TCR binding to immunogenicity prediction.

The scale and complexity of this task imply a need for an interdisciplinary consortium approach for systematic incorporation of the latest immunological understandings of cellular immunity at the tissue level and cutting-edge developments in the field of artificial intelligence and data science. This should include experimental and computational immunologists, machine-learning experts and translational and industrial partners. Considering the success of the critical assessment of protein structure prediction series⁷⁹, we encourage a similar approach to address the grand challenge of TCR specificity inference in the short term and ultimately to the prediction of integrated T and B cell immunogenicity. Competing models should be made freely available for research use, following the commendable example set in protein structure prediction^65,70.

Acknowledgements

H.K. is supported by funding from the UK Medical Research Council grant number MC_UU_12010/3. D.H. receives support from the Biotechnology and Biological Sciences Research Council (BBSRC) (grant number BB/T008784/1) and is funded by the Rosalind Franklin Institute. The authors thank A. Simmons, B. McMaster and C. Lee for critical review. H.K. acknowledges A. Antanaviciute, A. Simmons, T. Elliott and P. Klenerman for their encouragement, support and fruitful conversations.

Glossary

Area under the receiver-operating characteristic curve: (ROC-AUC). ROC-AUC and the area under the precision–recall curve (PR-AUC) are measures of model tendency to different classes of error. These plots are produced for classification tasks by changing the threshold at which a model prediction falling between zero and one is assigned to the positive label class, for example, predicted binding of a given T cell receptor–antigen pair. ROC-AUC is the area under the line described by a plot of the true positive rate and false positive rate. ROC-AUC is typically more appropriate for problems where positive and negative labels are proportionally represented in the input data. PR-AUC is the area under the line described by a plot of model precision against model recall. PR-AUC is typically more appropriate for problems in which the positive label is less frequently observed than the negative label.
Library-on-library screens: Experimental screens that permit analysis of the binding between large libraries of (for example) peptide–MHC complexes and various T cell receptors.
Machine learning models: A broad family of computational and statistical methods that aim to identify statistically conserved patterns within a data set without being explicitly programmed to do so. Machine learning models may broadly be described as supervised or unsupervised based on the manner in which the model is trained. Many recent models make use of both approaches.
Neural networks: A family of machine learning models inspired by the synaptic connections of the brain that are made up of stacked layers of simple interconnected models. Although each component of the network may learn a relatively simple predictive function, the combination of many predictors allows neural networks to perform arbitrarily complex tasks from millions or billions of instances. Neural networks may be trained using supervised or unsupervised learning and may deploy a wide variety of different model architectures. Deep neural networks refer to those with more than one intermediate layer.
Shuffling: In the absence of experimental negative (non-binding) data, shuffling is the act of assigning a given T cell receptor drawn from the set of known T cell receptor–antigen pairs to an epitope other than its cognate ligand, and labelling the randomly generated pair as a negative instance.
Supervised learning: Models that learn a mathematical function mapping from an input to a predicted label, given some data set containing both input data and associated labels. Common supervised tasks include regression, where the label is a continuous variable, and classification, where the label is a discrete variable.
Synthetic peptide display libraries: Experimental systems that make use of large libraries of recombinant synthetic peptide–MHC complexes displayed by yeast³⁰, baculovirus³² or bacteriophage³³ or beads³⁵ for profiling the sequence determinants of immune receptor binding. Peptide diversity can reach 10⁹ unique peptides for yeast-based libraries.
Training data: The training data set serves as an input to the model from which it learns some predictive or analytical function.
Unsupervised learning: Models that learn to assign input data to clusters having similar features, or otherwise to learn the underlying statistical patterns of the data. Unlike supervised models, unsupervised models do not require labels. Common unsupervised techniques include clustering algorithms such as K-means; anomaly detection models and dimensionality reduction techniques such as principal component analysis⁸⁰ and uniform manifold approximation and projection.
Validation: Analysis done using a validation data set to evaluate model performance during and after training. A given set of training data is typically subdivided into training and validation data, for example, in an 80%:20% ratio. Models may then be trained on the training data, and their performance evaluated on the validation data set.

Author contributions

H.K. and D.H. researched and wrote the article. R.A.F., M.B. and G.O. reviewed and edited the manuscript before submission.

Peer review

Peer review information

Nature Reviews Immunology thanks M. Birnbaum, P. Holec, E. Newell and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Competing interests

G.O. is a co-founder of T-Cypher Bio. D.H. and R.A.F provide consultancy services to companies active in T cell antigen discovery and vaccine development. The other authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Immune Epitope Database: https://www.iedb.org/

McPas-TCR: http://friedmanlab.weizmann.ac.il/McPAS-TCR

MIRA: https://clients.adaptivebiotech.com/pub/covid-2020

PyMOL: https://www.schrodinger.com/products/pymol

VDJdb: https://vdjdb.cdr3.net/

References

1.Nguyen AT, Szeto C, Gras S. The pockets guide to HLA class I molecules. Biochem. Soc. Trans. 2021;49:2319–2331. doi: 10.1042/BST20210410. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.de Jong A, Ogg G. CD1a function in human skin disease. Mol. Immunol. 2021;130:14–19. doi: 10.1016/j.molimm.2020.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.de Libero G, Chancellor A, Mori L. Antigen specificities and functional properties of MR1-restricted T cells. Mol. Immunol. 2021;130:148–153. doi: 10.1016/j.molimm.2020.12.016. [DOI] [PubMed] [Google Scholar]
4.Sun L, Middleton DR, Wantuch PL, Ozdilek A, Avci FY. Carbohydrates as T-cell antigens with implications in health and disease. Glycobiology. 2016;26:1029–1040. doi: 10.1093/glycob/cww062. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Bagaev DV, et al. VDJdb in 2019: database extension, new analysis infrastructure and a T-cell receptor motif compendium. Nucleic Acids Res. 2020;48:D1057–D1062. doi: 10.1093/nar/gkz874. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Tickotsky N, Sagiv T, Prilusky J, Shifrut E, Friedman N. McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. Bioinformatics. 2017;33:2924–2929. doi: 10.1093/bioinformatics/btx286. [DOI] [PubMed] [Google Scholar]
7.Vita R, et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 2019;47:D339–D343. doi: 10.1093/nar/gky1006. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Nolan, S. et al. A large-scale database of T-cell receptor beta (TCRβ) sequences and binding associations from natural and synthetic exposure to SARS-CoV-2. Preprint at Res. Sq. https://www.researchsquare.com/article/rs-51964/v1 (2020).
9.Moris P, et al. Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification. Brief. Bioinform. 2021;22:bbaa318. doi: 10.1093/bib/bbaa318. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Huang H, Wang C, Rubelt F, Scriba TJ, Davis MM. Analyzing the Mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening. Nat. Biotechnol. 2020;38:1194–1202. doi: 10.1038/s41587-020-0505-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Mayer-Blackwell K, et al. TCR meta-clonotypes for biomarker discovery with tcrdist3 enabled identification of public, HLA-restricted clusters of SARS-CoV-2 TCRs. eLife. 2021;10:e68605. doi: 10.7554/eLife.68605. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Weber A, Born J, Rodriguez Martínez M. TITAN: T cell receptor specificity prediction with bimodal attention networks. Bioinformatics. 2021;37:I237–I244. doi: 10.1093/bioinformatics/btab294. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Lee CH, Antanaviciute A, Buckley PR, Simmons A, Koohy H. To what extent does MHC binding translate to immunogenicity in humans? Immunoinformatics. 2021;3–4:100006. doi: 10.1016/j.immuno.2021.100006. [DOI] [Google Scholar]
14.Buckley PR, et al. Evaluating performance of existing computational models in predicting CD8+ T cell pathogenic epitopes and cancer neoantigens. Brief. Bioinform. 2022;23:bbac141. doi: 10.1093/bib/bbac141. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Mösch A, Raffegerst S, Weis M, Schendel DJ, Frishman D. Machine learning for cancer immunotherapies based on epitope recognition by T cell receptors. Front. Genet. 2019;10:1141. doi: 10.3389/fgene.2019.01141. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Wells DK, et al. Key parameters of tumor epitope immunogenicity revealed through a consortium approach improve neoantigen prediction. Cell. 2020;183:818–834.e13. doi: 10.1016/j.cell.2020.09.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Altman JD, et al. Phenotypic analysis of antigen-specific T lymphocytes. Science. 1996;274:94–96. doi: 10.1126/science.274.5284.94. [DOI] [PubMed] [Google Scholar]
18.Yao Y, Wyrozżemski Ł, Lundin KEA, Kjetil Sandve G, Qiao S-W. Differential expression profile of gluten-specific T cells identified by single-cell RNA-seq. PLoS ONE. 2021;16:e0258029. doi: 10.1371/journal.pone.0258029. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Glanville J, et al. Identifying specificity groups in the T cell receptor repertoire. Nature. 2017;547:94–98. doi: 10.1038/nature22976. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Kurtulus S, Hildeman D. Assessment of CD4+ and CD8+ T cell responses using MHC class I and II tetramers. Methods Mol. Biol. 2013;979:71–79. doi: 10.1007/978-1-62703-290-2_8. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Joglekar AV, Li G. T cell antigen discovery. Nat. Methods. 2021;18:873–880. doi: 10.1038/s41592-020-0867-z. [DOI] [PubMed] [Google Scholar]
22.Bosselut R, et al. Single T cell sequencing demonstrates the functional role of αβ TCR pairing in cell lineage and antigen specificity. Front. Immunol. 2019;1:1516. doi: 10.3389/fimmu.2019.01516. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Emerson RO, et al. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat. Genet. 2017;49:659–665. doi: 10.1038/ng.3822. [DOI] [PubMed] [Google Scholar]
24.Wang X, He Y, Zhang Q, Ren X, Zhang Z. Direct comparative analyses of 10× genomics chromium and Smart-Seq2. Genomics Proteomics Bioinformatics. 2021;19:253–266. doi: 10.1016/j.gpb.2020.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Zhang W, et al. A framework for highly multiplexed dextramer mapping and prediction of T cell receptor sequences to antigen specificity. Sci. Adv. 2021;7:eabf5835. doi: 10.1126/sciadv.abf5835. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Gascoigne N, et al. Optimized peptide-MHC multimer protocols for detection and isolation of autoimmune T-cells. Front. Immunol. 2018;9:1378. doi: 10.3389/fimmu.2018.01378. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Meysman P, et al. Benchmarking solutions to the T-cell receptor epitope prediction problem: IMMREP22 workshop report. bioRxiv. 2022 doi: 10.1101/2022.10.27.514020. [DOI] [Google Scholar]
28.Dobson CS, et al. Antigen identification and high-throughput interaction mapping by reprogramming viral entry. Nat. Methods. 2022;19:449–460. doi: 10.1038/s41592-022-01436-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Guo XZJ, Elledge SJ. V-CARMA: a tool for the detection and modification of antigen-specific T cells. Proc. Natl Acad. Sci. USA. 2022;119:e2116277119. doi: 10.1073/pnas.2116277119. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Brophy SE, Holler PD, Kranz DM. A yeast display system for engineering functional peptide-MHC complexes. J. Immunol. Methods. 2003;272:235–246. doi: 10.1016/S0022-1759(02)00439-8. [DOI] [PubMed] [Google Scholar]
31.Birnbaum ME, et al. Deconstructing the peptide-MHC specificity of T cell recognition. Cell. 2014;157:1073–1087. doi: 10.1016/j.cell.2014.03.047. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Crawford F, et al. Use of baculovirus MHC/peptide display libraries to characterize T-cell receptor ligands. Immunol. Rev. 2006;210:156–170. doi: 10.1111/j.0105-2896.2006.00365.x. [DOI] [PubMed] [Google Scholar]
33.Coles CH, et al. TCRs with distinct specificity profiles use different binding modes to engage an identical peptide–HLA complex. J. Immunol. 2020;204:1943–1953. doi: 10.4049/jimmunol.1900915. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Kula T, et al. T-Scan: a genome-wide method for the systematic discovery of T cell epitopes. Cell. 2019;178:1016. doi: 10.1016/j.cell.2019.07.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Pan X, et al. Combinatorial HLA-peptide bead libraries for high throughput identification of CD8+ T cell specificity. J. Immunol. Methods. 2014;403:72–78. doi: 10.1016/j.jim.2013.11.023. [DOI] [PubMed] [Google Scholar]
36.Li G, et al. T cell antigen discovery via trogocytosis. Nat. Methods. 2019;16:183–190. doi: 10.1038/s41592-018-0305-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Joglekar AV, et al. T cell antigen discovery via signaling and antigen-presenting bifunctional receptors. Nat. Methods. 2019;16:191–198. doi: 10.1038/s41592-018-0304-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Schaap-Johansen A-L, Vujovic M, Borch A, Hadrup SR, Marcatili P. T cell epitope prediction and its application to immunotherapy. Front. Immunol. 2021;12:712488. doi: 10.3389/fimmu.2021.712488. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Valkiers S, et al. Recent advances in T-cell receptor repertoire analysis: bridging the gap with multimodal single-cell RNA sequencing. Immunoinformatics. 2022;5:100009. doi: 10.1016/j.immuno.2022.100009. [DOI] [Google Scholar]
40.Lee CH, et al. Predicting cross-reactivity and antigen specificity of T cell receptors. Front. Immunol. 2020;11:2498. doi: 10.3389/fimmu.2020.565096. [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Vujovic M, et al. T cell receptor sequence clustering and antigen specificity. Comput. Struct. Biotechnol. J. 2020;18:2166–2173. doi: 10.1016/j.csbj.2020.06.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
42.Katayama Y, Yokota R, Akiyama T, Kobayashi TJ. Machine learning approaches to TCR repertoire analysis. Front. Immunol. 2022;13:858057. doi: 10.3389/fimmu.2022.858057. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Luu AM, Leistico JR, Miller T, Kim S, Song JS. Predicting TCR-epitope binding specificity using deep metric learning and multimodal learning. Genes. 2021;12:572. doi: 10.3390/genes12040572. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Montemurro A, et al. NetTCR-2.0 enables accurate prediction of TCR-peptide binding by using paired TCRα and β sequence data. Commun. Biol. 2021;4:1060. doi: 10.1038/s42003-021-02610-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Dens C, Bittremieux W, Affaticati F, Laukens K, Meysman P. Interpretable deep learning to uncover the molecular binding patterns determining TCR–epitope interactions. bioRxiv. 2022 doi: 10.1101/2022.05.02.490264. [DOI] [Google Scholar]
46.Springer I, Tickotsky N, Louzoun Y. Contribution of T cell receptor alpha and beta CDR3, MHC typing, V and J genes to peptide binding prediction. Front. Immunol. 2021;12:1436. doi: 10.3389/fimmu.2021.664514. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Lu T, et al. Deep learning-based prediction of the T cell receptor–antigen binding specificity. Nat. Mach. Intell. 2021;3:864–875. doi: 10.1038/s42256-021-00383-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Fischer DS, Wu Y, Schubert B, Theis FJ. Predicting antigen specificity of single T cells based on TCR CDR3 regions. Mol. Syst. Biol. 2020;16:9416. doi: 10.15252/msb.20199416. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Wu K, et al. TCR-BERT: learning the grammar of T-cell receptors for flexible antigen-binding analyses. bioRxiv. 2021 doi: 10.1101/2021.11.18.469186. [DOI] [Google Scholar]
50.Grazioli F, et al. On TCR binding predictors failing to generalize to unseen peptides. Front. Immunol. 2022;13:1014256. doi: 10.3389/fimmu.2022.1014256. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Zhang H, Zhan X, Li B. GIANA allows computationally-efficient TCR clustering and multi-disease repertoire classification by isometric transformation. Nat. Commun. 2021;12:4699. doi: 10.1038/s41467-021-25006-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Chronister WD, et al. TCRMatch: predicting T-cell receptor specificity based on sequence similarity to previously characterized receptors. Front. Immunol. 2021;12:640725. doi: 10.3389/fimmu.2021.640725. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Sidhom JW, Larman HB, Pardoll DM, Baras AS. DeepTCR is a deep learning framework for revealing sequence concepts within T-cell repertoires. Nat. Commun. 2021;12:1605. doi: 10.1038/s41467-021-21879-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
54.Dash P, et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature. 2017;547:89–93. doi: 10.1038/nature22383. [DOI] [PMC free article] [PubMed] [Google Scholar]
55.Valkiers S, van Houcke M, Laukens K, Meysman P. ClusTCR: a python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity. Bioinformatics. 2021;37:4865–4867. doi: 10.1093/bioinformatics/btab446. [DOI] [PubMed] [Google Scholar]
56.Corrie BD, et al. iReceptor: a platform for querying and analyzing antibody/B-cell and T-cell receptor repertoire data across federated repositories. Immunol. Rev. 2018;284:24–41. doi: 10.1111/imr.12666. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Andreatta M, et al. Interpretation of T cell states from single-cell transcriptomics data using reference atlases. Nat. Commun. 2021;12:2965. doi: 10.1038/s41467-021-23324-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Leem J, de Oliveira SHP, Krawczyk K, Deane CM. STCRDab: the structural T-cell receptor database. Nucleic Acids Res. 2018;46:D406–D412. doi: 10.1093/nar/gkx971. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Mayer A, Callan Jr CG. Measures of epitope binding degeneracy from T cell receptor repertoires. bioRxiv. 2022 doi: 10.1101/2022.07.25.501373. [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Singh NK, et al. Emerging concepts in TCR specificity: rationalizing and (maybe) predicting outcomes. J. Immunol. 2017;199:2203–2213. doi: 10.4049/jimmunol.1700744. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Quaratino S, Thorpe CJ, Travers PJ, Londei M. Similar antigenic surfaces, rather than sequence homology, dictate T-cell epitope molecular mimicry. Proc. Natl Acad. Sci. USA. 1995;92:10398–10402. doi: 10.1073/pnas.92.22.10398. [DOI] [PMC free article] [PubMed] [Google Scholar]
62.Lanzarotti E, Marcatili P, Nielsen M. T-cell receptor cognate target prediction based on paired α and β chain sequence and structural CDR loop similarities. Front. Immunol. 2019;10:2080. doi: 10.3389/fimmu.2019.02080. [DOI] [PMC free article] [PubMed] [Google Scholar]
63.Evans R, et al. Protein complex prediction with AlphaFold-Multimer. bioRxiv. 2022 doi: 10.1101/2021.10.04.463034. [DOI] [Google Scholar]
64.Koehler Leman J, et al. Macromolecular modeling and design in Rosetta: recent methods and frameworks. Nat. Methods. 2020;17:665–680. doi: 10.1038/s41592-020-0848-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Rives A, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA. 2021;118:e2016239118. doi: 10.1073/pnas.2016239118. [DOI] [PMC free article] [PubMed] [Google Scholar]
66.Bradley P. Structure-based prediction of T cell receptor: peptide–MHC interactions. bioRxiv. 2022 doi: 10.1101/2022.08.05.503004. [DOI] [PMC free article] [PubMed] [Google Scholar]
67.Jiang Y, Huo M, Li SC. TEINet: a deep learning framework for prediction of TCR-epitope binding specificity. bioRxiv. 2022 doi: 10.1101/2022.10.20.513029. [DOI] [PubMed] [Google Scholar]
68.Chinery L, Wahome N, Moal I, Deane CM. Paragraph — antibody paratope prediction using Graph Neural Networks with minimal feature vectors. Bioinformatics. 2022;39:btac732. doi: 10.1093/bioinformatics/btac732. [DOI] [PubMed] [Google Scholar]
69.Alley EC, Khimulya G, Biswas S. Unified rational protein engineering with sequence-based deep representation learning. Nat. Methods. 2019;16:1312–1322. doi: 10.1038/s41592-019-0598-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
70.Jumper J, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
71.Reynisson B, Alvarez B, Paul S, Peters B, Nielsen M. NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res. 2020;48:449–454. doi: 10.1093/nar/gkaa379. [DOI] [PMC free article] [PubMed] [Google Scholar]
72.Mason D. A very high level of cross-reactivity is an essential feature of the T-cell receptor. Immunol. Today. 1998;19:395–404. doi: 10.1016/S0167-5699(98)01299-7. [DOI] [PubMed] [Google Scholar]
73.Sewell AK. Why must T cells be cross-reactive? Nat. Rev. Immunol. 2012;12:669–677. doi: 10.1038/nri3279. [DOI] [PMC free article] [PubMed] [Google Scholar]
74.Keck S, et al. Antigen affinity and antigen dose exert distinct influences on CD4 T-cell differentiation. Proc. Natl Acad. Sci. USA. 2014;111:14852–14857. doi: 10.1073/pnas.1403271111. [DOI] [PMC free article] [PubMed] [Google Scholar]
75.Achar SR, et al. Universal antigen encoding of T cell activation from high-dimensional cytokine dynamics. Science. 2022;376:880–884. doi: 10.1126/science.abl5311. [DOI] [PMC free article] [PubMed] [Google Scholar]
76.van Panhuys N, Klauschen F, Germain RN. T cell receptor-dependent signal intensity dominantly controls CD4+ T cell polarization in vivo. Immunity. 2014;41:63–74. doi: 10.1016/j.immuni.2014.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
77.Liu S, et al. Spatial maps of T cell receptors and transcriptomes reveal distinct immune niches and interactions in the adaptive immune response. Immunity. 2022;55:1940–1952.e5. doi: 10.1016/j.immuni.2022.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
78.Schattgen SA, et al. Integrating T cell receptor sequences and transcriptional profiles by clonotype neighbor graph analysis (CoNGA) Nat. Biotechnol. 2021;40:54–63. doi: 10.1038/s41587-021-00989-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
79.Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP) — round XIV. Proteins. 2021;89:1607–1617. doi: 10.1002/prot.26237. [DOI] [PMC free article] [PubMed] [Google Scholar]
80.Pearson K. On lines and planes of closest fit to systems of points in space. Philos. Mag. 1901;2:559–570. doi: 10.1080/14786440109462720. [DOI] [Google Scholar]
81.Cai M, Bang S, Zhang P, Lee H. ATM-TCR: TCR–epitope binding affinity prediction using a multi-head self-attention model. Front. Immunol. 2022;13:893247. doi: 10.3389/fimmu.2022.893247. [DOI] [PMC free article] [PubMed] [Google Scholar]
82.Pavlović M, et al. The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires. Nat. Mach. Intell. 2021;3:936–944. doi: 10.1038/s42256-021-00413-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
83.Heikkilä N, et al. Human thymic T cell repertoire is imprinted with strong convergence to shared sequences. Mol. Immunol. 2020;127:112–123. doi: 10.1016/j.molimm.2020.09.003. [DOI] [PubMed] [Google Scholar]
84.10× Genomics. A new way of exploring immunity: linking highly multiplexed antigen recognition to immune repertoire and phenotype. 10× Genomicshttps://pages.10xgenomics.com/rs/446-PBO-704/images/10x_AN047_IP_A_New_Way_of_Exploring_Immunity_Digital.pdf (2020).
85.Ehrlich R, et al. SwarmTCR: a computational approach to predict the specificity of T cell receptors. BMC Bioinformatics. 2021;22:422. doi: 10.1186/s12859-021-04335-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
86.Dean J, et al. Annotation of pseudogenic gene segments by massively parallel sequencing of rearranged lymphocyte receptor loci. Genome Med. 2015;7:123. doi: 10.1186/s13073-015-0238-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
87.Zhang W, et al. PIRD: pan immune repertoire database. Bioinformatics. 2020;36:897–903. doi: 10.1093/bioinformatics/btz614. [DOI] [PubMed] [Google Scholar]
88.Jokinen E, Huuhtanen J, Mustjoki S, Heinonen M, Lähdesmäki H. Predicting recognition between T cell receptors and epitopes with TCRGP. PLoS Comput. Biol. 2021;17:e1008814. doi: 10.1371/journal.pcbi.1008814. [DOI] [PMC free article] [PubMed] [Google Scholar]
89.Tong Y, et al. SETE: sequence-based ensemble learning approach for TCR epitope binding prediction. Comput. Biol. Chem. 2020;87:107281. doi: 10.1016/j.compbiolchem.2020.107281. [DOI] [PubMed] [Google Scholar]
90.Snyder TM, et al. Magnitude and dynamics of the T-cell response to SARS-CoV-2 infection at both individual and population levels. medRxiv. 2020 doi: 10.1101/2020.07.31.20165647. [DOI] [Google Scholar]
91.Zhang SQ, et al. High-throughput determination of the antigen specificities of T cell receptors in single cells. Nat. Biotechnol. 2018;36:1156–1159. doi: 10.1038/nbt.4282. [DOI] [PMC free article] [PubMed] [Google Scholar]
92.Zhang H, et al. Investigation of antigen-specific T-cell receptor clusters in human cancers. Clin. Cancer Res. 2020;26:1359–1371. doi: 10.1158/1078-0432.CCR-19-3249. [DOI] [PubMed] [Google Scholar]
93.Berman HM, et al. The protein data bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]
94.Chen SY, Yue T, Lei Q, Guo AY. TCRdb: a comprehensive database for T-cell receptor sequences with powerful search function. Nucleic Acids Res. 2021;49:D468. doi: 10.1093/nar/gkaa796. [DOI] [PMC free article] [PubMed] [Google Scholar]
95.Gilson MK, et al. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 2015;44:1045–1053. doi: 10.1093/nar/gkv1072. [DOI] [PMC free article] [PubMed] [Google Scholar]
96.Dines JN, et al. The ImmuneRACE Study: a prospective multicohort study of immune response action to COVID-19 events with the ImmuneCODETM Open Access Database. medRxiv. 2020 doi: 10.1101/2020.08.17.20175158. [DOI] [Google Scholar]
97.Chen G, et al. Sequence and structural analyses reveal distinct and highly diverse human CD8+ TCR repertoires to immunodominant viral antigens. Cell Rep. 2017;19:569. doi: 10.1016/j.celrep.2017.03.072. [DOI] [PMC free article] [PubMed] [Google Scholar]
98.Huth A, Liang X, Krebs S, Blum H, Moosmann A. Antigen-specific TCR signatures of cytomegalovirus infection. J. Immunol. 2019;202:979–990. doi: 10.4049/jimmunol.1801401. [DOI] [PubMed] [Google Scholar]
99.Springer I, Besser H, Tickotsky-Moskovitz N, Dvorkin S, Louzoun Y. Prediction of specific TCR-peptide binding from large dictionaries of TCR–peptide pairs. Front. Immunol. 2020;11:1803. doi: 10.3389/fimmu.2020.01803. [DOI] [PMC free article] [PubMed] [Google Scholar]
100.Kanakry CG, et al. Origin and evolution of the T cell repertoire after posttransplantation cyclophosphamide. JCI Insight. 2016;1:86252. doi: 10.1172/jci.insight.86252. [DOI] [PMC free article] [PubMed] [Google Scholar]
101.Raman MCC, et al. Direct molecular mimicry enables off-target cardiovascular toxicity by an enhanced affinity TCR designed for cancer immunotherapy. Sci. Rep. 2016;6:18851. doi: 10.1038/srep18851. [DOI] [PMC free article] [PubMed] [Google Scholar]
102.Soto C, et al. High frequency of shared clonotypes in human T cell receptor repertoires. Cell. Rep. 2020;32:107882. doi: 10.1016/j.celrep.2020.107882. [DOI] [PMC free article] [PubMed] [Google Scholar]
103.Woolhouse MEJ, Gowtage-Sequeria S. Host range and emerging and reemerging pathogens. Emerg. Infect. Dis. 2005;11:1842–1847. doi: 10.3201/eid1112.050997. [DOI] [PMC free article] [PubMed] [Google Scholar]
104.Robinson J, Waller MJ, Parham P, Bodmer JG, Marsh SGE. IMGT/HLA Database — a sequence database for the human major histocompatibility complex. Nucleic Acids Res. 2001;29:210–213. doi: 10.1093/nar/29.1.210. [DOI] [PMC free article] [PubMed] [Google Scholar]
105.Waldman AD, Fritz JM, Lenardo MJ. A guide to cancer immunotherapy: from T cell basic science to clinical practice. Nat. Rev. Immunol. 2020;20:651–668. doi: 10.1038/s41577-020-0306-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
106.Linette GP, et al. Cardiovascular toxicity and titin cross-reactivity of affinity-enhanced T cells in myeloma and melanoma. Blood. 2013;122:863–871. doi: 10.1182/blood-2013-03-490565. [DOI] [PMC free article] [PubMed] [Google Scholar]
107.Arellano B, Graber DJ, Sentman CL. Regulatory T cell-based therapies for autoimmunity. Discov. Med. 2016;22:73–80. [PMC free article] [PubMed] [Google Scholar]
108.Raffin C, Vo LT, Bluestone JA. Treg cell-based therapies: challenges and perspectives. Nat. Rev. Immunol. 2020;20:158–172. doi: 10.1038/s41577-019-0232-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
109.Hernando B, et al. The effect of age on the acquisition and selection of cancer driver mutations in sun-exposed normal skin. Ann. Oncol. 2021;32:412–421. doi: 10.1016/j.annonc.2020.11.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
110.Sesma A, et al. From tumor mutational burden to blood T cell receptor: looking for the best predictive biomarker in lung cancer treated with immunotherapy. Cancers. 2020;12:1–19. doi: 10.3390/cancers12102974. [DOI] [PMC free article] [PubMed] [Google Scholar]
111.Scott AC, et al. TOX is a critical regulator of tumour-specific T cell differentiation. Nature. 2019;571:270. doi: 10.1038/s41586-019-1324-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
112.Yost KE, et al. Clonal replacement of tumor-specific T cells following PD-1 blockade. Nat. Med. 2019;25:1251–1259. doi: 10.1038/s41591-019-0522-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
113.Wherry EJ, Kurachi M. Molecular and cellular insights into T cell exhaustion. Nat. Rev. Immunol. 2015;15:486–499. doi: 10.1038/nri3862. [DOI] [PMC free article] [PubMed] [Google Scholar]
114.Daniel B, et al. Divergent clonal differentiation trajectories of T cell exhaustion. Nat. Immunol. 2022;23:1614–1627. doi: 10.1038/s41590-022-01337-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
115.Shakiba M, et al. TCR signal strength defines distinct mechanisms of T cell dysfunction and cancer evasion. J. Exp. Med. 2022;219:e20201966. doi: 10.1084/jem.20201966. [DOI] [PMC free article] [PubMed] [Google Scholar]
116.Dan JM, et al. Immunological memory to SARS-CoV-2 assessed for up to 8 months after infection. Science. 2021;371:eabf4063. doi: 10.1126/science.abf4063. [DOI] [PMC free article] [PubMed] [Google Scholar]
117.Swanson PA, et al. AZD1222/ChAdOx1 nCoV-19 vaccination induces a polyfunctional spike protein-specific TH1 response with a diverse TCR repertoire. Sci. Transl Med. 2021;13:7211. doi: 10.1126/scitranslmed.abj7211. [DOI] [PMC free article] [PubMed] [Google Scholar]
118.Bjornevik K, et al. Longitudinal analysis reveals high prevalence of Epstein–Barr virus associated with multiple sclerosis. Science. 2022;375:296–301. doi: 10.1126/science.abj8222. [DOI] [PubMed] [Google Scholar]

[CR1] 1.Nguyen AT, Szeto C, Gras S. The pockets guide to HLA class I molecules. Biochem. Soc. Trans. 2021;49:2319–2331. doi: 10.1042/BST20210410. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.de Jong A, Ogg G. CD1a function in human skin disease. Mol. Immunol. 2021;130:14–19. doi: 10.1016/j.molimm.2020.12.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR3] 3.de Libero G, Chancellor A, Mori L. Antigen specificities and functional properties of MR1-restricted T cells. Mol. Immunol. 2021;130:148–153. doi: 10.1016/j.molimm.2020.12.016. [DOI] [PubMed] [Google Scholar]

[CR4] 4.Sun L, Middleton DR, Wantuch PL, Ozdilek A, Avci FY. Carbohydrates as T-cell antigens with implications in health and disease. Glycobiology. 2016;26:1029–1040. doi: 10.1093/glycob/cww062. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Bagaev DV, et al. VDJdb in 2019: database extension, new analysis infrastructure and a T-cell receptor motif compendium. Nucleic Acids Res. 2020;48:D1057–D1062. doi: 10.1093/nar/gkz874. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Tickotsky N, Sagiv T, Prilusky J, Shifrut E, Friedman N. McPAS-TCR: a manually curated catalogue of pathology-associated T cell receptor sequences. Bioinformatics. 2017;33:2924–2929. doi: 10.1093/bioinformatics/btx286. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Vita R, et al. The Immune Epitope Database (IEDB): 2018 update. Nucleic Acids Res. 2019;47:D339–D343. doi: 10.1093/nar/gky1006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Nolan, S. et al. A large-scale database of T-cell receptor beta (TCRβ) sequences and binding associations from natural and synthetic exposure to SARS-CoV-2. Preprint at Res. Sq. https://www.researchsquare.com/article/rs-51964/v1 (2020).

[CR9] 9.Moris P, et al. Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification. Brief. Bioinform. 2021;22:bbaa318. doi: 10.1093/bib/bbaa318. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Huang H, Wang C, Rubelt F, Scriba TJ, Davis MM. Analyzing the Mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening. Nat. Biotechnol. 2020;38:1194–1202. doi: 10.1038/s41587-020-0505-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Mayer-Blackwell K, et al. TCR meta-clonotypes for biomarker discovery with tcrdist3 enabled identification of public, HLA-restricted clusters of SARS-CoV-2 TCRs. eLife. 2021;10:e68605. doi: 10.7554/eLife.68605. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR12] 12.Weber A, Born J, Rodriguez Martínez M. TITAN: T cell receptor specificity prediction with bimodal attention networks. Bioinformatics. 2021;37:I237–I244. doi: 10.1093/bioinformatics/btab294. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Lee CH, Antanaviciute A, Buckley PR, Simmons A, Koohy H. To what extent does MHC binding translate to immunogenicity in humans? Immunoinformatics. 2021;3–4:100006. doi: 10.1016/j.immuno.2021.100006. [DOI] [Google Scholar]

[CR14] 14.Buckley PR, et al. Evaluating performance of existing computational models in predicting CD8+ T cell pathogenic epitopes and cancer neoantigens. Brief. Bioinform. 2022;23:bbac141. doi: 10.1093/bib/bbac141. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] 15.Mösch A, Raffegerst S, Weis M, Schendel DJ, Frishman D. Machine learning for cancer immunotherapies based on epitope recognition by T cell receptors. Front. Genet. 2019;10:1141. doi: 10.3389/fgene.2019.01141. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Wells DK, et al. Key parameters of tumor epitope immunogenicity revealed through a consortium approach improve neoantigen prediction. Cell. 2020;183:818–834.e13. doi: 10.1016/j.cell.2020.09.015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Altman JD, et al. Phenotypic analysis of antigen-specific T lymphocytes. Science. 1996;274:94–96. doi: 10.1126/science.274.5284.94. [DOI] [PubMed] [Google Scholar]

[CR18] 18.Yao Y, Wyrozżemski Ł, Lundin KEA, Kjetil Sandve G, Qiao S-W. Differential expression profile of gluten-specific T cells identified by single-cell RNA-seq. PLoS ONE. 2021;16:e0258029. doi: 10.1371/journal.pone.0258029. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Glanville J, et al. Identifying specificity groups in the T cell receptor repertoire. Nature. 2017;547:94–98. doi: 10.1038/nature22976. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] 20.Kurtulus S, Hildeman D. Assessment of CD4+ and CD8+ T cell responses using MHC class I and II tetramers. Methods Mol. Biol. 2013;979:71–79. doi: 10.1007/978-1-62703-290-2_8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] 21.Joglekar AV, Li G. T cell antigen discovery. Nat. Methods. 2021;18:873–880. doi: 10.1038/s41592-020-0867-z. [DOI] [PubMed] [Google Scholar]

[CR22] 22.Bosselut R, et al. Single T cell sequencing demonstrates the functional role of αβ TCR pairing in cell lineage and antigen specificity. Front. Immunol. 2019;1:1516. doi: 10.3389/fimmu.2019.01516. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR23] 23.Emerson RO, et al. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat. Genet. 2017;49:659–665. doi: 10.1038/ng.3822. [DOI] [PubMed] [Google Scholar]

[CR24] 24.Wang X, He Y, Zhang Q, Ren X, Zhang Z. Direct comparative analyses of 10× genomics chromium and Smart-Seq2. Genomics Proteomics Bioinformatics. 2021;19:253–266. doi: 10.1016/j.gpb.2020.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Zhang W, et al. A framework for highly multiplexed dextramer mapping and prediction of T cell receptor sequences to antigen specificity. Sci. Adv. 2021;7:eabf5835. doi: 10.1126/sciadv.abf5835. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] 26.Gascoigne N, et al. Optimized peptide-MHC multimer protocols for detection and isolation of autoimmune T-cells. Front. Immunol. 2018;9:1378. doi: 10.3389/fimmu.2018.01378. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Meysman P, et al. Benchmarking solutions to the T-cell receptor epitope prediction problem: IMMREP22 workshop report. bioRxiv. 2022 doi: 10.1101/2022.10.27.514020. [DOI] [Google Scholar]

[CR28] 28.Dobson CS, et al. Antigen identification and high-throughput interaction mapping by reprogramming viral entry. Nat. Methods. 2022;19:449–460. doi: 10.1038/s41592-022-01436-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR29] 29.Guo XZJ, Elledge SJ. V-CARMA: a tool for the detection and modification of antigen-specific T cells. Proc. Natl Acad. Sci. USA. 2022;119:e2116277119. doi: 10.1073/pnas.2116277119. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Brophy SE, Holler PD, Kranz DM. A yeast display system for engineering functional peptide-MHC complexes. J. Immunol. Methods. 2003;272:235–246. doi: 10.1016/S0022-1759(02)00439-8. [DOI] [PubMed] [Google Scholar]

[CR31] 31.Birnbaum ME, et al. Deconstructing the peptide-MHC specificity of T cell recognition. Cell. 2014;157:1073–1087. doi: 10.1016/j.cell.2014.03.047. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR32] 32.Crawford F, et al. Use of baculovirus MHC/peptide display libraries to characterize T-cell receptor ligands. Immunol. Rev. 2006;210:156–170. doi: 10.1111/j.0105-2896.2006.00365.x. [DOI] [PubMed] [Google Scholar]

[CR33] 33.Coles CH, et al. TCRs with distinct specificity profiles use different binding modes to engage an identical peptide–HLA complex. J. Immunol. 2020;204:1943–1953. doi: 10.4049/jimmunol.1900915. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR34] 34.Kula T, et al. T-Scan: a genome-wide method for the systematic discovery of T cell epitopes. Cell. 2019;178:1016. doi: 10.1016/j.cell.2019.07.009. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Pan X, et al. Combinatorial HLA-peptide bead libraries for high throughput identification of CD8+ T cell specificity. J. Immunol. Methods. 2014;403:72–78. doi: 10.1016/j.jim.2013.11.023. [DOI] [PubMed] [Google Scholar]

[CR36] 36.Li G, et al. T cell antigen discovery via trogocytosis. Nat. Methods. 2019;16:183–190. doi: 10.1038/s41592-018-0305-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR37] 37.Joglekar AV, et al. T cell antigen discovery via signaling and antigen-presenting bifunctional receptors. Nat. Methods. 2019;16:191–198. doi: 10.1038/s41592-018-0304-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR38] 38.Schaap-Johansen A-L, Vujovic M, Borch A, Hadrup SR, Marcatili P. T cell epitope prediction and its application to immunotherapy. Front. Immunol. 2021;12:712488. doi: 10.3389/fimmu.2021.712488. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] 39.Valkiers S, et al. Recent advances in T-cell receptor repertoire analysis: bridging the gap with multimodal single-cell RNA sequencing. Immunoinformatics. 2022;5:100009. doi: 10.1016/j.immuno.2022.100009. [DOI] [Google Scholar]

[CR40] 40.Lee CH, et al. Predicting cross-reactivity and antigen specificity of T cell receptors. Front. Immunol. 2020;11:2498. doi: 10.3389/fimmu.2020.565096. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR41] 41.Vujovic M, et al. T cell receptor sequence clustering and antigen specificity. Comput. Struct. Biotechnol. J. 2020;18:2166–2173. doi: 10.1016/j.csbj.2020.06.041. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR42] 42.Katayama Y, Yokota R, Akiyama T, Kobayashi TJ. Machine learning approaches to TCR repertoire analysis. Front. Immunol. 2022;13:858057. doi: 10.3389/fimmu.2022.858057. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Luu AM, Leistico JR, Miller T, Kim S, Song JS. Predicting TCR-epitope binding specificity using deep metric learning and multimodal learning. Genes. 2021;12:572. doi: 10.3390/genes12040572. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR44] 44.Montemurro A, et al. NetTCR-2.0 enables accurate prediction of TCR-peptide binding by using paired TCRα and β sequence data. Commun. Biol. 2021;4:1060. doi: 10.1038/s42003-021-02610-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR45] 45.Dens C, Bittremieux W, Affaticati F, Laukens K, Meysman P. Interpretable deep learning to uncover the molecular binding patterns determining TCR–epitope interactions. bioRxiv. 2022 doi: 10.1101/2022.05.02.490264. [DOI] [Google Scholar]

[CR46] 46.Springer I, Tickotsky N, Louzoun Y. Contribution of T cell receptor alpha and beta CDR3, MHC typing, V and J genes to peptide binding prediction. Front. Immunol. 2021;12:1436. doi: 10.3389/fimmu.2021.664514. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR47] 47.Lu T, et al. Deep learning-based prediction of the T cell receptor–antigen binding specificity. Nat. Mach. Intell. 2021;3:864–875. doi: 10.1038/s42256-021-00383-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR48] 48.Fischer DS, Wu Y, Schubert B, Theis FJ. Predicting antigen specificity of single T cells based on TCR CDR3 regions. Mol. Syst. Biol. 2020;16:9416. doi: 10.15252/msb.20199416. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR49] 49.Wu K, et al. TCR-BERT: learning the grammar of T-cell receptors for flexible antigen-binding analyses. bioRxiv. 2021 doi: 10.1101/2021.11.18.469186. [DOI] [Google Scholar]

[CR50] 50.Grazioli F, et al. On TCR binding predictors failing to generalize to unseen peptides. Front. Immunol. 2022;13:1014256. doi: 10.3389/fimmu.2022.1014256. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR51] 51.Zhang H, Zhan X, Li B. GIANA allows computationally-efficient TCR clustering and multi-disease repertoire classification by isometric transformation. Nat. Commun. 2021;12:4699. doi: 10.1038/s41467-021-25006-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR52] 52.Chronister WD, et al. TCRMatch: predicting T-cell receptor specificity based on sequence similarity to previously characterized receptors. Front. Immunol. 2021;12:640725. doi: 10.3389/fimmu.2021.640725. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR53] 53.Sidhom JW, Larman HB, Pardoll DM, Baras AS. DeepTCR is a deep learning framework for revealing sequence concepts within T-cell repertoires. Nat. Commun. 2021;12:1605. doi: 10.1038/s41467-021-21879-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR54] 54.Dash P, et al. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature. 2017;547:89–93. doi: 10.1038/nature22383. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR55] 55.Valkiers S, van Houcke M, Laukens K, Meysman P. ClusTCR: a python interface for rapid clustering of large sets of CDR3 sequences with unknown antigen specificity. Bioinformatics. 2021;37:4865–4867. doi: 10.1093/bioinformatics/btab446. [DOI] [PubMed] [Google Scholar]

[CR56] 56.Corrie BD, et al. iReceptor: a platform for querying and analyzing antibody/B-cell and T-cell receptor repertoire data across federated repositories. Immunol. Rev. 2018;284:24–41. doi: 10.1111/imr.12666. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR57] 57.Andreatta M, et al. Interpretation of T cell states from single-cell transcriptomics data using reference atlases. Nat. Commun. 2021;12:2965. doi: 10.1038/s41467-021-23324-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR58] 58.Leem J, de Oliveira SHP, Krawczyk K, Deane CM. STCRDab: the structural T-cell receptor database. Nucleic Acids Res. 2018;46:D406–D412. doi: 10.1093/nar/gkx971. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR59] 59.Mayer A, Callan Jr CG. Measures of epitope binding degeneracy from T cell receptor repertoires. bioRxiv. 2022 doi: 10.1101/2022.07.25.501373. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR60] 60.Singh NK, et al. Emerging concepts in TCR specificity: rationalizing and (maybe) predicting outcomes. J. Immunol. 2017;199:2203–2213. doi: 10.4049/jimmunol.1700744. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR61] 61.Quaratino S, Thorpe CJ, Travers PJ, Londei M. Similar antigenic surfaces, rather than sequence homology, dictate T-cell epitope molecular mimicry. Proc. Natl Acad. Sci. USA. 1995;92:10398–10402. doi: 10.1073/pnas.92.22.10398. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR62] 62.Lanzarotti E, Marcatili P, Nielsen M. T-cell receptor cognate target prediction based on paired α and β chain sequence and structural CDR loop similarities. Front. Immunol. 2019;10:2080. doi: 10.3389/fimmu.2019.02080. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR63] 63.Evans R, et al. Protein complex prediction with AlphaFold-Multimer. bioRxiv. 2022 doi: 10.1101/2021.10.04.463034. [DOI] [Google Scholar]

[CR64] 64.Koehler Leman J, et al. Macromolecular modeling and design in Rosetta: recent methods and frameworks. Nat. Methods. 2020;17:665–680. doi: 10.1038/s41592-020-0848-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR65] 65.Rives A, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc. Natl Acad. Sci. USA. 2021;118:e2016239118. doi: 10.1073/pnas.2016239118. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR66] 66.Bradley P. Structure-based prediction of T cell receptor: peptide–MHC interactions. bioRxiv. 2022 doi: 10.1101/2022.08.05.503004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR67] 67.Jiang Y, Huo M, Li SC. TEINet: a deep learning framework for prediction of TCR-epitope binding specificity. bioRxiv. 2022 doi: 10.1101/2022.10.20.513029. [DOI] [PubMed] [Google Scholar]

[CR68] 68.Chinery L, Wahome N, Moal I, Deane CM. Paragraph — antibody paratope prediction using Graph Neural Networks with minimal feature vectors. Bioinformatics. 2022;39:btac732. doi: 10.1093/bioinformatics/btac732. [DOI] [PubMed] [Google Scholar]

[CR69] 69.Alley EC, Khimulya G, Biswas S. Unified rational protein engineering with sequence-based deep representation learning. Nat. Methods. 2019;16:1312–1322. doi: 10.1038/s41592-019-0598-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR70] 70.Jumper J, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589. doi: 10.1038/s41586-021-03819-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR71] 71.Reynisson B, Alvarez B, Paul S, Peters B, Nielsen M. NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data. Nucleic Acids Res. 2020;48:449–454. doi: 10.1093/nar/gkaa379. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR72] 72.Mason D. A very high level of cross-reactivity is an essential feature of the T-cell receptor. Immunol. Today. 1998;19:395–404. doi: 10.1016/S0167-5699(98)01299-7. [DOI] [PubMed] [Google Scholar]

[CR73] 73.Sewell AK. Why must T cells be cross-reactive? Nat. Rev. Immunol. 2012;12:669–677. doi: 10.1038/nri3279. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR74] 74.Keck S, et al. Antigen affinity and antigen dose exert distinct influences on CD4 T-cell differentiation. Proc. Natl Acad. Sci. USA. 2014;111:14852–14857. doi: 10.1073/pnas.1403271111. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR75] 75.Achar SR, et al. Universal antigen encoding of T cell activation from high-dimensional cytokine dynamics. Science. 2022;376:880–884. doi: 10.1126/science.abl5311. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR76] 76.van Panhuys N, Klauschen F, Germain RN. T cell receptor-dependent signal intensity dominantly controls CD4+ T cell polarization in vivo. Immunity. 2014;41:63–74. doi: 10.1016/j.immuni.2014.06.003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR77] 77.Liu S, et al. Spatial maps of T cell receptors and transcriptomes reveal distinct immune niches and interactions in the adaptive immune response. Immunity. 2022;55:1940–1952.e5. doi: 10.1016/j.immuni.2022.09.002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR78] 78.Schattgen SA, et al. Integrating T cell receptor sequences and transcriptional profiles by clonotype neighbor graph analysis (CoNGA) Nat. Biotechnol. 2021;40:54–63. doi: 10.1038/s41587-021-00989-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR79] 79.Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP) — round XIV. Proteins. 2021;89:1607–1617. doi: 10.1002/prot.26237. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR80] 80.Pearson K. On lines and planes of closest fit to systems of points in space. Philos. Mag. 1901;2:559–570. doi: 10.1080/14786440109462720. [DOI] [Google Scholar]

[CR81] 81.Cai M, Bang S, Zhang P, Lee H. ATM-TCR: TCR–epitope binding affinity prediction using a multi-head self-attention model. Front. Immunol. 2022;13:893247. doi: 10.3389/fimmu.2022.893247. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR82] 82.Pavlović M, et al. The immuneML ecosystem for machine learning analysis of adaptive immune receptor repertoires. Nat. Mach. Intell. 2021;3:936–944. doi: 10.1038/s42256-021-00413-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR83] 83.Heikkilä N, et al. Human thymic T cell repertoire is imprinted with strong convergence to shared sequences. Mol. Immunol. 2020;127:112–123. doi: 10.1016/j.molimm.2020.09.003. [DOI] [PubMed] [Google Scholar]

[CR84] 84.10× Genomics. A new way of exploring immunity: linking highly multiplexed antigen recognition to immune repertoire and phenotype. 10× Genomicshttps://pages.10xgenomics.com/rs/446-PBO-704/images/10x_AN047_IP_A_New_Way_of_Exploring_Immunity_Digital.pdf (2020).

[CR85] 85.Ehrlich R, et al. SwarmTCR: a computational approach to predict the specificity of T cell receptors. BMC Bioinformatics. 2021;22:422. doi: 10.1186/s12859-021-04335-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR86] 86.Dean J, et al. Annotation of pseudogenic gene segments by massively parallel sequencing of rearranged lymphocyte receptor loci. Genome Med. 2015;7:123. doi: 10.1186/s13073-015-0238-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR87] 87.Zhang W, et al. PIRD: pan immune repertoire database. Bioinformatics. 2020;36:897–903. doi: 10.1093/bioinformatics/btz614. [DOI] [PubMed] [Google Scholar]

[CR88] 88.Jokinen E, Huuhtanen J, Mustjoki S, Heinonen M, Lähdesmäki H. Predicting recognition between T cell receptors and epitopes with TCRGP. PLoS Comput. Biol. 2021;17:e1008814. doi: 10.1371/journal.pcbi.1008814. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR89] 89.Tong Y, et al. SETE: sequence-based ensemble learning approach for TCR epitope binding prediction. Comput. Biol. Chem. 2020;87:107281. doi: 10.1016/j.compbiolchem.2020.107281. [DOI] [PubMed] [Google Scholar]

[CR90] 90.Snyder TM, et al. Magnitude and dynamics of the T-cell response to SARS-CoV-2 infection at both individual and population levels. medRxiv. 2020 doi: 10.1101/2020.07.31.20165647. [DOI] [Google Scholar]

[CR91] 91.Zhang SQ, et al. High-throughput determination of the antigen specificities of T cell receptors in single cells. Nat. Biotechnol. 2018;36:1156–1159. doi: 10.1038/nbt.4282. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR92] 92.Zhang H, et al. Investigation of antigen-specific T-cell receptor clusters in human cancers. Clin. Cancer Res. 2020;26:1359–1371. doi: 10.1158/1078-0432.CCR-19-3249. [DOI] [PubMed] [Google Scholar]

[CR93] 93.Berman HM, et al. The protein data bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR94] 94.Chen SY, Yue T, Lei Q, Guo AY. TCRdb: a comprehensive database for T-cell receptor sequences with powerful search function. Nucleic Acids Res. 2021;49:D468. doi: 10.1093/nar/gkaa796. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR95] 95.Gilson MK, et al. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 2015;44:1045–1053. doi: 10.1093/nar/gkv1072. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR96] 96.Dines JN, et al. The ImmuneRACE Study: a prospective multicohort study of immune response action to COVID-19 events with the ImmuneCODETM Open Access Database. medRxiv. 2020 doi: 10.1101/2020.08.17.20175158. [DOI] [Google Scholar]

[CR97] 97.Chen G, et al. Sequence and structural analyses reveal distinct and highly diverse human CD8+ TCR repertoires to immunodominant viral antigens. Cell Rep. 2017;19:569. doi: 10.1016/j.celrep.2017.03.072. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR98] 98.Huth A, Liang X, Krebs S, Blum H, Moosmann A. Antigen-specific TCR signatures of cytomegalovirus infection. J. Immunol. 2019;202:979–990. doi: 10.4049/jimmunol.1801401. [DOI] [PubMed] [Google Scholar]

[CR99] 99.Springer I, Besser H, Tickotsky-Moskovitz N, Dvorkin S, Louzoun Y. Prediction of specific TCR-peptide binding from large dictionaries of TCR–peptide pairs. Front. Immunol. 2020;11:1803. doi: 10.3389/fimmu.2020.01803. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR100] 100.Kanakry CG, et al. Origin and evolution of the T cell repertoire after posttransplantation cyclophosphamide. JCI Insight. 2016;1:86252. doi: 10.1172/jci.insight.86252. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR101] 101.Raman MCC, et al. Direct molecular mimicry enables off-target cardiovascular toxicity by an enhanced affinity TCR designed for cancer immunotherapy. Sci. Rep. 2016;6:18851. doi: 10.1038/srep18851. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR102] 102.Soto C, et al. High frequency of shared clonotypes in human T cell receptor repertoires. Cell. Rep. 2020;32:107882. doi: 10.1016/j.celrep.2020.107882. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR103] 103.Woolhouse MEJ, Gowtage-Sequeria S. Host range and emerging and reemerging pathogens. Emerg. Infect. Dis. 2005;11:1842–1847. doi: 10.3201/eid1112.050997. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR104] 104.Robinson J, Waller MJ, Parham P, Bodmer JG, Marsh SGE. IMGT/HLA Database — a sequence database for the human major histocompatibility complex. Nucleic Acids Res. 2001;29:210–213. doi: 10.1093/nar/29.1.210. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR105] 105.Waldman AD, Fritz JM, Lenardo MJ. A guide to cancer immunotherapy: from T cell basic science to clinical practice. Nat. Rev. Immunol. 2020;20:651–668. doi: 10.1038/s41577-020-0306-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR106] 106.Linette GP, et al. Cardiovascular toxicity and titin cross-reactivity of affinity-enhanced T cells in myeloma and melanoma. Blood. 2013;122:863–871. doi: 10.1182/blood-2013-03-490565. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR107] 107.Arellano B, Graber DJ, Sentman CL. Regulatory T cell-based therapies for autoimmunity. Discov. Med. 2016;22:73–80. [PMC free article] [PubMed] [Google Scholar]

[CR108] 108.Raffin C, Vo LT, Bluestone JA. Treg cell-based therapies: challenges and perspectives. Nat. Rev. Immunol. 2020;20:158–172. doi: 10.1038/s41577-019-0232-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR109] 109.Hernando B, et al. The effect of age on the acquisition and selection of cancer driver mutations in sun-exposed normal skin. Ann. Oncol. 2021;32:412–421. doi: 10.1016/j.annonc.2020.11.023. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR110] 110.Sesma A, et al. From tumor mutational burden to blood T cell receptor: looking for the best predictive biomarker in lung cancer treated with immunotherapy. Cancers. 2020;12:1–19. doi: 10.3390/cancers12102974. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR111] 111.Scott AC, et al. TOX is a critical regulator of tumour-specific T cell differentiation. Nature. 2019;571:270. doi: 10.1038/s41586-019-1324-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR112] 112.Yost KE, et al. Clonal replacement of tumor-specific T cells following PD-1 blockade. Nat. Med. 2019;25:1251–1259. doi: 10.1038/s41591-019-0522-3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR113] 113.Wherry EJ, Kurachi M. Molecular and cellular insights into T cell exhaustion. Nat. Rev. Immunol. 2015;15:486–499. doi: 10.1038/nri3862. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR114] 114.Daniel B, et al. Divergent clonal differentiation trajectories of T cell exhaustion. Nat. Immunol. 2022;23:1614–1627. doi: 10.1038/s41590-022-01337-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR115] 115.Shakiba M, et al. TCR signal strength defines distinct mechanisms of T cell dysfunction and cancer evasion. J. Exp. Med. 2022;219:e20201966. doi: 10.1084/jem.20201966. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR116] 116.Dan JM, et al. Immunological memory to SARS-CoV-2 assessed for up to 8 months after infection. Science. 2021;371:eabf4063. doi: 10.1126/science.abf4063. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR117] 117.Swanson PA, et al. AZD1222/ChAdOx1 nCoV-19 vaccination induces a polyfunctional spike protein-specific TH1 response with a diverse TCR repertoire. Sci. Transl Med. 2021;13:7211. doi: 10.1126/scitranslmed.abj7211. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR118] 118.Bjornevik K, et al. Longitudinal analysis reveals high prevalence of Epstein–Barr virus associated with multiple sclerosis. Science. 2022;375:296–301. doi: 10.1126/science.abj8222. [DOI] [PubMed] [Google Scholar]

PERMALINK

Can we predict T cell specificity with digital biology and machine learning?

Dan Hudson

Ricardo A Fernandes

Mark Basham

Graham Ogg

Hashem Koohy

Abstract

Introduction

Fig. 1. Structure and function of the TCR.

Fig. 2. The current landscape of known TCR–antigen pairs.

Box 1 The extraordinary diversity of TCR–antigen pairs.

Box 2 Implications of accurate TCR specificity prediction.

State of the art

Experimental methods

Fig. 3. Screening and computational methods.

Computational methods

Table 1.

Supervised predictive models

Unsupervised clustering models

Key challenges

Data

Modelling

Immunology

Conclusions and call to action

Acknowledgements

Glossary

Author contributions

Peer review

Peer review information

Competing interests

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Can we predict T cell specificity with digital biology and machine learning?

Dan Hudson

Ricardo A Fernandes

Mark Basham

Graham Ogg

Hashem Koohy

Abstract

Introduction

Fig. 1. Structure and function of the TCR.

Fig. 2. The current landscape of known TCR–antigen pairs.

Box 1 The extraordinary diversity of TCR–antigen pairs.

Box 2 Implications of accurate TCR specificity prediction.

State of the art

Experimental methods

Fig. 3. Screening and computational methods.

Computational methods

Table 1.

Supervised predictive models

Unsupervised clustering models

Key challenges

Data

Modelling

Immunology

Conclusions and call to action

Acknowledgements

Glossary

Author contributions

Peer review

Peer review information

Competing interests

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases