Methods for sequence and structural analysis of B and T cell receptor repertoires

Shunsuke Teraguchi; Dianita S Saputri; Mara Anais Llamas-Covarrubias; Ana Davila; Diego Diez; Sedat Aybars Nazlica; John Rozewicki; Hendra S Ismanto; Jan Wilamowski; Jiaqi Xie; Zichang Xu; Martin de Jesus Loza-Lopez; Floris J van Eerden; Songling Li; Daron M Standley

doi:10.1016/j.csbj.2020.07.008

. 2020 Jul 17;18:2000–2011. doi: 10.1016/j.csbj.2020.07.008

Methods for sequence and structural analysis of B and T cell receptor repertoires

Shunsuke Teraguchi ^a,^b, Dianita S Saputri ^b, Mara Anais Llamas-Covarrubias ^b,^c, Ana Davila ^b, Diego Diez ^a, Sedat Aybars Nazlica ^a, John Rozewicki ^a,^b, Hendra S Ismanto ^b, Jan Wilamowski ^b, Jiaqi Xie ^b, Zichang Xu ^b, Martin de Jesus Loza-Lopez ^a, Floris J van Eerden ^a, Songling Li ^b, Daron M Standley ^a,^b,^⁎

PMCID: PMC7366105 PMID: 32802272

Graphical abstract

Keywords: B cell receptor, T cell receptor, Immune system, Adaptive immunology, Antigen, Single cell sequencing, Clustering, Machine-learning

Abstract

B cell receptors (BCRs) and T cell receptors (TCRs) make up an essential network of defense molecules that, collectively, can distinguish self from non-self and facilitate destruction of antigen-bearing cells such as pathogens or tumors. The analysis of BCR and TCR repertoires plays an important role in both basic immunology as well as in biotechnology. Because the repertoires are highly diverse, specialized software methods are needed to extract meaningful information from BCR and TCR sequence data. Here, we review recent developments in bioinformatics tools for analysis of BCR and TCR repertoires, with an emphasis on those that incorporate structural features. After describing the recent sequencing technologies for immune receptor repertoires, we survey structural modeling methods for BCR and TCRs, along with methods for clustering such models. We review downstream analyses, including BCR and TCR epitope prediction, antibody-antigen docking and TCR-peptide-MHC Modeling. We also briefly discuss molecular dynamics in this context.

1. Introduction

B cell receptors (BCRs) and T cell receptors (TCRs) are key molecules in adaptive immune response that provide protection to perturbations, both from the outside (e.g. pathogens) and from within (e.g. mutated or misfolded proteins). Together, BCRs and TCRs constitute a unique class of proteins whose coding sequences are arranged combinatorically in a cell-autonomous manner known as V(D)J recombination. In V(D)J recombination within a given cell, variable (V), diversity (D), and joining (J) segments are selected randomly from among many variants, and joined to make the V (variable) region of a full-length receptor. In addition to V(D)J recombination, BCRs can also undergo subsequent somatic hypermutation (SHM) and clonal selection upon antigen encounter, collectively referred to as “affinity maturation”. On a cell population level, these processes create a functionally diverse and dynamic set (repertoire) of B and T cells. The number of possible different BCR or TCR sequence combinations is extremely high, with theoretical estimates in the 10¹²–10¹⁸ range [1]. However, the observed populations of receptor sequences in a given individual follow a power law, where most sequences appear only at very low frequency and a minority of sequences appear at higher frequencies (see for example [2] for a recent discussion).

For both BCRs and TCRs, V regions consist of two polypeptide chains, referred to as “light” (BCRs) or “alpha” (TCRs) and “heavy” (BCRs) or “beta” (TCRs). TCRs are composed of a single pair of alpha and beta chains while BCRs contain two pairs of light and heavy chains [1]. For simplicity, in this review, we focus on a single pair of (light-heavy or alpha–beta) chains.

Both BCRs and TCRs belong to the immunoglobulin-like fold in which the canonical antigen binding site is composed of three loops called “complementarity-determining regions” (CDRs), in each receptor chain. The V(D)J recombination junction, in which random nucleotides may be inserted during the recombination, is located in the third CDR (CDR3). As a result, CDR3 is the most diverse among the three CDRs [1]. Much effort has been spent on CDR3 modeling, in particular for soluble BCRs (antibodies).

BCRs interact directly with antigens, and we refer to interface residues as “paratope” on the BCR side and “epitope” on the antigen side (Fig. 1A). TCRs, on the other hand, interact with antigen-derived peptide fragments, which are presented by major histocompatibility complex (MHC) proteins (Fig. 1B). Here, generally “epitope” refers to the antigen-derived peptide and not the MHC contacting residues.

Each human carries up to six class I MHC molecules and up to eight class II molecules. There are thousands of MHC variants (alleles) in the human population, which can differ in their peptide specificity [1]. Peptide-MHC binding affinity shapes the TCR repertoire, and the particular set of MHC alleles carried by an individual become a source of TCR repertoire diversity, affecting the susceptibility to particular diseases (reviewed in [3]). Since BCR maturation requires a co-stimulation from activated helper T cells [4], the BCR and TCR repertoires are not completely independent.

Both BCR and TCR sequences can be captured by current sequencing technologies. Moreover, molecule and cell barcoding technologies are an area of intense research and development. Emerging sequencing and barcoding methods are thus expected to revolutionize our understanding of immune repertoires. As just one example, the number of paired (alpha–beta) TCR sequences for which the peptide-MHC is known has grown by two orders of magnitude in the last two years [5], indicating a need for computational tools that can keep pace with this growth.

In this review, after briefly reviewing recent technologies for repertoire sequencing, we explore tools for interpreting BCR and TCR sequences in terms of their structures and targeted antigens. In this context, we cover structural modeling, epitope prediction, molecular docking, and molecular dynamics. Integration of such tools, along with growth in sequence and associated experimental data, will allow us to more fully describe the immune status of an individual in health and disease.

2. Repertoire sequence analysis

Very early approaches to characterize immune repertoires were limited to estimating the length of the CDR3 loops [6]. Current methods, relying on high-throughput sequencing (HTS) technology, can be used for comprehensive quantification of full-length TCR and BCR V region sequences [7], [8]. Though a comprehensive review on the existing technologies for repertoire sequencing analysis is beyond the scope of this review, HTS is the main source of data for subsequent structural analysis. Therefore, we briefly describe the basic information contained in bulk and single-cell RNA-based repertoire sequencing (Fig. 2).

Fig. 2 — Conceptual difference of bulk and single cell repertoire sequencing. In bulk sequencing, the information of receptor pairs will be lost while higher coverage tends to be achieved. In single cell sequencing, the pairing information is preserved while currently sample preparation and sequencing costs tend to be higher than in bulk sequencing.

2.1. Bulk sequencing

Early development of HTS repertoire analysis was based on bulk sequencing (i.e. sequencing many cells without preserving their identities). In this approach, the information of light/heavy or alpha/beta pairs is lost. Thus, bulk sequence analysis tends to focus on a single (typically the heavy/beta) chain.

Repertoire sequencing typically uses TCR/BCR enrichment followed by PCR amplification to increase sensitivity and reduce sequencing cost. Since a 100 bp fragment is enough to resolve the CDR3 fragment, short read sequencing is often used. The choice of sequencing technology can have an important impact on quality, since the types and rates of errors can be different. Among preferred platforms are Illumina MiSeq (long reads) and HiSeq (short reads targeting CDR3).

One of the sources of low-quality repertoire data is a biproduct of PCR amplification. Without other information, we cannot distinguish between true nucleotide sequence differences and PCR errors. As a result, PCR errors cause the appearance of spurious sequences, in particular from dominant, highly abundant sequences/clonotypes. Use of Unique Molecular Identifier (UMI) sequences enables correction of PCR amplification biases and quantification of the number of receptors expressed. Thus, the use of technologies with UMI have a distinct advantage.

To date, several pipelines can be used to extract repertoire information from bulk HTS data. These tools generally map sequencing reads to TCR/BCR reference sequences. Then, contigs, the continuous sequences assembled from the mapped reads, can subsequently be annotated by V(D)J gene usage and CDR (1,2,)3 amino acid sequences [9], [10]. IMGT/HighV-QUEST (International Immunogenetics Information System V-Query and Standardization) [11], [12] uses pairwise alignment and sequence comparison to experimental data to align sequencing reads. IgBLAST [13] utilizes the BLAST algorithm [14] for its search engine. MiXCR [15] is an efficient pipeline equipped with a fast aligner. It can be used for reconstructing TCR/BCR sequences from generic RNA-seq data without PCR amplification of TCRs/BCRs [16]. A detailed assessment on those three tools can be found in [17]. The Immcantation framework [18], [19] and TRUST (TCR repertoire utilities for solid tissue) [20] can be also used for the same purpose among many other available tools not covered here

Though single chain information alone is usually not enough to explain the binding of the receptor to the target epitope, there are several methods applicable to bulk sequencing data. For example, diversity analysis of the repertoire sequences can be used for estimating the clonal diversity of an immune repertoire of each individual, as well as repertoire overlap among repertoires of several individuals. This can currently be performed using conventional ecology measures [21], [22], [23], or repertoire-designed estimators [24], [25], [26]. Also, by analyzing repertoire data from many individuals with additional information like Human Leukocyte Antigen (HLA) allele profiles or disease status, one can associate each TCR with particular labels with the help of statistical hypothesis testing [27], [28]. Repertoire information also carries the information of underlying V(D)J recombination. Thus, from repertoire sequences, generative models of V(D)J recombination were developed; and, in turn, these models were used to analyze repertoire sequence data [29], [30], [31], [32], [33], [34]. We have collected some of (but not all of) tools used for those sequence analysis as in Table 1.

Table 1.

Repertoire sequence analysis tools.

Tools	Purpose	URL	References
IgBLAST	Bulk Sequence reconstruction	https://www.ncbi.nlm.nih.gov/igblast/	[13]
IMGT/HighV-QUEST		http://www.imgt.org/IMGTindex/IMGTHighV-QUEST.php	[11], [12]
MiXCR		https://mixcr.readthedocs.io/en/master/index.html	[15]
TRUST		https://bitbucket.org/liulab/trust/src/master/	[20]

TRAPeS	Single cell Sequence reconstruction	https://github.com/YosefLab/TRAPeS	[36]
TraCeR		https://github.com/teichlab/tracer	[37]
VDJPuzzle		https://github.com/simone-rizzetto/VDJPuzzle	[38]
BASIC		http://ttic.uchicago.edu/~aakhan/BASIC/	[39]
BraCeR		https://github.com/teichlab/bracer/	[40]

VDJtools	General repertoire analysis	https://github.com/mikessh/vdjtools	[21]
Immcantation		https://immcantation.readthedocs.io/en/stable	[19]
Vidjil		http://www.vidjil.org http://bioinfo.lille.inria.fr/vidjil	[22]
ASAP		https://asap.tau.ac.il	[119]
ARGalaxy		https://bioinf-galaxian.erasmusmc.nl/argalaxy/	[120]
bcRep		https://cran.r-project.org/web/packages/bcRep/vignettes/vignette.html	[121]
Immunarch		https://immunarch.com	[23]
Sumrep		https://github.com/matsengrp/sumrep	[122]

DiVE	Specialized in diversity analysis	http://cran.r-project.org/web/packages/DivE/index.html	[24]
RDI		https://rdi.readthedocs.io/en/1.0.0/	[25]
RECOLD		https://github.com/Q-bio-at-IIS/RECOLD/tree/master/codes	[26]

OLGA	Generative model of VDJ recombination	https://github.com/statbiophys/OLGA	[29]
IgoR		https://github.com/qmarcou/IGoR	[34]
SONIA		https://github.com/statbiophys/SONIA	[30]
vampire		https://github.com/matsengrp/vampire/	[26]

Open in a new tab

2.2. Single cell sequencing

The most important limitation of bulk sequencing approaches is the loss of pairing between receptor chains. This limitation is addressed by single cell repertoire profiling methods. These methods use a number of cell barcoding strategies to add a unique barcode to each cDNA in a given cell. New approaches are dramatically improving the ability to measure full length paired receptors at the single-cell level. For example, RAGE-seq (Repertoire and Gene Expression by Sequencing) combines long reads from Oxford Nanopore sequencing with short reads from Illumina sequencers [35]. When combined with droplet based single cell RNA-seq approaches, we can characterize the full-length paired repertoires of thousands of single cells. In addition, off-the-shelf single cell repertoire sequencing platforms are currently available from various companies including 10x Genomics and Takara Bio.

In the case of single-cell gene expression data, TRAPeS (TCR Reconstruction Algorithm for Paired-End Single Cell) [36], TraCeR (Reconstruction of T cell receptor sequences from single cell RNA-seq data) [37] and VDJPuzzle [38] are often used for analysis of TCRs. Meanwhile, BASIC (BCR assembly from single cells) [39], BraCeR (B-cell-receptor reconstruction and clonality inference from single-cell RNA-seq) [40] and an extension of VDJPuzzle [41] are often used for BCR analysis. These tools mainly differ on the way they assemble the missing information after mapping to reference sequences, and the final results are generally consistent. Since structural modeling has yet to be effectively used for predicting chain pairing, single cell sequencing technologies are critical for TCR or BCR structural modeling. Moreover, expression of TCRs or BCRs requires such pairing and so single cell sequencing is important for most downstream analyses of T or B cells and their cognate antigens.

2.3. Extensions of repertoire sequencing

There have also been exciting developments in the application of HTS technology for experimental discovery of epitopes. In Libra-seq (Linking B cell receptor to antigen specificity through sequencing) [42], the 10x Genomics platform was used to barcode not only BCR sequences but also antigen proteins. By sorting the antigen-bound B cells and then performing single cell sequencing, antigen specific BCRs can be identified from the antigen barcodes. Similarly, by using barcoded peptide-MHC complexes, HTS allow us to generate a large reference dataset of TCR-epitope pairs [43]. Kula et al. [44] developed T-Scan, a high-throughput method that identifies functional antigen targets of CD8 T cells. They started from bulk memory T cells and made antigen libraries such that target cells could present the antigens on MHC molecules. Recognition of target cells by T cells and subsequent next-generation sequencing enabled T-scan to discover CMV antigens as well as the targets of self-reactive TCRs. Gee MH et al. [45] used yeast-display libraries of pMHCs and screened for antigens of orphan T cell receptors on tumor-infiltrating lymphocytes. Kobayashi et al. [46] have developed a cloning and expression system called hTEC10 (human TCR efficient cloning system within 10 days) that can be used to rapidly determine the antigen specificity of TCRs. They applied their system successfully to peptide specificity and cytotoxic activity of TCRs from EBV infection and cancer.

3. TCR and BCR 3D structural modeling

In spite of advances in experiential determination of receptor-antigen interactions, most high-throughput experiments lack residue-level resolution. X-ray crystallography and single-particle electron microscopy (cryo-EM), on the other hand, provide such high-resolution information, but are not suitable for high-throughput analysis. Computational modeling of TCRs and BCRs is now routine and can be performed in a high-throughput manner. Building 3D models of receptors is also the first step in structure-based analysis of receptor antigen interactions. For 3D structural modeling, TCR or BCR V regions are generally divided into “frameworks” and the three CDRs (Fig. 3). Each framework is a double layer of beta sheets that contain the beginning and ending of each CDR loop. There are other loops in V regions, but the CDRs are important because of their high sequence diversity and because they form a continuous surface that constitutes the main antigen binding interface. Of the CDRs, CDR3 is the most diverse in terms of both sequence and structure. CDR3 modeling has been tackled by a wide range of approaches [47]. Software for CDR3 modeling (Table 2) spans the range from simple sequence alignment methods [48], to fragment assembly [49], molecular dynamics (MD) [50] and robotics-based loop closure algorithms [51]. In the most recent antibody modeling assessment (AMA-II) [52], the lowest heavy-chain CDR3 (CDRH3) errors were obtained by our own group using a combination of MD, fragment assembly and manual selection [53]. Based on an internal assessment of our AMA-II results, we developed a purely fragment assembly-based tool, Kotai Antibody Builder [54]. We more recently introduced Repertoire Builder, which exceeded Kotai Antibody Builder in terms of accuracy, with a factor of 100 improvement in speed [55]. In the same time frame, several new tools, including ABodyBuilder [56], TCRModel [57], and PigsPro (Prediction of immunoglobulin structure v2) [58] have been introduced, which show advancement over previously published methods. Because of its high accuracy and ability to scale with the number of input sequences, we will briefly outline the Repertoire Builder approach.

Fig. 3 — BCR and TCR structure. Representative BCR and TCR structures. The location in structure and sequence of the three CDRs are shown for a representative BCR (A) and TCR (B) using the same PDB entries as in Fig. 1.

Table 2.

BCR or TCR 3D modeling tools.

Tools	BCR	TCR	URL	References
Repertoire Builder	Yes	Yes	https://sysimm.org/rep_builder/	[55]
PigsPro	Yes	No	http://biocomputing.it/pigspro	[58]
Rosetta Antibody	Yes	No	https://rosie.graylab.jhu.edu/snug_dock	[123]
ABodyBuilder	Yes	No	http://frodock.chaconlab.org/	[56]
LYRA	Yes	Yes	http://www.cbs.dtu.dk/services/LYRA/	[82]
TCRpMHCmodels	No	Yes	http://www.cbs.dtu.dk/services/TCRpMHCmodels/	[83]

Open in a new tab

In order to improve speed and reduce noise, one aim of Repertoire Builder was to remove 3D structure from the key decision-making steps: sampling and scoring. Working in three dimensions is computationally expensive and also messy, as protein structure files can contain a plethora of sources of noise. As an alternative, we derived feature vectors from pairwise query-template alignments and trained a machine learning model to recognize the good alignments. Feature vectors currently consist of BLOSUM62 matrix elements or gaps for each aligned residue pair and cover the entire V region. The inclusion of residues outside of the CDR region was intended to take the environment of the CDR into account in the choice of template. We note that scoring at the alignment level is not unique to Repertoire Builder; all of the methods do this. What is novel here is the alignment-derived feature vectors. Another trick used by Repertoire Builder was to store templates in the form of structure-aware multiple sequence alignments (MSAs), which can be readily computed using our MAFFT-DASH (Multiple Alignment using Fast Fourier Transform-Database of Aligned Structural Homologs) pipeline and which have been shown to be significantly more accurate than sequence-based MSAs [59]. The query sequence can be added to a stored template MSA efficiently using MAFFT’s fragment-adding option, which preserves the relationships between the templates in the stored MSAs [60]. Templates in MSAs are grouped by their CDR lengths. Thus, there is a different template MSA stored for each CDR-length combination. The advantage of using MAFFT-DASH in this manner is primarily a combination of speed and MSA accuracy. We have not assessed whether use of alternative alignment strategies results in a degradation of model quality. The current Repertoire Builder can model 10⁴ paired or unpaired sequences in approximately 30 min, which makes it practically useful for high-throughput sequencing discussed above. To our knowledge, Repertoire Builder is the only server that allows multiple BCR or TCR sequences to be input at one time.

4. TCR and BCR clustering

As genomic data continues to grow, methods for clustering nucleotide or amino acid sequences will play major role in sequence and structural analysis. Since generic sequence clustering methods (e.g. [61], [62]) are beyond the scope of this review, here we focus on methods specific to immune receptors. A common goal when studying immune repertoires is to understand common features of receptors that are shared by a group of donors of interest (Fig. 4). The implication here is that receptors target the same antigen and epitope will be more common in the donors of interest than in a control group. This is a very general notion that can be applied to either BCRs or TCRs and approached in a variety of ways. Given the broad diversity of immune repertoires, their uneven population distributions, and the relatively low overlap of exact matching sequences among subjects, this task is a significant challenge. To address these issues, several clustering strategies have been developed recently. Below, we review some representative examples, including our own efforts.

Fig. 4 — Receptor clustering. B or T cells of interest are acquired from donors of interest, receptors are sequences and clustered based on sequence features, structure features, or both. Clusters that are enriched in receptors from donors of interest are identified.

4.1. TCR clustering

Based on the observation that there are specific positions in TCR CDR3 regions that contact antigen peptides and that the presence of particular sequence motifs can define TCR clusters, Glanville et al., developed the GLIPH (grouping of lymphocyte interactions by paratope hotspots) algorithm [63], [64]. This algorithm clusters TCRs based on local sequence motifs, as well as on other parameters such as global CDR3 similarity, V gene usage, CDR3 length, MHC profile of donor(s) and clone size. GLIPH identifies motifs that are enriched in a given dataset relative to a control group, with the goal of producing groups of TCRs targeting the same peptide-MHC (pMHC). By using this approach, the authors were able to design synthetic antigen-specific TCRs to groups, and confirm their specificity experimentally.

In a similar study, Dash et al. [65] developed TCRdist; a tool that estimates the similarity of two TCR sequences by computing a weighted Hamming distance among the concatenated amino acid sequences of the CDR loops of each TCR. TCRdist assumes a higher weight (3x) for the CDR3 regions. Clusters of highly similar antigen-specific TCRs can be built, and new TCRs of unknown specificity can be assigned to an antigen-specific cluster based on similarity, allowing for the prediction of antigen specificity. Additionally, a diversity score (TCRdiv) that robustly calculates the diversity of epitope-specific repertoires by considering both TCR similarity and exact identity in a generalized Simpson’s diversity index, was developed. TCRdist has recently been used to identify clonal expansion of M. tuberculosis specific TCRs in a South African cohort where it was able to accurately classify active tuberculosis patients [66].

Though they share the same goal, the focus of those two tools are slightly different. The GLIPH algorithm assumes that the input data is enriched in TCRs targeting a restricted set of epitopes, and tries to cluster these enriched TCRs using common motifs in the dataset. With this approach, they are also able to avoid direct comparison of all pairs of sequences, which is computationally expensive. Thus, GLIPH is suitable for large repertoire analyses of particular disease cohorts. On the other hand, TCRdist is based on direct comparison of each TCRs using a “universal” measure of TCR similarity, and it is thus currently difficult to apply the method to datasets greater than approximately 10⁴. However, an advantage of TCRdist is that the calculated distance between a pair of TCRs are always the same, regardless of other factors. Such “universal” definition of TCR similarity/difference is of use when assumptions about shared antigen/epitope cannot be made.

4.2. BCR clustering

Structural studies of antibodies targeting antigens specific to HIV [67], influenza [68] and more recently SARS-CoV-2 [69] have demonstrated that antibodies produced in unrelated donors targeting common antigens and epitopes can share sequence and structural features. We note here that, since B cells can undergo affinity-driven maturation, such receptors need not derive from a similar common clone. Recently, the SAAB + tool was developed to characterize structural properties of CDRs from differentiated B cells [70]. It is likely that more tools trained to identify “convergence” of functionally related antibodies will appear in the future as more sequence data from donors with shared BCR epitopes become available.

To this end, we recently developed InterClone, a method to cluster BCR sequences which are likely to share epitopes [71]. InterClone is based on a comparison of sequence and structural features of pairs of BCRs using a machine learning-based classifier that was trained on known antigen-BCR structures. Like TCRdist, InterClone assigns a “universal” similarity score to each BCR pair. Hierarchical clustering is then used to group sequences of high similarity. As such, InterClone can be used without requiring sequences to be enriched in a particular BCR motif. A sensitivity of 61.9% and specificity of 99.7% were obtained when InterClone was applied to an independent set of anti-HIV antibody sequences [71]. A more robust and computationally efficient version of InterClone that works for both BCRs and TCRs and can perform high-throughput analysis of up to 10⁵ sequences is currently being developed.

In addition to the above clustering methods, networks that describe antibody repertoire architecture can be used to compare repertoires. Miho and colleagues [72] developed a platform that builds similarity networks of hundreds of thousands of antibody sequences from both humans and mice. Using this approach, the authors detected global patterns in antibody repertoire architectures that were highly reproducible in different subjects, and tended to converge despite independent VDJ recombination. Furthermore, these repertoire architectures were robust to clonal deletion of private clones.

5. Epitope specificity

5.1. Predicting TCR epitopes

TCRs recognize short peptides presented on class I or II MHC complexes. The ability to predict epitope(s) from TCR sequence and MHC allele would be highly valuable in elucidating disease etiology, monitoring the immune system, developing diagnostic assays and designing vaccines. Traditionally, identifying epitopes is carried out experimentally [73], and is both costly and time-consuming. There is necessarily great interest in methods that can accelerate this process computationally.

To this end, Fischer et al. [74] developed a deep learning approach on TCR CDR3 regions to predict the antigen-specificity of single T cells. Jokinen et al., [75] developed TCRGP to predict whether TCRs recognize certain epitopes using a novel Gaussian process (GP). Their method uses CDR sequences from TCR alpha and beta and learns which CDR recognizes different epitopes. The tool was applied to identify T cells specific to HBV. NetTCR by Jurtz VI et al. [43] utilized convolutional networks for sequence-based prediction of TCR-pMHC specificity. NetTCR uses the recent explosion of next-generation sequencing data to train a sequence based-predictor. Ogishi et al. [76] computationally defined immunogenicity scores through sequence-level simulation of interaction between pMHC complexes and public TCR repertoires. Though their focus is more on immunogenicity of peptides presented to MHC molecules, they also observed correlation between individual TCR-pMHC affinities and the features important for immunogenicity score. Gielis et al. [77] applied random forest-based classifiers for epitope specific TCRs to repertoire level analysis. Their models successfully detected the increase of epitope specific TCRs upon vaccination in two Yellow Fever vaccination studies. The works by Chain and co-workers [78], [79] also addressed related questions. In [78], the authors have constructed a classifier to distinguish the TCR beta sequences in expanded repertoires of ovalbumin-stimulated mice from control. Their classifier was based on the frequencies of amino acid triplets in CDR3 and their choice of machine learning algorithm called LPBoost (linear programming boosting) allowed them to identify the responsible motifs in CDR3.

5.2. TCR-pMHC 3D modeling

Unlike BCRs, which can be expressed as soluble antibodies, TCRs remain attached to the cell surface. This, along with their weaker binding affinities to pMHC complexes, has made experimental structural analysis more difficult than for BCRs. Nevertheless, from the known crystal structures of TCR-pMHC complexes, we can see that the range of docking modes is highly restricted, as expected by the similarity of MHCs within a given class (Fig. 5). As a result of this restriction, we and others [80] have approached the problem using structural templates for TCR-pMHC docking.

Fig. 5 — Restricted docking of TCR-peptide-MHC complexes. A representative set of MHC class I (A) and class-II (B) complexes from the PDB were superimposed using conserved residue positions in the MHC. TCR alpha (yellow) and beta (magenta) chains are contained within a narrow ensemble of binding modes. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

There are currently few methods for modeling TCR-pMHC complexes. To our knowledge, there are two public servers for this purpose: our own ImmuneScape [81] and the Lymphocyte Receptor Automated Modeling or LYRA-based [82] TCRpMHCmodels [83]. Both of these approaches are “template-based” in the sense that existing structures instead of stochastic conformational sampling are used as templates for each of the key modeling steps: TCR, pMHC and TCR-pMHC orientation. They are also both “bottom-up” in the sense that models for TCR and pMHC are built and then combined to form the TCR-pMHC complex. One possible conceptual difference is that, in ImmuneScape, CDRs are modeled after the TCR and pMHC templates are combined in order to take the pMHC into account. It will be interesting to compare the two approaches in more detail. TCRpMHCmodels compared favorably to an earlier rigid docking-based approach, TCRFlexDock, which suggests that care must be taken in sampling TCR-pMHC orientations beyond that which is observed in typical crystal structures.

5.3. Predicting BCR epitopes

Several computational methods are available to predict BCR epitopes and paratopes. Of the two problems, paratope prediction is much easier, as paratopes tend to correspond to CDR residues, while epitopes can be anywhere on an antigen. This is illustrated in the case of anti-influenza hemagglutinin (HA) antibodies (Fig. 6); a superimposition of all known anti-HA antibodies leaves very little un-targeted surface area.

Fig. 6 — BCR epitopes on influenza hemagglutinin. A representative set of anti-HA antibodies bound to HA from the PDB were superimposed using conserved residues in HA. HA is a symmetric trimer and antibodies are only shown bound to the chain facing toward the back for simplicity.

Paratope prediction methods include the Paratome algorithm [84], which is based on structural consensus between BCRs and uses features from sequence or structure; Prediction of Antibody Contacts or ProABC [85], which applies a random forest learning technique and is based on sequence; Parapred [86], which uses a deep learning architecture to extract patterns from variable regions in sequence; AntibodyInterfacePrediction [87], which uses a support vector machine method (SVM) to classify antibody surface patches based on 3D Zernike descriptors; ProABC-2 [131], which is an upgrade of the original algorithm [85] with convolutional neural networks, and improves performance over existing methods. Additionally, paratope predictors have evolved to be specific to cognate antigen. The antibody i-Patch [89] algorithm introduces a likelihood score for residue contact as a constraint on local docking to generate predicted paratope residues, and thus requires the structure of the antigen-antibody complex. AG-Fast-Parapred [88], which is based on deep neural networks, utilizes antigen sequence information to predict paratope.

With regard to epitope prediction there are many tools available. Previously, methods were built to predict linear epitopes that are contiguous polypeptide chains, an example of which is LBtope (Linear B-Cell epitope prediction server) [90], which discriminated experimentally verified B-cell epitopes from background using SVM. However, the majority of epitopes are non-continuous surface residues characterized by structure as well as sequence. Several methods are available to treat such conformational epitopes. SEPIa [91] uses a combination of two classifiers (naive Bayesian and random forest) from antigen sequence. BepiPred-2.0 [92] uses random forest algorithms to predict epitopes from primary sequence only. Glep [93], is a recent method based on subgraph clustering for the prediction of separated and overlapping epitopes.

Recently, there has been a realization that epitope prediction without reference to a particular antibody is an ill-formed problem, and methods for “antibody-specific epitope prediction” have been introduced [94]. There are currently few options for antibody-specific epitope prediction. The PEASE (Predicting Epitopes using Antibody Sequence) [95] method applies machine learning to predict true contacts of antibody-antigen residue pairs, providing candidates epitope patches. EpiPred [96] identifies the epitope region by rescoring antibody-antigen global docking based on geometric matching of antigen–antibody interfaces and asymmetric potentials. MAbTope [97] predicts epitope residues based on consensus epitopes shared by top-ranked poses; the success of this approach depends on the quality of the docking. PECAN [132] predicts binding interfaces on both antibodies and antigens by learning context-aware structural representations; it applies a unified deep learning framework that consists of a combination of graph convolutional networks, attention and transfer learning. Although there is a clear awareness of the importance of antibody information in epitope prediction, the traditional antigen-centric methods cannot easily be extended to include such information. This is partially because of the increase in the number of degrees of freedom when antibody-antigen interactions are considered.

5.4. BCR-antigen docking

The most direct means of tackling antibody-antigen interactions is through protein docking, a technique that requires structure information of antibody and antigen. This introduces 6 additional degrees of freedom for rigid docking and a host of other issues due to the complexity and inherent uncertainty of protein structural information. Nevertheless, protein docking is a mature field and steady progress has been made in this area. Generally speaking, docking methods can be classified into four categories: Fast Fourier transform (FFT) correlation; Monte-Carlo (MC) simulated annealing; Geometric hashing; and flexible docking [98]. In Table 3, we give a representative list of molecular docking tools or web servers that can be applied to antibody-antigen docking. Of these, Cluspro [99], PatchDock [100], FRODOCK [101] and SnugDock [102] provide Antibody-Antigen specific modes and are capable of automatically masking non-CDR regions. Among the four, ClusPro, FRODOCK and PatchDock implement rigid-body or soft docking which do not consider the large conformational changes in the Antibody or Antigen. Although we are not aware of a flexible docking methodology tailored for antibody-antigen interactions [102], SnugDock takes molecular flexibility into account by optimizing the antibody-antigen rigid-body positions, orientation of the H/L chains and conformations of the six CDR loops.

Table 3.

Antibody docking methods.

Tools	Docking mode	URL	Algorithm	References
ClusPro	Have Ab specific mode	https://cluspro.bu.edu/login.php	FFT based	[99]
SnugDock/Rosseta	Have Ab specific mode	https://rosie.graylab.jhu.edu/snug_dock	Semi flexible docking with energy minimization	[49], [102], [123]
FRODOCK2.0	Have Ab specific mode	http://frodock.chaconlab.org/	FFT based	[101]
PatchDock/ FireDock	Have Ab specific mode	https://bioinfo3d.cs.tau.ac.il/PatchDock/, http://bioinfo3d.cs.tau.ac.il/FireDock/	Geometric hashing based	[100], [124]
HADDOCK2.2	Not Ab specific mode	https://haddock.science.uu.nl/services/HADDOCK2.2/	MC simulated annealing based	[103]
ZDOCK	Not Ab specific mode	http://zdock.umassmed.edu/	FFT based	[105]
SwarmDock	Not Ab specific mode	https://bmm.crick.ac.uk/~svc-bmm-swarmdock/	Flexible docking with Particle Swarm Optimization (PSO)	[125]
LightDock	Not Ab specific mode	https://lightdock.org/	Flexible docking with Glowworm Swarm Optimization (GSO)	[104]
pyDockWeb/ pyDock	Not Ab specific mode	https://life.bsc.es/pid/pydockweb	FFT based	[126]
HDOCK	Not Ab specific mode	http://hdock.phys.hust.edu.cn/	FFT based	[127]
HexServer	Not Ab specific mode	http://hexserver.loria.fr/	FFT based	[128]
ATTRACT	Not Ab specific mode	http://www.attract.ph.tum.de/services/ATTRACT/	Energy minimization	[129]
GRAMM-X	Not Ab specific mode	http://vakser.compbio.ku.edu/resources/gramm/grammx/	FFT based	[130]

Open in a new tab

Recently, Vreven et al. used a well-established flexible docking program, HADDOCK [103] and another three representative tools (ClusPro, LightDock [104] and ZDOCK [105]) to systematically analyze 16 antibody-antigen complexes from the well-studied ZDOCK protein–protein interaction benchmark (version 5.0) [106]. The results were evaluated using criteria established by the Critical Assessment of PRedicted Interactions (CAPRI) community where models are classified into the four categories: Incorrect, Acceptable, Medium, or High quality [107]. It was demonstrated that information-driven docking, even using noisy predictions of epitope and paratope, could significantly improve performance over all four algorithms [108]. Notably, HADDOCK was capable of providing high quality models for all 16 entries based on CAPRI criteria in this test. However, this study did not evaluate the tolerance of the docking methods to typical BCR modeling errors.

As with all protein docking from homology models, the success of docking antibody models depends heavily on the quality of the starting structures [109]. Structural uncertainties in the binding regions can occur either from flexibility or modeling errors. Moreover, the regions of greatest uncertainty tend to be the CDRs (especially CDRH3), which is highly likely to form part of the paratope [110]. These issues can be addressed to some extent by use of epitope and paratope predictions. However, few antibody docking methods have been rigorously tested using a large benchmark of realistic models. The bottom line is that structure-based prediction of antibody-antigen interactions from sequence involves a number of interrelated tasks: receptor and antigen model building, initial epitope and paratope prediction, docking, scoring and refinement. The combination of so many critical steps results in complexity, both in terms of software integration and in parameter optimization. Fortunately, the emergence of larger and better BCR sequence datasets will be a motivation to develop well-integrated structure prediction pipelines.

6. Molecular dynamics

In this review, we have focused primarily on high-throughput structure-based methods that can be applied to BCR or TCR repertoires. As is clear from the previous section, combining software methods that work well in isolation introduces complexity. Such complexity arises from conceptual considerations (e.g. parameter optimization) and technical issues (code interoperability). In this regard, MD is conceptually simple: it applies Newtonian mechanics to molecular systems. The force fields describing the interatomic interactions can be taken as given and generally do not have to be optimized. Therefore, even though MD is not a high-throughput method, it can be used to independently confirm BCR- or TCR specific calculations.

As with all proteins, the dynamics of BCRs and TCRs is intimately tied to their functions. Protein dynamics are governed by interactions at the level of individual atoms. The time and length scales involved are, however, difficult to observe experimentally. Molecular dynamics offers the possibility to observe the behavior of proteins and lipids at atomistic resolution, and can therefore contribute to a better understanding of the immune system. The challenges facing such studies are illustrated by recent work by the Deane group, who used a large number of molecular dynamics studies to investigate the influence of point mutations on the structure and dynamics of an epitope derived from the Epstein Barr virus [111]. In their simulations they did not observe a strong relation between the structural and dynamical features of the epitope and its immunogenicity. It is not clear if this is due to limitations in their modelling, or due to the complexity of the immune system. Reboul et al. investigated the immunogenicity of a specific epitope when presented by two structurally highly similar MHC complexes, HLA-B*3508 and HLA-B*3501. Only when the epitope is bound to HLA-B*3508 is a strong interaction with the T cell receptor formed. Simulations showed that the epitope exhibits a much higher flexibility in HLA-B*3501, thereby apparently hindering the formation of a strong interaction by the T cell receptor [112].

Most studies focusing on the T cell receptor only study the dynamics of T cell receptors when bound to a pMHC. In contrast, Dominguez and Knapp compared the dynamics of T cell receptors bound to pMHC and free T cell receptors. In their study they found, apart from expected results as an increased flexibility and increased solvent accessible surface of the CDRs in the free T cell receptor, also differences in the hydrogen bond network of the CDR3α chain in the free TCR versus the pMHC bound TCR [113]. A study combining steered molecular dynamics and single-molecule biophysical experiments [114] studied the formation of catch bonds between the pMHC and the TCR. Catch bonds are a special type of bond in which the lifetime increases when more force is applied. This study suggests that catch bond formation is influenced by conformational changes in the pMHC. A downside of molecular dynamics simulations are the high computational requirements. Fodor et al. were able to distill conformational data from pMHC class I x-ray structures using ensemble refinement, which is a refinement technique to obtain dynamic data without the need of more computationally intensive molecular dynamics simulations [115]. Another way to reduce the computational requirements is by using coarse grained simulations, in which atoms are grouped together into beads. Coarse graining allows for the study of much larger systems on longer time scales. Friess et al. modeled the transmembrane domains of the immunoglobulin M (IgM) B cell receptor, which have been unresolved so far, and subsequently used coarse grained simulations to study their aggregation behavior and association with lipid rafts [116].

7. Conclusions

Recent advances in sequencing technology enable the study of immune responses in unprecedented breadth and depth. As discussed above, the emerging data has spawned the development of a wide range of modeling methods that are applicable to B cells, T cells or both. Current challenges include the integration of data and methodologies. For example, sequence and structural information can, in principle, be combined to yield more accurate descriptions of receptors sharing antigen and epitope specificity. Structural modeling is still not in the mainstream of repertoire analysis; nevertheless, 3D modeling methods present a straightforward direction to encompass “shared features” of functionally related receptors in different donors.

In the context of repertoire analysis, we are often interested in the target antigens and epitopes; however, the scale of publicly available data on targeted antigens and epitopes is currently smaller than that of BCR/TCR sequences, and vastly smaller the actual BCR-antigen or TCR-peptide-MHC interactome. As barcoding methods evolve to include antigens themselves [42], there may soon be new and valuable data available to train methods for functional classification of BCRs and TCRs.

At the point where we are asking not only what is targeted but also why or why not, the use of structural modeling is likely to play a critical role in our understanding of BCR and TCR molecular recognition. As a case in point, at the time of this writing, we are in the midst of the COVID-19 pandemic. This is an example where the target antigens, along with their structures, are largely known, and understanding host immune responses to these antigens is of vital importance in the development of diagnostics, biomarkers, vaccines and therapeutics [117]. Structural similarity among neutralizing antibodies targeting SARS-CoV-2 [69] or between SARS-CoV-1 and SARS-CoV-2 [118] have been noted. With such high stakes driving research and development, integration of emerging technologies in the repertoire analysis domain, including structural analysis, is expected. As the saying goes, “necessity is the mother of invention,” and the need for understanding human immune repertoires has never been greater.

Author statement

Shunsuke Teraguchi, Diego Diez, Hendra S. Ismanto and Mara Anais Llamas Covarrubias contributed significantly to writing the sequencing sections. Dianita Saputri, Sedat Aybars Nazlica, Jiaqi Xie and Martin de Jesus Loza Lopez contributed to the experimental epitope determination sections. John Rozewicki, Ana Davila and Jan Wilamowski wrote most of the in-house software sections. Floris J. van Eerden wrote the MD section. Zichang Xu wrote the BCR docking section. Daron M. Standley wrote the overall manuscript and coordinated the efforts of the other members.

Acknowledgements

We would like to thank all members of the Systems Immunology Lab for helpful discussions. This research was supported by JSPS KAKENHI Grant Numbers 18H02430, 20K06610, 20K16286, 20K07538 and by the Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research (BINDS)) from AMED under Grant Number 17am0101108j0001.

References

1.Murphy K. Garland Science; New York: 2008. Janeway's immunobiology. [Google Scholar]
2.Mora T., Walczak A.M. How many different clonotypes do immune repertoires contain? Curr Opin Syst Biol. 2019;18:104–110. [Google Scholar]
3.Turner S.J. Structural determinants of T-cell receptor bias in immunity. Nat Rev Immunol. 2006;6(12):883–894. doi: 10.1038/nri1977. [DOI] [PubMed] [Google Scholar]
4.Reinhardt R.L., Liang H.E., Locksley R.M. Cytokine-secreting follicular T cells shape the antibody repertoire. Nat Immunol. 2009;10(4):385–393. doi: 10.1038/ni.1715. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Bagaev D.V. VDJdb in 2019: database extension, new analysis infrastructure and a T-cell receptor motif compendium. Nucleic Acids Res. 2020;48(D1):D1057–D1062. doi: 10.1093/nar/gkz874. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Miqueu P. Statistical analysis of CDR3 length distributions for the assessment of T and B cell repertoire biases. Mol Immunol. 2007;44(6):1057–1064. doi: 10.1016/j.molimm.2006.06.026. [DOI] [PubMed] [Google Scholar]
7.Calis J.J., Rosenberg B.R. Characterizing immune repertoires by high throughput sequencing: strategies and applications. Trends Immunol. 2014;35(12):581–590. doi: 10.1016/j.it.2014.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Hou X.L. Current status and recent advances of next generation sequencing techniques in immunological repertoire. Genes Immun. 2016;17(3):153–164. doi: 10.1038/gene.2016.9. [DOI] [PubMed] [Google Scholar]
9.Brochet X, Lefranc MP, Giudicelli V, IMGT/V-QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis. Nucleic Acids Res;2008:36(Web Server issue): p. W503–8. [DOI] [PMC free article] [PubMed]
10.Ralph DK, Matsen FAT, Per-sample immunoglobulin germline inference from B cell receptor deep sequencing data. PLoS Comput Biol;2019:15(7):e1007133. [DOI] [PMC free article] [PubMed]
11.Alamyar E. IMGT((R)) tools for the nucleotide analysis of immunoglobulin (IG) and T cell receptor (TR) V-(D)-J repertoires, polymorphisms, and IG mutations: IMGT/V-QUEST and IMGT/HighV-QUEST for NGS. Methods Mol Biol. 2012;882:569–604. doi: 10.1007/978-1-61779-842-9_32. [DOI] [PubMed] [Google Scholar]
12.Li S. IMGT/HighV QUEST paradigm for T cell receptor IMGT clonotype diversity and next generation repertoire immunoprofiling. Nat Commun. 2013;4:2333. doi: 10.1038/ncomms3333. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Ye J, et al., IgBLAST: an immunoglobulin variable domain sequence analysis tool. Nucleic Acids Res;2013: 41(Web Server issue): W34–40. [DOI] [PMC free article] [PubMed]
14.Altschul S.F. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Bolotin D.A. MiXCR: software for comprehensive adaptive immunity profiling. Nat Methods. 2015;12(5):380–381. doi: 10.1038/nmeth.3364. [DOI] [PubMed] [Google Scholar]
16.Bolotin D.A. Antigen receptor repertoire profiling from RNA-seq data. Nat Biotechnol. 2017;35(10):908–911. doi: 10.1038/nbt.3979. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Smakaj E. Benchmarking immunoinformatic tools for the analysis of antibody repertoire sequences. Bioinformatics. 2020;36(6):1731–1739. doi: 10.1093/bioinformatics/btz845. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Vander Heiden J.A. pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics. 2014;30(13):1930–1932. doi: 10.1093/bioinformatics/btu138. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Gupta N.T. Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. Bioinformatics. 2015;31(20):3356–3358. doi: 10.1093/bioinformatics/btv359. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Li B. Ultrasensitive detection of TCR hypervariable-region sequences in solid-tissue RNA-seq data. Nat Genet. 2017;49(4):482–483. doi: 10.1038/ng.3820. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Shugay M. VDJtools: unifying post-analysis of T cell receptor repertoires. PLoS Comput Biol. 2015;11(11) doi: 10.1371/journal.pcbi.1004503. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Duez M. Vidjil: a web platform for analysis of high-throughput repertoire sequencing. PLoS ONE. 2016;11(11) doi: 10.1371/journal.pone.0166126. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Nazarov V.I. tcR: an R package for T cell receptor repertoire advanced data analysis. BMC Bioinf. 2015;16:175. doi: 10.1186/s12859-015-0613-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Laydon D.J. Quantification of HTLV-1 clonality and TCR diversity. PLoS Comput Biol. 2014;10(6) doi: 10.1371/journal.pcbi.1003646. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Bolen C.R. The Repertoire Dissimilarity Index as a method to compare lymphocyte receptor repertoires. BMC Bioinf. 2017;18(1):155. doi: 10.1186/s12859-017-1556-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Yokota R., Kaminaga Y., Kobayashi T.J. Quantification of inter-sample differences in T-cell receptor repertoires using sequence-based information. Front Immunol. 2017;8:1500. doi: 10.3389/fimmu.2017.01500. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Emerson R.O. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat Genet. 2017;49(5):659–665. doi: 10.1038/ng.3822. [DOI] [PubMed] [Google Scholar]
28.DeWitt WS, 3rd, et al., Human T cell receptor occurrence patterns encode immune history, genetic background, and receptor specificity. Elife;2018:7. [DOI] [PMC free article] [PubMed]
29.Sethna Z. OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs. Bioinformatics. 2019;35(17):2974–2981. doi: 10.1093/bioinformatics/btz035. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Sethna Z, et al., Population variability in the generation and thymic selection of T-cell repertoires. bioRxiv, 2020: p. 2020.01.08.899682. [DOI] [PMC free article] [PubMed]
31.Davidsen K. Deep generative models for T cell receptor protein sequences. Elife. 2019:8. doi: 10.7554/eLife.46935. [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Pogorelyy M.V. Detecting T cell receptors involved in immune responses from single repertoire snapshots. PLoS Biol. 2019;17(6) doi: 10.1371/journal.pbio.3000314. [DOI] [PMC free article] [PubMed] [Google Scholar]
33.Murugan A. Statistical inference of the generation probability of T-cell receptors from sequence repertoires. Proc Natl Acad Sci U S A. 2012;109(40):16161–16166. doi: 10.1073/pnas.1212755109. [DOI] [PMC free article] [PubMed] [Google Scholar]
34.Marcou Q., Mora T., Walczak A.M. High-throughput immune repertoire analysis with IGoR. Nat Commun. 2018;9(1):561. doi: 10.1038/s41467-018-02832-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Singh M. High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes. Nat Commun. 2019;10(1):3120. doi: 10.1038/s41467-019-11049-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Afik S. Targeted reconstruction of T cell receptor sequence from single cell RNA-seq links CDR3 length to T cell differentiation state. Nucleic Acids Res. 2017;45(16) doi: 10.1093/nar/gkx615. [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Stubbington M.J.T. T cell fate and clonality inference from single-cell transcriptomes. Nat Methods. 2016;13(4):329–332. doi: 10.1038/nmeth.3800. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Eltahla A.A. Linking the T cell receptor to the single cell transcriptome in antigen-specific human T cells. Immunol Cell Biol. 2016;94(6):604–611. doi: 10.1038/icb.2016.16. [DOI] [PubMed] [Google Scholar]
39.Canzar S. BASIC: BCR assembly from single cells. Bioinformatics. 2017;33(3):425–427. doi: 10.1093/bioinformatics/btw631. [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Lindeman I. BraCeR: B-cell-receptor reconstruction and clonality inference from single-cell RNA-seq. Nat Methods. 2018;15(8):563–565. doi: 10.1038/s41592-018-0082-3. [DOI] [PubMed] [Google Scholar]
41.Rizzetto S. B-cell receptor reconstruction from single-cell RNA-seq with VDJPuzzle. Bioinformatics. 2018;34(16):2846–2847. doi: 10.1093/bioinformatics/bty203. [DOI] [PubMed] [Google Scholar]
42.Setliff I, et al., High-throughput mapping of B cell receptor sequences to antigen specificity. Cell;2019:179(7):1636–1646 e15. [DOI] [PMC free article] [PubMed]
43.Jurtz VI, et al., NetTCR: sequence-based prediction of TCR binding to peptide-MHC complexes using convolutional neural networks. bioRxiv;2018:433706.
44.Kula T, et al., T-scan: a genome-wide method for the systematic discovery of T cell epitopes. Cell;2019:178(4):1016–1028 e13. [DOI] [PMC free article] [PubMed]
45.Gee MH, et al., Antigen identification for orphan T cell receptors expressed on tumor-infiltrating lymphocytes. Cell;2018:172(3): p. 549–563 e16. [DOI] [PMC free article] [PubMed]
46.Kobayashi E. A new cloning and expression system yields and validates TCRs from blood lymphocytes of patients with cancer within 10 days. Nat Med. 2013;19(11):1542–1546. doi: 10.1038/nm.3358. [DOI] [PubMed] [Google Scholar]
47.Marks C., Deane C.M. Antibody H3 Structure Prediction. Comput Struct Biotechnol J. 2017;15:222–231. doi: 10.1016/j.csbj.2017.01.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Marcatili P., Rosi A., Tramontano A. PIGS: automatic prediction of antibody structures. Bioinformatics. 2008;24(17):1953–1954. doi: 10.1093/bioinformatics/btn341. [DOI] [PubMed] [Google Scholar]
49.Sircar A, Kim ET, Gray JJ, RosettaAntibody: antibody variable region homology modeling server. Nucleic Acids Res, 2009. 37(Web Server issue): p. W474-9. [DOI] [PMC free article] [PubMed]
50.Nishigami H., Kamiya N., Nakamura H. Revisiting antibody modeling assessment for CDR-H3 loop. Protein Eng Des Sel. 2016;29(11):477–484. doi: 10.1093/protein/gzw028. [DOI] [PMC free article] [PubMed] [Google Scholar]
51.Mandell D.J., Coutsias E.A., Kortemme T. Sub-angstrom accuracy in protein loop reconstruction by robotics-inspired conformational sampling. Nat Methods. 2009;6(8):551–552. doi: 10.1038/nmeth0809-551. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Almagro J.C. Second antibody modeling assessment (AMA-II) Proteins. 2014;82(8):1553–1562. doi: 10.1002/prot.24567. [DOI] [PubMed] [Google Scholar]
53.Shirai H. High-resolution modeling of antibody structures by a combination of bioinformatics, expert knowledge, and molecular simulations. Proteins. 2014;82(8):1624–1635. doi: 10.1002/prot.24591. [DOI] [PubMed] [Google Scholar]
54.Yamashita K. Kotai Antibody Builder: automated high-resolution structural modeling of antibodies. Bioinformatics. 2014;30(22):3279–3280. doi: 10.1093/bioinformatics/btu510. [DOI] [PubMed] [Google Scholar]
55.Schritt D. Repertoire Builder: High-throughput structural modeling of B and T cell receptors. Mol Syst Des Eng. 2019;4:761–768. [Google Scholar]
56.Leem J. ABodyBuilder: Automated antibody structure prediction with data-driven accuracy estimation. MAbs. 2016;8(7):1259–1268. doi: 10.1080/19420862.2016.1205773. [DOI] [PMC free article] [PubMed] [Google Scholar]
57.Gowthaman R., Pierce B.G. TCRmodel: high resolution modeling of T cell receptors from sequence. Nucleic Acids Res. 2018;46(W1):W396–W401. doi: 10.1093/nar/gky432. [DOI] [PMC free article] [PubMed] [Google Scholar]
58.Lepore R. PIGSPro: prediction of immunoGlobulin structures v2. Nucleic Acids Res. 2017;45(W1):W17–W23. doi: 10.1093/nar/gkx334. [DOI] [PMC free article] [PubMed] [Google Scholar]
59.Rozewicki J. MAFFT-DASH: integrated protein sequence and structural alignment. Nucleic Acids Res. 2019;47(W1):W5–W10. doi: 10.1093/nar/gkz342. [DOI] [PMC free article] [PubMed] [Google Scholar]
60.Katoh K., Frith M.C. Adding unaligned sequences into an existing alignment using MAFFT and LAST. Bioinformatics. 2012;28(23):3144–3146. doi: 10.1093/bioinformatics/bts578. [DOI] [PMC free article] [PubMed] [Google Scholar]
61.Edgar R.C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26(19):2460–2461. doi: 10.1093/bioinformatics/btq461. [DOI] [PubMed] [Google Scholar]
62.Li W., Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22(13):1658–1659. doi: 10.1093/bioinformatics/btl158. [DOI] [PubMed] [Google Scholar]
63.Glanville J. Identifying specificity groups in the T cell receptor repertoire. Nature. 2017;547(7661):94–98. doi: 10.1038/nature22976. [DOI] [PMC free article] [PubMed] [Google Scholar]
64.Huang H. Analyzing the Mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening. Nat Biotechnol. 2020 doi: 10.1038/s41587-020-0505-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Dash P. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature. 2017;547(7661):89–93. doi: 10.1038/nature22383. [DOI] [PMC free article] [PubMed] [Google Scholar]
66.DeWitt W.S. A diverse lipid antigen-specific TCR repertoire is clonally expanded during active tuberculosis. J Immunol. 2018;201(3):888–896. doi: 10.4049/jimmunol.1800186. [DOI] [PMC free article] [PubMed] [Google Scholar]
67.Scheid J.F. Sequence and structural convergence of broad and potent HIV antibodies that mimic CD4 binding. Science. 2011;333(6049):1633–1637. doi: 10.1126/science.1207227. [DOI] [PMC free article] [PubMed] [Google Scholar]
68.Joyce M.G. Vaccine-induced antibodies that Neutralize Group 1 and Group 2 influenza A viruses. Cell. 2016;166(3):609–623. doi: 10.1016/j.cell.2016.06.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
69.Robbiani D.F. Convergent antibody responses to SARS-CoV-2 in convalescent individuals. Nature. 2020 doi: 10.1038/s41586-020-2456-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
70.Kovaltsuk A. Structural diversity of B-cell receptor repertoires along the B-cell differentiation axis in humans and mice. PLoS Comput Biol. 2020;16(2) doi: 10.1371/journal.pcbi.1007636. [DOI] [PMC free article] [PubMed] [Google Scholar]
71.Xu Z, et al., Functional clustering of B cell receptors using sequence and structural features. Mol Syst Des Eng, 2019. in press.
72.Miho E. Large-scale network analysis reveals the sequence space architecture of antibody repertoires. Nat Commun. 2019;10(1):1321. doi: 10.1038/s41467-019-09278-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
73.Joglekar A.V., Li G. T cell antigen discovery. Nat Methods. 2020 doi: 10.1038/s41592-020-0867-z. [DOI] [PubMed] [Google Scholar]
74.Fischer DS, et al., Predicting antigen-specificity of single T-cells based on TCR CDR3 regions. bioRxiv, 2019: p. 734053. [DOI] [PMC free article] [PubMed]
75.Jokinen E, et al., TCRGP: Determining epitope specificity of T cell receptors. bioRxiv, 2019: p. 542332.
76.Ogishi M., Yotsuyanagi H. Quantitative prediction of the landscape of T cell epitope immunogenicity in sequence space. Front Immunol. 2019;10:827. doi: 10.3389/fimmu.2019.00827. [DOI] [PMC free article] [PubMed] [Google Scholar]
77.Gielis S. Detection of enriched T cell epitope specificity in full T cell receptor sequence repertoires. Front Immunol. 2019;10:2820. doi: 10.3389/fimmu.2019.02820. [DOI] [PMC free article] [PubMed] [Google Scholar]
78.Sun Y. Specificity, privacy, and degeneracy in the CD4 T Cell receptor repertoire following immunization. Front Immunol. 2017;8:430. doi: 10.3389/fimmu.2017.00430. [DOI] [PMC free article] [PubMed] [Google Scholar]
79.Thomas N. Tracking global changes induced in the CD4 T-cell receptor repertoire by immunization with a complex antigen using short stretches of CDR3 protein sequence. Bioinformatics. 2014;30(22):3181–3188. doi: 10.1093/bioinformatics/btu523. [DOI] [PMC free article] [PubMed] [Google Scholar]
80.Lanzarotti E., Marcatili P., Nielsen M. T-cell receptor cognate target prediction based on paired alpha and beta chain sequence and structural CDR loop similarities. Front Immunol. 2019;10:2080. doi: 10.3389/fimmu.2019.02080. [DOI] [PMC free article] [PubMed] [Google Scholar]
81.Li S, et al., Structural modeling of lymphocyte receptors and their antigens. Meth Mol Biol, 2019. in press. [DOI] [PubMed]
82.Klausen M.S. LYRA, a webserver for lymphocyte receptor structural modeling. Nucleic Acids Res. 2015;43(W1):W349–W355. doi: 10.1093/nar/gkv535. [DOI] [PMC free article] [PubMed] [Google Scholar]
83.Jensen K.K. TCRpMHCmodels: structural modelling of TCR-pMHC class I complexes. Sci Rep. 2019;9(1):14530. doi: 10.1038/s41598-019-50932-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
84.Kunik V, Ashkenazi S, Ofran Y, Paratome: an online tool for systematic identification of antigen-binding regions in antibodies based on sequence or structure. Nucleic Acids Res, 2012. 40(Web Server issue): p. W521-4. [DOI] [PMC free article] [PubMed]
85.Olimpieri P.P. Prediction of site-specific interactions in antibody-antigen complexes: the proABC method and server. Bioinformatics. 2013;29(18):2285–2291. doi: 10.1093/bioinformatics/btt369. [DOI] [PMC free article] [PubMed] [Google Scholar]
86.Liberis E. Parapred: antibody paratope prediction using convolutional and recurrent neural networks. Bioinformatics. 2018;34(17):2944–2950. doi: 10.1093/bioinformatics/bty305. [DOI] [PubMed] [Google Scholar]
87.Daberdaku S., Ferrari C. Antibody interface prediction with 3D Zernike descriptors and SVM. Bioinformatics. 2019;35(11):1870–1876. doi: 10.1093/bioinformatics/bty918. [DOI] [PubMed] [Google Scholar]
88.Deac A., VeliCkovic P., Sormanni P. Attentive cross-modal paratope prediction. J Comput Biol. 2019;26(6):536–545. doi: 10.1089/cmb.2018.0175. [DOI] [PubMed] [Google Scholar]
89.Krawczyk K. Antibody i-Patch prediction of the antibody binding site improves rigid local antibody-antigen docking. Protein Eng Des Sel. 2013;26(10):621–629. doi: 10.1093/protein/gzt043. [DOI] [PubMed] [Google Scholar]
90.Singh H., Ansari H.R., Raghava G.P. Improved method for linear B-cell epitope prediction using antigen's primary sequence. PLoS ONE. 2013;8(5) doi: 10.1371/journal.pone.0062216. [DOI] [PMC free article] [PubMed] [Google Scholar]
91.Dalkas G.A., Rooman M. SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence. BMC Bioinf. 2017;18(1):95. doi: 10.1186/s12859-017-1528-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
92.Jespersen M.C. BepiPred- 2.0: improving sequence-based B-cell epitope prediction using conformational epitopes. Nucleic Acids Res. 2017;45(W1):W24–W29. doi: 10.1093/nar/gkx346. [DOI] [PMC free article] [PubMed] [Google Scholar]
93.Zhao L. Novel overlapping subgraph clustering for the detection of antigen epitopes. Bioinformatics. 2018;34(12):2061–2068. doi: 10.1093/bioinformatics/bty051. [DOI] [PubMed] [Google Scholar]
94.Jespersen M.C. Antibody specific B-cell epitope predictions: leveraging information from antibody-antigen protein complexes. Front Immunol. 2019;10:298. doi: 10.3389/fimmu.2019.00298. [DOI] [PMC free article] [PubMed] [Google Scholar]
95.Sela-Culang I. PEASE: predicting B-cell epitopes utilizing antibody sequence. Bioinformatics. 2015;31(8):1313–1315. doi: 10.1093/bioinformatics/btu790. [DOI] [PubMed] [Google Scholar]
96.Krawczyk K. Improving B-cell epitope prediction and its application to global antibody-antigen docking. Bioinformatics. 2014;30(16):2288–2294. doi: 10.1093/bioinformatics/btu190. [DOI] [PMC free article] [PubMed] [Google Scholar]
97.Bourquard T. MAbTope: a method for improved epitope mapping. J Immunol. 2018;201(10):3096–3105. doi: 10.4049/jimmunol.1701722. [DOI] [PubMed] [Google Scholar]
98.Janin J. Protein-protein docking tested in blind predictions: the CAPRI experiment. Mol Biosyst. 2010;6(12):2351–2362. doi: 10.1039/c005060c. [DOI] [PubMed] [Google Scholar]
99.Kozakov D. The ClusPro web server for protein-protein docking. Nat Protoc. 2017;12(2):255–278. doi: 10.1038/nprot.2016.169. [DOI] [PMC free article] [PubMed] [Google Scholar]
100.Schneidman-Duhovny D, et al., PatchDock and SymmDock: servers for rigid and symmetric docking. Nucleic Acids Res, 2005. 33(Web Server issue): p. W363-7. [DOI] [PMC free article] [PubMed]
101.Ramirez-Aportela E, Lopez-Blanco JR, Chacon P, FRODOCK 2.0: fast protein-protein docking server. Bioinformatics, 2016. 32(15): p. 2386-8. [DOI] [PubMed]
102.Sircar A, Gray JJ, SnugDock: paratope structural optimization during antibody-antigen docking compensates for errors in antibody homology models. PLoS Comput Biol, 2010. 6(1): p. e1000644. [DOI] [PMC free article] [PubMed]
103.van Zundert G.C.P. The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes. J Mol Biol. 2016;428(4):720–725. doi: 10.1016/j.jmb.2015.09.014. [DOI] [PubMed] [Google Scholar]
104.Roel-Touris J., Bonvin A., Jimenez-Garcia B. LightDock goes information-driven. Bioinformatics. 2020;36(3):950–952. doi: 10.1093/bioinformatics/btz642. [DOI] [PMC free article] [PubMed] [Google Scholar]
105.Pierce B.G. ZDOCK server: interactive docking prediction of protein-protein complexes and symmetric multimers. Bioinformatics. 2014;30(12):1771–1773. doi: 10.1093/bioinformatics/btu097. [DOI] [PMC free article] [PubMed] [Google Scholar]
106.Vreven T. Updates to the integrated protein-protein interaction benchmarks: docking benchmark version 5 and affinity benchmark version 2. J Mol Biol. 2015;427(19):3031–3041. doi: 10.1016/j.jmb.2015.07.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
107.Lensink MF, Velankar S, Wodak SJ, Modeling protein-protein and protein-peptide complexes: CAPRI 6th edition. Proteins, 2017. 85(3): p. 359-377. [DOI] [PubMed]
108.Ambrosetti F, et al., Modeling antibody-antigen complexes by information-driven docking. Structure 2020. 28(1): p. 119-129 e2. [DOI] [PubMed]
109.Anishchenko I., Kundrotas P.J., Vakser I.A. Modeling complexes of modeled proteins. Proteins. 2017;85(3):470–478. doi: 10.1002/prot.25183. [DOI] [PMC free article] [PubMed] [Google Scholar]
110.Norman R.A. Computational approaches to therapeutic antibody design: established methods and emerging trends. Brief Bioinform. 2019 doi: 10.1093/bib/bbz095. [DOI] [PMC free article] [PubMed] [Google Scholar]
111.Knapp B., Dunbar J., Deane C.M. Large scale characterization of the LC13 TCR and HLA-B8 structural landscape in reaction to 172 altered peptide ligands: a molecular dynamics simulation study. PLoS Comput Biol. 2014;10(8) doi: 10.1371/journal.pcbi.1003748. [DOI] [PMC free article] [PubMed] [Google Scholar]
112.Reboul C.F. Epitope flexibility and dynamic footprint revealed by molecular dynamics of a pMHC-TCR complex. PLoS Comput Biol. 2012;8(3) doi: 10.1371/journal.pcbi.1002404. [DOI] [PMC free article] [PubMed] [Google Scholar]
113.Dominguez J.L., Knapp B. How peptide/MHC presence affects the dynamics of the LC13 T-cell receptor. Sci Rep. 2019;9(1):2638. doi: 10.1038/s41598-019-38788-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
114.Wu P, et al., Mechano-regulation of peptide-MHC Class I conformations determines TCR antigen recognition. Mol Cell 2019;73(5): p. 1015-1027 e7. [DOI] [PMC free article] [PubMed]
115.Fodor J. Previously hidden dynamics at the TCR-peptide-MHC interface revealed. J Immunol. 2018;200(12):4134–4145. doi: 10.4049/jimmunol.1800315. [DOI] [PubMed] [Google Scholar]
116.Friess M.D., Pluhackova K., Bockmann R.A. Structural model of the mIgM B-cell receptor transmembrane domain from self-association molecular dynamics simulations. Front Immunol. 2018;9:2947. doi: 10.3389/fimmu.2018.02947. [DOI] [PMC free article] [PubMed] [Google Scholar]
117.Tay M.Z. The trinity of COVID-19: immunity, inflammation and intervention. Nat Rev Immunol. 2020 doi: 10.1038/s41577-020-0311-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
118.Cao Y. Potent neutralizing antibodies against SARS-CoV-2 identified by high-throughput single-cell sequencing of convalescent patients' B cells. Cell. 2020 doi: 10.1016/j.cell.2020.05.025. [DOI] [PMC free article] [PubMed] [Google Scholar]
119.Avram O. ASAP – A webserver for immunoglobulin-sequencing analysis pipeline. Front Immunol. 2018;9:1686. doi: 10.3389/fimmu.2018.01686. [DOI] [PMC free article] [PubMed] [Google Scholar]
120.H, IJ, et al., Antigen receptor galaxy: A user-friendly, web-based tool for analysis and visualization of T and B cell receptor repertoire data. J Immunol 2017;198(10): p. 4156–4165. [DOI] [PMC free article] [PubMed]
121.Bischof J., Ibrahim S.M. bcRep: R package for comprehensive analysis of B cell receptor repertoire data. PLoS ONE. 2016;11(8) doi: 10.1371/journal.pone.0161569. [DOI] [PMC free article] [PubMed] [Google Scholar]
122.Olson B.J. sumrep: a summary statistic framework for immune receptor repertoire comparison and model validation. Front Immunol. 2019;10:2533. doi: 10.3389/fimmu.2019.02533. [DOI] [PMC free article] [PubMed] [Google Scholar]
123.Weitzner B.D. Modeling and docking of antibody structures with Rosetta. Nat Protoc. 2017;12(2):401–416. doi: 10.1038/nprot.2016.180. [DOI] [PMC free article] [PubMed] [Google Scholar]
124.Mashiach E, et al., FireDock: a web server for fast interaction refinement in molecular docking. Nucleic Acids Res 2008;36(Web Server issue): p. W229–32. [DOI] [PMC free article] [PubMed]
125.Torchala M. SwarmDock: a server for flexible protein-protein docking. Bioinformatics. 2013;29(6):807–809. doi: 10.1093/bioinformatics/btt038. [DOI] [PubMed] [Google Scholar]
126.Jiménez-García B., Pons C., Fernández-Recio J. pyDockWEB: a web server for rigid-body protein-protein docking using electrostatics and desolvation scoring. Bioinformatics. 2013;29(13):1698–1699. doi: 10.1093/bioinformatics/btt262. [DOI] [PubMed] [Google Scholar]
127.Yan Y. HDOCK: a web server for protein-protein and protein-DNA/RNA docking based on a hybrid strategy. Nucleic Acids Res. 2017;45(W1):W365–W373. doi: 10.1093/nar/gkx407. [DOI] [PMC free article] [PubMed] [Google Scholar]
128.Macindoe G, et al., HexServer: an FFT-based protein docking server powered by graphics processors. Nucleic Acids Res 2010;38(Web Server issue): p. W445–9. [DOI] [PMC free article] [PubMed]
129.de Vries S.J. A web interface for easy flexible protein-protein docking with ATTRACT. Biophys J. 2015;108(3):462–465. doi: 10.1016/j.bpj.2014.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
130.Tovchigrechko A, Vakser IA, GRAMM-X public web server for protein-protein docking. Nucleic Acids Res, 2006. 34(Web Server issue): p. W310–4. [DOI] [PMC free article] [PubMed]
131.Ambrosetti F. proABC-2: PRediction Of AntiBody Contacts v2 and its application to information-driven docking. bioRxiv. 2020 doi: 10.1101/2020.03.18.967828. [DOI] [PMC free article] [PubMed] [Google Scholar]
132.Pittala S., Bailey-Kellogg C. Learning context-aware structural representations to predict antigen and antibody binding interfaces. Bioinformatics. 2020;36(13):3996–4003. doi: 10.1093/bioinformatics/btaa263. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0005] 1.Murphy K. Garland Science; New York: 2008. Janeway's immunobiology. [Google Scholar]

[b0010] 2.Mora T., Walczak A.M. How many different clonotypes do immune repertoires contain? Curr Opin Syst Biol. 2019;18:104–110. [Google Scholar]

[b0015] 3.Turner S.J. Structural determinants of T-cell receptor bias in immunity. Nat Rev Immunol. 2006;6(12):883–894. doi: 10.1038/nri1977. [DOI] [PubMed] [Google Scholar]

[b0020] 4.Reinhardt R.L., Liang H.E., Locksley R.M. Cytokine-secreting follicular T cells shape the antibody repertoire. Nat Immunol. 2009;10(4):385–393. doi: 10.1038/ni.1715. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0025] 5.Bagaev D.V. VDJdb in 2019: database extension, new analysis infrastructure and a T-cell receptor motif compendium. Nucleic Acids Res. 2020;48(D1):D1057–D1062. doi: 10.1093/nar/gkz874. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0030] 6.Miqueu P. Statistical analysis of CDR3 length distributions for the assessment of T and B cell repertoire biases. Mol Immunol. 2007;44(6):1057–1064. doi: 10.1016/j.molimm.2006.06.026. [DOI] [PubMed] [Google Scholar]

[b0035] 7.Calis J.J., Rosenberg B.R. Characterizing immune repertoires by high throughput sequencing: strategies and applications. Trends Immunol. 2014;35(12):581–590. doi: 10.1016/j.it.2014.09.004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0040] 8.Hou X.L. Current status and recent advances of next generation sequencing techniques in immunological repertoire. Genes Immun. 2016;17(3):153–164. doi: 10.1038/gene.2016.9. [DOI] [PubMed] [Google Scholar]

[b0045] 9.Brochet X, Lefranc MP, Giudicelli V, IMGT/V-QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis. Nucleic Acids Res;2008:36(Web Server issue): p. W503–8. [DOI] [PMC free article] [PubMed]

[b0050] 10.Ralph DK, Matsen FAT, Per-sample immunoglobulin germline inference from B cell receptor deep sequencing data. PLoS Comput Biol;2019:15(7):e1007133. [DOI] [PMC free article] [PubMed]

[b0055] 11.Alamyar E. IMGT((R)) tools for the nucleotide analysis of immunoglobulin (IG) and T cell receptor (TR) V-(D)-J repertoires, polymorphisms, and IG mutations: IMGT/V-QUEST and IMGT/HighV-QUEST for NGS. Methods Mol Biol. 2012;882:569–604. doi: 10.1007/978-1-61779-842-9_32. [DOI] [PubMed] [Google Scholar]

[b0060] 12.Li S. IMGT/HighV QUEST paradigm for T cell receptor IMGT clonotype diversity and next generation repertoire immunoprofiling. Nat Commun. 2013;4:2333. doi: 10.1038/ncomms3333. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0065] 13.Ye J, et al., IgBLAST: an immunoglobulin variable domain sequence analysis tool. Nucleic Acids Res;2013: 41(Web Server issue): W34–40. [DOI] [PMC free article] [PubMed]

[b0070] 14.Altschul S.F. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0075] 15.Bolotin D.A. MiXCR: software for comprehensive adaptive immunity profiling. Nat Methods. 2015;12(5):380–381. doi: 10.1038/nmeth.3364. [DOI] [PubMed] [Google Scholar]

[b0080] 16.Bolotin D.A. Antigen receptor repertoire profiling from RNA-seq data. Nat Biotechnol. 2017;35(10):908–911. doi: 10.1038/nbt.3979. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0085] 17.Smakaj E. Benchmarking immunoinformatic tools for the analysis of antibody repertoire sequences. Bioinformatics. 2020;36(6):1731–1739. doi: 10.1093/bioinformatics/btz845. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0090] 18.Vander Heiden J.A. pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics. 2014;30(13):1930–1932. doi: 10.1093/bioinformatics/btu138. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0095] 19.Gupta N.T. Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. Bioinformatics. 2015;31(20):3356–3358. doi: 10.1093/bioinformatics/btv359. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0100] 20.Li B. Ultrasensitive detection of TCR hypervariable-region sequences in solid-tissue RNA-seq data. Nat Genet. 2017;49(4):482–483. doi: 10.1038/ng.3820. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0105] 21.Shugay M. VDJtools: unifying post-analysis of T cell receptor repertoires. PLoS Comput Biol. 2015;11(11) doi: 10.1371/journal.pcbi.1004503. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0110] 22.Duez M. Vidjil: a web platform for analysis of high-throughput repertoire sequencing. PLoS ONE. 2016;11(11) doi: 10.1371/journal.pone.0166126. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0115] 23.Nazarov V.I. tcR: an R package for T cell receptor repertoire advanced data analysis. BMC Bioinf. 2015;16:175. doi: 10.1186/s12859-015-0613-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0120] 24.Laydon D.J. Quantification of HTLV-1 clonality and TCR diversity. PLoS Comput Biol. 2014;10(6) doi: 10.1371/journal.pcbi.1003646. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0125] 25.Bolen C.R. The Repertoire Dissimilarity Index as a method to compare lymphocyte receptor repertoires. BMC Bioinf. 2017;18(1):155. doi: 10.1186/s12859-017-1556-5. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0130] 26.Yokota R., Kaminaga Y., Kobayashi T.J. Quantification of inter-sample differences in T-cell receptor repertoires using sequence-based information. Front Immunol. 2017;8:1500. doi: 10.3389/fimmu.2017.01500. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0135] 27.Emerson R.O. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat Genet. 2017;49(5):659–665. doi: 10.1038/ng.3822. [DOI] [PubMed] [Google Scholar]

[b0140] 28.DeWitt WS, 3rd, et al., Human T cell receptor occurrence patterns encode immune history, genetic background, and receptor specificity. Elife;2018:7. [DOI] [PMC free article] [PubMed]

[b0145] 29.Sethna Z. OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs. Bioinformatics. 2019;35(17):2974–2981. doi: 10.1093/bioinformatics/btz035. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0150] 30.Sethna Z, et al., Population variability in the generation and thymic selection of T-cell repertoires. bioRxiv, 2020: p. 2020.01.08.899682. [DOI] [PMC free article] [PubMed]

[b0155] 31.Davidsen K. Deep generative models for T cell receptor protein sequences. Elife. 2019:8. doi: 10.7554/eLife.46935. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0160] 32.Pogorelyy M.V. Detecting T cell receptors involved in immune responses from single repertoire snapshots. PLoS Biol. 2019;17(6) doi: 10.1371/journal.pbio.3000314. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0165] 33.Murugan A. Statistical inference of the generation probability of T-cell receptors from sequence repertoires. Proc Natl Acad Sci U S A. 2012;109(40):16161–16166. doi: 10.1073/pnas.1212755109. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0170] 34.Marcou Q., Mora T., Walczak A.M. High-throughput immune repertoire analysis with IGoR. Nat Commun. 2018;9(1):561. doi: 10.1038/s41467-018-02832-w. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0175] 35.Singh M. High-throughput targeted long-read single cell sequencing reveals the clonal and transcriptional landscape of lymphocytes. Nat Commun. 2019;10(1):3120. doi: 10.1038/s41467-019-11049-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0180] 36.Afik S. Targeted reconstruction of T cell receptor sequence from single cell RNA-seq links CDR3 length to T cell differentiation state. Nucleic Acids Res. 2017;45(16) doi: 10.1093/nar/gkx615. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0185] 37.Stubbington M.J.T. T cell fate and clonality inference from single-cell transcriptomes. Nat Methods. 2016;13(4):329–332. doi: 10.1038/nmeth.3800. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0190] 38.Eltahla A.A. Linking the T cell receptor to the single cell transcriptome in antigen-specific human T cells. Immunol Cell Biol. 2016;94(6):604–611. doi: 10.1038/icb.2016.16. [DOI] [PubMed] [Google Scholar]

[b0195] 39.Canzar S. BASIC: BCR assembly from single cells. Bioinformatics. 2017;33(3):425–427. doi: 10.1093/bioinformatics/btw631. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0200] 40.Lindeman I. BraCeR: B-cell-receptor reconstruction and clonality inference from single-cell RNA-seq. Nat Methods. 2018;15(8):563–565. doi: 10.1038/s41592-018-0082-3. [DOI] [PubMed] [Google Scholar]

[b0205] 41.Rizzetto S. B-cell receptor reconstruction from single-cell RNA-seq with VDJPuzzle. Bioinformatics. 2018;34(16):2846–2847. doi: 10.1093/bioinformatics/bty203. [DOI] [PubMed] [Google Scholar]

[b0210] 42.Setliff I, et al., High-throughput mapping of B cell receptor sequences to antigen specificity. Cell;2019:179(7):1636–1646 e15. [DOI] [PMC free article] [PubMed]

[b0215] 43.Jurtz VI, et al., NetTCR: sequence-based prediction of TCR binding to peptide-MHC complexes using convolutional neural networks. bioRxiv;2018:433706.

[b0220] 44.Kula T, et al., T-scan: a genome-wide method for the systematic discovery of T cell epitopes. Cell;2019:178(4):1016–1028 e13. [DOI] [PMC free article] [PubMed]

[b0225] 45.Gee MH, et al., Antigen identification for orphan T cell receptors expressed on tumor-infiltrating lymphocytes. Cell;2018:172(3): p. 549–563 e16. [DOI] [PMC free article] [PubMed]

[b0230] 46.Kobayashi E. A new cloning and expression system yields and validates TCRs from blood lymphocytes of patients with cancer within 10 days. Nat Med. 2013;19(11):1542–1546. doi: 10.1038/nm.3358. [DOI] [PubMed] [Google Scholar]

[b0235] 47.Marks C., Deane C.M. Antibody H3 Structure Prediction. Comput Struct Biotechnol J. 2017;15:222–231. doi: 10.1016/j.csbj.2017.01.010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0240] 48.Marcatili P., Rosi A., Tramontano A. PIGS: automatic prediction of antibody structures. Bioinformatics. 2008;24(17):1953–1954. doi: 10.1093/bioinformatics/btn341. [DOI] [PubMed] [Google Scholar]

[b0245] 49.Sircar A, Kim ET, Gray JJ, RosettaAntibody: antibody variable region homology modeling server. Nucleic Acids Res, 2009. 37(Web Server issue): p. W474-9. [DOI] [PMC free article] [PubMed]

[b0250] 50.Nishigami H., Kamiya N., Nakamura H. Revisiting antibody modeling assessment for CDR-H3 loop. Protein Eng Des Sel. 2016;29(11):477–484. doi: 10.1093/protein/gzw028. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0255] 51.Mandell D.J., Coutsias E.A., Kortemme T. Sub-angstrom accuracy in protein loop reconstruction by robotics-inspired conformational sampling. Nat Methods. 2009;6(8):551–552. doi: 10.1038/nmeth0809-551. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0260] 52.Almagro J.C. Second antibody modeling assessment (AMA-II) Proteins. 2014;82(8):1553–1562. doi: 10.1002/prot.24567. [DOI] [PubMed] [Google Scholar]

[b0265] 53.Shirai H. High-resolution modeling of antibody structures by a combination of bioinformatics, expert knowledge, and molecular simulations. Proteins. 2014;82(8):1624–1635. doi: 10.1002/prot.24591. [DOI] [PubMed] [Google Scholar]

[b0270] 54.Yamashita K. Kotai Antibody Builder: automated high-resolution structural modeling of antibodies. Bioinformatics. 2014;30(22):3279–3280. doi: 10.1093/bioinformatics/btu510. [DOI] [PubMed] [Google Scholar]

[b0275] 55.Schritt D. Repertoire Builder: High-throughput structural modeling of B and T cell receptors. Mol Syst Des Eng. 2019;4:761–768. [Google Scholar]

[b0280] 56.Leem J. ABodyBuilder: Automated antibody structure prediction with data-driven accuracy estimation. MAbs. 2016;8(7):1259–1268. doi: 10.1080/19420862.2016.1205773. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0285] 57.Gowthaman R., Pierce B.G. TCRmodel: high resolution modeling of T cell receptors from sequence. Nucleic Acids Res. 2018;46(W1):W396–W401. doi: 10.1093/nar/gky432. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0290] 58.Lepore R. PIGSPro: prediction of immunoGlobulin structures v2. Nucleic Acids Res. 2017;45(W1):W17–W23. doi: 10.1093/nar/gkx334. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0295] 59.Rozewicki J. MAFFT-DASH: integrated protein sequence and structural alignment. Nucleic Acids Res. 2019;47(W1):W5–W10. doi: 10.1093/nar/gkz342. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0300] 60.Katoh K., Frith M.C. Adding unaligned sequences into an existing alignment using MAFFT and LAST. Bioinformatics. 2012;28(23):3144–3146. doi: 10.1093/bioinformatics/bts578. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0305] 61.Edgar R.C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26(19):2460–2461. doi: 10.1093/bioinformatics/btq461. [DOI] [PubMed] [Google Scholar]

[b0310] 62.Li W., Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22(13):1658–1659. doi: 10.1093/bioinformatics/btl158. [DOI] [PubMed] [Google Scholar]

[b0315] 63.Glanville J. Identifying specificity groups in the T cell receptor repertoire. Nature. 2017;547(7661):94–98. doi: 10.1038/nature22976. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0320] 64.Huang H. Analyzing the Mycobacterium tuberculosis immune response by T-cell receptor clustering with GLIPH2 and genome-wide antigen screening. Nat Biotechnol. 2020 doi: 10.1038/s41587-020-0505-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0325] 65.Dash P. Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature. 2017;547(7661):89–93. doi: 10.1038/nature22383. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0330] 66.DeWitt W.S. A diverse lipid antigen-specific TCR repertoire is clonally expanded during active tuberculosis. J Immunol. 2018;201(3):888–896. doi: 10.4049/jimmunol.1800186. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0335] 67.Scheid J.F. Sequence and structural convergence of broad and potent HIV antibodies that mimic CD4 binding. Science. 2011;333(6049):1633–1637. doi: 10.1126/science.1207227. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0340] 68.Joyce M.G. Vaccine-induced antibodies that Neutralize Group 1 and Group 2 influenza A viruses. Cell. 2016;166(3):609–623. doi: 10.1016/j.cell.2016.06.043. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0345] 69.Robbiani D.F. Convergent antibody responses to SARS-CoV-2 in convalescent individuals. Nature. 2020 doi: 10.1038/s41586-020-2456-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0350] 70.Kovaltsuk A. Structural diversity of B-cell receptor repertoires along the B-cell differentiation axis in humans and mice. PLoS Comput Biol. 2020;16(2) doi: 10.1371/journal.pcbi.1007636. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0355] 71.Xu Z, et al., Functional clustering of B cell receptors using sequence and structural features. Mol Syst Des Eng, 2019. in press.

[b0360] 72.Miho E. Large-scale network analysis reveals the sequence space architecture of antibody repertoires. Nat Commun. 2019;10(1):1321. doi: 10.1038/s41467-019-09278-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0365] 73.Joglekar A.V., Li G. T cell antigen discovery. Nat Methods. 2020 doi: 10.1038/s41592-020-0867-z. [DOI] [PubMed] [Google Scholar]

[b0370] 74.Fischer DS, et al., Predicting antigen-specificity of single T-cells based on TCR CDR3 regions. bioRxiv, 2019: p. 734053. [DOI] [PMC free article] [PubMed]

[b0375] 75.Jokinen E, et al., TCRGP: Determining epitope specificity of T cell receptors. bioRxiv, 2019: p. 542332.

[b0380] 76.Ogishi M., Yotsuyanagi H. Quantitative prediction of the landscape of T cell epitope immunogenicity in sequence space. Front Immunol. 2019;10:827. doi: 10.3389/fimmu.2019.00827. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0385] 77.Gielis S. Detection of enriched T cell epitope specificity in full T cell receptor sequence repertoires. Front Immunol. 2019;10:2820. doi: 10.3389/fimmu.2019.02820. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0390] 78.Sun Y. Specificity, privacy, and degeneracy in the CD4 T Cell receptor repertoire following immunization. Front Immunol. 2017;8:430. doi: 10.3389/fimmu.2017.00430. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0395] 79.Thomas N. Tracking global changes induced in the CD4 T-cell receptor repertoire by immunization with a complex antigen using short stretches of CDR3 protein sequence. Bioinformatics. 2014;30(22):3181–3188. doi: 10.1093/bioinformatics/btu523. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0400] 80.Lanzarotti E., Marcatili P., Nielsen M. T-cell receptor cognate target prediction based on paired alpha and beta chain sequence and structural CDR loop similarities. Front Immunol. 2019;10:2080. doi: 10.3389/fimmu.2019.02080. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0405] 81.Li S, et al., Structural modeling of lymphocyte receptors and their antigens. Meth Mol Biol, 2019. in press. [DOI] [PubMed]

[b0410] 82.Klausen M.S. LYRA, a webserver for lymphocyte receptor structural modeling. Nucleic Acids Res. 2015;43(W1):W349–W355. doi: 10.1093/nar/gkv535. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0415] 83.Jensen K.K. TCRpMHCmodels: structural modelling of TCR-pMHC class I complexes. Sci Rep. 2019;9(1):14530. doi: 10.1038/s41598-019-50932-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0420] 84.Kunik V, Ashkenazi S, Ofran Y, Paratome: an online tool for systematic identification of antigen-binding regions in antibodies based on sequence or structure. Nucleic Acids Res, 2012. 40(Web Server issue): p. W521-4. [DOI] [PMC free article] [PubMed]

[b0425] 85.Olimpieri P.P. Prediction of site-specific interactions in antibody-antigen complexes: the proABC method and server. Bioinformatics. 2013;29(18):2285–2291. doi: 10.1093/bioinformatics/btt369. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0430] 86.Liberis E. Parapred: antibody paratope prediction using convolutional and recurrent neural networks. Bioinformatics. 2018;34(17):2944–2950. doi: 10.1093/bioinformatics/bty305. [DOI] [PubMed] [Google Scholar]

[b0435] 87.Daberdaku S., Ferrari C. Antibody interface prediction with 3D Zernike descriptors and SVM. Bioinformatics. 2019;35(11):1870–1876. doi: 10.1093/bioinformatics/bty918. [DOI] [PubMed] [Google Scholar]

[b0440] 88.Deac A., VeliCkovic P., Sormanni P. Attentive cross-modal paratope prediction. J Comput Biol. 2019;26(6):536–545. doi: 10.1089/cmb.2018.0175. [DOI] [PubMed] [Google Scholar]

[b0445] 89.Krawczyk K. Antibody i-Patch prediction of the antibody binding site improves rigid local antibody-antigen docking. Protein Eng Des Sel. 2013;26(10):621–629. doi: 10.1093/protein/gzt043. [DOI] [PubMed] [Google Scholar]

[b0450] 90.Singh H., Ansari H.R., Raghava G.P. Improved method for linear B-cell epitope prediction using antigen's primary sequence. PLoS ONE. 2013;8(5) doi: 10.1371/journal.pone.0062216. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0455] 91.Dalkas G.A., Rooman M. SEPIa, a knowledge-driven algorithm for predicting conformational B-cell epitopes from the amino acid sequence. BMC Bioinf. 2017;18(1):95. doi: 10.1186/s12859-017-1528-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0460] 92.Jespersen M.C. BepiPred- 2.0: improving sequence-based B-cell epitope prediction using conformational epitopes. Nucleic Acids Res. 2017;45(W1):W24–W29. doi: 10.1093/nar/gkx346. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0465] 93.Zhao L. Novel overlapping subgraph clustering for the detection of antigen epitopes. Bioinformatics. 2018;34(12):2061–2068. doi: 10.1093/bioinformatics/bty051. [DOI] [PubMed] [Google Scholar]

[b0470] 94.Jespersen M.C. Antibody specific B-cell epitope predictions: leveraging information from antibody-antigen protein complexes. Front Immunol. 2019;10:298. doi: 10.3389/fimmu.2019.00298. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0475] 95.Sela-Culang I. PEASE: predicting B-cell epitopes utilizing antibody sequence. Bioinformatics. 2015;31(8):1313–1315. doi: 10.1093/bioinformatics/btu790. [DOI] [PubMed] [Google Scholar]

[b0480] 96.Krawczyk K. Improving B-cell epitope prediction and its application to global antibody-antigen docking. Bioinformatics. 2014;30(16):2288–2294. doi: 10.1093/bioinformatics/btu190. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0485] 97.Bourquard T. MAbTope: a method for improved epitope mapping. J Immunol. 2018;201(10):3096–3105. doi: 10.4049/jimmunol.1701722. [DOI] [PubMed] [Google Scholar]

[b0490] 98.Janin J. Protein-protein docking tested in blind predictions: the CAPRI experiment. Mol Biosyst. 2010;6(12):2351–2362. doi: 10.1039/c005060c. [DOI] [PubMed] [Google Scholar]

[b0495] 99.Kozakov D. The ClusPro web server for protein-protein docking. Nat Protoc. 2017;12(2):255–278. doi: 10.1038/nprot.2016.169. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0500] 100.Schneidman-Duhovny D, et al., PatchDock and SymmDock: servers for rigid and symmetric docking. Nucleic Acids Res, 2005. 33(Web Server issue): p. W363-7. [DOI] [PMC free article] [PubMed]

[b0505] 101.Ramirez-Aportela E, Lopez-Blanco JR, Chacon P, FRODOCK 2.0: fast protein-protein docking server. Bioinformatics, 2016. 32(15): p. 2386-8. [DOI] [PubMed]

[b0510] 102.Sircar A, Gray JJ, SnugDock: paratope structural optimization during antibody-antigen docking compensates for errors in antibody homology models. PLoS Comput Biol, 2010. 6(1): p. e1000644. [DOI] [PMC free article] [PubMed]

[b0515] 103.van Zundert G.C.P. The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes. J Mol Biol. 2016;428(4):720–725. doi: 10.1016/j.jmb.2015.09.014. [DOI] [PubMed] [Google Scholar]

[b0520] 104.Roel-Touris J., Bonvin A., Jimenez-Garcia B. LightDock goes information-driven. Bioinformatics. 2020;36(3):950–952. doi: 10.1093/bioinformatics/btz642. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0525] 105.Pierce B.G. ZDOCK server: interactive docking prediction of protein-protein complexes and symmetric multimers. Bioinformatics. 2014;30(12):1771–1773. doi: 10.1093/bioinformatics/btu097. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0530] 106.Vreven T. Updates to the integrated protein-protein interaction benchmarks: docking benchmark version 5 and affinity benchmark version 2. J Mol Biol. 2015;427(19):3031–3041. doi: 10.1016/j.jmb.2015.07.016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0535] 107.Lensink MF, Velankar S, Wodak SJ, Modeling protein-protein and protein-peptide complexes: CAPRI 6th edition. Proteins, 2017. 85(3): p. 359-377. [DOI] [PubMed]

[b0540] 108.Ambrosetti F, et al., Modeling antibody-antigen complexes by information-driven docking. Structure 2020. 28(1): p. 119-129 e2. [DOI] [PubMed]

[b0545] 109.Anishchenko I., Kundrotas P.J., Vakser I.A. Modeling complexes of modeled proteins. Proteins. 2017;85(3):470–478. doi: 10.1002/prot.25183. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0550] 110.Norman R.A. Computational approaches to therapeutic antibody design: established methods and emerging trends. Brief Bioinform. 2019 doi: 10.1093/bib/bbz095. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0555] 111.Knapp B., Dunbar J., Deane C.M. Large scale characterization of the LC13 TCR and HLA-B8 structural landscape in reaction to 172 altered peptide ligands: a molecular dynamics simulation study. PLoS Comput Biol. 2014;10(8) doi: 10.1371/journal.pcbi.1003748. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0560] 112.Reboul C.F. Epitope flexibility and dynamic footprint revealed by molecular dynamics of a pMHC-TCR complex. PLoS Comput Biol. 2012;8(3) doi: 10.1371/journal.pcbi.1002404. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0565] 113.Dominguez J.L., Knapp B. How peptide/MHC presence affects the dynamics of the LC13 T-cell receptor. Sci Rep. 2019;9(1):2638. doi: 10.1038/s41598-019-38788-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0570] 114.Wu P, et al., Mechano-regulation of peptide-MHC Class I conformations determines TCR antigen recognition. Mol Cell 2019;73(5): p. 1015-1027 e7. [DOI] [PMC free article] [PubMed]

[b0575] 115.Fodor J. Previously hidden dynamics at the TCR-peptide-MHC interface revealed. J Immunol. 2018;200(12):4134–4145. doi: 10.4049/jimmunol.1800315. [DOI] [PubMed] [Google Scholar]

[b0580] 116.Friess M.D., Pluhackova K., Bockmann R.A. Structural model of the mIgM B-cell receptor transmembrane domain from self-association molecular dynamics simulations. Front Immunol. 2018;9:2947. doi: 10.3389/fimmu.2018.02947. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0585] 117.Tay M.Z. The trinity of COVID-19: immunity, inflammation and intervention. Nat Rev Immunol. 2020 doi: 10.1038/s41577-020-0311-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0590] 118.Cao Y. Potent neutralizing antibodies against SARS-CoV-2 identified by high-throughput single-cell sequencing of convalescent patients' B cells. Cell. 2020 doi: 10.1016/j.cell.2020.05.025. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0595] 119.Avram O. ASAP – A webserver for immunoglobulin-sequencing analysis pipeline. Front Immunol. 2018;9:1686. doi: 10.3389/fimmu.2018.01686. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0600] 120.H, IJ, et al., Antigen receptor galaxy: A user-friendly, web-based tool for analysis and visualization of T and B cell receptor repertoire data. J Immunol 2017;198(10): p. 4156–4165. [DOI] [PMC free article] [PubMed]

[b0605] 121.Bischof J., Ibrahim S.M. bcRep: R package for comprehensive analysis of B cell receptor repertoire data. PLoS ONE. 2016;11(8) doi: 10.1371/journal.pone.0161569. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0610] 122.Olson B.J. sumrep: a summary statistic framework for immune receptor repertoire comparison and model validation. Front Immunol. 2019;10:2533. doi: 10.3389/fimmu.2019.02533. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0615] 123.Weitzner B.D. Modeling and docking of antibody structures with Rosetta. Nat Protoc. 2017;12(2):401–416. doi: 10.1038/nprot.2016.180. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0620] 124.Mashiach E, et al., FireDock: a web server for fast interaction refinement in molecular docking. Nucleic Acids Res 2008;36(Web Server issue): p. W229–32. [DOI] [PMC free article] [PubMed]

[b0625] 125.Torchala M. SwarmDock: a server for flexible protein-protein docking. Bioinformatics. 2013;29(6):807–809. doi: 10.1093/bioinformatics/btt038. [DOI] [PubMed] [Google Scholar]

[b0630] 126.Jiménez-García B., Pons C., Fernández-Recio J. pyDockWEB: a web server for rigid-body protein-protein docking using electrostatics and desolvation scoring. Bioinformatics. 2013;29(13):1698–1699. doi: 10.1093/bioinformatics/btt262. [DOI] [PubMed] [Google Scholar]

[b0635] 127.Yan Y. HDOCK: a web server for protein-protein and protein-DNA/RNA docking based on a hybrid strategy. Nucleic Acids Res. 2017;45(W1):W365–W373. doi: 10.1093/nar/gkx407. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0640] 128.Macindoe G, et al., HexServer: an FFT-based protein docking server powered by graphics processors. Nucleic Acids Res 2010;38(Web Server issue): p. W445–9. [DOI] [PMC free article] [PubMed]

[b0645] 129.de Vries S.J. A web interface for easy flexible protein-protein docking with ATTRACT. Biophys J. 2015;108(3):462–465. doi: 10.1016/j.bpj.2014.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b0650] 130.Tovchigrechko A, Vakser IA, GRAMM-X public web server for protein-protein docking. Nucleic Acids Res, 2006. 34(Web Server issue): p. W310–4. [DOI] [PMC free article] [PubMed]

[bib651] 131.Ambrosetti F. proABC-2: PRediction Of AntiBody Contacts v2 and its application to information-driven docking. bioRxiv. 2020 doi: 10.1101/2020.03.18.967828. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib652] 132.Pittala S., Bailey-Kellogg C. Learning context-aware structural representations to predict antigen and antibody binding interfaces. Bioinformatics. 2020;36(13):3996–4003. doi: 10.1093/bioinformatics/btaa263. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Methods for sequence and structural analysis of B and T cell receptor repertoires

Shunsuke Teraguchi

Dianita S Saputri

Mara Anais Llamas-Covarrubias

Ana Davila

Diego Diez

Sedat Aybars Nazlica

John Rozewicki

Hendra S Ismanto

Jan Wilamowski

Jiaqi Xie

Zichang Xu

Martin de Jesus Loza-Lopez

Floris J van Eerden

Songling Li

Daron M Standley

Graphical abstract

Abstract

1. Introduction

Fig. 1.

2. Repertoire sequence analysis

Fig. 2.

2.1. Bulk sequencing

Table 1.

2.2. Single cell sequencing

2.3. Extensions of repertoire sequencing

3. TCR and BCR 3D structural modeling

Fig. 3.

Table 2.

4. TCR and BCR clustering

Fig. 4.

4.1. TCR clustering

4.2. BCR clustering

5. Epitope specificity

5.1. Predicting TCR epitopes

5.2. TCR-pMHC 3D modeling

Fig. 5.

5.3. Predicting BCR epitopes

Fig. 6.

5.4. BCR-antigen docking

Table 3.

6. Molecular dynamics

7. Conclusions

Author statement

Acknowledgements

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases