Abstract
Antibody sequence repertoire analysis of plasma cells (PC) isolated before and 1 week after a vaccine provide time specific snapshots of the antibody response. Comparison of the immunoglobulin (Ig) sequences pre- and post-vaccination allows analysis of maturation over time and identification of antigen specific Ig. Here we compare the Ig heavy chain (Ig-H) repertoire of circulating PCs isolated from 109 peripheral blood mononuclear cells (PBMC) collected by apheresis one week after a tetanus toxoid vaccine booster with the Ig-H repertoire of PCs collected two and 11 weeks prior to the booster. A total of 21,060 unique Ig nucleotide sequences encoding 14,307 unique heavy chain complementarity determining region 3 (CDR-H3) amino acid sequences, also called clonotypes, were identified. Only 466 clonotypes (3.3%) were present at all 3 time points. In contrast, 90% of the 30 highest frequency CDR-H3 regions at +1w were also identified at another time point and 50% were present at all time points, suggesting the rapid expansion of a memory B cell population. The tetanus toxoid specificity of the CDR-H3 region with the 7th highest frequency at +1w was confirmed using immunoprecipitation and mass spectroscopy, and two public tetanus toxoid-specific CDR-H3 regions were also overrepresented at +1w. In summary, we have used the tetanus vaccine model system to demonstrate that bulk PC Ig repertoire analysis can identify PC populations that expand and mature following antigen exposure. The application of this approach before and after clinical infections should advance our understanding of clinical protection and facilitate vaccine design.
Keywords: Tetanus toxoid, vaccine, antibody repertoire, antibody maturation, Ig sequence analysis
Effective vaccines are critical to current global health strategies and played a key role in the eradication of smallpox and ongoing efforts against polio (Bandyopadhyay and Macklin 2020; Bao et al. 2020; Weiss and Esparza 2015). However, pathogens expressing polymorphic surface antigens with complex and/or dynamic structures, such as HIV and parasites, have been difficult vaccine targets (Heaton 2020). Recent advances in developing human monoclonal antibodies (Kisalu et al. 2018; Tan et al. 2018; Walker and Burton 2018) have allowed the identification of inhibitory mAb and the corresponding target epitopes, but these are still relatively low throughput approaches. It also remains a challenge to develop vaccines that can induce a robust response against these protective epitopes, and this has led to a careful analysis of B cell activation and Ig maturation (Graham et al. 2019; Kwong and Mascola 2018).
The core antigen binding domain of an Ig molecule is made up of three complementary determining regions (CDR) on the Ig heavy chain and three on the Ig light chain (Murphy and Weaver 2017). On both Ig chains the third CDR is the most diverse and most responsible for conferring antigen specificity. The sequence for CDR3 is defined during B cell development by genomic recombination that quasi-randomly links one V, one D and one J exon together in the Ig heavy chain locus and one V and one J exon in the Ig light chain locus. During this process additional nucleotides are also randomly added to or removed from the inter-exon junctions resulting in the genome of each naïve B cell encoding an Ig molecule with a custom CDR 3 for both the heavy (CDR-H3) and light chains (CDR-L3), respectively. These unique recombined CDR 3 regions define the clonotype of an Ig molecule and can be tracked during subsequent B cell activation, proliferation and differentiation into Ig secreting plasma cells or memory B cells to identify the parental naïve B cell. Memory B cells do not secrete Ig, but when they bind antigen they respond rapidly by proliferating and differentiating into plasma cells as well as a new generation of memory B cells. As B cells proliferate following antigen stimulation, point mutations are acquired in the Ig locus, which can improve antigen binding and following additional rounds of antigen selection lead to enhanced antigen affinity. Of particular interest is the evaluation of the Ig sequence through the acquisition of protective immunity to understand how to specifically activate B cells with Ig CDR 3s that can be molded by further mutations and antigen selection to have high affinity against protective epitopes.
To develop approaches to monitor Ig sequence dynamics we evaluated the plasma cell Ig repertoire before and after a tetanus booster vaccination. Vaccination with purified tetanus toxoid, the formaldehyde-inactivated toxin produced from Clostridium tetani, has been used as a human vaccine since 1924 and has successfully reduced the incidence of tetanus by >96% worldwide (WHO 2018). However, Ig titers decline over time and a booster vaccination is recommended every ten years to maintain immunity. We sequenced cDNA corresponding to Ig RNA isolated from ~106 plasma cells (CD38+) enriched from ~109 peripheral blood mononuclear cells obtained by apheresis 11 and two weeks before and one week after a tetanus booster vaccination to compare changes in the Ig repertoires. We found that in contrast to only 3.5% of the total clonotypes being shared between time points, 90% of the 30 CDR-H3 regions most frequently detected after vaccination were shared with a prior time point. Further comparison of the set of unique nucleotide sequences encoding the overrepresented CDR-H3 allowed the identification of Igs expanded by re-exposure and those continuing to mature by the acquisition of new mutations.
Methods
Ethics Statement
The normal volunteer was informed of the objectives, methods, and potential hazards of the apheresis prior to enrolling and was encouraged to ask questions about any aspect of the study that was unclear to them. They were informed that they could withdraw at any time without penalty. Tetanus inoculation was performed as part of routine patient care. The samples were de-identified, coded and then released to the investigator with the volunteer’s recent vaccination history. The investigator did not have access to the volunteer’s other records. The protocol was approved by the Institutional Review Board of NIAID, NIH.
Plasma cell isolation.
PBMCs were collected from a normal volunteer by a 1 pass apheresis then further enriched by centrifugation over a Ficoll-Paque PLUS (GE Healthcare Life Sciences, Marlborough, MA) cushion. PBMCs were collected from the Ficoll-Paque PLUS interface, washed in phosphate buffered saline (PBS) and then resuspended in PBS supplemented with 2 mM ethylenediaminetetraacetic acid (EDTA), and 0.5% bovine serum albumin (BSA). The Miltenyi plasma cell isolation kit II (Bergisch Gladbach, Germany) was used to isolate plasma cells with anti-CD38+ antibodies following the depletion of non-plasma cells with a cocktail of antibodies, against CD2, CD3, CD10, CD13, CD14, CD15, CD22, CD34, CD56, CD123, and CD235a. The purified plasma cells were resuspended in RLT buffer (Qiagen, Germantown, MD) containing 143 mM β-mercaptoethanol and stored frozen for later RNA isolation.
Sequencing.
RNA isolated using the AllPrep DNA/RNA prep kit (Qiagen) was converted to cDNA using the QuanitTech Reverse Transcription kit (Qiagen) and amplified using high fidelity AccuPrime™ Taq DNA Polymerase with buffer II (Thermo Fisher Scientific, Waltham, MA). Six different forward primers were used with a single reverse primer to amplify the 6 major IgHV allelic families from framework region 2 into the J region as described by Boyd et al. (Boyd et al. 2009) (Table 2). The polymerase chain reaction was started with a 10 min 95°C activation step followed by 30 cycles of 30 sec at 95°C, 45s at 58°C and 45s at 72°C and a final single incubation at 72°C for 10 min. The amplicons from each time point were pooled and purified using a MinElute PCR purification kit (Qiagen). The Quant-iT™ PicoGreen™ dsDNA Assay Kit (Thermo Fisher Scientific) was used for quantification before labeling with distinct barcoded 454 sequencing primers using the GS Rapid library Prep kit and sequencing using 454 GS-FLX Titanium pyrosequencing (Roche, Indianapolis, IN) according the manufacturer’s instructions. All the dsDNA available at each time point was included for sequencing and then normalized using the total number of full length reads obtained.
Table 2:
Oligonucleotide Primers
| Name | Sequence 5’ to 3’ |
|---|---|
| VH1.S | CTG GGT GCG ACA GGC CCC TGG ACA A |
| VH2.S | TGG ATC CGT CAG CCC CCA GGG AAG G |
| VH3.S | GGT CCG CCA GGC TCC AGG GAA |
| VH4.S | TGG ATC CGC CAG CCC CCA GGG AAG G |
| VH5.S | GGG TGC GCC AGA TGC CCG GGA AAG G |
| VH6.S | TGG ATC AGG CAG TCC CCA TCG AGA G |
| IgVJ.A | CTT ACC TGA GGA GAC GGT GAC C |
Sequence analysis.
The raw sequences were sorted by the time point barcode and then the presence of both Ig primers was used to confirm full length sequence. The frequency of each distinct sequence was determined and then all unique sequences identified 5 times or more were analyzed with IMGT/HighV-Quest (Alamyar et al. 2010; Brochet et al. 2008) as well as SoDA2 (Munshaw and Kepler 2010) and iHMMune-align (Gaeta et al. 2007) to identify the corresponding human alleles for the V, J and, if present, D exons and the presence of additional junctional nucleotides. Problematic sequences with different CDR1 or CDR2 lengths that did not match to the closest identified germline V gene allele were removed from further analysis. These problematic sequences constituted 1.8 % of the −11w, 1.5% of the −2w and 5% of the +1 w sequences and were all low frequency sequences (<20 copies) in the −11w and −2w samples, however some of the +1w post sequences had >100 copies. The nucleotide sequences obtained with each of the primer sets (v1 - v6) were compared using MAFFT aligner (Katoh and Standley 2013; Yamada et al. 2016) and Jukes-Cantor Neighbor-Joining tree without bootstrap resampling and plotted using Fig Tree (http://tree.bio.ed.ac.uk/software/figtree/). To identify similar CDR-H3 regions, the sequences were aligned from the conserved C terminus and divided into sets of 5 aa. The number of times each 5 aa sequence was found in the Ig sequences was calculated. Sequences with conserved 5 aa sequences were clustered and the rest of the sequence evaluated for similarities and differences. This method allowed the grouping of similar CDR-H3 regions and allowed alignment of the rest of the sequence.
Tetanus toxoid pull down.
Protein A Sepharose (Thermo Fisher Scientific) was used to purify Immunoglobulin molecules from serum obtained 1 week after a tetanus booster. Briefly, serum (13.8 ml) was brought to a final concentration of 1× PBS and 0.5M NaCl, centrifuged at 1.3kg for 5 min at 4°C and then the supernatant was divided and applied to a two 5 ml Protein A Sepharose column pre-equilibrated in binding buffer [0.5 M NaCl in 100mM citrate phosphate buffer (pH 9.0)]. The column was washed with 75 ml of binding buffer and eluted in two ml aliquots with 0.1 M glycine pH 3.0. The elution was neutralized using one tenth volume of 1M Tris (pH 8) and the protein concentration determined using the Bio-Rad protein assay (Bio-Rad, Hercules, CA). Protein-containing elutions were pooled and a 7 ml aliquot was brought to 0.1% Tween-20 then added to an Immobilon-P PVDF membrane (MilliporeSigma, Burlington, MA) (5 cm2 ) pretreated with tetanus toxoid (1.1 mg/ml) in PBS ( for an hour at room temperature and then washed with Tris-buffered saline (pH 8) supplemented with 0.1% Tween 20 and 0.5M NaCl (TBS+). After incubation for an hour at room temperature, the Ig was removed and the membrane was washed with TBS+ three times for 2–5 min each. To harvest the Ig that remained bound to the tetanus toxoid-coated membrane, the membrane was sliced into small pieces with a clean scalpel, placed in a microcentrifuge tube and incubated in SDS-PAGE sample buffer with 5% β-mercaptoethanol (βME) at 95°C for 5 min, then centrifuged. The supernatant was collected and size fractionated on an SDS-PAGE gel. After staining with colloidal blue coomassie, the heavy chain band was excised, washed with HPLC grade H2O and acetonitrile and sent for Mass Spectroscopy. As a control the original purified Ig was size fractionated using the same SDS-PAGE gel system and the heavy chain was excised, washed as above. The excised gel pieces were reduced, alkylated, and trypsin digested by standard mass spectrometry protocols. The supernatant and two washes (5% formic acid in 50% acetonitrile) of the gel digests were pooled and concentrated by speed vac (Labconco, Kansas, MO) to dryness directly in 200 μl polypropylene auto-sampler vials (Sun Sri, Rockwood, TN). The recovered peptides were re-suspended in 10 μl of Solvent A (0.1% formic acid, 2% acetonitrile, and 97.9% water).
Digested peptides were subjected to the LC-MS analysis using LTQ-Orbitrap Velos mass spectrometer (ThermoFisher Scientific, West Palm Beach, FL) connected with EASY nLC 2 liquid chromatography system and temperature-controlled Ion Max Nanospray source (ThermoFisher Scientific). Nano-LC was carried out on a 3-μm Magic AQ C18 column from Precision Capillary Columns (15cm, ID 75 μm) operating at 500 nL/min flow rate with solvent A and Solvent B (0.1% formic acid, 2% water, and 97.9% acetonitrile). After 5 μL of the sample was injected and the column was washed, the MS/MS data was acquired with a linear gradient from 3% to 45 %B over 60 min. A standard data-dependent acquisition was performed as the following. A full MS spectrum is obtained by the Orbitrap for m/z 400—1800 at the resolution of 60000, where the lock mass of m/z 445.120025 was utilized. The precursor ion, if multiply-charged, was selected and labeled by MIPS, isolated for the 2 m/z window, fragmented by CID, then scanned by the Ion Trap. MS/MS scans were repeated for the 10 most intense ions per full MS scan, and the dynamic exclusion was enabled for 30 seconds.
Acquisitions were searched against the CDR region TT repertoire database (8/2013) using PEAKS v6 (Bioinformatics Solutions Inc, Ontario, Canada) with a semi-tryptic search strategy with tolerances of 20 ppm for MS and 0.8 Da for MS/MS and carbamidomethylation of cysteine as a fixed modification and oxidation of methionine as a dynamic modification. Peptides were filtered with a 1% FDR using a decoy database approach and a 2 spectral matcher/peptide requirement. Further refinement was performed to reflect a 5 ppm parent mass tolerance.
BLOSUM analysis.
For each Ig CDR-H3 amino acid analyzed (CARDTIMVIGYGKLDHW, CARDYFGSGSVYYFDYW, CARQADNWFDPW), CDR-H3 amino acid sequences for each time point were filtered to only include sequences with matching V and J segments. The filtered CDR-H3 amino acid sequences were then compared to each target Ig CDR-H3 sequence to determine the similarity of the sequences based on complementarity amongst the amino acids using a BLOSUM62 (BLOcks SUbstitution Matrix) table (Altschul 1991; Henikoff and Henikoff 1992; Karlin and Altschul 1990). Briefly, amino acid substitutions were converted to a complementarity score based on how such observed substitutions affected protein structure and function (Henikoff and Henikoff 1992). Ig CDR-H3 amino acid sequences were converted to a sum of the complementarity scores against the target sequences at each amino acid position, displayed as a sum score (Score) or percent similarity (%) in Table S1. Sequences were filtered for the highest complementarity scores to identify Ig CDR-H3 sequences that were most likely to share antigen reactivity and be expanded post-vaccination.
Results
Plasma cell Ig repertoire sequencing
The plasma cell Ig repertoire was determined from the same individual 11 (−11w) and two (−2w) weeks prior to and one (+1w) week after a routine tetanus toxoid booster (Fig. 1). At each time point, plasma cells were isolated from peripheral blood mononuclear cells (PBMCs) that had been collected by apheresis and used to make Ig heavy chain variable region libraries for 454 sequencing. A total of 21,060 unique nucleotide sequences were obtained from the 413,679 full length sequences represented by at least 5 identical reads (Fig. 2). The nucleotide sequences extended from a conserved region in Ig heavy chain framework region 2 and to the 3’ end of the J exon and were mapped to the human Ig heavy locus using the International ImMunoGeneTics Information System (IMGT). Five thousand three hundred and ninety nine unique nucleotide sequences were identified at −11w, 9,553 sequences at −2w and 6,108 sequences at +1wk (Fig. 2). The differences in the number of sequences likely reflect random variation introduced through the sample preparation and sequencing processes. Unique Ig heavy chain clonotypes were identified using the heavy chain CDR3 (CDR-H3) amino acid sequence and 3,924 unique clonotypes were defined at −11w (73% of the total unique sequences), 6,785 at −2w (71 %), and 3,598 at +1w (58%). The majority of the clonotypes were unique to a particular time point, 64% at −11w, 78% at −2wk and 62% at +1wk, with only 466 clonotypes (3.3%) identified at all time points, demonstrating the breadth and diversity of the plasma cell repertoire at each time point.
Figure 1.
Time line of sample collection and analysis. PBMCs were harvested by apheresis at eleven (−11w) and 2 weeks (−2w) prior to and 1 week (+1w) after a tetanus booster vaccination. Immediately after apheresis plasma cells were enriched from the PBMC, resuspended in RNA stabilization buffer and stored at −80 until RNA isolation and 454 sequencing.
Figure 2.
Summary of the IgG sequence analysis: The number of total nucleotide sequences (seq) with ≥5 seq reads, the number of unique nucleotide seq, and the number of unique clonotypes detected at each time point, 11 (−11w) or two (−2w) weeks before or one (+1w) week after the vaccination, as well as the average seq length are indicated in the table. The Venn diagram depicts the number of unique clonotypes that are and are not shared between the indicated time points.
Ig sequences expanded 1 week post tetanus boost
To further evaluate the dynamics of the population before and after the tetanus booster, we first determined the phylogenetic relationship between the unique nucleotide sequences obtained at different time points. This analysis revealed several expanded branches of closely related sequences in the +1w data when compared to −11 or −2w (Fig. 3). To integrate the nucleotide sequence information into the clonotype data, we summed the number of sequence reads for each of the unique full length nucleotide sequences that coded for the same CDR-H3 region and divided by the total number of sequences obtained for that time point to determine the clonotype frequency. This calculation allowed compensation for the different number of total reads obtained at each time point. Clonotype frequency varied dramatically within each time point, ranging from a single CDR-H3 amino acid sequence representing 9.5% of all the sequence reads at time point −11w to a minimum of <0.004 % reflecting the cutoff value used of 1 sequence represented by 5 independent reads (Fig. 4A). Looking across all time points, only a few CDR-H3 sequences had frequencies above 1%, three sequences at −11w, five at −2w and eleven at +1w. The increased breadth of the high frequency sequences at +1w could reflect the response to the tetanus toxoid booster. However, the average frequency for all three time points was similar, 0.02 % (−2w) and 0.03 % (+1w and −11w).
Figure 3.
Nucleotide sequence expansion one week after a tetanus booster vaccination. The unique nucleotide sequences obtained using the six oligonucleotide primer sets (VH 1–6) at the indicated time points, two (−2w) weeks before and one (+1w) week after vaccination (−2w & +1w) or 11 weeks (−11w) and −2w before vaccination (−11w & −2w) were aligned using MAFFT aligner and Jukes-Cantor Neighbor-Joining and plotted using Fig Tree software. The sequences from the different time points were color coded to allow comparison (−11w red, −2w blue, +1w green). The location of the clonotypes over represented at +1w are indicated by their +1w frequency ranking 1–10, 27, 122. Clonotype 7 encodes CARDTIMVIGYGKLDHW, which binds to tetanus toxoid and clonotypes 9 and 14 encode related aa sequences. Clonotype 27 encodes a public tetanus toxoid-associated CDR-H3 and clonotype 122 is closely related to clonotype 27.
Figure 4.
Clonotype frequency distribution. The frequency of each clonotype was determined for each time point by totaling all the sequence reads obtained for all the unique nucleotide sequences that code for the same CDR-H3 region and dividing by the total number of sequence reads obtained at that time point. A) Clonotypes detected at each time point, 11 (−11w, red) or two (−2w, blue) weeks before or one (+1w, green) week after the vaccination, were ranked according to their frequency (1–24) at that time point and the frequency plotted. B) The clonotypes with the 30 highest frequencies at +1w were ranked 1 – 30 and the frequency of those 30 clonotypes at +1w, −2w and −11w are plotted. The tetanus toxoid-binding clonotype (7) is indicated by a solid red circle and the 2 related clonotypes (9, 14) are indicated by red circles. The public tetanus-associated clonotype (27) is indicated by a solid purple circle.
To further assess the post tetanus toxoid booster response we compared the V exon usage between the three time points and identified those that were expanded at +1w. The number of unique sequences containing Ig heavy chain (IGH) V exons 5–51, 2–26 and 2–5 were all significantly higher at +1w (p<0.00001), while the usage of other exons remained similar suggesting selective expansion (Fig. 5). IGHV exon 5–51 was included in the most frequent CDR-H3 as well as the tenth, IGHV exon 2–26 was included in the sixth most frequent +1w CDR-H3 and IGHV exon 2–5 was included in the eighth and the eighteenth most frequent +1w CDR-H3s. These increases in the number of unique, yet closely related nucleotide sequences coding for specific clonotypes at +1w are detectable as expanded branches in +1w and −2 (Fig. 3) or −11w sequence alignments.
Figure 5.
Increased Ig heavy chain variable exon usage 1 week post tetanus vaccination. The number of unique nucleotide sequences obtained that contain the indicated Ig heavy chain V exon at each time point, 11 (−11w) or two (−2w) weeks before or one (+1w) week after the vaccination, are plotted. A binomial probability test was used to compare the exon usage at +1w with the average of exon usage −11w and −2w. The only V exons with significantly increased usage at +1w are indicated with a orange box. The p value for each of the 3 exons was <0.0001.
Interestingly, in contrast to the predominance of unique clonotypes at each time point, all but 3 of the 30 most frequent CDR-H3 regions at +1w were shared between time points, with 50% found at all time points (Table 1, Fig 4B). This result is consistent with the rapid expansion of an existing B cell population in response to the tetanus boost. Further support for this is that at least one of the unique nucleotide sequences encoding each of the top 7 expanded CDR-H3 regions was also found at another time point. The presence of these shared nucleotide sequences suggests that plasma cells from the same lineage were circulating 2 and/or 11 weeks before the booster vaccination and had not acquired additional point mutations, even following vaccination. Additionally, the unique nucleotide sequence with the highest number of reads at +1w for each of these 7 clonotypes, also had the highest number of sequence reads for that clonotype at the pre vaccination time points. This finding is consistent with the activation of B cell lineages that predominated during the prior exposure. In contrast, the presence of additional distinct nucleotide sequences with fewer sequencing reads, but coding for the same CDR-H3 were only present at +1w, not −11w or −2w. These new sequences could reflect ongoing somatic mutation during activation or that the population was below the detection threshold at prior time points. It would be interesting to compare the antigen specificity and affinity of these populations.
Table 1:
Unique CDR-H3 regions overrepresented at +1w
| Expanded Sequences | +1w |
−2w |
−11w |
|||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Unique AA Junctions | # Uniq seqa | Read Sumb | Sum Freqc | # Uniq seq | Read Sum | Sum Freq | # Uniq seq | Read Sum | Sum Freq | |||
| 1 | CARFSTHLYSGSKFDSW | 79 | 8467 | 6.52 | 4 | 68 | 0.03 | 17 | 478 | 0.54 | ||
| 2 | CARRTSYLYALDVW | 102 | 3701 | 2.85 | 0 | 0 | 0.00 | 13 | 170 | 0.19 | ||
| 3 | CARRGRNVVVPDSYVWFDPW | 28 | 3310 | 2.55 | 1 | 8 | 0.00 | 3 | 107 | 0.12 | ||
| 4 | CARGEAGRRGLDVW | 22 | 2952 | 2.27 | 1 | 145 | 0.07 | 0 | 0 | 0.00 | ||
| 5 | CARNIWSNYPIFYYELDVW | 56 | 2795 | 2.15 | 1 | 31 | 0.02 | 11 | 331 | 0.37 | ||
| 6 | CAHHILNSVSARSRFDSW | 37 | 2407 | 1.85 | 1 | 5 | 0.00 | 6 | 183 | 0.21 | ||
| 7 | CARDTIMVIGYGKLDHW d | 95 | 2251 | 1.73 | 3 | 18 | 0.01 | 27 | 350 | 0.39 | ||
| 8 | CARIVLPATFAFDFW | 28 | 1864 | 1.44 | 1 | 8 | 0.00 | 3 | 104 | 0.12 | ||
| 9 | CAR E T SQ VIGYGKLDHW e | 76 | 1630 | 1.26 | 3 | 31 | 0.02 | 19 | 305 | 0.34 | ||
| 10 | CARRIGGSGNGLDVW | 29 | 1310 | 1.01 | 1 | 17 | 0.01 | 2 | 63 | 0.07 | ||
| 11 | CVRRGRLGYMYRYYGADSW | 30 | 1232 | 0.95 | 1 | 11 | 0.01 | 7 | 189 | 0.21 | ||
| 12 | CAKRNTRLGYSYALVYW | 39 | 1143 | 0.88 | 0 | 0 | 0.00 | 9 | 107 | 0.12 | ||
| 13 | CGRGYYSNLAYFYYGLDVW | 20 | 1041 | 0.80 | 2 | 19 | 0.01 | 5 | 216 | 0.24 | ||
| 14 | CAR E T S MVIGYGKLDHW e | 42 | 978 | 0.75 | 1 | 5 | 0.00 | 10 | 198 | 0.22 | ||
| 15 | CAKEHSGWFFDSW | 7 | 897 | 0.69 | 0 | 0 | 0.00 | 3 | 211 | 0.24 | ||
| 16 | CARRTGGSGNGLDVW | 26 | 865 | 0.67 | 0 | 0 | 0.00 | 5 | 34 | 0.04 | ||
| 17 | CAKADPYDTSGFYKDW | 15 | 821 | 0.63 | 0 | 0 | 0.00 | 4 | 61 | 0.07 | ||
| 18 | CAHARPLTAHW | 20 | 800 | 0.62 | 0 | 0 | 0.00 | 4 | 54 | 0.06 | ||
| 19 | CARRYDFWSGFHDYW | 27 | 731 | 0.56 | 0 | 0 | 0.00 | 4 | 21 | 0.02 | ||
| 20 | CTRGVRATGIDYW | 6 | 691 | 0.53 | 0 | 0 | 0.00 | 0 | 0 | 0.00 | ||
| 21 | CARETSYDPTGYYYHYTMDVW | 10 | 689 | 0.53 | 1 | 10 | 0.01 | 5 | 94 | 0.11 | ||
| 22 | CARERPRRGWDSAGRFDYW | 20 | 689 | 0.53 | 0 | 0 | 0.00 | 2 | 30 | 0.03 | ||
| 23 | CARDYYGSGSYSGEWTAVYYLEYW | 18 | 686 | 0.53 | 1 | 5 | 0.00 | 6 | 58 | 0.07 | ||
| 24 | CATSRLNGGLLSIIHFW | 14 | 625 | 0.48 | 0 | 0 | 0.00 | 5 | 34 | 0.04 | ||
| 25 | CARHLYDGDGYRNDFW | 6 | 587 | 0.45 | 1 | 7 | 0.00 | 1 | 12 | 0.01 | ||
| 26 | CARSVTRAFDPW | 4 | 576 | 0.44 | 0 | 0 | 0.00 | 0 | 0 | 0.00 | ||
| 27 | CARQADNWFDPW f | 41 | 575 | 0.44 | 0 | 0 | 0.00 | 0 | 0 | 0.00 | ||
| 28 | CARRRGSYYLLPYYFDSW | 9 | 560 | 0.43 | 0 | 0 | 0.00 | 3 | 29 | 0.03 | ||
| 29 | CATTRPDYLNGHFDHW | 13 | 552 | 0.43 | 1 | 7 | 0.00 | 2 | 58 | 0.07 | ||
| 30 | CTRPDYGHYYSYGMDVW | 10 | 546 | 0.42 | 0 | 0 | 0.00 | 2 | 13 | 0.01 | ||
# Uniq seq, number of unique nucleotide sequences encoding the CDR-H3.
Read Sum, sum of the number of sequencing reads obtained for all the unique nucleotide sequences encoding the CDR-H3
Sum Freq, The Read Sum for each CDR-H3 divided by the total number of full length sequencing reads obtained at that time point.
The CDR-H3 containing the aa sequence of the peptide detected by immunoprecipitation using plasma isolated 1 week after a tetanus booster vaccination.
CDR-H3 that has ≥ 75% identity with CARDTIMVIGYGKLDHW.
Public tetanus toxoid-associated CDR-H3 sequence
Tetanus Toxoid specific CDR-H3 Ig sequences.
To more directly evaluate whether the overrepresented +1w CDR-H3 regions could bind to tetanus, we screened for previously reported public tetanus toxoid-specific sequences. Two sets of CDR-H3 sequences that match reported public anti-tetanus toxoid sequences (Frolich et al. 2010; Poulsen et al. 2011; Truck et al. 2015) were found in the +1w expanded gene set. One, CARQADNWFDPW (Frolich et al. 2010; Truck et al. 2015), was in the top 30 most frequently overrepresented +1w clonotypes and only detected at +1w (frequency 0.44 %) (Table 1, Fig 4B, purple circle). BLOSUM analysis was used to identify two other closely related CDR-H3 sequences (> 75% identity) encoded by the same V and J exons, CARQTDNWFDPW and CATQADNWFDPW (Table S1). All three additional CDR-H3s were also only detected at +1w and had frequencies of 0.1, 0.03 and 0.01, respectively. The second public anti-tetanus toxoid set (Poulsen et al. 2011; Truck et al. 2015) included four closely related sequences. Again, all four sequences used the same V, D and J exons, CARDYFGSGSAYYFEYW, CARDYYGSGAVYYFEYW, CARDYFGSGAVYYFENW, and CARDYFGSGPVYYFENW with frequencies of 0.10, 0.03, 0.11, and 0.01 respectively, at +1w and ≤0.01 at −2w and −11w (Table S1).
Tetanus toxoid-specific antibodies were also identified using precipitation with tetanus toxoid followed by mass spectroscopic analysis. One peptide, DTIMVIGYGK, corresponding to a CDR-H3 region was detected. It was found in the CDR-H3 sequence (CARDTIMVIGYGKLDHW) with the 7th highest frequency at +1w (1.7%) and a frequency of just 0.009% at −2w. This frequency pattern is consistent with expansion following the tetanus booster (Fig 4B). Given the presence of multiple sequences closely related to the public anti-tetanus toxoid CDR-H3s, we used BLOSUM analysis to identify other CDR-H3 regions formed by recombination of the same V and J exons encoding DTIMVIGYGK. We found 13 related sequences of which 10 were expanded at +1w in comparison with −2w and −11w (Table S1). Two of the related CDR-H3 sequences had the 9th and 14th highest frequencies at +1w (Table 1, Fig 4B), again suggesting an increase in response to a tetanus booster vaccine. The +1w expansions in the number of unique sequences encoding DTIMVIGYGK as well as the other related sequences can clearly be seen in the circular graphs of the phylogenies comparing all the unique nucleotide sequences with v1 type IGHV exons (Fig. 3 & 6).
Figure 6.
Increase in the number of unique nucleotide sequences encoding CARDTIMVIGYGKLDHW and 2 related CDR-H3 1 week post tetanus vaccination. Nucleotide sequences encoding the peptide DTIMVIGYGK shown to bind tetanus toxoid are indicated in yellow on the alignment of the unique nucleotide sequences detected at 11 (−11w, red) and two (−2w, blue) weeks prior the tetanus vaccination, −11w (red) and one (+1w, green) week post vaccination or −2w and +1w. MAFFT aligner and Jukes-Cantor Neighbor-Joining were used to align the sequences and they were plotted using Fig Tree software. The location of the sequences coding for the 2 related CDR-H3 sequences are indicated in the −2w vs +1w plot and the black box indicates the same region in the 3 different comparisons.
More focused analysis of the 95 unique nucleotide sequences encoding CARDTIMVIGYGKLDHW at +1w indicated that 25 of these sequences were also identified in the −11w sample and three at −2w. Again, shared nucleotide sequences are consistent with the same B cell population being present prior to the booster vaccination. Moreover, the 10 unique nucleotide sequences encoding CARDTIMVIGYGKLDHW with the highest number of sequence reads at +1w, also ranked highest for read frequency among the −11w sequences encoding CARDTIMVIGYGKLDHW (Fig. 7). The three −2w sequences had too few reads (5–7 per sequence) to rank. In total, the 95 unique nucleotide sequences encoding CARDTIMVIGYGKLDHW detected at +1w encoded 61 distinct peptides (Table S2). Ten of the 61 peptide sequences were encoded by more than one nucleotide sequence at +1w with three peptides predominating. Each of these 3 peptides were encoded by four or more unique nucleotide sequences and included the nucleotide sequences with the highest number of sequencing reads at both +1w and −1 weeks. This data suggests that the predominant tetanus-specific plasma cell population circulating at −11w continued to predominate one week after the booster injection. However, all ten peptides encoded by multiple nucleotide sequences also contained at least 2 nucleotide sequences that were not found at prior time points, suggestive of ongoing somatic hypermutation. Unique nucleotide sequences for 3 peptides were also detected at −11w, not +1 w, consistent with prior rounds of mutation that might have failed to generate a memory response, possibly due to reduced tetanus binding affinity.
Figure 7.
The frequency rank for the top ten unique sequences encoding CARDTIMVIGYGKLDHW is the same at +1w and −11w. The number of sequence reads obtained for each of the top ten unique nucleotide sequences encoding CARDTIMVIGYGKLDHW at each time point, 11 (−11w, red) or two (−2w, blue) weeks before or one (+1w, green) week after the vaccination is plotted.
Discussion
Comparison of the Ig repertoire of plasma cells isolated before and 7 days after a tetanus toxoid booster allowed the identification of CDR-H3 regions expanded after administration of a tetanus booster vaccination. The CDR-H3 aa sequences identified included 1 public sequence previously associated with tetanus binding (Frolich et al. 2010; Truck et al. 2015), 1 sequence closely related to a public sequence (Poulsen et al. 2011; Truck et al. 2015), and a newly recognized sequence that was identified using tetanus toxoid to precipitate Ig for amino acid sequence determination using mass spectroscopy. Each of these tetanus toxoid-associated CDR-H3 aa sequences were encoded by a number of nucleotide sequences that differed outside of the CDR-H3 region. This data suggests the accumulation of point mutations after the initial activation of the parental naïve B cell and is discussed in detail below. The sum of the total number of sequencing reads that encoded these CDR-H3 regions was 4.4 – 881 times higher +1w after the booster vaccination than before, which is consistent with expansion of the plasma cell population after antigen exposure. This finding also suggests that other CDR-H3 regions represented by more sequence reads after vs before the booster vaccination are also likely to be specific for tetanus toxoid. Predicting antigen reactivity from bulk plasma cell Ig repertoire analysis before and after exposure provides an antigen agonistic approach that is needed for the analysis of complex microbes, such as parasites whose antigens are difficult to produce recombinantly for use identifying antigen specific B cells.
The Ig heavy chain repertoire can also be used to track sequence maturation over time to identify changes that increase with increasing clinical protection (Zhou and Kleinstein 2019). Prior work has focused on analysis of the Ig repertoire of antigen reactive B cells over time (Andrews et al. 2019; Wec et al. 2020) or the small number of plasma cells obtained from a standard blood draw which limits the depth of the analysis (Lavinder et al. 2014; Poulsen et al. 2011). The use of apheresis to obtain ~1–2×109 PMBC to allow the isolation of ~1–2×106 plasma cells for Ig heavy chain variable region sequencing provides this in-depth analysis. Comparison of the Ig heavy chain sequences and their relative abundance after exposure can also be used to inform selection of Ig molecules to advance for recombinant Ig production and functional characterization from the limited number of monoclonal antibodies or single cells that can realistically be generated or sequenced, respectively. The need to focus Ig selection is especially important for antigen agonistic discovery studies that cannot rely on cell sorting to isolate antigen specific B cells or for which there is no reliable in vitro neutralization assay. In this case, selecting candidate Ig that are only enriched following clinical protection could facilitate the identification of protective Ig and their target epitopes.
The high percentage of the high frequency clonotypes at +1w that were shared with −11w and/or −2w is consistent with a booster vaccination activating B memory cells, which had been generated during a prior exposure. The higher frequency of these nucleotide sequences at −11w versus −2w could suggest that the individual may have been naturally exposed to Clostridium tetani just prior to −11w and cleared the infection. Unfortunately, this cannot be confirmed retrospectively. The comparison of the frequency of the unique nucleotide sequences encoding the tetanus toxoid-binding CDR-H3, CARDTIMVIGYGKLDHW, demonstrates that the frequency order is similar at +1w and −11w, meaning that the most frequent nucleotide sequences at 11w were also the most frequent at +1w. It is likely that the similar order reflects the relative number of memory cells in the population and provides a tool to monitor changes in these populations over time. This detailed information about the frequency of memory cell populations with specific Ig sequences could be used in clinical studies guiding changes to vaccine formulations to enhance memory cell production and perhaps decrease the need for a booster vaccinations.
The identification of the same ten distinct CARDTIMVIGYGKLDHW containing peptides represented by the most total sequence reads at +1w and at −11w is again consistent with the primary humoral response being generated from preexisting memory cells. However, there were an additional 46 distinct CARDTIMVIGYGKLDHW containing peptides identified only at +1w. Each of these peptides was represented by only 1–2 unique nucleotide sequences, most of which only had 5 reads, suggesting the humoral response is continuing to evolve. Given that all the peptides have the same CDR-H3 region it is likely that these unique sequences are the result of somatic hypermutation during proliferation of a preexisting memory B cell, not the stimulation of distinct naïve B cells. However, the latter cannot be ruled out. These 46, +1w-specific peptides have a range of a distinct sequences, in marked contrast with the top 10 shared peptides, which are all minor variants of the top 3 peptides. These 10 shared peptides only differ from one another by 2–4 aa in the framework region. The representation of such a limited number of aa sequences by multiple unique nucleotide sequences is consistent with selection for high affinity. It would be interesting to re-evaluate the response after a subsequent exposure to see if there is an increase in the representation of one of the novel +1w sequences, which could suggest a further increase in affinity.
Using the tetanus vaccine model system we have demonstrated that bulk plasma cell Ig repertoire analysis can identify plasma cell populations that expand following antigen exposure and provide insight into the maturation of the humoral immune response. The application of similar techniques to track changes in the Ig repertoire before and after clinical infections should advance our understanding of the development of clinical protection and facilitate the design of effective vaccine strategies.
Supplementary Material
Acknowledgements:
The opinions and assertions expressed herein are those of the author(s) and do not necessarily reflect the official policy or position of the Uniformed Services University or the Department of Defense.
Funding:
This investigation received financial support from Public Health Service grants AI069314 and U0110852 from the National Institute of Allergy and Infectious Diseases (NIAID) and National Human Genome Research Institute NIH Intramural program.
Footnotes
Declarations
Availability of data and material: All data is available on request.
Conflicts of interest/competing interests: The authors declare no conflicts of interest or competing interests.
References
- Alamyar E, Giudicelli V, Duroux P, Lefranc M-P (2010) IMGT/HighV-Quest: A high-throughput system and web portal for the analysis of rearranged nucleotide sequences of antigen receptors - High-throughput version of IMGT/V-QUEST. Proceedings of the 11th Journees ouvertes en Biologie, Informatique et Mathematiques (JOBIM), P27–156 [Google Scholar]
- Altschul SF (1991) Amino acid substitution matrices from an information theoretic perspective. J Mol Biol 219:555–65 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andrews SF, Chambers MJ, Schramm CA, Plyler J, Raab JE, Kanekiyo M, Gillespie RA, Ransier A, Darko S, Hu J, Chen X, Yassine HM, Boyington JC, Crank MC, Chen GL, Coates E, Mascola JR, Douek DC, Graham BS, Ledgerwood JE, McDermott AB (2019) Activation Dynamics and Immunoglobulin Evolution of Pre-existing and Newly Generated Human Memory B cell Responses to Influenza Hemagglutinin. Immunity 51:398–410 e5 [DOI] [PubMed] [Google Scholar]
- Bandyopadhyay AS, Macklin GR (2020) Final frontiers of the polio eradication endgame. Curr Opin Infect Dis 33:404–410 [DOI] [PubMed] [Google Scholar]
- Bao J, Thorley B, Isaacs D, Dinsmore N, Elliott EJ, McIntyre P, Britton PN (2020) Polio - The old foe and new challenges: An update for clinicians. J Paediatr Child Health 56:1527–1532 [DOI] [PubMed] [Google Scholar]
- Boyd SD, Marshall EL, Merker JD, Maniar JM, Zhang LN, Sahaf B, Jones CD, Simen BB, Hanczaruk B, Nguyen KD, Nadeau KC, Egholm M, Miklos DB, Zehnder JL, Fire AZ (2009) Measurement and clinical monitoring of human lymphocyte clonality by massively parallel VDJ pyrosequencing. Sci Transl Med 1:12ra23 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brochet X, Lefranc MP, Giudicelli V (2008) IMGT/V-QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysis. Nucleic Acids Research 36:W503–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Frolich D, Giesecke C, Mei HE, Reiter K, Daridon C, Lipsky PE, Dorner T (2010) Secondary immunization generates clonally related antigen-specific plasma cells and memory B cells. J Immunol 185:3103–10 [DOI] [PubMed] [Google Scholar]
- Gaeta BA, Malming HR, Jackson KJ, Bain ME, Wilson P, Collins AM (2007) iHMMune-align: hidden Markov model-based alignment and identification of germline genes in rearranged immunoglobulin gene sequences. Bioinformatics 23:1580–7 [DOI] [PubMed] [Google Scholar]
- Graham BS, Gilman MSA, McLellan JS (2019) Structure-Based Vaccine Antigen Design. Annu Rev Med 70:91–104 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heaton PM (2020) Challenges of Developing Novel Vaccines With Particular Global Health Importance. Front Immunol 11:517290 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A 89:10915–9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karlin S, Altschul SF (1990) Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc Natl Acad Sci U S A 87:2264–8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–80 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kisalu NK, Idris AH, Weidle C, Flores-Garcia Y, Flynn BJ, Sack BK, Murphy S, Schon A, Freire E, Francica JR, Miller AB, Gregory J, March S, Liao HX, Haynes BF, Wiehe K, Trama AM, Saunders KO, Gladden MA, Monroe A, Bonsignori M, Kanekiyo M, Wheatley AK, McDermott AB, Farney SK, Chuang GY, Zhang B, Kc N, Chakravarty S, Kwong PD, Sinnis P, Bhatia SN, Kappe SHI, Sim BKL, Hoffman SL, Zavala F, Pancera M, Seder RA (2018) A human monoclonal antibody prevents malaria infection by targeting a new site of vulnerability on the parasite. Nat Med 24:408–416 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kwong PD, Mascola JR (2018) HIV-1 Vaccines Based on Antibody Identification, B Cell Ontogeny, and Epitope Structure. Immunity 48:855–871 [DOI] [PubMed] [Google Scholar]
- Lavinder JJ, Wine Y, Giesecke C, Ippolito GC, Horton AP, Lungu OI, Hoi KH, DeKosky BJ, Murrin EM, Wirth MM, Ellington AD, Dorner T, Marcotte EM, Boutz DR, Georgiou G (2014) Identification and characterization of the constituent human serum antibodies elicited by vaccination. Proc Natl Acad Sci U S A 111:2259–64 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Munshaw S, Kepler TB (2010) SoDA2: a Hidden Markov Model approach for identification of immunoglobulin rearrangements. Bioinformatics 26:867–72 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Murphy K, Weaver C (2017) Immunobiology. Garland Science, New York, NY [Google Scholar]
- Poulsen TR, Jensen A, Haurum JS, Andersen PS (2011) Limits for antibody affinity maturation and repertoire diversification in hypervaccinated humans. J Immunol 187:4229–35 [DOI] [PubMed] [Google Scholar]
- Tan J, Sack BK, Oyen D, Zenklusen I, Piccoli L, Barbieri S, Foglierini M, Fregni CS, Marcandalli J, Jongo S, Abdulla S, Perez L, Corradin G, Varani L, Sallusto F, Sim BKL, Hoffman SL, Kappe SHI, Daubenberger C, Wilson IA, Lanzavecchia A (2018) A public antibody lineage that potently inhibits malaria infection through dual binding to the circumsporozoite protein. Nat Med 24:401–407 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Truck J, Ramasamy MN, Galson JD, Rance R, Parkhill J, Lunter G, Pollard AJ, Kelly DF (2015) Identification of antigen-specific B cell receptor sequences using public repertoire analysis. J Immunol 194:252–261 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walker LM, Burton DR (2018) Passive immunotherapy of viral infections: ‘super-antibodies’ enter the fray. Nat Rev Immunol 18:297–308 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wec AZ, Haslwanter D, Abdiche YN, Shehata L, Pedreno-Lopez N, Moyer CL, Bornholdt ZA, Lilov A, Nett JH, Jangra RK, Brown M, Watkins DI, Ahlm C, Forsell MN, Rey FA, Barba-Spaeth G, Chandran K, Walker LM (2020) Longitudinal dynamics of the human B cell response to the yellow fever 17D vaccine. Proc Natl Acad Sci U S A 117:6675–6685 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weiss RA, Esparza J (2015) The prevention and eradication of smallpox: a commentary on Sloane (1755) ‘An account of inoculation’. Philos Trans R Soc Lond B Biol Sci 370 [DOI] [PMC free article] [PubMed] [Google Scholar]
- WHO (2018) Tetanus: Fact Sheet. World Health Organization [Google Scholar]
- Yamada KD, Tomii K, Katoh K (2016) Application of the MAFFT sequence alignment program to large data-reexamination of the usefulness of chained guide trees. Bioinformatics 32:3246–3251 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou JQ, Kleinstein SH (2019) Cutting Edge: Ig H Chains Are Sufficient to Determine Most B Cell Clonal Relationships. J Immunol 203:1687–1692 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.







