Abstract
Background
Tseek is a method of sequencing T cell receptor (TCR) repertoires with minimal bias. This work aimed to develop methods to characterize the TCR repertoire in dogs, identify influences such as genetic lineage and age, and evaluate the use of repertoires to monitor immune status in dogs.
Methods
Two studies were conducted to develop the techniques and characterize the effect of individual, breed, and age. One study analyzed RNA data from individuals (n = 32), 8 from each of 4 breeds, sampled at 2 time points a year apart. The second, lifestage study, used individuals within a single breed (Labrador Retriever) with ages dispersed across a broad range (0.2 to 12 yr, n = 50). Tseek was used to process samples for sequencing, to identify the V, and J segments to annotate the CDR3, which were then analyzed to draw inferences.
Results
The TCR repertoires had signatures of breeds, and of the individual, with stability over at least a year. Across the lifestage study, littermate-specific characteristics were not detected, but an age-related effect was observed: older dogs exhibited reduced diversity characterized by a greater abundance of individual-specific high-frequency clones, while puppies had a more diverse repertoire
Conclusion
An individual’s TCR repertoire includes stable information, indicative of the individual, breed, and age-related decline. The α and β chain repertoires had distinct properties in the breed-specific signatures, indicating differential influences on their selection, despite their pairing in each T cell. Consistent, age-related changes can be seen in the repertoire, but their impact on immune system needs to be delineated.
Keywords: CDR3 Complementarity-determining region 3, PB Petit Basset Griffon Vendeen, BE Beagle, NT Norfolk Terrier, LR Labrador Retriever
Introduction
The adaptive immune system, a key feature of jawed vertebrates, helps fight infections. T and B cells are lynchpins of the system, with their receptors helping identify and target invaders. Receptors expressed by T cells (TCRs) allow them to bind antigenic peptides presented by the major histocompatibility complexes (such as Human Leukocyte Antigen) on the surfaces of antigen-presenting cells. TCRs are heterodimers of two transmembrane peptides (α, β) linked by covalent disulfide bonds. Germline TCRα and TCRβ loci undergo DNA rearrangement during the development of each T cell, wherein one of several V, D (only in TCRβ), and J regions are joined,1 which defines the complementarity determining region CDR3 (Fig. S1). The joining mechanisms introduce variations in the CDR3 (insertions, deletions, mutations) that do not exist in the template DNA, leading to a large variety (theoretically ∼1016) of amino acid sequences at these junctions.2 The CDR3 contributes the most to antigen specificity.3
Considering the diversity of the receptors, Next Generation Sequencing (NGS) seems to be a natural choice to catalog this diversity, provided no biases are introduced in the process. Thus far, most approaches have used multiple primers designed against V segments, but this creates biases that are difficult to correct.4 Tseek,5,6 on the other hand, allows for unbiased, sensitive profiling of TCRα and TCRβ chains (Methods), that does not require a priori knowledge of the V and J segments. Tseek characterizes the populations of the α, and β CDR3 separately to identify properties in the bulk, as opposed to single-cell methods. Tseek has already been used in several studies including, (1) the role of T cells in a diet-induced mouse model of colitis,7 *2) the T cell response to vaccines in humans,6 (3) iNKT cells (invariant innate-like T cells) in pigs,8 and (4) immunotherapy in mice and humans.9
Dogs are a unique species, having been domesticated by humans as companion animals and selected for different traits, with subsequent development of breeds with the widest size distribution of any mammalian species. Dog breeds have restricted diversities in their genomes, with differing susceptibilities to diseases10 and live in a home environment exposed to similar environmental and health risks as their caregivers. Dogs, with an immune system more similar to humans than mice,11 exhibit many disorders that humans face (autoimmunity, cognitive health, aging-related disorders, etc) and respond to similar interventions. Dogs are a good model for some common human pathologies11–13 as well as those diseases that have specific immunological roles, such as psoriasis, where therapies like the anti-IL11 antibodies as well as jak-stat inhibitors have been used successfully. JAK/STAT inhibitors (eg, oclacitinib) are used in veterinary medicine to treat pruritus associated with allergic and atopic dermatitis in dogs. So, an improved understanding of the role of breed, age, and life history will not only further our understanding of dogs but may also highlight aspects of biology that likewise affect human health.
The canine TCR locus is evolutionarily conserved and organized similarly to other mammalian systems such as humans and mice.14 The development of the immune system in dogs roughly parallels other mammals.15 A single-cell transcriptomics study established that the lymphatic systems in dogs and humans are similar16 and other studies have shown that regulatory T cells17 and the TCRγ locus18 are also conserved in dogs. To our knowledge, there have been 2 attempts to apply NGS to sequencing the canine repertoire, but they were based on a restricted set of primers and limited to the β chain.19,20
The objectives of the 2 studies described here were to develop laboratory and data analysis methods to characterize the dog TCR. The first study characterized the effect of breed and the stability of the TCR repertoire within an individual, with samples acquired 1 yr apart; the second investigated the TCR repertoire of individuals from a single breed (Labrador Retriever), across a range of age groups.
Materials and methods
Ethics approvals
Dogs from both the Waltham Petcare Science Institute (UK) and the Pet Health Nutrition Center (US) were housed in purpose-built, environmentally enriched housing. In the United Kingdom, all care and procedures were in keeping with the requirements of the Animals (Scientific Procedures) Act 1986 and the study was approved by WALTHAM’s Animal Welfare and Ethical Review Body (Ethics Approval 67369). Dogs from the Pet Health Nutrition Center (US) were cared for in accordance with the Animal Welfare Act and AAALAC guidelines and approved by the IACUC.
Study design
Intra-individual and inter-breed study design
In the first study, we analyzed a 32-dog cohort from 4 breeds (8 per breed) that were sampled twice, a year apart. These samples were taken during biannual health checks at Waltham Petcare Science Institute. The additional blood (0.5 ml) required for this study was added to RNAProtect Animal tubes (Qiagen Inc), incubated for 24 h at 4 °C and then stored at −80 °C. Using this biobank, we selected a set with similar features (balance of sex, number of litters, and age range) sampled 1 year apart. This group consisted of 8 adults (range 4.8 to 7.7 years at first sample collection used in this study) each of 4 breeds (Beagle [BE], Norfolk Terrier [NT], Petit Basset Griffon Vendeen [PB] and Labrador Retriever [LR]; Table 1). These samples were sent on dry ice to Girihlet Inc. for analysis.
Table 1.
Characteristics of dogs selected for the longitudinal study.
| Beagle (BE) | Norfolk Terrier (NT) | Petit Basset Griffon Vendeen (PB) | Labrador Retriever (LR) | |
|---|---|---|---|---|
| Age range (years) | 5.9–7.4 | 4.8–5.3 | 6.4–7.1 | 4.8–7.7 |
| Sex (M/F) | 4/4 | 5/3 | 3/5 | 4/4 |
| No. of litters | 4 | 5 | 3 | 4 |
Their distribution by age, gender and litters, across different breeds, are listed.
Lifestages study design
To achieve a wide age distribution and capture the age range of Labrador Retrievers, 50 dogs were recruited from 2 cohorts within Mars Petcare, at Waltham Petcare Science Institute (WPSI, UK) and Pet Health Nutrition Centre (PHNC, US) (Table 2). Exclusion criteria were having a known morbidity (such as cancer or diabetes) and, to minimize skewing of the natural T cell populations, vaccinations were not to be given 30 d prior to the blood sample. There were no specific housing or dietary requirements and dogs received standard care. The dogs were selected to provide quintiles representing the different life stages of dog balancing for gender across the 2 sites (Table 2) and were from 36 litters. Each of the 50 dogs were placed into 1 of 5 categories, Puppy (P) ranging in age from 0 to 1.1 yr, Youth (Y) aging in age from 1.1 to 5.0 yr, Early Midlife (E) ranging in age from 5 to 8 yr, Late midlife (L) ranging in age from 8 to 10 yr, and Senior (S) comprising of dogs older than 10 yr of age (Table 2).
Table 2.
Lifestage categories defined by age for the 50-dog cohort.
| Group | Age Range (Years) | No. of dogs (WPSI) | No. of dogs (PHNC) | No. of litters |
|---|---|---|---|---|
| 1 | 0–1.1 (Puppy P) |
|
2 Gender; 0F, 2M Litter; 2 | 6 |
| 2 | 1.1–5 (Youth Y) | 5 Gender; 2F, 3M Litter; 2, 1, 1, 1 | 5 Gender; 3F, 2M Litter; 2, 1, 1, 1 | 8 |
| 3 | 5–8 (Early midlife E) | 5 Gender; 4F, 1M Litter; 2, 1, 1, 1 | 5 Gender; 3F, 2M Litter; 2, 1, 1, 1 | 8 |
| 4 | 8–10 (Late midlife L) | 6 Gender; 3F, 3M Litter; 2, 2, 2 | 4 Gender; 3F, 1M Litter; 1, 1, 1, 1 | 7 |
| 5 | 10+ (Senior S) | – | 10 Gender; 5F, 5M Litter; 2, 2, 2, 1, 1, 1, 1 | 7 |
The dogs come from 2 centers, WPSI in the United Kingdom and PHNC in the United States. Each cell in the third and fourth columns show the number of dogs for each age group by center, the gender mix (F, M) in the second row, and their partition into litters (numbers in each litter) in the third row.
Blood sampling and preparation
Each dog had a single jugular blood sample (2 ml) collected in a 2 ml EDTA tube. To ensure consistency in T cell isolation and mRNA extraction from blood sampled at the two locations, blood samples were sent fresh to Girihlet for T cell isolation, mRNA extraction, and selective amplification and sequencing. Samples were stored at 4 °C until sent on ice by courier and processed within 7 d of sampling.
Experimental methods
Blood processing
Two different methods were employed to process blood prior to RNA extraction in the 2 studies. The intra-individual and inter-breed study used biobanked whole-blood samples stored in RNAProtect, while the cross-sectional life stages study used fresh blood samples with RBC lysis to provide PBMCs.
RNA extraction from samples preserved in RNAProtect
RNA was extracted from 0.5 ml of blood stored in RNAProtect reagent from Qiagen using manufacturer’s instructions (RNeasy Protect Animal Blood Kit, Cat. No.: 73224).
RNA extraction using lysis of PBMCs
Total RNA from lymphocytes pelleted from 1 ml of blood was extracted using the AllPrep DNA/RNA Micro Kit from Qiagen (Cat. No.: 80284) following the protocol recommended by the manufacturer.
Tseek
Once the RNA was extracted by either of the methods above, its quality (RIN score) and quantity were checked using the RNA Bioanalyzer Nano (Agilent Biotechnologies). RIN scores greater than 8 (max 10) are considered good quality. Messenger RNA (mRNA) from total RNA (500 ng) was isolated and fragmented and reverse transcribed using random primers. Adapters with 8 bp molecular indices were ligated to all the blunt ended fragments. Nested polymerase chain reaction (PCR) was then performed using a TCR constant primer and an adapter primer. Libraries were quantified again using a bioanalyzer and sequenced.
Sequencing
Paired-end sequencing (150 bp PE) was performed on the libraries generated by Tseek using Illumina instruments (Next seq 500, HiSeq X, MiSeq). The libraries were tested on a MiSeq. For the intra-individual and inter-breed study, the samples were sequenced on a HiSeq X and samples from the 50-dog lifestage cohort were sequenced on a NextSeq 500.
Analytical methods
Processing tseek data
The data from Tseek are nucleotide sequences that start in the V segment, span the CDR3 region, and cover the whole J segment in α and β-chains. The sequencing reads were merged for paired-end data and mapped to the V and J segments (based on annotations from Gencode21 as well as those derived from our data). Reads annotated with V and J identities were translated to amino acid sequences (all 3 frames), and frames with stop codons were filtered out. Based on the V and J annotations, the location of the CDR3 region was identified, and the precise boundaries were determined using motifs for the boundaries derived from annotated sets of CDR3 from the reference set in the IMGT database.22 Separate tables were created for the V and J segments, the V-J pairs, and CDR3 counts, across all samples in the study. The rest of the analyses were based on these master tables.
Analysis of CDR3 data
The CDR3 data consist of amino-acid sequences up to 15 aa in length, with a million (or more) distinct CDR3 per sample. The depth of sequencing determines the number of CDR3 sequences detected in a sample. The methods described here were also implemented in tools on the companion website to this study, https://katahdin.girihlet.com/shiny/dog/, which allows users to explore the data and generate the figures and tables shown in the paper.
The analysis of CDR3 involves normalization so that the samples can be compared to each other, and features of interest can be identified. The details of this process are as follows.
Normalization of CDR3. Variability in the depth of sequencing impacts measured characteristics (such as entropy) of the repertoires. Normalization of the data allows for fair comparisons between samples and building higher-level characterizations. Key features of the repertoire distribution considered in developing the normalization scheme developed here were (1) the distributions have a small number of highly abundant clones; (2) the distributions have an extremely long tail of low-frequency clones, deeper sequencing fills the tail with more rare clones, and these sometimes dominate the distance measures; (3) scaling cannot recover sequences that were missed due to lower sequencing depths. This led to the development of the following normalization scheme: (1) generate a frequency for each CDR3 seen in the sample, (2) sample CDR3 from this distribution (with replacement) till we obtain 5 million CDR3, (3) use the sampled set of CDR3 for further analysis.
Characterizing the repertoires. Entropy is a metric used to characterize the diversity in a repertoire. The more diverse a repertoire, the higher the entropy, which translates to a more even distribution of clones. A repertoire with a few dominant clones would have a lower entropy. Entropy (H) for a vector with N probability members is given by where, and i=1….N. The maximum entropy possible is log(N), for the uniform distribution, which is used to normalize the entropy. Entropy is used to compare diversities.
Comparisons of samples. Any comparison of samples requires that a distance be defined between the CDR3 distributions. We used the Jensen-Shannon (JS) distance,23 which is calculated by first evaluating the JS divergence between two probability distributions, and . The JS divergence is defined as where, the are relative weights of the two distributions, satisfying . This divergence becomes a true distance upon setting and taking its square root,24 which is the definition used in our calculations.
Clustering the data. The distance matrix was used to cluster the data, using hierarchical cluster analysis (HCA) and multidimensional scaling (MDS) (Figs 1 and 2). Specifically, “hclust” was used in R25 to generate the trees and “cmdscale” was used to generate coordinates for MD.
Uncertainty in clustering. Abootstrap approach, by repeatedly sampling from the original data and calculating the clusters, was used to establish that the clusters were stable and reliable.
Scoring breed identity with clones. A set of the most abundant clones with breed-specific signatures (from the top 100,000 clones, out of millions of unique clones in the data set) were each given a breed-score based on log of their average frequency in each breed (with a 1 added to the frequencies). In the test set the breed of the dog was inferred by scoring the top-ranked clones that occurred in the training set and using the highest sum of scores to identify the breed.
The role of age/litter identity in shaping the TCR repertoire (Table 3, Fig. 3). Each of the 50 dogs in the lifestage study were placed into 1 of 5 categories, Puppy (P), Youth (Y), Early Midlife (E), Late midlife (L), and Senior (S) (Table 2). In the CDR3 expression table (each row labelled with a distinct CDR3, and columns labelled by samples and each entry representing expression of the CDR3 in samples) rows were sorted by the maximum value in each row. Clones expressed in multiple dogs were ignored for this analysis. A chi-square test was used to investigate the relationship between the distribution of clones and age groups, demonstrating that there was a significant effect of age on the distribution of the more frequent clones.
Figure 1.
Hierarchical clustering (HCA). HCA demonstrates stability of the repertoire within the individual over time and also grouping by breed. In the α-chain CDR3 tree (A), the Beagle (BE) and Labrador Retriever (LR) cluster separately, distinct from the Petit Basset Griffon Vendeen (PB) and Norfolk Terrier (NT) groups, while in the β-chain CDR3 tree (B), all 4 breed groups separate. Year-apart samples from each individual (_1 and _2) cluster mostly with each other, demonstrating stability of the α-chain CDR3 tree (A) and β-chain CDR3 tree (B) repertoires over a year. Samples from the dogs LR_D and LR_B were collected as part of the 50-dog cohort within the same year as the 2 bio-banked samples but processed by alternate methods. They were included in this analysis and mostly clustered with their biobank samples (the red stars) except the LR_D β-chain sample, which failed to cluster with its partners.
Figure 2.
MDS and heatmap view of distance between repertoires. Multi-dimensional scaling (MDS) of α-chain CDR3 tree (A) and β-chain CDR3 tree (B) samples exhibit grouping by breed. Beagle (BE) and Labrador Retriever (LR) samples are well separated from each other and from the Petit Basset Griffon Vendeen (PB) and Norfolk Terrier (NT) samples in both the α and β chain CDR3 data. When the PB and NT samples are separately analyzed, they do not separate out in the α-chain CDR3 data, but separation was possible in the β chain (inset, B). The bottom panels show a heatmap of the distances between samples for α chain (C) and β chain (D). The trivial zeroes along the diagonals have been removed, and depicted with white squares, for clarity. The differences between α and β chain as well as between the breeds is visualized more readily in the heatmap. These views use the same data as the trees in Fig. 1, but the distinct visualizations emphasize different features of the data.
Table 3.
High-expressed clones by rank.
| Ranks (chain) | Puppy (P) 0–1.1 yr | Youth (Y) 1.1–5 yr | Early Midlife (E) 5–8 yr | Late Midlife (L) 8–10 yr | Senior (S) 10+ yr |
|---|---|---|---|---|---|
| 0–20 (α) | 0 | 1 | 1 | 2 | 7 |
| 21–50 (α) | 0 | 6 | 3 | 5 | 6 |
| 51–75 (α) | 0 | 2 | 0 | 6 | 7 |
| 75–100 (α) | 2 | 1 | 3 | 4 | 2 |
| 0–20 (β) | 0 | 2 | 1 | 3 | 11 |
| 21–50 (β) | 0 | 5 | 1 | 6 | 13 |
| 51–75 (β) | 0 | 1 | 0 | 3 | 10 |
| 75–100 (β) | 0 | 2 | 2 | 3 | 6 |
This table enumerates high-expression clones (for α, β chains) that are unique to a sample (Fig. 4 shows some examples). The first row shows the number in the top 20 clones sorted by frequency (α chain). The group of oldest dogs (S) tends to have more of these than the youngest dogs (P), suggesting that these clones tend to increase in frequency over time, reducing the diversity in the repertoire. Figure 5 further bolsters this narrative.
Figure 3.
Moderate expression of breed-specific CDR3 clones. Examples of breed specific clones, the y-axis (log-scale) is expression levels, and the x-axis are samples. Panel A shows the data for an α-chain CDR3 that is expressed mostly in Labrador Retrievers. Panel B shows the data for a β-chain CDR3 that is expressed mostly in Labrador Retrievers. These clones are indicative of breed-specific expression patterns with lower average expression of these across the samples in a breed compared to the highly expressed, individual-specific clones (Fig. 4). These are specific examples picked to demonstrate the abundance levels of such breed-specific CDR3; each breed has such breed-specific clones. To help distinguish neighboring bars, lines were added, they are purely a visual aid with no significance.
Results
The canine TCR repertoire
We achieved the requisite sequencing depth (> 1 million reads) for most samples (110 out of 114) in the 2 studies, as shown in the plots (Fig. S2). The unevenness in the coverage of different samples was handled by the sampling procedure outlined in the methods section. To determine whether the Tseek data were consistent with other approaches, we elected to use qPCR for some segments and demonstrated that the results of Tseek were concordant with the qPCR data (Fig. S3). Since Tseek allows discovery of segments, our data resulted in an authoritative annotation of the V and J segments in the TCR locus for α and β-chains in the dog genome, which was essential for deriving the results described below.
Analysis and interpretation of the canine TCR repertoire
Individual and breed-specific signatures exist in the TCR α and β chain repertoires.
We hypothesized that an individual's TCR repertoire remains distinct over time. To test this, we applied hierarchical clustering analysis (HCA) to the pairwise distance matrix calculated using the JS metric. Beyond revealing repertoire patterns, this clustering served as a critical quality control (QC) step, allowing us to detect potential issues such as low sequencing depth, sample mishandling, or contamination. These were reflected either in much higher than usual similarity between unrelated samples (either between individuals or breeds) or much lower similarity between year apart samples which were often the result of low sequencing depth or low-quality RNA. Based on quality checks (low quality/quantity RNA, sequencing depth, potential mislabeling, and sample contamination) the following samples were removed from the analysis; LR_H_2, NT_B_1, LR_G_2, BE_C_2 and PB_G_1 in the α-chain CDR3 and LR_H_1 and PB_F_2 in the β-chain CDR3.
The dendrogram of HCA results shows that samples in the breed study tended to cluster by breed, suggesting the repertoires encoded breed-associated information (Fig. 1). The LR and BE groups formed breed-specific clusters in both the α and β chain CDR3 data sets, while the PB and NT samples tended to group together. Within the HCA, 21 of the 26 possible individual pairs paired correctly in the α-chain data and 28 of a possible 30 individual pairs paired correctly in the b-chain data. These data indicate a signature associated with the individual that is stable for at least a year. Even though the same underlying distance data are used, both multidimensional scaling (MDS) and heatmaps were used to further investigate the separation of groups, because they are sensitive to different features and reveal patterns that are not apparent from the hierarchical trees (Fig. 2). Separation between the PB and NT samples was achieved by applying MDS to just the PB/NT samples, but for the β-chain repertoire only. In the heatmaps, visual inspection shows the relatively strong intra-breed connections in BE and LR, relative to the PB and NT with differences between the patterns for the α and β chains (Fig. 2).
To understand the observed clustering, we investigated further the patterns in the relevant clones by examining their individual characteristics. The clones responsible for individual identity are primarily the high-abundance ones (Table 3, Fig. 4), while the breed-specific clones are relatively less abundant on average (Fig. 3).
Figure 4.
High expression of some individual-specific CDR3 clones. The y-axis (log-scale) shows expression levels, the samples are along the x-axis. Panel A: Expression of an α-chain CDR3 that is highly expressed (several 100,000 copies) in a Late midlife dog, compared to expression in other dogs; Panel B: Expression of a β-chain CDR3 that is highly expressed (several 100,000 copies) in a Puppy dog. Each such CDR3 reduces the diversity (or entropy) in the sample, and the expression and number of these individual-specific CDR3 goes up with age (Fig. 5). This indicates there are highly expressed, individual-specific clones, that are not shared with litter mates or other dogs from the same breed. Table 3 gives the counts of such clones in different age groups.
In the intra-individual and inter-breed study, the samples usually clustered by individual in HCA (Fig. 1) demonstrating that the TCR repertoire is stable for at least a year and individuals may be identified by their TCR, akin to a fingerprint. We were able to test this post hoc as 2 of these Labradors were also members of the 50-dog cohort. Despite different experimental processing, their data clustered with the samples from the corresponding dogs in the longitudinal study (Fig. 1), providing evidence that the data are robust, and interpretation may be unaffected by changes in methods of RNA isolation or sequencing instruments.
Despite the littermates being of the same age and sharing relatively similar exposures and genetic backgrounds, their repertoires diverge, as measured by pairwise JS distances. The closest sample in the cohort are often individuals unrelated by age or litter. Additionally, abundant clones show little similarity in expression between littermates (an example is shown in Fig. 4). This is true even in very young (< 1 yr old) littermates. We know there are breed-specific CDR3 that provide some measure of similarity, but there are sufficient numbers of abundant, individual-specific clones, even amongst littermates, to enable distinguishing individuals based on the repertoire alone.
The repertoire model for breeds
Analysis of the distribution of individual CDR3 across breeds suggested that there were a set of CDR3 sequences that were common to members within a breed (examples for Labrador retriever in both α and β-chains are shown in Fig. 3). We used the breed-scoring scheme described in the methods section to identify the breeds of the 50-dog cohort, which were all Labrador retrievers, on the basis of the repertoire alone. Greater than 90% of the samples were correctly identified for the β repertoire, but the model performed less effectively for the α repertoire (60% correct breed identification).
Effect of age on the TCR repertoire
The CDR3 clones were sorted by maximum frequency and clones unique to an individual (Fig. 4) were placed in the S, L, E, Y, and P categories, based on the age of the dog. These were tabulated (Table 3) with the data organized by the top 20, the next 30 and so on. The S (senior) category has the largest number of such clones, followed by the L group, E is next, followed by Y, and the P (Puppy) group trails behind all. Thus, the S category dogs have more of their CDR3s occurring in high-abundance, individual-specific clones relative to the rest of their repertoire.
The distribution of high-abundance individual-specific clones in different age categories is significantly different from random (P-value < 0.0001 for top 100 clones using the χ2 test).
Samples with a larger number of highly expressed clones have a lower diversity, or more concentrated distribution (lower entropy), compared to samples that don’t have such outliers. These data are consistent with dogs in the Puppy (P) stage having a relatively even distribution that may progress across the lifestages to a more “peaked” distribution, characterized by many individual-specific super-abundant clones, as the dogs grow older. This is seen as an abrupt change in the distribution of highly expressed individual-specific clones between the P (Puppy) and the Y (young adult) categories. The average expression levels of the top 3 individual-specific clones across samples in each age group corroborate these observations (Fig. 5).
Figure 5.
Expression levels of highest expressed clones unique to each dog (A, B), averaged by age-category of dogs (C, D). Average expression (y-axis logscale) of the top 3 α-chain and β-chain clones unique to each dog (x-axis), ordered by age and colored by age category (A, B). The age progression of the average from Puppy(P) through Youth(Y), Early midlife(E), Late midlife(L), and Senior(S) life stages is also shown (panels C, D). This trend doesn’t change if instead of 3, the top 20 individual-specific clones were chosen. Panels C and D demonstrate a trend in the restriction of the TCR repertoire with age. The quadratic fit to the log values (red line) is a guide to the trend, a lot more data are needed to establish the parameters of any such fit.
Discussion
Our objective was to establish methods for bulk TCR repertoire characterization in dogs and identify key drivers of repertoire variance. We demonstrated that bulk TCR repertoire profiling can provide broader insights than single-cell analysis of the T cell receptor.
We optimized sample processing by implementing an RNAlater method that requires minimal blood volume (0.5 ml vs 2 ml) with reduced laboratory processing compared to WBC lysis, while yielding comparable results (Fig. 1). This approach facilitates sample storage for subsequent comparative analyses of repertoire responses to immune challenges.
Individual TCR profiles remained remarkably consistent over at least one year, primarily due to the presence of highly abundant clones. This persistence suggests these expanded clones are not responses to acute immune activation but rather to persistent antigenic exposure, similar to microbiome-driven expansions reported in mice.26 This stability establishes the TCR repertoire as a valuable biomarker that, when monitored longitudinally, can provide meaningful insights into health status, consistent with findings of human studies.6
One of the merits of Tseek is its unbiased approach, which allows the discovery of novel segments. A recent study that demonstrated the promise of measuring the clonality of beta and delta receptors in gastrointestinal lymphoma in dogs sometimes failed to detect relevant clones due to a lack of appropriate primers.27 As more diverse breeds are investigated, identifying breed-specific segments becomes essential, which Tseek enables.
Tseek measures the bulk α and β repertoires separately, raising questions about the value of unpaired analysis versus paired αβ analysis for receptor specificity. However, the repertoire's complexity (millions of clones in a long-tail distribution) necessitates deep sampling, achievable only through bulk sequencing, especially for probing mid-distribution clones where much of the immune response signal resides based on human vaccine studies.6
Our data consistently revealed differences between the α and β chain repertoires including (i) the distinct clustering of NT/PB dogs in the α and β chain repertoires (Fig. 1), (ii) differences between the α and β repertoires in the ability to predict breed, (iii) different patterns in the heatmaps of the repertoires for the 2 chains (Fig. 2). These observations are consistent with the independent development of the 2 chains; for example, studies have shown that the numbers of responsive clones in the 2 chains are different, the responsive parts to the COVID vaccine in humans have more numerous α chain components compared to the β chain repertoire.6 All the T cells in an individual are not numerous enough to sample all possible α/β combinations, so a reactive T cell cannot be optimized for binding both chains, suggesting that the binding of the T cell to epitopes is likely the result of independent, additive binding of the α and β chain in the receptor dimer. Thus, both chains should be evaluated, potentially independently, while assessing T cell responses.
Immune function often declines with age presumably due to reduced repertoire diversity. Our 50-cohort study provides us with a means to identify changes in the TCR repertoire with age in Labrador retrievers. One advantage of performing aging studies in purebred dogs, compared to humans or other mammals, is the reduced variability due to genetic background, allowing the effects of aging to standout. There have been reports of a noticeable reduction in the repertoire diversity in older Labradors (10-13 years of age), but these have been based on relatively crude methods using size distributions of CDR3 as evidence.28 Based on our data, we identify a trend of increasing concentration in a subset of clones in each individual, to the detriment of overall diversity, as the dogs age. These signatures can serve as a rough marker of age, but individual variability suggests these might not be good “clocks”. Instead, they might define an “immunological age” as a measure of health. The implications of this for immune function need further investigation.
The breed-associated signatures in the TCR repertoire might originate in the differences in the Dog Leukocyte Antigen (DLA), potentially explaining the breed-specificity of certain disorders, particularly autoimmune conditions. This insight could guide the development of novel diagnostic assays and targeted treatments.
While breed limits genetic variability, it does not provide an understanding of variability between individuals, especially among littermates, who do not share highly expressed clones. This indicates that individual CDR3 sequences in the repertoires respond to stimuli in a stochastic manner. Therefore, we can conclude there are multiple possibilities for responsive clones in an individual, with selection and response magnitude occurring stochastically—an important consideration when identifying T cell targets from repertoire sequences.
Previous studies have observed TCR repertoire changes in dogs as potential health biomarkers. For example, reduced diversity of TCR repertoire was associated with inflammatory gastrointestinal disease in dogs29 and the TCR repertoire was used to study progression of autoimmune disorders in dogs after bone-marrow transplants.30 Single-cell profiling of the TCR repertoire in dogs was used to associate responsive clones to immunotherapies,31 though limited clonal detection (thousands per sample) restricted definitive conclusions. Using TCR as a biomarker for type I diabetes has also been proposed along with a suggestion that building a database of repertoires would facilitate biomarker development.32 Despite attempts to introduce clonality testing in veterinary medicine by providing guidelines for their use,33 clinical impact remains minimal.
Several limitations warrant consideration. The time series was constrained to a year, preventing determination of longer-term repertoire stability. While we recruited Labrador retrievers across a wide age range, they came from 2 locations (UK and US), introducing breeding sub-population differences and environmental/husbandry variations. The exclusively US-based oldest dog group potentially introduces confounding effects. This study is a proof-of-concept, additional large-scale studies are needed to fully realize the potential of these measurements.
Future studies in dogs should consider the breed as a potentially confounding variable. It would be of great value to extend this study to a wider range of breeds. Another useful study might focus on the differences in the effect of age on the repertoires in dogs from breeds with widely differing average lifespans.
Developing these techniques in dogs provides multiple benefits. They directly impact the development of diagnostics and therapies for dogs specifically, but more importantly, if dogs become a good model for human diseases, then there can be an exchange of therapies (such as immunotherapies) from humans to dogs and vice versa.
Summary
We established robust methods for sampling and analyzing bulk TCR repertoires in dogs. Using a cohort of dogs representing different genotypes we demonstrated a role for differential selection of α and β chains. Individuals have highly abundant features that are stable for at least a year, indicating sustained clonal expansion. We identified a significant age-associated decrease in immune repertoire diversity, most pronounced between the youngest (P) and the oldest dogs (S), with a consistent trend across age-defined groups. We propose that longitudinal bulk TCR analysis using Tseek during routine health assessments could help identify the effect of interventions, diseases, and vaccinations while advancing immunological research.
Supplementary Material
Acknowledgments
The authors thank Janet Alexander for establishing the biobank of whole blood in RNAProtect that allowed the breed cohort study to take place, and Rhiannon Reynolds for supporting the appropriate individuals from the database. They also thank all the trainers, carers, and technical staff that enable our dogs to contribute to our caring science.
Contributor Information
David Allaway, Waltham, Petcare Science Institute, Leicestershire United Kingdom.
Matt Harrison, Mars Advanced Research Institute, McLean, VA, United States.
Claire Pink, Waltham, Petcare Science Institute, Leicestershire United Kingdom.
Richard Haydock, Waltham, Petcare Science Institute, Leicestershire United Kingdom.
Anitha Devi Jayaprakash, Girihlet Inc., Oakland, CA, United States.
Ravi Sachidanandam, Girihlet Inc., Oakland, CA, United States; New York Medical College, Valhalla, NY, United States.
Author contributions
D.A. Writing, Conceptualization, Resources, Project administration M.H. Resources, Conceptualization, Funding acquisition, Project administration C.P. Formal analysis, Writing—Editing, Review R.H., Writing—Editing, Review, Formal analysis A.D.J. Methodology, Investigation, Project administration, Funding acquisition RS Writing, Visualization, Data Curation, Formal analysis, Software, Project administration.
Supplementary material
Supplementary material is available at ImmunoHorizons online.
Funding
Funding was provided by Mars Inc. The funders had no role in the design, analysis or writing of this article.
Conflicts of interest
A.J. and R.S. are inventors on the Tseek patent (USPTO 10,920,220) and are co-founders of Girihlet Inc. ,which has licensed the Tseek patent from Icahn school of medicine at Mount Sinai. D.A., M.H., R.H. and C.P. are employees of Mars Inc.
Data availability
A website with tools resources and downloads has been set up at https://katahdin.girihlet.com/shiny/dog All data used in this paper will be made available at this site.
References
- 1. Kappler JW, Kushnir E, Marrack P. Analysis of V beta 17a expression in new mouse strains bearing the V beta a haplotype. J Exp Med. 1989;169:1533–1541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Mitchell AM, Michels AW. T cell receptor sequencing in autoimmunity. J Life Sci (Westlake Village). 2020;2:38–58. 10.36069/jols/20201203 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Springer I, Tickotsky N, Louzoun Y. Contribution of T cell receptor alpha and beta CDR3, MHC typing, V and J genes to peptide binding prediction. Front Immunol. 2021;12:664514. 10.3389/fimmu.2021.664514 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Shen Y, Voigt A, Leng X, Rodriguez AA, Nguyen CQ. A current and future perspective on T cell receptor repertoire profiling. Front Genet. 2023;14:1159109. 10.3389/fgene.2023.1159109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Jayaprakash AD, Chess A, Sachidanandam R. Methods for determining recombination diversity at a genomic locus. US10920220B2. 2021. Accessed April 30, 2021. https://patents.google.com/patent/US10920220B2/en
- 6. Mohammed K et al. The T cell receptor repertoire reflects the dynamics of the immune response to vaccination [preprint], bioRxiv. 2021. 10.1101/2021.12.09.471735 [DOI]
- 7. Chen L et al. Diet modifies colonic microbiota and CD4+ T-cell repertoire to induce flares of colitis in mice with myeloid-cell expression of interleukin 23. Gastroenterology. 2018;155:1177–1191.e16. 10.1053/j.gastro.2018.06.034 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Yang G et al. Next generation sequencing of the pig αβ TCR repertoire identifies the porcine invariant NKT cell receptor. J Immunol. 2019;202:1981–1991. 10.4049/jimmunol.1801171 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Pai C-CS et al. Clonal deletion of tumor-specific T cells by interferon-γ confers therapeutic resistance to combination immune checkpoint blockade. Immunity. 2019;50:477–492.e8. 10.1016/j.immuni.2019.01.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Plassais J et al. Whole genome sequencing of canids reveals genomic regions under selection and variants influencing morphology. Nat Commun. 2019;10:1489. 10.1038/s41467-019-09373-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Chow L, Wheat W, Ramirez D, Impastato R, Dow S. Direct comparison of canine and human immune responses using transcriptomic and functional analyses. Sci Rep. 2024;14:2207. 10.1038/s41598-023-50340-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Dow S. A role for dogs in advancing cancer immunotherapy research. Front Immunol. 2019;10:2935. 10.3389/fimmu.2019.02935 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Shearin AL, Ostrander EA. Leading the way: canine models of genomics and disease. Dis Models Mech. 2010;3:27–34. 10.1242/dmm.004358 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Mineccia M et al. New insight into the genomic structure of dog T cell receptor beta (TRB) locus inferred from expression analysis. Dev Comp Immunol. 2012;37:279–293. 10.1016/j.dci.2012.03.010 [DOI] [PubMed] [Google Scholar]
- 15. Rabiger FV et al. Distinct Features of Canine Non-conventional CD4−CD8α− Double-Negative TCRαβ+ vs. TCRγδ+ T Cells. Front Immunol. 2019;10. Accessed December 18, 2023. https://www.frontiersin.org/articles/10.3389/fimmu.2019.02748 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Eschke M, Moore PF, Chang H, Alber G, Keller SM. Canine peripheral blood TCRαβ T cell atlas: Identification of diverse subsets including CD8A+ MAIT-like cells by combined single-cell transcriptome and V(D)J repertoire analysis. Front Immunol 2023;14:1123366. 10.3389/fimmu.2023.1123366 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Wu Y et al. Phenotypic characterisation of regulatory T cells in dogs reveals signature transcripts conserved in humans and mice. Sci Rep. 2019;9:13478. 10.1038/s41598-019-50065-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Keller SM, Moore PF. Rearrangement patterns of the canine TCRγ locus in a distinct group of T cell lymphomas. Vet Immunol Immunopathol. 2012;145:350–361. 10.1016/j.vetimm.2011.12.008 [DOI] [PubMed] [Google Scholar]
- 19. Zuleger CL et al. Development of a next-generation sequencing protocol for the canine T cell receptor beta chain repertoire. Vet Immunol Immunopathol. 2024;268:110702. Dec 10.1016/j.vetimm.2023.110702 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. MagicTM canine TCR repertoire profiling—creative biolabs. Accessed December 15, 2023. https://www.creative-biolabs.com/magic-canine-tcr-repertoire-profiling.html
- 21. Frankish A et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 2019;47:D766–D773. 10.1093/nar/gky955 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Lane J, Duroux P, Lefranc M-P. From IMGT-ONTOLOGY to IMGT/LIGMotif: the IMGT standardized approach for immunoglobulin and T cell receptor gene identification and description in large genomic sequences. BMC Bioinform 2010;11:223. 10.1186/1471-2105-11-223 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Lin J. Divergence measures based on the Shannon entropy. IEEE Trans Inform Theory 1991;37:145–151. 10.1109/18.61115 [DOI] [Google Scholar]
- 24. Endres DM, Schindelin JE. A new metric for probability distributions. IEEE Trans Inform Theory 2003;49:1858–1860. 10.1109/TIT.2003.813506 [DOI] [Google Scholar]
- 25. R Core Team. R: a language and environment for statistical computing. R Foundation for Statistical Computing. 2024. https://www.R-project.org/ [Google Scholar]
- 26. Chen YE et al. Engineered skin bacteria induce antitumor T cell responses against melanoma. Science 2023;380:203–210. 10.1126/science.abp9563 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Takanosu M, Kagawa Y. A clonality analysis based on T-cell receptor beta and delta loci for high-grade gastrointestinal lymphoma in dogs. J Vet Diagn Invest 2022;34:972–976. 10.1177/10406387221116285 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Holder A et al. Perturbation of the T cell receptor repertoire occurs with increasing age in dogs. Dev Comp Immunol 2018;79:150–157. 10.1016/j.dci.2017.10.020 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Olivero D, Turba ME, Gentilini F. Reduced diversity of immunoglobulin and T-cell receptor gene rearrangements in chronic inflammatory gastrointestinal diseases in dogs. Vet Immunol Immunopathol 2011;144:337–345. 10.1016/j.vetimm.2011.08.011 [DOI] [PubMed] [Google Scholar]
- 30. Vernau W et al. T cell repertoire development in XSCID dogs following nonconditioned allogeneic bone marrow transplantation. Biol Blood Marrow Transplant. 2007;13:1005–1015. 10.1016/j.bbmt.2007.05.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Hoang MH et al. Single-cell T-cell receptor repertoire profiling in dogs. Commun Biol 2024;7:484–416. 10.1038/s42003-024-06174-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Nakayama M, Michels AW. Using the T cell receptor as a biomarker in type 1 diabetes. Front Immunol 2021;12:777788. 10.3389/fimmu.2021.777788 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Keller SM, Vernau W, Moore PF. Clonality testing in veterinary medicine: a review with diagnostic guidelines. Vet Pathol. 2016;53:711–725. 10.1177/0300985815626576 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
A website with tools resources and downloads has been set up at https://katahdin.girihlet.com/shiny/dog All data used in this paper will be made available at this site.





