Extensive phylogenetic analysis of chemokine signalling molecules reveals the origin and evolution of canonical and non-canonical components, shedding light on the evolution of this complex system.
Abstract
Chemokine signalling performs key functions in cell migration via chemoattraction, such as attracting leukocytes to the site of infection during host defence. The system consists of a ligand, the chemokine, usually secreted outside the cell, and a chemokine receptor on the surface of a target cell that recognises the ligand. Several noncanonical components interact with the system. These include a variety of molecules that usually share some degree of sequence similarity with canonical components and, in some cases, are known to bind to canonical components and/or to modulate cell migration. Whereas canonical components have been described in vertebrate lineages, the distribution of the noncanonical components is less clear. Uncertainty over the relationships between canonical and noncanonical components hampers our understanding of the evolution of the system. We used phylogenetic methods, including gene–tree to species–tree reconciliation, to untangle the relationships between canonical and noncanonical components, identify gene duplication events, and clarify the origin of the system. We found that unrelated ligand groups independently evolved chemokine-like functions. We found noncanonical ligands outside vertebrates, such as TAFA “chemokines” found in urochordates. In contrast, all receptor groups are vertebrate-specific and all—except ACKR1—originated from a common ancestor in early vertebrates. Both ligand and receptor copy numbers expanded through gene duplication events at the base of jawed vertebrates, with subsequent waves of innovation occurring in bony fish and mammals.
Introduction
The chemokine system is responsible for regulating many biological processes, including host defence, neuronal communication, and homeostasis (1, 2, 3, 4, 5). The system has two components, a ligand, usually a small cytokine called a chemokine, and a receptor. It typically operates through chemoattraction, wherein one cell type produces and secretes chemokines, creating a chemical gradient as these molecules disperse. Cells equipped with the corresponding chemokine receptors on their membranes can recognise and bind to specific chemokines, promoting their migration along the gradient (4). This mechanism allows cells to reach target locations, such as infection sites during inflammation or tissues important for homeostatic functions, for example, leukocyte maturation and trafficking (3, 6). Chemokines involved in the latter homeostatic functions are usually constitutively expressed, whereas those involved in inflammatory responses have an inducible expression (7). Chemokine ligands are categorised into four groups, XC, CC, CXC, and CX3C, according to the pattern of cysteine residues in the N-terminal portion of the protein (8). Likewise, the receptors are classified based on the ligands they bind to into four groups, the XCR, CCR, CXCR, and CX3CR, and all of them belong to the GPCR class A superfamily (9). In addition to canonical components, other molecules have been discovered to function similarly to chemokine ligands (1) or receptors (2) (see Table 1). These include the following: the chemokine-like factor (CKLF) that binds to chemokine receptor CCR4 (10, 11) and drives cell migration in vivo (12); TAFA chemokines, expressed mainly in the nervous system, which share structural similarities to canonical chemokines (25, 26) and bind GPCRs related to chemokine receptors, for example, formyl peptide receptors (FPR) (27, 28) and GPR1 (29); cytokine-like 1 (CYTL1) that binds CCR2 (22) and has been suggested to be related to CC ligands based on the presence of a IL8-like chemokine fold (40). There are also noncanonical chemokine receptors, such as the following: the chemokine-like receptor (CML1, or also CMKLR1) (36); atypical chemokine receptors (ACKRs) (33); and viral chemokine receptors (41, 42, 43, 44). Unlike other chemokine receptors, atypical receptors cannot initiate classical chemokine signaling upon ligand binding (33, 45). The human genome encodes four types of ACKRs: the ACKR1 (also known as DARC), ACKR2 (also known as D6), ACKR3 (also known as CXCR7), and ACKR4 (also known as CCRL1) (34, 35). In addition, several proteins of viral origins, such as US28 from human cytomegalovirus, have chemokine-receptor/binding activity (41, 42). These viral proteins can bind a wide array of chemokine ligands (42).
Table 1.
Summary table of all the canonical and noncanonical chemokine components analyzed in this study.
Names | Abbreviations | H. sapiens orthologs | Functions | References | |
---|---|---|---|---|---|
Ligand Groups Receptor Groups | Canonical chemokines | CCL, CXCL, XCL, CX3CL | CCL1-3, 3L1, 3L3, 4, 4L1-L2, 5, 7, 8, 11, 13, 14-28; CXCL1-4, 4L1, 5-14, 16,17; XCL1,2; CX3CL1 | - Chemokine receptor binding and signalling | (2, 4, 7) |
- Chemoattraction of leukocytes | |||||
- Homeostasis of leukocytes | |||||
CKLF-like MARVEL transmembrane domain-containing proteins (chemokine-like factor super family) | CKLF, CMTM | CKLF1; CMTM1-8 (CKLF, CKLFSF1-8) | - CKLF1 (CKLF) binds to chemokine receptor CCR4 | (1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21) | |
- CKLF1 (CKLF): chemotactic activity for lymphocytes, macrophages, and neutrophils | |||||
- Other CMTMs: variably expressed in immune system; putative roles in immunity, programmed cell death, regulation of antitumour immunity etc. | |||||
Cytokine-like protein 1 (Protein C17 or C4orf4) | CYTL | CYTL1 | - Chemokine receptor binding (CCR2) and signalling | (1, 22, 23, 24) | |
- Chemoattraction monocytes/macrophages | |||||
- Chemotactic activity in neutrophils | |||||
TAFA chemokines (family with sequence similarity 19 (chemokine (C-C motif)-like) member A) | TAFA | TAFA1-5 (FAM19A1-5) | - Formyl-peptide receptor binding and signalling (TAFA4 and 5) | (1, 25, 26, 27, 28, 29, 30, 31, 32) | |
- Putative binding to other GPCRs: GPR1 (TAFA1); S1PR2 (TAFA5) | |||||
- Expressed in central and peripheral nervous system | |||||
- Implicated in vast diversity of physiological processes | |||||
Canonical chemokine receptors | CCR, CXCR, XCR, CX3CR | CCR1-10; CXCR1-6; XCR1; CX3CR1 | - Chemokine binding and signalling | (2, 4, 7) | |
- Chemotaxis of leukocytes | |||||
- Homeostasis of leukocytes | |||||
Atypical chemokine receptors | ACKR | ACKR1-4 (DARC; D6; CXCR7; CCRL1) | - Chemokine binding, but no signalling | (33, 34, 35) | |
- Resolution of inflammatory response | |||||
Chemokine receptor-like (chemokine (C-C motif) receptor-like 2) | CCRL | CCRL2 (ACKR5) | - Binds CCL5 and CCL19, but no signalling | (36, 37) | |
- Binds chemerin and presents it to CMKLR1 | |||||
Chemokine-like receptor 1 | CML | CML1 (CMKLR1; ChemR23) | - Binds chemerin inducing migration of macrophages and dendritic cells | (36) | |
- Binds also other anti-inflammatory molecules (e.g., Resolvin E1 (RvE1)) | |||||
Formyl-peptide receptors | FPR | FPR 1-3 | - TAFA chemokine binding | (27, 28, 38) | |
- Chemoattraction, modulation of inflammation | |||||
Putative chemokine receptors | ACKR6, CXCR8 | PTITMP3, CXCR8 (GPR35) | - ACKR6/PTITMP3 binds CCL18 (NB: It is not a GPCR) | (37, 39) | |
- CXCR8/GPR35 binds CXCL17 |
Despite the extensive research on the chemokine system, with over 320,000 articles available on PubMed, many aspects of its evolution remain unclear. For instance, the homology between canonical and noncanonical ligands is uncertain and supported by circumstantial evidence, such as shared specific motifs (12, 25, 40, 46). Furthermore, the relationships between canonical, atypical, and viral receptors and the outgroup of the canonical chemokine receptors remain uncertain. Finally, the evolutionary history of the canonical and noncanonical components remains poorly understood outside a few key model systems (9, 47, 48). These outstanding questions share common underlying causes, including the use of inadequate inference methods (such as relying solely on sequence similarities) and limited sampling of species (e.g., focusing mainly on humans, mice, and zebrafish (7, 49)). In addition, solving the phylogenetic relationships for short molecules such as chemokine receptors and ligands is particularly challenging because of the lack of strong phylogenetic signals (50).
Here, to clarify these outstanding questions, we use state-of-the-art phylogenetic methods, including those designed for single-gene phylogenies, a large taxonomical sampling comprising both vertebrate and invertebrate genomes and the entire complement of canonical and noncanonical components of both receptors and ligands. Our findings substantially clarify the phylogenetic relationship between canonical and noncanonical ligands and receptors and suggest that unrelated proteins evolved “chemokine-like” ligand function multiple times independently. In addition, we discovered that all the canonical and noncanonical chemokine receptors (except ACKR1) originated from a single duplication in the vertebrate stem group, which also gave rise to many GPCRs. Finally, we characterized the complement of canonical and noncanonical components in the common ancestor of vertebrates and identified several other ligands and receptors with potential chemokine-related properties that could be explored in future functional work.
Results
There are five unrelated groups of ligands
Initially, we focused on the ligands, including all the canonical chemokines, the CYTL, the TAFAs, and the CKLF Super Family (CKLFSF) proteins (Table 1). The presence of a four-transmembrane MARVEL domain in the latter proteins (12, 13, 14) distinguishes them from canonical chemokines, the CYTL and the TAFAs. Therefore, we separated these two groups for further analysis. Using BLASTP or PSI-BLAST (51, 52, 53) (see the Materials and Methods section for more details) against 64 species from 19 animal phyla (Table S1), we identified 891 putative homologs for chemokines, TAFA, and CYTL and 602 putative homologs of the CKLF Super Family.
We used Cluster Analysis of Sequences (CLANS) (54, 55), a clustering tool based on sequence similarity and local alignment, to identify homology within these two groups. Unlike traditional phylogenetic methods, CLANS assigns homology between sequences based on BLAST and customisable stringency levels defined according to P-values (54). When two (or more) sequences are connected at a lower P-value (closer to 0), this indicates a high level of homology. Conversely, if two or more sequences only connect at a higher P-value, this suggests a relatively low level of sequence homology. Our analysis shows that canonical chemokines form a distinct group with a clear distinction between C-X-C-type and C-C-type (Fig 1A), whereas CXCL17, TAFA, and CYTL remain separate from canonical chemokines and from each other even at the loosest P-values tested (Fig 1A). The distinction between CXCL17 and all other canonical chemokines is consistent with our receptor results, showing that the potential receptor for CXCL17, GPR35 (39), is also not within the canonical chemokine receptor group (see below). However, it is important to note that recent studies fail to demonstrate CXCL17 activity at GPR35 (56, 57). Within the CKLFSF, two large clusters were identified, named CKLF I and CKLF II, although these ultimately connect to form one large superfamily (Fig 1B). These clusters are robust to the different stringency thresholds used (Figs S1 and S2 and see the Materials and Methods section for further details). Our results indicate that even when the stringency level to detect homology is relaxed, canonical chemokines, TAFA, CYTL, and CXCL17 remain in distinct clusters. This suggests that, similarly to CKLFs, these proteins are not homologous and convergently evolved chemokine-like properties. We have thus identified five distinct groups of ligands: (i) the canonical chemokines, (ii) TAFA “chemokines,” (iii) CYTL, (iv) CXCL17, and (v) CKLF Super Family (Fig 1A and B).
Figure 1. Cluster Analysis and phylogeny of ligand groups.
(A) Similarity-based clustering, using Cluster Analysis of Sequences, of canonical chemokines and related molecules with sequence similarity. Canonical chemokines are an independent group from other related molecules (TAFA, CYTL, and CXCL17). Canonical chemokines are composed of two large groups (CC type and CXC type) within which some divergent subgroups are highlighted. The clustering and connections shown are at the P-value threshold of 1 × 10−6. Other P-values tested are shown in Fig S1. Candidate invertebrate sequences are shown as crosses and further information regarding them can be found in the Supplementary results section. (B) Similarity-based clustering, using Cluster Analysis of Sequences, of the chemokine-like factor (CKLF) super family (CKLFSF). Two major clusters are formed: the smaller “CKLF Group I” and the heterogenous “CKLF group II” that also includes some invertebrate sequences (shown as crosses). Subclades, including the known members of the CKLF super family, are highlighted. The clustering and connections shown are at the P-value threshold of 1 × 10−15, as this is the threshold at which the two major clusters connect. Other P-values tested are shown in Fig S2. (C) Maximum-Likelihood un-rooted phylogenetic tree of canonical chemokines. CC type and CXC type are split into two separate clades. Supports for key nodes are indicated in boxes with Transfer Bootstrap Expectation represented by triangles and the Ultrafast Bootstraps as circles. A traffic light colour code is used to indicate the level of support: high (green); intermediate (yellow), and low (red). (D) Maximum-Likelihood un-rooted phylogenetic tree of the CKLF super family (CKLFSF). The CKLF group I is monophyletic, whereas the CKLF group II is not. Supports for key nodes are indicated in boxes with Transfer Bootstrap Expectation represented by triangles and the Ultrafast Bootstraps as circles. A traffic light colour code is used to indicate the level of support: high (green), intermediate (yellow), and low (red).
Figure S1. Cluster Analysis of Sequences clustering of chemokines and related molecules sequences.
Initial identification and annotation of clusters was performed at the strict P-value of 1 × 10−35. (A, B) Subsequent loosening of the P-value clarified the relationships across clusters and defined bigger groups. At P-value 1 × 10−15 (B), two major canonical chemokine groups are well defined: the CCL group, which includes also XCL and X3CL; and the CXCL group. At this level of stringency, only few canonical chemokines remain isolated: CCL27/28, CXCL12, CXCL14, CXCL16, and CXCL17. TAFA and CYTL are also isolated. (C) At P-value 1 × 10−10 (C) the two major chemokine groups connect to each other. CCL27/28 is connected to the CCL group and CXCL12 and CXCL14 are connected to the CXCL group, whereas CXCL16 and CXCL17 are still isolated. (D) At P-value 1 × 10−6 (D), all chemokine groups are connected in one big cluster, except for CXCL17. TAFA and CYTL are also still isolated. Crosses indicate the few invertebrate sequences that were collected from the BLAST search, more information in the Supplementary results section.
Figure S2. Cluster Analysis of Sequences clustering of chemokine-like factor (CKLF) Super Family sequences.
Initial identification and annotation of clusters was performed at the strict P-value of 1 × 10−60. (A, B) Subsequent loosening of the P-value clarified the relationships across clusters and defined bigger groups. At 1 × 10−20 (B), two major clusters have formed. One, that we called chemokine-like factor (CKLF) group I, includes CKLF, CMTM1, 2, 3, 5, and PLP2. The other, that we called CKLF group II, includes CMTM4/6, 7, 8, and other groups. (C) At 1 × 10−16 (C), more sequences have joined the two major groups that are still separate. (D) At 1 × 10−15 (D), the two major groups connect and few extra sequences; see the Supplementary results section for extra details. Crosses indicate invertebrate sequences.
The evolution of chemokine and chemokine-like ligands in animals
To better understand the evolution of both canonical and noncanonical chemokine ligands, we performed a separate phylogenetic reconstruction for each group (Figs 1C and D and S3, S4, S5, S6, S7, S8, S9, S10, S11, and S12) (see the Materials and Methods section for details). To evaluate the nodal support, in addition to the UltraFast bootstrap (UFB) (58, 59), we used Transfer Bootstrap Expectation (TBE), a method that has been developed for single-gene phylogeny (60). To evaluate ortholog/paralog relationships and overall dynamics of the ligand complement, we used GeneRax (61). This method uses maximum likelihood to reconcile the gene tree with the species tree (61). In brief, given a gene and species tree, GeneRax uses a maximum likelihood approach to optimise the duplication and loss events (61, 62 Preprint).
Figure S3. Unrooted phylogenetic tree of canonical chemokines with transfer bootstrap expectation supports.
Phylogenetic tree under the model GTR20+F+R4. Nodal support is calculated from 100 nonparametric bootstrap repeats with transfer bootstrap expectation. CCL clade is in orange, CXCL clade in blue.
Figure S4. Unrooted phylogenetic tree of canonical chemokines with UFB supports.
Phylogenetic tree under the model GTR20+F+R4. Nodal support is calculated from 1,000 ultrafast bootstrap repeats. CCL clade is in orange, CXCL clade in blue.
Figure S5. Unrooted phylogenetic tree of TAFA with transfer bootstrap expectation supports.
Phylogenetic tree under the model JTT+R5. Nodal support is calculated from 100 nonparametric bootstrap repeats with transfer bootstrap expectation.
Figure S6. Unrooted phylogenetic tree of TAFA with UFB supports.
Phylogenetic tree under the model JTT+R5. Nodal support is calculated from 1,000 ultrafast bootstrap repeats.
Figure S7. Unrooted phylogenetic tree of CYTL with transfer bootstrap expectation supports.
Phylogenetic tree under the model JTT+I+G4. Nodal support is calculated from 100 nonparametric bootstrap repeats with transfer bootstrap expectation.
Figure S8. Unrooted phylogenetic tree of CYTL with UFB supports.
Phylogenetic tree under the model JTT+I+G4. Nodal support is calculated from 1,000 ultrafast bootstrap repeats.
Figure S9. Unrooted phylogenetic tree of CXCL17 with transfer bootstrap expectation supports.
Phylogenetic tree under the model JTT. Nodal support is calculated from 100 nonparametric bootstrap repeats with transfer bootstrap expectation.
Figure S10. Unrooted phylogenetic tree of CXCL17 with UFB supports.
Phylogenetic tree under the model JTT. Nodal support is calculated from 1,000 ultrafast bootstrap repeats.
Figure S11. Unrooted phylogenetic tree of CKLFSF with transfer bootstrap expectation supports.
Phylogenetic tree under the model GTR20+F+R7. Nodal support is calculated from 100 nonparametric bootstrap repeats with transfer bootstrap expectation. Red clade = CMTM4/6; blue clade = CKLF I group; green clade = CMTM7; turquois clade = MAL/MALL/MAL2.
Figure S12. Unrooted phylogenetic tree of CKLFSF with UFB supports.
Phylogenetic tree under the model GTR20+F+R7. Nodal support is calculated from 1,000 ultrafast bootstrap repeats. Red clade = CMTM4/6; blue clade = CKLF I group; green clade = CMTM7; turquois clade = MAL/MALL/MAL2.
Our analysis initially identified a few invertebrate putative chemokine ligands (Fig 1A), however, these sequences lacked protein signatures associated with the canonical ligands (Figs S13, S14, and S15 and Supplementary File 3 in the GitHub repository: Roberto-Feuda-Lab/Chemokine2023 (github.com)), and they were therefore excluded from further analysis (see the Supplementary results section for further information). The phylogenetic tree for the canonical ligands identifies two major groups, the CC-type, which also includes the XC and X3C types, and the CXC type (TBE = 0.95, UFB = 92%) (Figs 1C and S3 and S4), confirming the previous finding obtained using synteny data (63, 64). Next, to clarify the distribution of canonical chemokines, we first reconciled their gene tree with the species tree and then used the reconciled tree to trace the presence/absence of each chemokine group throughout all the species (Figs 2A and S16). Our results confirm previous findings that canonical chemokines are uniquely present in vertebrates (47, 63). In addition, they indicate that chemokines are not evenly distributed across vertebrates and can be different even between closely related species (65). Some are very ancient, for example, CXCL12 is present in lamprey; CXCL14 and CCL20 are present in all jawed vertebrates; and CXCL8 is present throughout bony fishes and tetrapods, with few exceptions, notably mice and rats. However, a large part of the chemokine diversity evolved within mammals (e.g., CXCL1/2/3, CXCL16, and CCL25), particularly placentals (e.g., CXCL5/6 and CCL3/18). The phylogenetic relationships we uncovered in our reconciled tree were mostly compatible with known syntenic relationships as described in human (7). For example, the large cluster of CXC-type chemokine genes present in human chromosome 4 contains CXCL1-11 plus CXCL13 (7), all of which coalesce within a monophyletic group in our tree (Fig 2A). The micro-synteny within this cluster is also, to some extent, reflected in the phylogenetic relationships. Similarly, the other large syntenic cluster of chemokines, located on human chromosome 17, containing most of the CC-type chemokines (7), corresponds, with few exceptions, to a large monophyletic clade in our tree (Fig 2A). CXCL16 which is on a nearby locus of chromosome 17, is also phylogenetically related to this CC-type clade (Fig 2A). The complement of the canonical chemokines undergoes the largest expansion at the base of jawed vertebrates, where there is an expansion from 4 to 18 genes (Fig 2B). A second expansion occurred at the base of bony fishes (i.e., Osteichthyes) followed by relative stability until placental mammals, where the total number of canonical chemokine ligands jumped to 45 genes. Finally, unlike previous works (66), our results support the presence of orthologs of both CC type and CXC type in the common ancestor of all vertebrates (Fig 2A).
Figure S13. Alignment of candidate brachiopod CCL24 sequence with mammalian CCL24s.
Our BLAST searches picked up a sequence from the brachiopod Lingula unguis that when re-blasted versus SwissProt returned a CCL24 as hit. Alignment of the brachiopod sequence with mammalian CCL24 sequences reveals a poor overall conservation, with the brachiopod sequence also being significantly longer than any of the other sequences. Further details about this sequence can be found in Supplementary File 3 and in the Supplementary results section.
Figure S14. Alignment of candidate cnidarian CCL3 sequence with mammalian CCL3s.
Our BLAST searches picked up a sequence from the cnidarian Clytia hemisphaerica that when re-blasted versus SwissProt returned a CCL3 as hit. Alignment of the cnidarian sequence with mammalian CCL3 sequences reveals a poor overall conservation, with the cnidarian sequence being extremely longer than any of the other sequences. Further details about this sequence can be found in Supplementary File 3 and in the Supplementary results section.
Figure S15. Alignment of candidate echinoderm CXCL10 sequence with mammalian CXCL10s.
Our BLAST searches picked up a sequence from the echinoderm Acanthaster planci that when re-blasted versus SwissProt returned a CXCL10 as hit. Alignment of the echinoderm sequence with mammalian CXCL10 sequences reveals a poor overall conservation, with the brachiopod sequence also being significantly longer than any of the other sequences. Further details about this sequence can be found in Supplementary File 3 and in the Supplementary results section.
Figure 2. Distribution and duplication patterns of ligand groups.
(A) Presence of all ligand groups are mapped onto a species tree. Gene trees and duplication events are based on the gene tree to species tree reconciliation analyses. The nomenclature for canonical chemokines is primarily based on known chemokines of human (or mouse). Where human and mouse chemokines do not correspond, the default name refers to the human gene and the mouse (Mus musculus) one is indicated with “Mm.” Chemokines that have been classically described as having either homeostatic or inflammatory function are indicated with a circle or a star, respectively. The classification used here was based on reference 7 with the inflammatory type also including chemokines they described as plasma/platelet types. Overall, canonical chemokines originated in vertebrates and expanded a first time in jawed vertebrates and a second time in mammals. Homeostatic chemokines (e.g., CXCL12) are generally more ancient than inflammatory ones. CXCL17 and CYTL are mammal- and jawed vertebrate-specific, respectively. TAFA originated in the common ancestor of vertebrates and urochordates, whereas the chemokine-like factor super family is present in invertebrates although key duplications occurred at the base of vertebrates. (B) Number of complements for each ligand group at key species nodes is mapped onto the species tree. The number of complements in each group reflects the pattern of duplications. The major increase occurred at the level of jawed vertebrates with canonical chemokines undergoing a second significant increase within placentals. Silhouette images are by Andreas Hejnol (Xenopus laevis); Andy Wilson (Anas platyrhynchos, Taeniopygia guttata); Carlos Cano-Barbacil (Salmo trutta); Christoph Schomburg (Anolis carolinensis, Ciona intestinalis, Eptatretus burgeri, Petromyzon marinus); Christopher Kenaley (Mola mola); Chuanixn Yu (Latimeria chalumnae); Daniel Jaron (Mus musculus); Daniel Stadtmauer (Monodelphis domestica); Fernando Carezzano (Asteroidea); Ingo Braasch (Callorhinchus milii); Jake Warner (Danio rerio); Kamil S. Jaron (Poecilia formosa); Mali’o Kodis, photograph by Hans Hillewaert (Branchiostoma lanceolatum, https://www.phylopic.org/images/719d7b41-cedc-4c97-9ffe-dd8809f85553/branchiostoma-lanceolatum); Margot Michaud (Canis lupus, Physeter macrocephalus); NASA (Homo sapiens sapiens); Nathan Hermann (Scophthalmus aquosus); Ryan Cupo (Rattus norvegicus); seung9park (Takifugu rubripes rubripes); Soledad Miranda-Rottmann (Pelodiscus sinensis, https://www.phylopic.org/images/929fd134-bbd7-4744-987f-1975107029f5/pelodiscus-sinensis); Steven Traver (Gallus gallus domesticus, Ornithorhynchus anatinus); Stuart Humphries (Thunnus thynnus); T. Michael Keesey (after Colin M. L. Burnett) (Gorilla gorilla gorilla); Thomas Hegna (based on picture by Nicolas Gompel) (Drosophila (Drosophila) mojavensis); and Yan Wong (Balanoglossus).
Figure S16. Rooted species tree reconciled gene tree for canonical chemokines.
The canonical chemokines gene tree was reconciled with the species tree using GeneRax. “S” or “D” at the node indicates a speciation or duplication event, respectively. CCL clade is in orange, and CXCL clade is in blue.
Differently from the canonical chemokines, we identified a bona fide TAFA, that is, with specific protein motifs, in the urochordates, the sister group to vertebrates (see the Supplementary results section and Figs S17 and S18). The phylogenetic trees (Figs S5 and S6) identified monophyletic groups for TAFA5 (TBE = 0.98, UFB = 98%), TAFA1 (TBE = 0.94, UFB = 98%), TAFA4 (TBE = 0.77, UFB = 75%), and TAFA2/3 (TBE = 0.65, UFB = 84%). The reconciled tree from GeneRax places the root at the urochordate sequence (Fig S19), therefore clarifying that the TAFA5 clade is the sister group to TAFA1-4 (Fig 2A). The family originated in the ancestor of urochordates and vertebrates, and the first duplications occurred at the base of vertebrates giving rise to the TAFA5 split followed by the TAFA1 split. Subsequently, at the base of jawed vertebrates, additional duplications bring the complements from 3 to 10 (Fig 2B), giving rise to the remaining groups so that all jawed vertebrates possess the full diversity of TAFAs.
Figure S17. Alignment of four candidate urochordate TAFA sequences with vertebrate TAFAs.
Our BLAST searches picked up four sequences from the urochordate Ciona intestinalis that connected with the TAFA cluster in the Cluster Analysis of Sequences analysis. One of these sequences when blasted versus SwissProt returned a TAFA as hit. This sequence was also annotated as TAFA by InterProScan. Alignment of the urochordate sequences with vertebrate TAFA sequences reveals that only the one annotated as TAFA aligns well, whereas the other three align poorly and are also significantly longer than any of the other sequences. Further details about these sequences can be found in Supplementary File 3 and in the Supplementary results section.
Figure S18. Alignment of best candidate urochordate TAFA sequence with vertebrate TAFAs.
Of the four urochordate candidate TAFA sequences, only one was annotated as TAFA with both SwissProt and InterProScan annotation and appeared to align well with other TAFAs with a preliminary alignment with all urochordate sequences (Fig S6). Here, we removed the other three urochordate sequences and aligned only the best candidate with the vertebrate TAFAs. The sequence conservation is even more apparent with this alignment. Importantly, 8 of the 10 typical cysteine residues of TAFA1–4 are conserved, and the two missing cysteines are the same ones missing in TAFA5. Further discussion can be found in the Supplementary results section.
Figure S19. Rooted species tree reconciled gene tree for TAFA.
The TAFA gene tree was reconciled with the species tree using GeneRax. “S” or “D” at the node indicates a speciation or duplication event, respectively.
The phylogenetic trees for CYTL and CXCL17 mainly reflect the species trees (Figs S7, S8, S9, and S10), and the reconciliations revealed very simple complement dynamics (Figs 2B and S20 and S21). However, these molecules show a remarkable difference in their distribution. CYTLs are present throughout gnathostomes, whereas CXCL17 is found only in placental mammals (Fig 2A).
Figure S20. Rooted species tree reconciled gene tree for CYTL.
The CYTL gene tree was reconciled with the species tree using GeneRax. “S” or “D” at the node indicates a speciation or duplication event, respectively.
Figure S21. Rooted species tree reconciled gene tree for CXCL17.
The CXCL17 gene tree was reconciled with the species tree using GeneRax. “S” or “D” at the node indicates a speciation or duplication event, respectively.
The phylogenetic analysis for the CKLF super family (Figs 1D and S11 and S12) recovered a monophyletic clade for the CKLF I group (TBE = 0.96, UFB = 80%) that we had already identified through CLANS. This group contains CKLF, which is known to interact with C-C chemokine receptor 4 (10, 11), and CMTM1, 2, 3, 5, and proteolipid protein 2 (PLP2). Other monophyletic clades that are consistent with the CLANS are CMTM4/6 (TBE = 0.90, UFB = 61%), CMTM7 (TBE = 0.92, UFB = 83%) and a clade containing CMTM8 plus other related molecules such as plasmolipin (PLLP) and myelin and lymphocyte proteins (MAL) (TBE = 0.89, UFB = 60%). The latter were all part of a large cluster that we called CKLF II in the CLANS (Fig 1B). However, the placement of the root of the tree in Fig 1D can affect the interpretation of the relationships among CKLF II subgroups. To address this problem and clarify the patterns of duplications and the presence/absence of each group throughout animals, we used GeneRax to reconcile the gene with the species tree (see above and Material and Methods section for details). Our results suggest (Figs 2 and S22) that most CKLFSF groups, such as CMTM4, 6, and 8, originate in the vertebrate stem group from preexisting CMTM genes and are widely distributed in animals. The CKLF I subgroups originate from duplications at the base of jawed vertebrates, except for the split between CKLF and CMTM1 that occurs only within mammals (Fig 2A). We observe the major two expansions of the CKLFSF genes in the stem group of vertebrates (from 6 to 10 complements), and then in jawed vertebrates (from 10 to 16 complements). Interestingly, the extent of these expansions is less drastic than those we see for canonical chemokines (Fig 2B). In total, we have identified that the five distinct ligand groups have a different origin in the animal tree of life and underwent divergent evolutionary histories.
Figure S22. Rooted species tree reconciled gene tree for CKLFSF.
The CKLFSF gene tree was reconciled with the species tree using GeneRax. “S” or “D” at the node indicates a speciation or duplication event, respectively. Red clade = CMTM4/6; blue clade = CKLF I group; green clade = CMTM7; turquois clade = MAL/MALL/MAL2.
Canonical and noncanonical chemokine receptors are divided into four groups
Next, we investigated the origin and pattern of duplication for the chemokine receptors and chemokine-like receptors (Table 1). Using BLASTP against the 64 species, we identified 7,157 putative chemokine receptors (see the Materials and Methods section for more details) and investigated their relationships using CLANS (see above for justification). The result (Fig S23C) identifies four main groups of chemokine receptors and chemokine-like receptors. The first comprises canonical receptors (i.e., CCR, CXCR, CX3CR1, CX3C, and XCR1), and the second includes atypical receptor 3 and GPR182, which has been recently shown to have chemokine receptor activity (67). The third group, which we named Chemokine-like plus (CML-plus), contains the chemokine-like receptors (CML1 also known as chemerin receptor 1), FPR that bind the TAFA ligands (27, 28) and other GPCRs such as GPR1 (chemerin receptor 2), GPR33, PTGDR2. Furthermore, the CLANS analysis identifies an intermediate group containing angiotensin, apelin, and other receptors and shows sequence similarity to canonical and chemokine-like receptors (Fig S23B). Finally, our analysis identifies a small cluster composed of only ACKR1 that do not connect to other GPCRs or other atypical receptors even at loose P-value thresholds. This indicates that their sequence is either nonhomologous or highly divergent from other chemokine receptors and atypical receptors. Overall, these groups are robust to the stringency threshold used (i.e., different P-values) (Fig S23A–C). Interestingly, no specific cluster of viral or viral-like receptors was identified, but six of the reference viral receptor sequences clustered with the canonical chemokine receptors.
Figure S23. Cluster Analysis of Sequences clustering of receptors and related molecules sequences.
A Cluster Analysis of Sequences clustering layout where shapes indicate sequences and lines are connections indicating similarity between sequences at or surpassing the P-value similarity threshold. Sequences are positioned in clusters based on similarity. Initial identification and annotation of clusters was performed using the inbuilt convex clustering at the P-value of 1 × 10−100. (A) Clustering was loosened till the canonical receptor annotated groups formed a cluster at 1 × 10−65. (B) Loosening of the P-value to 1 × 10−60 identified relationships between clusters of interest and identified the intermediate group as connecting to both canonical and chemokine-like plus groups. All sequences connected to groups of interest are vertebrate sequences. (C) Further loosening to P-value 1 × 10−50 connects the vertebrate sequences of interest to a large cluster of sequences which contains vertebrate and invertebrate sequences which are annotated as opioid and somatostatin receptors and other GPCRs. Crosses indicate invertebrate sequences and Y-shape indicates the reference viral sequences included. Shapes are colour-coded by the group of interest: purple = canonical chemokine receptors; yellow = chemokine-like plus; green = atypical receptor 3/GPR182; blue = intermediate group; pink = relaxin receptors.
Altogether, these results confirm the homology between the canonical receptors and atypical receptor 3/GPR182. However, these findings indicate that the other GPCRs, such as the chemokine-like receptors, FPR, GPR1, and GPR33, are also closely related to the canonical receptors. Remarkably, these results also indicate that ACKR1 is not homologous to the canonical chemokine receptors. Furthermore, all clusters of chemokine receptors contained only vertebrate sequences, except for the receptors of viral origin.
Canonical and chemokine-like receptors derive from single-gene duplication in the ancestor of vertebrates
Previous studies suggested that the chemokine receptors evolved from a duplication of angiotensin receptors (68) or adrenomedullin receptors (47, 69). However, these works were based on error-prone phylogenetic methods such as Neighbour Joining (69). Our CLANS results indicate that chemokine receptors and chemokine-like receptors have only been observed in vertebrates. Therefore, we focused on invertebrate genomes to clarify the chemokine receptor’s outgroup. To clarify this, we lowered the P-value thresholds of CLANS (to P-value < 1 × 10−50) and collected a combined dataset including all chemokine receptor sequences and outgroups (i.e., sequences that connect to the chemokine receptor cluster), resulting in 3,026 sequences. We then performed a phylogenetic tree on this dataset using maximum likelihood methods with UFB and TBE for evaluating nodal support (see above and Materials and Methods section for details).
Our combined phylogenetic analysis shows strong support for the monophyly of canonical chemokine receptors (UFB = 96, TBE = 1.0), the CML-plus (UFB = 95, TBE = 0.99) and the atypical 3/GPR182 (UFB = 100, TBE = 1) (Figs 3 and S24 and S25). In contrast, viral chemokine receptors are paraphyletic, with three sequences placed within the canonical chemokine receptors and three forming a monophyletic group sister to them (UBF = 84 TBE = 1.0). Our results also suggest that the intermediate group, which includes apelin receptors, angiotensin receptors, bradykinin receptors, and orphan GPCRs (e.g., GPR25; GPR15) forms a monophyletic clade with the canonical chemokine receptors, CML-plus group and atypical3/GPR182 (UFB = 61, TBE = 0.91). However, its position changes between the sister group to canonical chemokine receptors plus atypical3/GPR182 in the TBE tree (TBE = 0.84) and sister to CML plus in the UFB tree (UFB = 38).
Figure 3. Phylogeny of receptor groups.
An unrooted maximum likelihood phylogeny of chemokine receptors. The tree shown is the Transfer Bootstrap Expectation tree including just the chordate specific clade from the Ultrafast Bootstrap tree. Node supports from both Transfer Bootstrap Expectation (triangle) and UFB (circle) shown for equivalent key nodes in boxes with arrows to indicate node. A traffic light colour code is used to indicate the level of support: high (green); intermediate (yellow); and low (red). Key clades highlighted: yellow = chemokine like plus group (CMLplus); blue = intermediate group; green = atypical 3 and GPR182 (ACKR3/GPR182); purple = canonical chemokines (Canonical CKR); and pink = relaxin receptors (RL3R). Branches scaled by amino acid substitutions per site.
Figure S24. Unrooted phylogenetic tree of receptors with transfer bootstrap expectation supports.
Phylogenetic tree of receptor sequences of interest and putative outgroups under the model GTR20+F+G4. Sequences used are a subset of sequences extracted from Cluster Analysis of Sequences; specifically, they are those in the chordate specific clade in the ultrafast bootstrap tree. Nodal support is calculated from 100 nonparametric bootstrap repeats with transfer bootstrap expectation. Branches colour coded by group of interest: purple = canonical chemokine receptors; yellow = chemokine-like plus; green = atypical receptor 3/GPR182; blue = intermediate group; pink = relaxin receptors.
Figure S25. Unrooted phylogenetic tree of receptors with UFB supports.
Phylogenetic tree of all receptor sequences of interest and putative outgroups extracted from clans under the model GTR20+F+G4. Nodal support is calculated from 1,000 ultrafast bootstrap repeats. Branches colour coded by the group of interest: purple = canonical chemokine receptors; yellow = chemokine-like plus; green = atypical receptor 3/GPR182; blue = intermediate group; pink = relaxin receptors.
All the groups mentioned above form a large clade composed of vertebrate-specific GPCRs (UBF = 100 TBE = 0.96) that also includes other GPCRs, such as CLTR and P2RY receptors (Figs 3 and S24 and S25). Another orphan GPCR, GPR35, had been proposed as a potential chemokine receptor (39); however, this was later questioned (56, 57) and GPR35 is still generally considered orphan (70, 71, 72). Our analysis collected GPR35 and placed it within this large vertebrate-specific clade indicating that is it also a vertebrate-specific gene but not phylogenetically a “canonical” chemokine receptor. The closest outgroup to this clade is composed of a few sequences from urochordates, the sister group of vertebrates (UFB = 49 TBE = 0.91) (Figs 3 and S24 and S25). Interestingly, as the sister group of this clade, we identify a group composed of Relaxin receptors, which contain sequences from both urochordates and vertebrates (UBF = 53 TBE = 0.95). Finally, as the sister group of these large clades, we identified a clade of cephalochordate-specific sequences (UBF = 44).
To clarify the duplication pattern and origin of the chemokine receptors, we used GeneRax (61) (see the Materials and Methods section). Our results indicate (Figs 4A and S26) that all chemokine receptors (except ACKR1) originated from a duplication in the stem lineage of vertebrates. This duplication of an unknown GPCR gave rise to the CML-plus, the canonical chemokine receptors atypical 3/GPR182 groups and the intermediate group and other GPCRs (Figs 4A and S26). This result is consistent with the distribution of the paralogous Relaxin receptors which are present both in urochordates and vertebrates and the position of the orphan urochordate sequences as the sister group of canonical chemokine receptors, CML-plus group, and atypical3/GPR182 and other GPCRs (see above). Furthermore, the phylogenetic relationships among canonical chemokine receptors are overall consistent with the syntenic gene patterns known in human (7). The largest cluster of chemokine receptor genes spans 3 closely located loci on human chromosome 3 (7). It includes most CCRs, XCR, and CX3CR and corresponds to one of the two major monophyletic clades in our tree (Fig 4A). Another example is the mini cluster of CXCR1 and CXCR2, located on human chromosome 2 (7), which we also found to form a monophyletic clade (Fig 4A).
Figure 4. Distribution and duplication patterns of receptor groups.
(A) Presence of all receptor groups are mapped onto a species tree. Gene trees and duplication events are based on the gene tree to species tree reconciliation analyses. The nomenclature for genes is primarily based on human chemokines. The canonical chemokines had five paralogs present in the vertebrate common ancestor. These undergo a heterogeneous pattern of duplication throughout vertebrates with different paralogs duplicating different number of times and in different groups of species. Chemokines that have been classically described as having either homeostatic or inflammatory function are indicated with a circle or a star respectively. The classification used here was based on reference 7. (B) Number of complements for each receptor group at key species nodes is mapped onto the species tree. The number of complements in each group reflects the pattern of duplications. The chemokine groups diverged in the vertebrate stem group. The major expansion occurred at the level of jawed vertebrates with canonical chemokine receptors, the chemokine-like receptor plus group and intermediate groups increasing in copy number. Canonical chemokine underwent another small subsequent increase within placentals. Silhouette images are by Andreas Hejnol (Xenopus laevis); Andy Wilson (Anas platyrhynchos, Taeniopygia guttata); Carlos Cano-Barbacil (Salmo trutta); Christoph Schomburg (Anolis carolinensis, Ciona intestinalis, Eptatretus burgeri, Petromyzon marinus); Christopher Kenaley (Mola mola); Chuanixn Yu (Latimeria chalumnae); Daniel Jaron (Mus musculus); Daniel Stadtmauer (Monodelphis domestica); Fernando Carezzano (Asteroidea); Ingo Braasch (Callorhinchus milii); Jake Warner (Danio rerio); Kamil S. Jaron (Poecilia formosa); Mali’o Kodis, photograph by Hans Hillewaert (Branchiostoma lanceolatum, https://www.phylopic.org/images/719d7b41-cedc-4c97-9ffe-dd8809f85553/branchiostoma-lanceolatum); Margot Michaud (Canis lupus, Physeter macrocephalus); NASA (Homo sapiens sapiens); Nathan Hermann (Scophthalmus aquosus); Ryan Cupo (Rattus norvegicus); seung9park (Takifugu rubripes rubripes); Soledad Miranda-Rottmann (Pelodiscus sinensis, https://www.phylopic.org/images/929fd134-bbd7-4744-987f-1975107029f5/pelodiscus-sinensis); Steven Traver (Gallus gallus domesticus, Ornithorhynchus anatinus); Stuart Humphries (Thunnus thynnus); T. Michael Keesey (after Colin M. L. Burnett) (Gorilla gorilla gorilla); Thomas Hegna (based on picture by Nicolas Gompel) (Drosophila (Drosophila) mojavensis); and Yan Wong (Balanoglossus).
Figure S26. Rooted species tree reconciled gene tree for receptors.
The ultrafast bootstrap receptor tree was modified to extract the subtree of the chordate specific clade. This gene tree was reconciled with the species tree using GeneRax. “S” or “D” at the node indicates a speciation or duplication event, respectively. Branches colour coded by group of interest: purple = canonical chemokine receptors; yellow = chemokine-like plus; green = atypical receptor 3/GPR182; blue = intermediate group; pink = relaxin receptors.
We used the reconciliation to better understand the repertoires of receptors present at key nodes during vertebrate evolution. Our results (Fig 4B) show a substantial difference in the duplication pattern of different receptor families. For example, the complement of the atypical3/GPR182 remains constant throughout vertebrate evolution, whereas the canonical and chemokine-like receptor groups expanded dramatically. The canonical chemokine receptors expanded from 5 to 20 genes and the CML-plus from 1 to 11 in the ancestor of the jawed vertebrates (Fig 4B). The expansion of the canonical CKRs is also not evenly distributed across its subgroups, with the ancestral CC type receptors undergoing a series of duplications in jawed vertebrates, whereas the CXCR paralogs did not, specifically one (CXCR4) remains in single copy across all vertebrates. We inferred that in the stem lineage of vertebrates, five canonical chemokine receptor paralogs had already diverged, representing the two major types of receptors (2 CCR and 3 CXCR paralogs). Also present in the stem lineage of vertebrates were ACKR3 and GPR182 and a single-copy gene, which would later diverge to produce all the CML-plus clade.
Discussion
This work substantially clarifies the evolutionary assembly of the chemokine system. Our analysis shows that contrary to the receptors which evolved from a single duplication event in the vertebrate stem group, several unrelated molecules acquired the ability to interact with chemokine receptors over the course of evolutionary history. Furthermore, our results (summarized in Fig 5) suggest that the key components of the chemokine system, including the chemokine receptors themselves, evolved in the stem group of vertebrates in the Cambrian around 500 million years ago and then underwent substantial diversification in the stem group of jawed vertebrates. These findings shed new light on the complex evolutionary history of the chemokine system.
Figure 5. Summary of the evolution of ligands and receptors.
A summary diagram of the evolution of the different chemokine system components. A simplified phylogenetic tree of species is shown, calibrated to time according to reference 73 for Deuterostomia and Bilateria nodes and reference 74 for all other nodes. Circles represent ligand groups, and seven transmembrane domain structure icons represent GPCR groups. Icons are colour-coded by group, and placed adjacent to the branch in the species tree where they first appear. X2 and X5 indicate the number of paralogs present for CXCL ligand group and the canonical CKR groups, respectively, on the branch where they first appear. Question mark refers to the uncertainty regarding the origin of the chemokine-like factor group I in jawed vertebrates or deuterostome stem group (see Fig 2). Geological column is shown along the bottom, in accordance with the ICS International Chronostratigraphic Chart (75).
Unrelated molecules converged to chemokine function
Based on the presence of shared protein motifs, TAFA “chemokines” (25, 26), CXCL17 (76, 77) and CYTL (40) have been proposed to be homologous to chemokine ligands. However, our findings strongly suggest that these molecules are not homologous (Fig 1) and likely acquired the ability to activate a chemokine-like response through convergent evolution. Our conclusions differ from those previous studies (25, 40, 76, 77) because of the differences in data completeness and methodological approach. Specifically, we used a complete set of canonical and noncanonical ligands and assessed the homology using overall sequence similarity rather than single motifs. Our results support and expand upon the findings of (46), which suggested that the presence of a CXC or CC motif is necessary but not sufficient for a protein to be defined as a chemokine ligand. Similarly, CKLF has been considered a “new member” of the chemokine family based on its function (12); we argue that classification based solely on function is insufficient and can be misleading. Instead, we recommend considering the evolutionary relationships among these molecules as the primary criterion for classification.
Most of the canonical and noncanonical ligands are vertebrate innovations
Our results clarify the distribution of canonical chemokine ligands in animals (Fig 2) and confirm that they are present only in vertebrates (47). We identify orthologs of CXCL and CCL ligands in both extant lineages of cyclostomes (Fig 2A). Although chemokines have already been described in lamprey (66, 78, 79), it is the first time, to the best of our knowledge, that they are described also in hagfish. Our findings also indicate that both CC and CXC types were present in the common ancestor of all vertebrates and that few ancestral genes gave rise to the entire diversity of ligands that we know in current animals. Furthermore, our results indicate that many chemokines, such as CXCL1-7, CXCL16, and CCL25, CCL11/13, and CCL2/7, are uniquely present in mammals, suggesting that the mammal ligand repertoire is substantially more complex than the one observed in other vertebrates.
Regarding noncanonical chemokine-like families, our findings indicate that the TAFA family originated in the ancestor of vertebrates and urochordates; CYTL is a novelty of jawed vertebrates; and CXCL17 is mammal-specific and likely unrelated to canonical chemokines (similar to its controversial putative receptor, GPR35 (39, 56, 57), that is not a canonical chemokine receptor). The CKLF superfamily has a more complex pattern with the presence of few groups in invertebrates and then great expansions occurring at the base of vertebrates. The CKLFSF includes a monophyletic clade (CKLF group I) comprising the original CKLF that binds CCR4, and CMTM1, 2, 3, 5, derived from duplications at the jawed vertebrates stem group. Interestingly, our analysis also revealed that additional molecules not previously considered part of the CKLF super family are closely related to classic members and should be included in it. For example, proteolipid protein 2 (PLP2) belongs to the CKLF I group and is, therefore more closely related to the CKLF with chemokine function than several other CKLFSF members. Similarly, CMTM8 is more closely related to plasmolipin (PLLP) and myelin and lymphocyte protein (MAL) than to any of the classic CKLFSF members. Although this relationship had been proposed based only on sequence similarity (13), our phylogenetic analysis provides additional evidence for it. Therefore, the potential chemokine function of all these additional members should be explored in vitro and in vivo in both vertebrates and invertebrates.
Most receptors derive from a single gene duplication
Our results clarify the distribution of canonical chemokine receptors in vertebrates (Fig 4), and their evolutionary relationships and identify the pattern of duplication that leads to their origin (Figs 4A and S26). Unlike previous works (80), we identify that atypical receptors do not form a monophyletic group. Specifically, atypical 2 and 4 are part of the canonical clade specifically related to CC-type receptor subclades. Furthermore, we find that the atypical 3 receptors are related to GPR182, supporting previous functional data suggesting that the latter are ACKRs binding CXCL10, 12, and 13 (67). We attribute these differences to our use of wider GPCR sampling and improved methods for phylogenetic inference.
Remarkably, our results do not identify ACKR1 as related to the main chemokine receptors but rather as a divergent clade (Fig S23). To the best of our knowledge, this is the first time this observation has been made. Our current results do not allow us to clarify the evolutionary origin of ACKR1. However, the presence of 7TMD domains suggests that they are GPCRs that independently acquired the ability to bind chemokines. Alternatively, similarly to other genes evolved in the immune system, ACKR1 may have been subjected to strong selective pressures that substantially changed their sequence, obscuring their phylogenetic relationships. The case of ACKR1 being the most distantly related receptor is intriguing as it is one of the most promiscuous chemokine receptors (2, 81) and it has been shown to bind both CC and CXC chemokines (82, 83).
Viral chemokine receptors represent a cryptic group that can bind multiple chemokines (41, 42). Despite their functional similarity to canonical chemokine receptors, viral chemokine receptors’ evolutionary origin and distribution remain poorly understood. Our results indicate that viral GPCRs do not form a monophyletic group, suggesting that the ability to encode chemokine-like receptors has evolved independently in multiple viruses, including cytomegaloviruses and poxviruses. The placement of viral sequences within an otherwise vertebrate-specific clade supports the hypothesis that viruses acquired these genes through non-vertical inheritance. Given the paraphyly of viral receptors, this appears to have occurred multiple times. However, there are significant uncertainties details of viral chemokine receptors’ evolution.
Our analysis reveals that the clade comprising apelin receptors, angiotensin receptors, bradykinin receptors, and orphan GPCRs (shown in Figs 3 and 4 and S24, S25, and S26) is closely related to chemokine receptors. This finding partially supports previous studies (68) that suggested a gene duplication event gave rise to both chemokine receptors and angiotensin receptors. Interestingly, we found that single gene duplication in the vertebrate stem group led to the emergence of canonical receptors and atypical 2,3,4, GPR182, chemokine-like receptors, FPR, the intermediate group, and many other known and orphan GPCRs including the controversial putative CXCL17 receptor GPR35. These findings suggest that two rounds of genome duplication (84, 85) played a role in the expansion of GPCR gene families. Future research will focus on investigating the functions of the orphan genes and many-to-one orthologs discovered in urochordates. This will provide further insight into the evolution and diversification of GPCR families in vertebrates.
The molecular assembly of the chemokine system
In this work, we explored the evolution of both ligand and receptor components of the chemokine signaling system, including noncanonical molecules with either chemokine-like function or sequence similarity. Our analysis suggests that the canonical chemokine signaling evolved in the vertebrate stem group (about 500 Mya) likely because of the two rounds of genome duplication that gave rise to many vertebrate novelties (84, 85). We found that the ancestral vertebrate repertoire included orthologs of both major ligand groups (CXCL and CCL) and both CCR and CXCR receptors and noncanonical components such as TAFA and CKLFSF ligands, and the receptors Atypical 3 and GPR182 (Fig 5). The distribution of ligands and receptors in the ancestor of all vertebrates seems to confirm the hypothesis that the ancestral function of chemokines was homeostatic (e.g., CXCL12, CXCL14) with inflammatory functions arising from recent duplications (e.g., CXCL5, CXCL6), potentially reflecting a rapid evolution induced by the selective pressure of new pathogens (7). Chemokine ligand and receptor genes are known to cluster on specific chromosomes (7), consistent with the hypothesis that they may be the result of the combination of en bloc duplication followed by tandem duplications (47, 63, 64). Because of limited high-quality genomes, syntenic patterns of chemokine genes described so far are based primarily on humans and a handful of other species (47, 63, 64), hampering our understanding of the level of conservation of these syntenic patterns. Conversely, our large-scale phylogenetic analyses encompassed many species. We uncovered several phylogenetic relationships that are consistent with known syntenic patterns in human, providing stronger evidence for their evolutionary relationship.
The evolutionary history of canonical components includes several examples of known ligand–receptor pairs following a corresponding pattern of origin and temporal dynamics of duplications. This is true, for example, for the ancient homeostatic CXCL12 ligand and its corresponding receptors CXCR4 and ACKR3, that all originated in early vertebrates (7). The early origin and conservation of CXCR4 and CXCL12 in the ancestor of vertebrates is interesting as this pair plays a key role in the migration of neural crest cells (86)—a key vertebrate innovation (87). This, combined with the fact that homeostatic chemokine ligands/receptors tend to be restricted to monogamous pairing (2, 65) suggests that homeostatic chemokine pairings are more ancient and conserved being in single copy throughout much of the vertebrates. Contrastingly, inflammatory chemokine pairings are more promiscuous, and this could be linked to the more recent duplications in the genes, such as for CCL2/7/8/11/13 (Fig 2A) and their receptors CCR1/2/3/4/5 (Fig 4A). For many of the noncanonical components, however, the ligand–receptor interactions are largely unclear, and their pattern throughout vertebrate evolution remains to be explored. Overall, our results indicate that three waves of molecular innovation in the vertebrates, jawed vertebrates, bony fishes, and mammal stem group increased the chemokine system’s molecular complexity (Fig 5), allowing for the fine-tuning present in modern-day animals.
Materials and Methods
Data mining and dataset assembly
We collected 64 proteomes from 25 vertebrates, six chordates, and 33 other animals covering the whole animal tree (Table S1). BUSCO v4.0.6 (88, 89) and the metazoa_odb10 set of 954 genes were used to evaluate their completeness (Table S1).
To identify potential homologs of canonical chemokines, TAFA chemokines, and CYTL1, we used 207 curated sequences that we obtained from SwissProt (90, 91) as seeds for an initial BLASTP (51, 53) with E-value < 10−10. To identify putative chemokines in cyclostomes, the lamprey Petromyzon marinus (92), and the hagfish Eptatretus burger (93 Preprint), we loosened the E-value to 0.05. Where putative chemokine sequences were found for one cyclostome species but not the other, those sequences were used to search again the other species. Furthermore, to investigate the presence of ligands outside vertebrates, we performed an additional BLASTP on invertebrate proteomes with an even looser E-value (0.1) and collected only up to five hits. This provided 18 initial candidate homologs spanning multiple invertebrate phyla. Further characterisation of these invertebrate sequences, through BLASTP versus SwissProt, protein domains search with InterProScan (94, 95), position in CLANS analysis (see below) and, where necessary, multiple sequence alignments, led us to retain only one urochordate sequence as a putative TAFA homolog (see the Supplementary results section for details).
To identify homologs for the CKLF superfamily, we used 21 SwissProt-reviewed sequences. In addition to the BLASTP search, we used a position-specific iterative BLAST (PSI-BLAST) (52) with an E-value threshold of <10−10. Using this approach, we identified a total of 590 putative homologs, including 186 from invertebrates.
We used BLASTP using 178 manually annotated receptor sequences from SwissProt as query sequences for the chemokine receptors. This includes all human canonical and ACKRs (96). We also collected eight viral sequences with chemokine receptor activity from UniProt (97) and performed a second BLASTP. We extracted all BLAST hits with E-values < 10−10 and used Phobius (98) to predict their transmembrane domain structure. Only sequences with five to eight transmembrane domains were kept. Hit sequences were annotated by their top five BLAST hits against SwissProt. All hits from both BLASTs were merged and filtered by cd-hit (99, 100) to remove redundant sequences at the 95% similarity threshold. This resulted in 7,157 putative chemokine GPCR sequences.
Identification of subgroups with CLANS
We used CLANS (54, 55) with default parameters and different P-values (i.e., stringency values) to visualize the relationships between subgroups of ligands and receptors. We assessed the similarity and interrelationships between different clusters by gradually relaxing the P-value threshold (Figs S1, S2, and S13). In addition, we annotated each cluster using gene annotations for key species Homo, Mus, Gorilla, Gallus, Anolis, and Danio. In the case of the receptors, to improve the cluster annotation all human Class-A GPCRs (excluding olfactory receptors) from GPCRdb (101) were added to the dataset and the eight seed viral chemokine receptors from UniProt (97).
Alignment and phylogenetic analysis
Alignment
All ligand and receptor sequences were aligned using MAFFT (102, 103) with the --auto setting and using trimAl (104) to remove positions with >70% gaps.
Gene trees
All gene alignments were analysed using IQTREE2 (105); the model test algorithm (106) was used to select the best substitution model for each analysis. The best models selected by IQTREE2 for each set are listed in Table S2 (for receptors we manually selected GTR20+F+G4 as the model as it was a large dataset). Nodal support was estimated using 1,000 UFB (58, 59) replicates. All analyses were repeated to run 100 nonparametric bootstrap repeats to calculate nodal support with TBE which is specifically designed to account for phylogenetic instability (60).
Table S2 Substitution models used in the phylogenetic analyses. (9.8KB, xlsx)
For the receptors, because of the high computational burden of running TBE analyses on sequence-dense datasets, we first analysed the full set of 3,026 sequences connected in CLANS at a P-value of < 1 × e−50 using UFB (Fig S25). Then, we extracted the chordate-specific clade sequences, including all chemokine receptor groups and their immediate outgroups, to analyse using TBE.
Gene tree–species tree reconciliation
To understand the pattern of duplication and the evolution of gene complement we used GeneRax (61). GeneRax requires a gene tree that was obtained as described above and a species tree that we constructed manually using publicly available information. In the instances where the genes tree contained polytomies, we used ETE3 (107) to solve them. The undated DL mode and the closest approximation of the best-fitting substitution model were used for each alignment. To track the evolution of sub-lineages within each group, we used annotated sequences of key species (e.g., Homo sapiens and Mus musculus) as reference. For the receptors, we used only the chordate-specific clade subtree and sequences because of the computational burden of running GeneRax on a high number of sequences. For species tree–gene tree reconciliation, we treat the viral sequences as human sequences.
Supplementary results
Exploration of candidate invertebrate homologs to canonical chemokines and related molecules
To explore the possibility of finding canonical chemokines and/or related molecules (TAFA and CYTL) outside of vertebrates, we used BLASTP with a loose E-value threshold of 0.1 to search 39 invertebrate proteomes (see Table S2). 18 candidate sequences were collected and explored further.
From the clustering analysis in CLANS (Figs 1A and S1A–D and Supplementary File 1 in the GitHub repository: Roberto-Feuda-Lab/Chemokine2023 (github.com)), it became apparent that four sequences were candidate TAFAs, whereas the remaining 14 sequences connected loosely to the canonical chemokines. We then performed BLASTP (51, 53) versus the curated SwissProt dataset (90, 91) and collected the first five hits. Furthermore, we used InterProScan (94, 95) to identify protein signatures. See Supplementary File 3 in the GitHub repository: Roberto-Feuda-Lab/Chemokine2023 (github.com) for a summary of all these results for all sequences.
Regarding the canonical chemokines, only three sequences received annotations related to chemokines from the BLASTP versus SwissProt. These were the following: one brachiopod sequence (Lingula unguis) as candidate CCL24, one cnidarian (Clytia hemisphaerica) sequence as candidate CCL3, and one echinoderm (Acanthaster planci) sequence as candidate CXCL10. Although none of these sequences were categorised as chemokines with InterProScan, we anyway decided to look at them further. First, we noted how all three sequences were significantly longer than their counterparts in vertebrates. Second, none of the three sequences possessed a signal peptide, as calculated with SignalP 6.0 online tool (https://services.healthtech.dtu.dk/service.php?SignalP-6.0), which is expected from secreted proteins such as chemokines (108). Finally, we anyway tried to align the sequences (MAFFT –auto) with their respective candidate relatives and found poor conservation (Figs S13, S14, and S15). The lack of evidence for being true chemokines, led us to discard all invertebrate candidates for further analyses on canonical chemokines.
The four candidate invertebrate TAFA sequences all belong to the urochordate Ciona intestinalis. One sequence was annotated as TAFA by both SwissProt and InterProScan, whereas the other three appear to be prolyl hydroxylases (see Supplementary File 3 in the GitHub repository: Roberto-Feuda-Lab/Chemokine2023 (github.com)). We anyway studied all four sequences further and found that the only one to possess a signal peptide is the sequence that received TAFA annotation. Moreover, the other three sequences appear to be too long and poorly aligned with vertebrate TAFAs (Fig S17). The TAFA annotated sequence was of correct length and showed sequence conservation in the alignments (Figs S17 and S18). Interestingly, it possesses 8 of the 10 typical cysteine residues of TAFA1–4 and the two missing cysteines are the same ones missing in TAFA5. Considering that TAFA5 is the sister group to TAFA1–4 and that the urochordate sequence places itself as orthologous to all TAFAs (see main text Results), it is reasonable to conclude that the ancestral TAFAs possessed eight cysteine residues and that the additional cysteines are a novelty of the TAFA1–4 lineage.
Taken together, these results show that whereas canonical chemokines are indeed a vertebrate innovation, TAFA “chemokines” likely originated in the ancestor of vertebrates and urochordates.
Exclusion of some sequences from the CKLFSF dataset
The data mining through BLASTP and PSI-BLAST (52) provided numerous candidate CKLFSF homologs both in vertebrates and in invertebrates and these were all included in a clustering analysis with CLANS (Figs 1C and S2A–D). Two main clusters emerged, and we called them “CKLF I” and “CKLF II.” Whereas CKLF I was vertebrate specific, the CKLF II included multiple invertebrate sequences. Although the two main clusters are well defined already at P-values of ∼1 × 10−20 (Fig S2B and Supplementary File 2 in the GitHub repository: Roberto-Feuda-Lab/Chemokine2023 (github.com)), they only connect to each other at 1 × 10−15. At this P-value, four additional sequences connected loosely to the CKLF II cluster. These were three echinoderm sequences (all from Stichopus japonicus) and one sequence from the placozoan Trichoplax adhaerens. The latter is the only non-bilaterian sequence collected from the original BLASTs. These sequences not only joined the CKLFSF cluster at the limit threshold, but also connected only to sequences that were already marginal, therefore being only indirectly connected to the core of the cluster. Like above, we examined the sequences with a BLAST versus SwissProt and with InterProScan (see Supplementary File 3 in the GitHub repository: Roberto-Feuda-Lab/Chemokine2023 (github.com)). The evidence in favour of keeping these sequences was scant (see details in Supplementary File 3 in the GitHub repository: Roberto-Feuda-Lab/Chemokine2023 (github.com)), and we decided to exclude them from downstream phylogenetic analyses. The CKLFSF dataset therefore did not include any non-bilaterian sequences, although multiple bilaterian invertebrate phyla were represented.
Data Availability
Supplementary material and raw output files for all the analyses described in this article are available at the GitHub repository: Roberto-Feuda-Lab/Chemokine2023 (github.com). All data are also deposited in the Zenodo repository: (109).
Supplementary Material
Acknowledgements
This work is supported by a University Research Fellowship (UF160226) to R Feuda. A Aleotti is supported by a Research Grant from the Royal Society to R Feuda (RGF\R1\181012). M Goulty is supported by a PhD Scholarship from the University of Leicester. C Lewis is supported by a BBRSC MIBPT fellowship. This research used the ALICE High-Performance Computing Facility at the University of Leicester.
Author Contributions
A Aleotti: conceptualization, data curation, formal analysis, investigation, and writing—original draft, review, and editing.
M Goulty: conceptualization, data curation, formal analysis, investigation, methodology, and writing—original draft, review, and editing.
C Lewis: data curation, formal analysis, and visualization.
F Giorgini: conceptualization, supervision, methodology, and writing—original draft, review, and editing.
R Feuda: conceptualization, formal analysis, supervision, funding acquisition, project administration, and writing—original draft, review, and editing.
Conflict of Interest Statement
The authors declare that they have no conflict of interest.
References
- 1.Zhang K, Shi S, Han W (2018) Research progress in cytokines with chemokine-like function. Cell Mol Immunol 15: 660–662. 10.1038/cmi.2017.121 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Chen K, Bao Z, Tang P, Gong W, Yoshimura T, Wang JM (2018) Chemokines in homeostasis and diseases. Cell Mol Immunol 15: 324–334. 10.1038/cmi.2017.134 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Blanchet X, Langer M, Weber C, Koenen R, von Hundelshausen P (2012) Touch of chemokines. Front Immunol 3: 175. 10.3389/fimmu.2012.00175 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.López-Cotarelo P, Gómez-Moreira C, Criado-García O, Sánchez L, Rodríguez-Fernández JL (2017) Beyond chemoattraction: Multifunctionality of chemokine receptors in leukocytes. Trends Immunol 38: 927–941. 10.1016/j.it.2017.08.004 [DOI] [PubMed] [Google Scholar]
- 5.Tran PB, Miller RJ (2003) Chemokine receptors: Signposts to brain development and disease. Nat Rev Neurosci 4: 444–455. 10.1038/nrn1116 [DOI] [PubMed] [Google Scholar]
- 6.Moser B, Wolf M, Walz A, Loetscher P (2004) Chemokines: Multiple levels of leukocyte migration control. Trends Immunol 25: 75–84. 10.1016/j.it.2003.12.005 [DOI] [PubMed] [Google Scholar]
- 7.Zlotnik A, Yoshie O (2012) The chemokine superfamily revisited. Immunity 36: 705–716. 10.1016/j.immuni.2012.05.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Zlotnik A, Yoshie O (2000) Chemokines: A new classification system and their role in immunity. Immunity 12: 121–127. 10.1016/s1074-7613(00)80165-x [DOI] [PubMed] [Google Scholar]
- 9.Nomiyama H, Osada N, Yoshie O (2011) A family tree of vertebrate chemokine receptors for a unified nomenclature. Dev Comp Immunol 35: 705–715. 10.1016/j.dci.2011.01.019 [DOI] [PubMed] [Google Scholar]
- 10.Wang Y, Zhang Y, Yang X, Han W, Liu Y, Xu Q, Zhao R, Di C, Song Q, Ma D (2006) Chemokine-like factor 1 is a functional ligand for CC chemokine receptor 4 (CCR4). Life Sci 78: 614–621. 10.1016/j.lfs.2005.05.070 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Wang Y, Zhang Y, Han W, Li D, Tian L, Yin C, Ma D (2008) Two C-terminal peptides of human CKLF1 interact with the chemokine receptor CCR4. Int J Biochem Cell Biol 40: 909–919. 10.1016/j.biocel.2007.10.028 [DOI] [PubMed] [Google Scholar]
- 12.Liu D-D, Song XY, Yang PF, Ai QD, Wang YY, Feng XY, He X, Chen NH (2018) Progress in pharmacological research of chemokine like factor 1 (CKLF1). Cytokine 102: 41–50. 10.1016/j.cyto.2017.12.002 [DOI] [PubMed] [Google Scholar]
- 13.Han W, Ding P, Xu M, Wang L, Rui M, Shi S, Liu Y, Zheng Y, Chen Y, Yang T, et al. (2003) Identification of eight genes encoding chemokine-like factor superfamily members 1–8 (CKLFSF1–8) by in silico cloning and experimental validation. Genomics 81: 609–617. 10.1016/s0888-7543(03)00095-8 [DOI] [PubMed] [Google Scholar]
- 14.Duan H-J, Li X-Y, Liu C, Deng X-L (2020) Chemokine-like factor-like MARVEL transmembrane domain-containing family in autoimmune diseases. Chin Med J 133: 951–958. 10.1097/CM9.0000000000000747 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Han W, Lou Y, Tang J, Zhang Y, Chen Y, Li Y, Gu W, Huang J, Gui L, Tang Y, et al. (2001) Molecular cloning and characterization of chemokine-like factor 1 (CKLF1), a novel human cytokine with unique structure and potential chemotactic activity. Biochem J 357: 127–135. 10.1042/0264-6021:3570127 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Wang L, Wu C, Zheng Y, Qiu X, Wang L, Fan H, Han W, Lv B, Wang Y, Zhu X, et al. (2004) Molecular cloning and characterization of chemokine-like factor super family member 1 (CKLFSF1), a novel human gene with at least 23 alternative splicing isoforms in testis tissue. Int J Biochem Cell Biol 36: 1492–1501. 10.1016/j.biocel.2003.11.017 [DOI] [PubMed] [Google Scholar]
- 17.Jin C, Ding P, Wang Y, Ma D (2005) Regulation of EGF receptor signaling by the MARVEL domain-containing protein CKLFSF8. FEBS Lett 579: 6375–6382. 10.1016/j.febslet.2005.10.021 [DOI] [PubMed] [Google Scholar]
- 18.Wang Z-Z, Li G, Chen XY, Zhao M, Yuan YH, Wang XL, Chen NH (2010) Chemokine-like factor 1, a novel cytokine, induces nerve cell migration through the non-extracellular Ca2+-dependent tyrosine kinases pathway. Brain Res 1308: 24–34. 10.1016/j.brainres.2009.10.047 [DOI] [PubMed] [Google Scholar]
- 19.Li T, Zhong J, Chen Y, Qiu X, Zhang T, Ma D, Han W (2006) Expression of chemokine-like factor 1 is upregulated during T lymphocyte activation. Life Sci 79: 519–524. 10.1016/j.lfs.2006.01.042 [DOI] [PubMed] [Google Scholar]
- 20.Zhang Y, Tian L, Zheng Y, Qi H, Guo C, Sun Q, Xu E, Zhang Y, Ma D, Wang Y (2011) C-terminal peptides of chemokine-like factor 1 signal through chemokine receptor CCR4 to cross-desensitize the CXCR4. Biochem Biophys Res Commun 409: 356–361. 10.1016/j.bbrc.2011.05.047 [DOI] [PubMed] [Google Scholar]
- 21.Li H, Li J, Su Y, Fan Y, Guo X, Li L, Su X, Rong R, Ying J, Mo X, et al. (2014) A novel 3p22.3 gene CMTM7 represses oncogenic EGFR signaling and inhibits cancer cell growth. Oncogene 33: 3109–3118. 10.1038/onc.2013.282 [DOI] [PubMed] [Google Scholar]
- 22.Wang X, Li T, Wang W, Yuan W, Liu H, Cheng Y, Wang P, Zhang Y, Han W (2016) Cytokine-like 1 chemoattracts monocytes/macrophages via CCR2. J Immunol 196: 4090–4099. 10.4049/jimmunol.1501908 [DOI] [PubMed] [Google Scholar]
- 23.Zhu S, Kuek V, Bennett S, Xu H, Rosen V, Xu J (2019) Protein Cytl1: Its role in chondrogenesis, cartilage homeostasis, and disease. Cell Mol Life Sci 76: 3515–3523. 10.1007/s00018-019-03137-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Xue H, Li S, Zhao X, Guo F, Jiang L, Wang Y, Zhu F (2020) CYTL1 promotes the activation of neutrophils in a sepsis model. Inflammation 43: 274–285. 10.1007/s10753-019-01116-9 [DOI] [PubMed] [Google Scholar]
- 25.Tom Tang Y, Emtage P, Funk WD, Hu T, Arterburn M, Park EE, Rupp F (2004) TAFA: A novel secreted family with conserved cysteine residues and restricted expression in the brain. Genomics 83: 727–734. 10.1016/j.ygeno.2003.10.006 [DOI] [PubMed] [Google Scholar]
- 26.Sarver DC, Lei X, Wong GW (2021) FAM19A (TAFA): An emerging family of neurokines with diverse functions in the central and peripheral nervous system. ACS Chem Neurosci 12: 945–958. 10.1021/acschemneuro.0c00757 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wang W, Li T, Wang X, Yuan W, Cheng Y, Zhang H, Xu E, Zhang Y, Shi S, Ma D, et al. (2015) FAM19A4 is a novel cytokine ligand of formyl peptide receptor 1 (FPR1) and is able to promote the migration and phagocytosis of macrophages. Cell Mol Immunol 12: 615–624. 10.1038/cmi.2014.61 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Park MY, Kim HS, Lee M, Park B, Lee HY, Cho EB, Seong JY, Bae YS (2017) FAM19A5, a brain-specific chemokine, inhibits RANKL-induced osteoclast formation through formyl peptide receptor 2. Sci Rep 7: 15575. 10.1038/s41598-017-15586-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Zheng C, Chen D, Zhang Y, Bai Y, Huang S, Zheng D, Liang W, She S, Peng X, Wang P, et al. (2018) FAM19A1 is a new ligand for GPR1 that modulates neural stem-cell proliferation and differentiation. FASEB J 32: 5874–5890. 10.1096/fj.201800020rrr [DOI] [PubMed] [Google Scholar]
- 30.Wang X, Shen C, Chen X, Wang J, Cui X, Wang Y, Zhang H, Tang L, Lu S, Fei J, et al. (2018) Tafa-2 plays an essential role in neuronal survival and neurobiological function in mice. Acta Biochim Biophys Sinica 50: 984–995. 10.1093/abbs/gmy097 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Wang Y, Chen D, Zhang Y, Wang P, Zheng C, Zhang S, Yu B, Zhang L, Zhao G, Ma B, et al. (2018) Novel adipokine, FAM19A5, inhibits neointima formation after injury through sphingosine-1-phosphate receptor 2. Circulation 138: 48–63. 10.1161/CIRCULATIONAHA.117.032398 [DOI] [PubMed] [Google Scholar]
- 32.Okada J, Yamada E, Saito T, Ozawa A, Nakajima Y, Pessin JE, Okada S, Yamada M (2019) Analysis of FAM19A2/TAFA-2 function. Physiol Behav 208: 112581. 10.1016/j.physbeh.2019.112581 [DOI] [PubMed] [Google Scholar]
- 33.Bonecchi R, Graham GJ (2016) Atypical chemokine receptors and their roles in the resolution of the inflammatory response. Front Immunol 7: 224. 10.3389/fimmu.2016.00224 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Nibbs RJB, Graham GJ (2013) Immune regulation by atypical chemokine receptors. Nat Rev Immunol 13: 815–829. 10.1038/nri3544 [DOI] [PubMed] [Google Scholar]
- 35.Bachelerie F, Graham GJ, Locati M, Mantovani A, Murphy PM, Nibbs R, Rot A, Sozzani S, Thelen M (2014) New nomenclature for atypical chemokine receptors. Nat Immunol 15: 207–208. 10.1038/ni.2812 [DOI] [PubMed] [Google Scholar]
- 36.Yoshimura T, Oppenheim JJ (2011) Chemokine-like receptor 1 (CMKLR1) and chemokine (C–C motif) receptor-like 2 (CCRL2); two multifunctional receptors with unusual properties. Exp Cell Res 317: 674–684. 10.1016/j.yexcr.2010.10.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Lokeshwar BL, Kallifatidis G, Hoy JJ (2020) Chapter one - a typical chemokine receptors in tumor cell growth and metastasis. In Advances in Cancer Research, GPCR Signaling in Cancer, Shukla AK (ed), pp 1–27. Cambridge, MA: Academic Press. [DOI] [PubMed] [Google Scholar]
- 38.He H-Q, Ye RD (2017) The formyl peptide receptors: Diversity of ligands and mechanism for recognition. Molecules 22: 455. 10.3390/molecules22030455 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Maravillas-Montero JL, Burkhardt AM, Hevezi PA, Carnevale CD, Smit MJ, Zlotnik A (2015) Cutting edge: GPR35/CXCR8 is the receptor of the mucosal chemokine CXCL17. J Immunol 194: 29–33. 10.4049/jimmunol.1401704 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Tomczak A, Pisabarro MT (2011) Identification of CCR2-binding features in Cytl1 by a CCL2-like chemokine model. Proteins 79: 1277–1292. 10.1002/prot.22963 [DOI] [PubMed] [Google Scholar]
- 41.Kledal TN, Rosenkilde MM, Schwartz TW (1998) Selective recognition of the membrane-bound CX3C chemokine, fractalkine, by the human cytomegalovirus-encoded broad-spectrum receptor US28. FEBS Lett 441: 209–214. 10.1016/s0014-5793(98)01551-8 [DOI] [PubMed] [Google Scholar]
- 42.Miles TF, Spiess K, Jude KM, Tsutsumi N, Burg JS, Ingram JR, Waghray D, Hjorto GM, Larsen O, Ploegh HL, et al. (2018) Viral GPCR US28 can signal in response to chemokine agonists of nearly unlimited structural degeneracy. Elife 7: e35850. 10.7554/eLife.35850 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Rosenkilde MM, Smit MJ, Waldhoer M (2008) Structure, function and physiological consequences of virally encoded chemokine seven transmembrane receptors. Br J Pharmacol 153: S154–S166. 10.1038/sj.bjp.0707660 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Daiyasu H, Nemoto W, Toh H (2012) Evolutionary analysis of functional divergence among chemokine receptors, decoy receptors, and viral receptors. Front Microbiol 3: 264. 10.3389/fmicb.2012.00264 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Meyrath M, Szpakowska M, Zeiner J, Massotte L, Merz MP, Benkel T, Simon K, Ohnmacht J, Turner JD, Krüger R, et al. (2020) The atypical chemokine receptor ACKR3/CXCR7 is a broad-spectrum scavenger for opioid peptides. Nat Commun 11: 3033. 10.1038/s41467-020-16664-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Denisov SS (2021) CXCL17: The black sheep in the chemokine flock. Front Immunol 12: 712897. 10.3389/fimmu.2021.712897 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.DeVries ME, Kelvin AA, Xu L, Ran L, Robinson J, Kelvin DJ (2006) Defining the origins and evolution of the chemokine/chemokine receptor system. J Immunol 176: 401–415. 10.4049/jimmunol.176.1.401 [DOI] [PubMed] [Google Scholar]
- 48.Bajoghli B (2013) Evolution and function of chemokine receptors in the immune system of lower vertebrates. Eur J Immunol 43: 1686–1692. 10.1002/eji.201343557 [DOI] [PubMed] [Google Scholar]
- 49.Nomiyama H, Hieshima K, Osada N, Kato-Unoki Y, Otsuka-Ono K, Takegawa S, Izawa T, Yoshizawa A, Kikuchi Y, Tanase S, et al. (2008) Extensive expansion and diversification of the chemokine gene family in zebrafish: Identification of a novel chemokine subfamily CX. BMC Genomics 9: 222. 10.1186/1471-2164-9-222 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Fleming JF, Feuda R, Roberts NW, Pisani D (2020) A novel approach to investigate the effect of tree reconstruction artifacts in single-gene analysis clarifies opsin evolution in nonbilaterian metazoans. Genome Biol Evol 12: 3906–3916. 10.1093/gbe/evaa015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215: 403–410. 10.1016/S0022-2836(05)80360-2 [DOI] [PubMed] [Google Scholar]
- 52.Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ, et al. (1997) Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402. 10.1093/nar/25.17.3389 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: Architecture and applications. BMC Bioinformatics 10: 421. 10.1186/1471-2105-10-421 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Frickey T, Lupas A (2004) CLANS: A Java application for visualizing protein families based on pairwise similarity. Bioinformatics 20: 3702–3704. 10.1093/bioinformatics/bth444 [DOI] [PubMed] [Google Scholar]
- 55.Gabler F, Nam SZ, Till S, Mirdita M, Steinegger M, Söding J, Lupas AN, Alva V (2020) Protein sequence analysis using the MPI bioinformatics toolkit. Curr Protoc Bioinformatics 72: e108. 10.1002/cpbi.108 [DOI] [PubMed] [Google Scholar]
- 56.Park S-J, Lee S-J, Nam S-Y, Im D-S (2018) GPR35 mediates lodoxamide-induced migration inhibitory response but not CXCL17-induced migration stimulatory response in THP-1 cells; is GPR35 a receptor for CXCL17? Br J Pharmacol 175: 154–161. 10.1111/bph.14082 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Binti Mohd Amir NAS, Mackenzie AE, Jenkins L, Boustani K, Hillier MC, Tsuchiya T, Milligan G, Pease JE (2018) Evidence for the existence of a CXCL17 receptor distinct from GPR35. J Immunol 201: 714–724. 10.4049/jimmunol.1700884 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Minh BQ, Nguyen MAT, von Haeseler A (2013) Ultrafast approximation for phylogenetic bootstrap. Mol Biol Evol 30: 1188–1195. 10.1093/molbev/mst024 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS (2018) UFBoot2: Improving the ultrafast bootstrap approximation. Mol Biol Evol 35: 518–522. 10.1093/molbev/msx281 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Lemoine F, Domelevo Entfellner JB, Wilkinson E, Correia D, Dávila Felipe M, De Oliveira T, Gascuel O (2018) Renewing Felsenstein’s phylogenetic bootstrap in the era of big data. Nature 556: 452–456. 10.1038/s41586-018-0043-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61.Morel B, Kozlov AM, Stamatakis A, Szöllősi GJ (2020) GeneRax: A tool for species-tree-aware maximum likelihood-based gene family tree inference under gene duplication, transfer, and loss. Mol Biol Evol 37: 2763–2774. 10.1093/molbev/msaa141 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Williams TA, Davin AA, Morel B, Szánthó LL, Spang A, Stamatakis A, Hugenholtz P, Szöllősi GJ (2023) The power and limitations of species tree-aware phylogenetics. BioRxiv. 10.1101/2023.03.17.533068 (Preprint posted March 17, 2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Nomiyama H, Osada N, Yoshie O (2013) Systematic classification of vertebrate chemokines based on conserved synteny and evolutionary history. Genes Cell devoted Mol Cell Mech 18: 1–16. 10.1111/gtc.12013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Zlotnik A, Yoshie O, Nomiyama H (2006) The chemokine and chemokine receptor superfamilies and their molecular evolution. Genome Biol 7: 243. 10.1186/gb-2006-7-12-243 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Murphy PM (2023) 15 - chemokines and chemokine receptors. In Clinical Immunology, Rich RR (ed), Sixth edn, pp 215–227. Amsterdam, Netherlands: Elsevier. [Google Scholar]
- 66.Sun Z, Qin Y, Liu D, Wang B, Jia Z, Wang J, Gao Q, Zou J, Pang Y (2021) The evolution and functional characterization of CXC chemokines and receptors in lamprey. Dev Comp Immunol 116: 103905. 10.1016/j.dci.2020.103905 [DOI] [PubMed] [Google Scholar]
- 67.Le Mercier A, Bonnavion R, Yu W, Alnouri MW, Ramas S, Zhang Y, Jäger Y, Roquid KA, Jeong HW, Sivaraj KK, et al. (2021) GPR182 is an endothelium-specific atypical chemokine receptor that maintains hematopoietic stem cell homeostasis. Proc Natl Acad Sci U S A 118: e2021596118. 10.1073/pnas.2021596118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Liò P, Vannucci M (2003) Investigating the evolution and structure of chemokine receptors. Gene 317: 29–37. 10.1016/s0378-1119(03)00666-8 [DOI] [PubMed] [Google Scholar]
- 69.Fredriksson R, Lagerström MC, Lundin L-G, Schiöth HB (2003) The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints. Mol Pharmacol 63: 1256–1272. 10.1124/mol.63.6.1256 [DOI] [PubMed] [Google Scholar]
- 70.Xiao S, Xie W, Zhou L (2021) Mucosal chemokine CXCL17: What is known and not known. Scand J Immunol 93: e12965. 10.1111/sji.12965 [DOI] [PubMed] [Google Scholar]
- 71.Giblin SP, Pease JE (2023) What defines a chemokine? – The curious case of CXCL17. Cytokine 168: 156224. 10.1016/j.cyto.2023.156224 [DOI] [PubMed] [Google Scholar]
- 72.Duan J, Liu Q, Yuan Q, Ji Y, Zhu S, Tan Y, He X, Xu Y, Shi J, Cheng X, et al. (2022) Insights into divalent cation regulation and G13-coupling of orphan receptor GPR35. Cell Discov 8: 135. 10.1038/s41421-022-00499-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 73.Dohrmann M, Wörheide G (2017) Dating early animal evolution using phylogenomic data. Sci Rep 7: 3599. 10.1038/s41598-017-03791-w [DOI] [PMC free article] [PubMed] [Google Scholar]
- 74.Delsuc F, Philippe H, Tsagkogeorga G, Simion P, Tilak MK, Turon X, López-Legentil S, Piette J, Lemaire P, Douzery EJP (2018) A phylogenomic framework and timescale for comparative studies of tunicates. BMC Biol 16: 39. 10.1186/s12915-018-0499-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Gradstein FM, Ogg JG (2012) Chapter 2 - the chronostratigraphic scale. In The Geologic Time Scale, Gradstein FM, Ogg JG, Schmitz MD, Ogg GM (eds), pp 31–42. Amsterdam, Netherlands: Elsevier. [Google Scholar]
- 76.Pisabarro MT, Leung B, Kwong M, Corpuz R, Frantz GD, Chiang N, Vandlen R, Diehl LJ, Skelton N, Kim HS, et al. (2006) Cutting edge: Novel human dendritic cell- and monocyte-attracting chemokine-like protein identified by fold recognition methods. J Immunol 176: 2069–2073. 10.4049/jimmunol.176.4.2069 [DOI] [PubMed] [Google Scholar]
- 77.Weinstein EJ, Head R, Griggs DW, Sun D, Evans RJ, Swearingen ML, Westlin MM, Mazzarella R (2006) VCC-1, a novel chemokine, promotes tumor growth. Biochem Biophys Res Commun 350: 74–81. 10.1016/j.bbrc.2006.08.194 [DOI] [PubMed] [Google Scholar]
- 78.Najakshin AM, Mechetina LV, Alabyev BY, Taranin AV (1999) Identification of an IL-8 homolog in lamprey (Lampetra fluviatilis): Early evolutionary divergence of chemokines. Eur J Immunol 29: 375–382. [DOI] [PubMed] [Google Scholar]
- 79.Bajoghli B, Aghaallaei N, Hess I, Rode I, Netuschil N, Tay BH, Venkatesh B, Yu JK, Kaltenbach SL, Holland ND, et al. (2009) Evolution of genetic networks underlying the emergence of thymopoiesis in vertebrates. Cell 138: 186–197. 10.1016/j.cell.2009.04.017 [DOI] [PubMed] [Google Scholar]
- 80.Pan L, Lv J, Zhang Z, Zhang Y (2018) Adaptation and constraint in the atypical chemokine receptor family in mammals. BioMed Res Int 2018: 9065181. 10.1155/2018/9065181 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Allen SJ, Crown SE, Handel TM (2007) Chemokine:Receptor structure, interactions, and antagonism. Annu Rev Immunol 25: 787–820. 10.1146/annurev.immunol.24.021605.090529 [DOI] [PubMed] [Google Scholar]
- 82.Horuk R, Chitnis CE, Darbonne WC, Colby TJ, Rybicki A, Hadley TJ, Miller LH (1993) A receptor for the malarial parasite plasmodium vivax: The erythrocyte chemokine receptor. Science 261: 1182–1184. 10.1126/science.7689250 [DOI] [PubMed] [Google Scholar]
- 83.Horuk R (2015) The duffy antigen receptor for chemokines DARC/ACKR1. Front Immunol 6: 279. 10.3389/fimmu.2015.00279 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Kasahara M (2007) The 2R hypothesis: An update. Curr Opin Immunol 19: 547–552. 10.1016/j.coi.2007.07.009 [DOI] [PubMed] [Google Scholar]
- 85.Simakov O, Marlétaz F, Yue JX, O’Connell B, Jenkins J, Brandt A, Calef R, Tung CH, Huang TK, Schmutz J, et al. (2020) Deeply conserved synteny resolves early events in vertebrate evolution. Nat Ecol Evol 4: 820–830. 10.1038/s41559-020-1156-z [DOI] [PMC free article] [PubMed] [Google Scholar]
- 86.Tang W, Li Y, Li A, Bronner ME (2021) Clonal analysis and dynamic imaging identify multipotency of individual Gallus gallus caudal hindbrain neural crest cells toward cardiac and enteric fates. Nat Commun 12: 1894. 10.1038/s41467-021-22146-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.York JR, McCauley DW (2020) The origin and evolution of vertebrate neural crest cells. Open Biol 10: 190285. 10.1098/rsob.190285 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 88.Waterhouse RM, Seppey M, Simão FA, Manni M, Ioannidis P, Klioutchnikov G, Kriventseva EV, Zdobnov EM (2018) BUSCO applications from quality assessments to gene prediction and phylogenomics. Mol Biol Evol 35: 543–548. 10.1093/molbev/msx319 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 89.Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM (2021) BUSCO update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol 38: 4647–4654. 10.1093/molbev/msab199 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 90.Boutet E, Lieberherr D, Tognolli M, Schneider M, Bansal P, Bridge AJ, Poux S, Bougueleret L, Xenarios I (2016) UniProtKB/Swiss-Prot, the manually annotated section of the UniProt KnowledgeBase: How to use the entry view. Methods Mol Biol 1374: 23–54. 10.1007/978-1-4939-3167-5_2 [DOI] [PubMed] [Google Scholar]
- 91.Poux S, Arighi CN, Magrane M, Bateman A, Wei CH, Lu Z, Boutet E, Bye-A-Jee H, Famiglietti ML, Roechert B, et al. (2017) On expert curation and scalability: UniProtKB/Swiss-Prot as a case study. Bioinformatics 33: 3454–3460. 10.1093/bioinformatics/btx439 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 92.Smith JJ, Kuraku S, Holt C, Sauka-Spengler T, Jiang N, Campbell MS, Yandell MD, Manousaki T, Meyer A, Bloom OE, et al. (2013) Sequencing of the sea lamprey (Petromyzon marinus) genome provides insights into vertebrate evolution. Nat Genet 45: 415-21–421e1-2. 10.1038/ng.2568 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 93.Yu D, Ren Y, Uesaka M, Beavan AJS, Muffato M, Shen J, Li Y, Sato I, Wan W, Clark JW, et al. (2023) Hagfish genome illuminates vertebrate whole genome duplications and their evolutionary consequences. BioRxiv. 10.1101/2023.04.08.536076 (Preprint posted April 08, 2023). [DOI] [Google Scholar]
- 94.Zdobnov EM, Apweiler R (2001) InterProScan – an integration platform for the signature-recognition methods in InterPro. Bioinformatics 17: 847–848. 10.1093/bioinformatics/17.9.847 [DOI] [PubMed] [Google Scholar]
- 95.Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, et al. (2014) InterProScan 5: Genome-scale protein function classification. Bioinformatics 30: 1236–1240. 10.1093/bioinformatics/btu031 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 96.Bachelerie F, Ben-Baruch A, Burkhardt AM, Charo IF, Combadiere C, Förster R, Farber JM, Graham GJ, Hills R, Horuk R, et al. (2020) Chemokine receptors (version 2020.5) in the IUPHAR/BPS guide to pharmacology database. IUPHAR/BPS Guide Pharmacol CITE 2020. 10.2218/gtopdb/F14/2020.5 [DOI] [Google Scholar]
- 97.UniProt Consortium (2023) UniProt: The universal protein knowledgebase in 2023. Nucleic Acids Res 51: D523–D531. 10.1093/nar/gkac1052 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 98.Käll L, Krogh A, Sonnhammer ELL (2007) Advantages of combined transmembrane topology and signal peptide prediction—the Phobius web server. Nucleic Acids Res 35: W429–W432. 10.1093/nar/gkm256 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 99.Li W, Jaroszewski L, Godzik A (2001) Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics 17: 282–283. 10.1093/bioinformatics/17.3.282 [DOI] [PubMed] [Google Scholar]
- 100.Fu L, Niu B, Zhu Z, Wu S, Li W (2012) CD-HIT: Accelerated for clustering the next-generation sequencing data. Bioinformatics 28: 3150–3152. 10.1093/bioinformatics/bts565 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 101.Pándy-Szekeres G, Caroli J, Mamyrbekov A, Kermani AA, Keserű GM, Kooistra AJ, Gloriam DE (2023) GPCRdb in 2023: State-specific structure models using AlphaFold2 and new ligand resources. Nucleic Acids Res 51: D395–D402. 10.1093/nar/gkac1013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 102.Katoh K, Misawa K, Kuma K, Miyata T (2002) MAFFT: A novel method for rapid multiple sequence alignment based on fast fourier transform. Nucleic Acids Res 30: 3059–3066. 10.1093/nar/gkf436 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 103.Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol Biol Evol 30: 772–780. 10.1093/molbev/mst010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 104.Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T (2009) trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25: 1972–1973. 10.1093/bioinformatics/btp348 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 105.Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, Lanfear R (2020) IQ-TREE 2: New models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol 37: 1530–1534. 10.1093/molbev/msaa015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 106.Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS (2017) ModelFinder: Fast model selection for accurate phylogenetic estimates. Nat Methods 14: 587–589. 10.1038/nmeth.4285 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 107.Huerta-Cepas J, Serra F, Bork P (2016) ETE 3: Reconstruction, analysis, and visualization of phylogenomic data. Mol Biol Evol 33: 1635–1638. 10.1093/molbev/msw046 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 108.Hughes CE, Nibbs RJB (2018) A guide to chemokines and their receptors. FEBS J 285: 2944–2971. 10.1111/febs.14466 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 109.Aleotti A, Matthew G, Clifton L, Flaviano G, Roberto F (2024) The Origin, Evolution and Molecular Diversity of the Chemokine System (Version 1.1). Zenodo. 10.5281/zenodo.10460137 [DOI]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Table S2 Substitution models used in the phylogenetic analyses. (9.8KB, xlsx)
Data Availability Statement
Supplementary material and raw output files for all the analyses described in this article are available at the GitHub repository: Roberto-Feuda-Lab/Chemokine2023 (github.com). All data are also deposited in the Zenodo repository: (109).