Skip to main content
Molecular Biology and Evolution logoLink to Molecular Biology and Evolution
. 2024 Oct 21;41(12):msae220. doi: 10.1093/molbev/msae220

Evolutionary Dynamics of Proinflammatory Caspases in Primates and Rodents

Mische Holland 1, Rachel Rutkowski 2, Tera C Levin 3,b,
Editor: Diogo Meyer
PMCID: PMC11630849  PMID: 39431598

Abstract

Caspase-1 and related proteases are key players in inflammation and innate immunity. Here, we characterize the evolutionary history of caspase-1 and its close relatives across 19 primates and 21 rodents, focusing on differences that may cause discrepancies between humans and animal studies. While caspase-1 has been retained in all these taxa, other members of the caspase-1 subfamily (caspase-4, caspase-5, caspase-11, and caspase-12 and CARD16, 17, and 18) each have unique evolutionary trajectories. Caspase-4 is found across simian primates, whereas we identified multiple pseudogenization and gene loss events in caspase-5, caspase-11, and the CARDs. Because caspase-4 and caspase-11 are both key players in the noncanonical inflammasome pathway, we expected that these proteins would be likely to evolve rapidly. Instead, we found that these two proteins are largely conserved, whereas caspase-4's close paralog, caspase-5, showed significant indications of positive selection, as did primate caspase-1. Caspase-12 is a nonfunctional pseudogene in humans. We find this extends across most primates, although many rodents and some primates retain an intact, and likely functional, caspase-12. In mouse laboratory lines, we found that 50% of common strains carry nonsynonymous variants that may impact the functions of caspase-11 and caspase-12 and therefore recommend specific strains to be used (and avoided). Finally, unlike rodents, primate caspases have undergone repeated rounds of gene conversion, duplication, and loss leading to a highly dynamic proinflammatory caspase repertoire. Thus, we uncovered many differences in the evolution of primate and rodent proinflammatory caspases and discuss the potential implications of this history for caspase gene functions.

Keywords: caspase-1, caspase-4, caspase-11, caspase-5, caspase-12, innate immunity

Introduction

Pathogen-interacting genes are among the most rapidly evolving in mammalian genomes (Enard et al. 2016). Because of the strong selective pressures involved, such genes often diversify through evolutionary arms races, which involve multiple, recurrent changes in host and pathogen molecules as each protein reciprocally adapts (Daugherty and Malik 2012). Common mechanisms of diversification in arms races include gene family expansion and contraction, gene conversion, and rapid amino acid turnover, particularly at sites of host–pathogen binding (Daugherty and Malik 2012; McLaughlin and Malik 2017; Daugherty and Zanders 2019).

Innate immune pathways, including those involved in inflammation, are key defenses against many human pathogens, and deficiencies in these pathways lead to heightened pathogen susceptibility (e.g. Zhu et al. 2018). Overactivation of inflammation can also be dangerous for the host, both during infection (Wu et al. 2023) and in autoimmunity (Xu et al. 2024). Therefore, we expect natural selection to act strongly on genes in inflammatory pathways, tuning these proteins to recognize an ever-changing array of microbial infections while avoiding self-harm. Indeed, some of the proteins involved in these pathways (e.g. inflammasome proteins NLRPs and CARD8) have been found to rapidly diversify, with signatures of evolutionary arms races (Tenthorey et al. 2014; Tsu et al. 2021, 2023). We therefore sought to examine the evolutionary history and diversity of the proinflammatory caspase genes, which participate in these pathways.

Caspase-1 is a protease that is a hub of proinflammatory signaling. In the “canonical” signaling pathway, multiprotein complexes called inflammasomes assemble in response to cellular or microbial signals (Sundaram et al. 2024). Caspase-1 is recruited to inflammasomes via its caspase activation and recruitment domain (CARD), where its oligomerization leads to self-cleavage and activation. The active caspase-1 protease (made up of p20 and p10 subunits, see Fig. 4 for domain architectures) then targets for maturation the proinflammatory cytokines interleukin (IL)-1beta and IL-18, as well as gasdermin D, which forms membrane pores and leads to an explosive cell death known as pyroptosis (Bibo-Verdugo and Salvesen 2024; Xu et al. 2024).

Fig. 4.

Fig. 4.

Positively selected residues in proinflammatory caspases in primates and rodents. a to e) Sites of positive selection, as determined by the FUBAR and CODEML algorithms. Arrowheads mark sites passing the statistical threshold for positive selection. Vertical dotted lines separate recombination segments identified by GARD, which were each analyzed independently. The domain architecture of each caspase is shown as colored boxes, with active sites illustrated (yellow *) along with the proposed LPS-binding residues of CASP4 and Casp11 according to the study by Shi et al. (2014) (orange carats). In three cases, sites of positive selection in primate CASP1 were homologous to sites identified in one or more other caspase alignment (bolded green, blue, or purple residue names). All coordinates are named according to the residues in the human or mouse sequences for primate or rodent analyses, respectively. f and g) Each identified site of CASP1 positive selection (red) is illustrated on the predicted AlphaFold structure of human f) or mouse g) caspase-1. The CARD, p20, and p10 domains are colored as in a to e), with linker regions in gray and active sites in yellow. Many more sites of positive selection were identified in primates than in rodents, but in neither clade do the sites show much colocalization in 3D space.

In addition, there are several caspase-1 homologs (collectively known as the caspase-1 subfamily), which all reside within the same syntenic locus as the caspase-1 gene in primates and rodents (Eckhart et al. 2008; Fig. 1a). One of these genes, rodent Casp11, has two homologs in humans, called CASP4 and 5, which have been implicated in “noncanonical” inflammatory signaling. (Note: Although rodent Casp11 is clearly the ortholog of primate CASP4/5 and is sometimes called Casp4, we refer to it here as Casp11 as this name is more commonly used in the biomedical, mouse-centric literature.) In the noncanonical pathway, caspase-4 or caspase-11 directly binds to cytosolic, bacterial lipopolysaccharide (LPS), leading to caspase oligomerization, activation, cleavage of cytokines and gasdermin D, and inflammatory cell death (Shi et al. 2014; Sahoo et al. 2023). Thus, the noncanonical pathway can sense and respond to cytosolic bacterial pathogens through direct caspase activation. Some caspase-5 mutations have been associated with cancer, and it is largely assumed that it participates in similar noncanonical signaling (Sahoo et al. 2023). However, the molecular functions of caspase-5 have remained more enigmatic, with mainly in vitro evidence supporting similar biochemical activities to its caspase-4 paralog (Shi et al. 2014; Eckhart and Fischer 2024).

Fig. 1.

Fig. 1.

Differential patterns of gene gain and loss within the caspase-1 locus. a) Genes in the caspase-1 locus in human and mouse (not to scale). b) PhyML phylogenetic tree of intact caspase genes demonstrates that orthologs have been correctly categorized, despite gene gains and losses. Numbers at nodes are aLRT support values. c and d) Gene presence/absence across primate c) and rodent d) species, with colored Xs on the tree showing the history of gene losses, while * marks the CASP4/5 duplication event. See text for definitions of pseudogene versus partial sequence versus gene absence. CASP1 has been retained across all species examined, as has CASP4 in simian primates. In contrast, there have been multiple cases of gene degradation and loss of rodent Casp11, primate CASP5, and the primate CARD genes. Early-branching primates encode homologs of a CASP4/5 ancestor, with additional duplications in some lineages. Casp12 is retained and functional across most (but not all) rodents, while it has been lost from most primates. Species in bold were used in selection analyses.

Other caspase-1 subfamily genes with less characterized roles include caspase-12, which is generally considered nonfunctional in humans (Fischer et al. 2002; Kachapati et al. 2006; Xue et al. 2006), and three CASP1-like genes that contain a CARD as their only protein domain (the CARDs, aka CARD-only proteins or COPs). The CARD-only genes are named CARD16, 17, and 18 (or alternatively COP1, Inca, and Iceberg, respectively; Druilhe et al. 2001; Lamkanfi et al. 2004; Indramohan et al. 2018). Overexpression studies of the CARDs suggested that they may act as caspase-1 inhibitors (Druilhe et al. 2001; Lamkanfi et al. 2004; Karasawa et al. 2015), and in vitro studies suggested that CARD16 and 18 promote CASP1 CARD oligomerization, whereas CARD17 inhibits it (Lu et al. 2016). However, a recent in vivo study of CARDs found that they each inhibit caspase-1 activation by competitively binding to the CARDs of caspase-1 and inflammasome adapters, blocking caspase-1 recruitment to inflammasomes (Devi et al. 2023). These CARDs apparently act nonredundantly, dampening caspase-1 activation during different stages of inflammasome signaling.

In terms of their expression patterns, mouse Casp1, Casp11, and Casp12 all have similar qualitative expression patterns to human CASP1 and CASP4, i.e. relatively high expression across most tissues and elevated in blood, spleen, and lung (Ringwald et al. 2022; human data from https://www.gtexportal.org/ on 2024 August 15). CARD16 has a similar expression pattern, whereas CARD17 and 18 are expressed at very low levels. The basal expression of CASP5 is also very low except in blood, colon, and small intestine, although it has been reported that CASP5 expression is induced by LPS and interferon-gamma (Lin et al. 2000).

Pathogens have abundant mechanisms to evade and shut down these inflammatory signaling pathways, including antagonists that act both upstream (Brodsky et al. 2010; Chung et al. 2016) and downstream (Luchetti et al. 2021) of caspase-1 activation, as well as multicaspase inhibitors that can target caspase-1, caspase-4, and caspase-5 by forming covalent bonds at the caspase active site (Kamada et al. 1997; Best 2008). Specific microbial effectors that bind caspase-4 and caspase-11 have also been identified, including the OspC3 effector from Shigella that inhibits caspase function through adenosine diphosphate (ADP) riboxination (Li et al. 2021). Because the members of the caspase-1 subfamily are key players in immune defenses, which can directly bind pathogen molecules and are often antagonized by pathogens, we hypothesized that they may evolve in evolutionary arms races, similar to those that have diversified inflammasome proteins (Castro and Daugherty 2023).

Previous studies have identified some hints that the caspase-1 subfamily has been evolutionarily dynamic. For example, instances of gene duplication and loss have been observed in mammals, mollusks, and insects (Lamkanfi et al. 2002; Eckhart et al. 2008). Some of the diversity of caspases has interesting and important functional implications, including the caspase-1/4 fusion proteins in carnivores (Devant et al. 2021). Because mice are the most commonly used animal models of proinflammatory signaling pathways, it is particularly important to understand the similarities and differences in caspases between primates and rodents. However, prior studies of caspase evolution included at most four distantly related rodents and three primates (Eckhart et al. 2008), limiting our evolutionary resolution in these clades.

Here, we take an in-depth comparison of evolution of the inflammatory caspase locus across 19 primate and 21 rodent species. We find that the caspase-1 locus is much more evolutionarily dynamic in primates than in rodents, with evidence of recurrent gene duplications, pseudogenization, and gene conversion events. Contrary to our expectations, the signatures of selection were strongest and most evident in primate CASP1 and 5 when compared to the other caspases, with interesting phenotypic implications. We also found that many commonly used mouse strains carry one or more nonsynonymous variants in caspase-11 and caspase-12, which may contribute to variation in inflammatory responses across animal studies. Combined with prior evidence of human segregating polymorphisms (Fairley et al. 2020) and copy number variation (Vollger et al. 2022) within the caspase-1 locus, we propose that these genes have not only a history of diversification but also abundant, ongoing genetic innovation.

Results

Differential Caspase Gene Birth and Loss in Primates and Rodents

Caspase-1 subfamily genes all reside within a single syntenic locus (Fig. 1a), found in humans on chromosome 11q22.3. To assess the repertoires of these genes across species, we analyzed the locus across 19 primate and 21 rodent genomes. These species were selected based on: (i) sampling taxa across the phylogeny that were not too diverged from our reference human and house mouse species (i.e. CASP1 dS < 0.3) and (ii) whether the full, ∼350-kb locus was well assembled on a single contig in the genome. Within the locus, we identified caspase genes by searching for exons that had 60% nucleotide identity to the annotated human or house mouse genes. In cases where diverged exons were not identified, we used synteny and alignments to the reference coding sequences to locate the missing exons.

Through this process, we classified every homolog into one of four categories. First, we called genes as “present” if they were found within their syntenic location and had a conserved, in-frame sequence that aligned across its full length to the reference. If our initial searches failed to find the homolog or identified only partial genes, we used blastn to search the Refseq Representative Genomes database for the annotated transcripts for each homolog. If this search yielded evidence of an in-frame transcript, we then identified each exon in the locus, realigned the coding sequence based on transcript evidence, and called the gene as “present.” Some homologs only had shorter in-frame sequences that did not span the full length of the reference gene. We categorized these homologs in the second category, “partial in-frame sequence.” If we identified some traces of the gene but did not find evidence of an in-frame coding sequence in either the genome or transcript database, we placed them in a third category, “pseudogenes.” Finally, if the gene was undetectable and/or if we found fewer than 70% of the exons when compared to the reference gene, we categorized the homolog as “absent.”

For all caspases that were present and intact, we confirmed their identities using a phylogenetic tree, which showed that all genes were correctly categorized into their CASP1, 4, 5, 11, or 12 gene families (Fig. 1b). As expected based on prior studies (Lamkanfi et al. 2002; Eckhart et al. 2008), this analysis also showed that CASP4 and 5 are paralogs found only in primates and most closely related to the rodent Casp11 gene. Yet, when we assessed which species had which intact caspase homologs to understand the history of gene gains and losses, we uncovered some surprises as well.

Caspase-1

CASP1 was present within the syntenic locus, and we recovered full-length, well-aligned coding sequences for all 40 primate and rodent species evaluated (Fig. 1c and d). This pattern of widespread gene retention implies that CASP1 plays an important, nonredundant role in organismal fitness across primates and rodents.

Caspase-12

For CASP12, we found that the previously reported partial gene loss in humans (Fischer et al. 2002; Kachapati et al. 2006) extended across most primates, with multiple lineages experiencing gene degradation over time. Importantly, we did find intact CASP12 in the orangutan, Old World Monkeys, loris, lemur, and tarsier genomes. These are the first reported cases of intact CASP12 in primates. Their distribution across the tree implies that CASP12 has experienced at least four gene loss events across primate history (Fig. 1c). Neither the SHS to SHG mutation found within humans that disrupts catalytic activity (Fischer et al. 2002) nor the residue 125 human stop codon polymorphism (Xue et al. 2006) was shared between humans and other primates, although many primate species had numerous CASP12 frameshifts and early stop codons. In rodents, Casp12 was present in most species. However, we estimate that there have been up to six additional instances of independent, lineage-specific Casp12 pseudogenization and/or loss events within rodents (Fig. 1d), suggesting that Casp12 loss is frequent across multiple mammalian clades. To test if CASP12 is slowly becoming a pseudogene across rodents as it has in primates, we made an alignment of intact rodent Casp12 sequences and analyzed it with CODEML (model 0 vs. 0a; Yang 2007). We detected strong evidence that rodent Casp12 is under purifying selection (supplementary table S3, Supplementary Material online), suggesting that it is likely a functional gene in most rodent species. We were not able to perform this analysis for primate CASP12, as we did not recover enough CASP12 homologs at the appropriate phylogenetic distance.

Caspase-4, Caspase-5, and Caspase-11

For CASP4, 5, and 11, because these proteins are thought to share similar immune functions, we thought they might exhibit similar patterns of evolution. Alternatively, because recently duplicated genes often lose one paralog and return to single copy, we expected Casp11 might be conserved across rodents, whereas CASP4 and/or CASP5 would experience some reciprocal losses in primates. In contrast to both hypotheses, we found CASP4 was present across all simian primate genomes, whereas there were multiple cases of pseudogenization (and some cases of complete gene loss) in rodent Casp11 and simian primate CASP5 (Fig. 1c and d). Because of these losses, four diverse rodent species, Apodemus sylvaticus, Mesocricetus auratus, Microtus ochrogaster, and Marmota marmota, have CASP1 as their only caspase-1 subfamily gene.

Interestingly, in basal-branching primates (tarsier, loris, and lemur), we found that these species did not encode CASP4 or CASP5, but rather a homolog of the CASP4/CASP5 ancestor, which we here name the CASP4/5 genes (Fig. 1b and c). We can thus date the caspase-4/5 duplication event as occurring between 43 and 69 Ma, in the ancestor of simian primates (Kumar et al. 2017). Notably, this conclusion differs from prior studies that have not sampled primates as thoroughly as the analysis presented here (Eckhart and Fischer 2024). In the tarsiers and Sunda slow loris, we discovered that the CASP4/5 genes had undergone additional, lineage-specific duplications, independent of the duplication event that birthed CASP4 and CASP5.

CARD Genes

Within primates, we also found at least four species that lack all three CARDs (Fig. 1c). Based on the distribution of CARDs across the primate tree, we can infer that CARD17 and 18 originated in the last common ancestor of all primates, ∼74 Ma (Kumar et al. 2017). However, since then all three CARDs have experienced multiple, repeated instances of gene loss.

Recent, Repeated Gene Conversion and Duplications in the Primate CASP1 Locus

In addition to the caspase and CARD genes described above, the human locus contains multiple caspase-like pseudogenes, indicating that there have been additional historical duplication events followed by gene loss. These include two CASP1-like pseudogenes that lie between the CARDs (Casp1P1 and P2) and a CASP4-like pseudogene (Casp4LP) near CASP12 (Fig. 2). Inspired by this history of innovation, we next examined the nucleotide sequence of the locus for evidence of recent duplications or gene conversion events that may not be reflected in simple gene presence/absence data. To do so, we searched for extended blocks of similar nucleotide sequence, defining blocks of self-similarity as regions at least 1-kb long that contained at least 50% nucleotide identity to another region within the 350-kb locus.

Fig. 2.

Fig. 2.

The caspase-1 locus has undergone recent gene conversions and tandem duplications in primates, but not in rodents. a to c) Annotated genes (filled colored arrows) and pseudogenes (white arrows with colored outlines) in the human a and b) and mouse c) loci. Colored arcs a) connect regions of extended nucleotide identity between caspase genes or pseudogenes, indicative of recent tandem gene duplications or gene conversions. Gray arcs b and c) show regions of sequence identity within intronic or intergenic regions.

In the human locus, we detected many regions of self-similarity, including those that duplicated both coding and noncoding sequences (Fig. 2a and b). For example, a ∼6-kb region of the CASP4 locus (including both exons and introns) was 60% identical to the CASP4LP pseudogene and 54% identical to the CASP5 locus. We found CASP4LP loci only within great apes (and it was a pseudogene in each case), suggesting that this locus likely arose ∼13 Ma, whereas CASP5 dates back to the last common ancestor of simian primates ∼50 Ma (Kumar et al. 2017; Fig. 1). These findings (and those of Figs. 3 and 4 below) suggest that CASP4 and CASP5 have experienced recent gene conversion events that overwrote the sequence of one locus with the other at one or more times during primate evolution. We found similar blocks of self-similarity in the human locus between CASP1 and CARD16, CARD17, and nearby pseudogenes (Fig. 2a), as well as blocks that included intronic, intergenic, or promoter regions (Fig. 2b). Although CARD18 arose via CASP1 duplication at approximately the same time as CARD16 and 17 (Fig. 1), it appears that CARD18 has not undergone recent gene conversion, as CARD18's sequence is now too diverged to be identified in this analysis. Overall, recurrent segmental duplications and gene conversion events within the primate CASP1 locus appear to be common.

Fig. 3.

Fig. 3.

PhyML phylogenetic tree of the CARD domains of each primate caspase-1 subfamily gene. Region within the dashed box in a) is shown in detail in b), with gene identity indicated by symbols. Although sequences for CASP4/5s, CASP12, and CARD18 have distinct branches, in several species, the CASP1, CARD16, and/or CARD17 sequences are highly similar and intermixed on the tree, indicating recent gene conversion. All illustrated nodes on the PhyML trees had aLRT support statistics >5. Other nodes were collapsed.

These events have significantly impacted the sequences, and potentially functions, of genes in the primate locus, as illustrated by a phylogenetic tree of primate CARD sequences (Fig. 3a). In this tree, we observed that the CASP4/5, CASP12, and CARD18 branches were clearly differentiated from each other, whereas there was an intermingling of the CASP1, CARD16, and CARD17 sequences (Fig. 3b). Specifically, we saw that the sequences of CASP1 and CARD16 were intermixed together in Old World Monkeys, independently intermixed within Great apes, and that the sequence of marmoset CASP1 was most similar to marmoset CARD16 and 17. Based on gene synteny and the presence of CARD16 and 17 across primates, we therefore infer that these genes have undergone recent gene conversion with CASP1 in at least three instances. Because these events overwrite the CARD sequence with that of CASP1, we relied on synteny to define the presence and absence of CARD16 and 17 in Fig. 1. We predict that such gene conversion events would enable CARD16 and CARD17 to retain strong, homotypic binding to the CASP1 CARD, despite divergence in the CASP1 CARD sequence over time.

In contrast to the highly dynamic human locus, few blocks of self-similarity exist within the house mouse locus (Fig. 2c). Indeed, we did not detect any instances of extended nucleotide identity that included coding regions of any caspase. We did detect multiple transposon insertions within intergenic and intronic regions (specifically the long interspersed nuclear element L1MdMus and endogenous retrovirus-like Mammalian-apparent long-terminal-repeat retrotransposon (ERVL-MaLR) family of long terminal repeat (LTR) transposon [Jurka 2000]), demonstrating that we had the power to detect such duplication events. We also performed similar phylogenetic analyses in rodent genomes as in Fig. 3 but detected no evidence of gene conversion among rodent Casp1, Casp11, or Casp12. Thus, unlike in the human locus, we did not find any evidence in the mouse genome of ongoing caspase-1 subfamily duplication or gene conversion events.

Finally, within the human population, it appears that duplications within the CASP1 locus are ongoing. Copy number variation among humans is common, including between the GRCh38 genome assembly, which we focused on here, and the recent telomere-to-telomere genome sequence of the CHM13 cell line, which derives from a single human haplotype (T2T-CHM13; Vollger et al. 2022). While segmental duplications tend to get collapsed in most genome assemblies, the T2T-CHM13 genome is currently unique in its completeness and ability to capture structural variation. In this haploid genome, the CASP1 locus was identified as one of the regions with highly variable gene dosage, with T2T-CHM13 containing an estimated 1 copy of CARD16; 2 copies of CASP12, CARD 17, and CARD18; 3 copies of CASP1; 5 copies of CASP5; and 14 copies of CASP4. Therefore, throughout primate history and within humans, the CASP1 locus appears to be a frequent site of gene duplication and innovation.

Unexpected Patterns of Positive Selection in the Caspase-1 Subfamily Genes

Canonical proinflammatory pathways activate caspase-1 downstream of pathogen recognition. In contrast, caspase-4 and caspase-11 have been found to directly bind multiple types of pathogen molecules (Kobayashi et al. 2013; Shi et al. 2014; Li et al. 2021), whereas no pathogen antagonists have yet been identified that specifically target caspase-1. Because both LPS and microbial effectors tend to be highly variable, and because we expect to see rapid evolution at host–pathogen binding interfaces, we hypothesized that we would see strong signatures of positive selection within CASP4 and Casp11, potentially in the previously defined LPS-binding residues of the CARD region (Shi et al. 2014), whereas these signatures might be weaker or absent in CASP1 and CASP5.

To test this hypothesis, we analyzed evolutionary genomic signatures in alignments of primate CASP1, CASP4, and CASP5 as well as rodent Casp1, Casp11, and Casp12. Recurrent gene conversion events (as discussed above) can cause different parts of a gene to have different evolutionary histories. To detect if this was the case and identify sites of recombination breakpoints, we used the program GARD (Kosakovsky Pond et al. 2006). Consistent with our earlier results, GARD identified multiple recombination breakpoints across the caspases, with more sites identified in the primate than in the rodent genes: two recombination breakpoints in primate CASP1, one breakpoint in CASP4, three breakpoints in CASP5, one breakpoint in rodent Casp1, and none in rodent Casp11. Thus, many extant caspase genes are evolutionary patchworks, reflecting multiple gene conversion events.

This sort of recombination history can interfere with tests for positive selection. Therefore, we divided each alignment into different recombination segments and analyzed each segment independently. For each, we used multiple algorithms to estimate the dN/dS ratio, i.e. the rate of nonsynonymous to synonymous substitutions. If the dN/dS ratio is significantly higher than 1, that would indicate that the protein regions and amino acid sites have evolved under positive selection, with more rapid amino acid changes than expected by neutral mutation. For these statistical tests of positive selection, we used CODEML and FUBAR (Yang 2007; Murrell et al. 2013). Both algorithms are designed to detect positive selection that is pervasive across the phylogenetic tree, and they use different statistical models of molecular evolution. Therefore, these analyses detect related, but complementary, signatures of positive selection.

Contrary to our initial hypothesis, CODEML did not detect significant evidence of positive selection in primate CASP4 or rodent Casp11 (Table 1; Fig. 4a and b). Similarly, for these genes, only a few sites were called by FUBAR: two sites in the CARD of CASP4 and two in the p20 subunit of Casp11. None of these sites overlapped with the LPS-binding residues as proposed by Shi et al. (2014). Of the FUBAR-identified sites, the most convincing was I156 in the p20 domain of mouse Casp11, which toggled among I, S, T, and N codons across the 10 rodents analyzed. Interestingly, I156 was also the only residue where homologous sites in primate CASP1 and CASP5 were also identified as positively selected (see below). In addition, this site is in proximity to the binding interface between caspase-4 and caspase-11 to a bacterial effector from Shigella (Li et al. 2021), suggesting that this site may evolve rapidly to evade effector binding. However, with the possible exception of the I156 residue, we found little to no evidence that CASP4 or 11 have evolved under positive selection, with no enrichment at putative LPS-binding sites.

Table 1.

Signatures of selection in caspase genes

Gene Clade Recombination segment dN/dS M7 v. M8 P-value M8 v. M8a P-value Highest omega in M8 # of positively selected sites in CODEML # of positively selected sites in FUBAR
CASP1 Primate 1 0.41516 9.31E−01 9.44E−01 3.60 0 1
2* 1.36624 8.93E−09 1.00E−08 9.56 6 12
3* 0.68748 6.51E−07 6.54E−07 3.72 13 14
CASP4 Primate 1 0.35825 5.01E−01 6.04E−01 2.43 0 2
2 0.23668 1.40E−01 1.75E−01 6.96 0 0
CASP5 Primate 1* 3.15359 2.90E−03 2.91E−03 5.01 12 1
2 0.99276 1.80E−02 1.85E-02 2.33 2 4
3 0.3462 9.96E−01 9.73E−01 1.00 0 0
4 1.09378 1.60E−02 1.61E−02 5.27 2 2
Casp1 Rodent 1 0.58534 3.60E−01 6.18E−01 1.29 0 1
2 0.33065 8.47E−02 1.61E−01 1.83 1 6
Casp11 Rodent 1 0.41356 3.65E−01 7.32E−01 1.37 0 2
Casp12 Rodent 1 0.19438 1.00E+00 1.00E+00 24.07 0 0
2 0.51008 1.00E+00 1.00E+00 1.00 0 0
3 0.17643 1.00E+00 1.00E+00 1.00 0 2

Recombination segments that showed significant signatures of positive selection in CODEML following Bonferonni correction (P < 0.003) indicated with an asterisk. Only CODEML sites that were robustly identified when using different codon frequencies, starting omegas, and gene trees were included here and in Fig. 4.

In contrast, we identified significant signatures of positive selection in both primate CASP1 and CASP5. For CASP5, we analyzed sequences only in apes and Old World Monkeys, because CASP5 in New World Monkeys appears to be a pseudogene (Fig. 1). Nevertheless, despite the reduced statistical power from fewer sequences, we identified 19 sites under positive selection in CASP5, four of which were called by both programs (Fig. 4c). Unlike CASP4, CASP5 has an extended N-terminus. This N-terminal region was enriched for positive selection hits (56% of total sites identified by both algorithms were located within the first 12% of the coding sequence length), with CODEML calling an especially large number of sites. In CODEML, positive selection testing can be less powerful in short alignments. Therefore, it is possible that we have missed some of the positively selected sites on this short, N-terminal segment. The CASP5 N-terminal tail also included an insertion within Old World Monkeys, further diversifying this region. When we analyzed the N-terminal region both before and after trimming out the Old World Monkey insertion, CODEML and FUBAR identified many positively selected sites in both cases, although the identities of the sites shifted. Therefore, while we are confident that the N-terminal region of CASP5 is rapidly evolving, the specific sites identified should be interpreted with caution. While the N-terminus of CASP5 has not previously been associated with a known function, the diversification observed here could indicate that this part of the protein is important for organismal fitness. We also note that CASP5 has multiple annotated splice isoforms, some of which exclude exon 2, which encodes most of the N-terminus. Therefore, although these residues passed our statistical thresholds in one or more algorithms, it is possible that the elevated rate of amino acid turnover in the N-terminus is due in part to relaxed selection, as this region is less frequently a part of the caspase-5 protein.

Even ignoring the CASP5 N-terminus, we still detected more residues under positive selection and with stronger support in CASP5 than in its close paralog CASP4 or rodent Casp11 (Fig. 4; Table 1). Of particular note, three positively selected sites were homologous between primates CASP5 and CASP1, including V217, homolog of I156 in caspase-11. Within humans, the V217 residue of CASP5 is polymorphic at high frequencies (11% of individuals in Africa and 4% worldwide have L or M at this site; Fairley et al. 2020), demonstrating that variation at this site still segregates within human populations, not only across different primate species.

Finally, in CASP1, we identified 29 sites of positive selection primates (16 called by both CODEML and FUBAR) and 6 in rodents (1 called by both algorithms [Fig. 4d and e]). These sites were distributed across the length of the protein, including sites within the CARD, p20, and p10 domains and the flexible linkers connecting these domains. Positively selected residues that are distant in the primary sequence of a protein will sometimes colocalize in the folded protein, often overlapping with host–pathogen binding interfaces. To see if the CASP1 selected residues potentially formed this kind of interface, we examined them on the AlphaFold-predicted structures of primate and rodent caspase-1 but did not observe a large concentration of the residues in 3D space. The localization was similar when we used experimental caspase-1 crystal structures (not shown).

In summary, we detected little to no evidence of positive selection in CASP4 and Casp11, whereas there were strong signatures of selection in primate CASP5 and CASP1. In three cases, sites we identified in primate CASP1 were homologous to positively selected sites in other caspases. However, with the exception of the N-terminal tail of CASP5, the positively selected sites were not concentrated or colocalized in a particular region of the protein.

Variation in Caspase-1 Aspartate Self-cleavage Sites

During caspase protease activation, protein oligomerization induces self-cleavage at aspartate sites to generate the active enzyme (Makoni and Nichols 2021). The removal of the flexible linker between the p20 and p10 subunits is essential for caspase-1 activation. The aspartate cleavage sites flanking this linker are strictly conserved across caspase-1, 4, 5, and 11 (Fig. 5). However, it has been open to debate whether the CARD-p20 cleavage is essential for proinflammatory caspase function. When we examined the aspartate site at the N-terminal end of p20, we found it was conserved across caspase-4, 5, and 11, with the exception of one rodent species, Mus pahari, which had an asparagine at the caspase-11 site instead. Across caspase-1 homologs, we found many more species that had variation in the CARD-p20 cleavage site (Fig. 5), often with no aspartate residues nearby that could serve as alternate cleavage sites. These species included all of the New World Monkeys and four species of rodents, including rats. We suggest that functional studies of the caspase-1 homologs from these lineages may provide important information about the mechanisms and activation of proinflammatory caspases, particularly the role of CARD-p20 cleavage.

Fig. 5.

Fig. 5.

Caspase-1 aspartate cleavage sites between CARD and p20 domains are not strictly conserved. a) Domain architecture of caspase-1, showing locations of aspartate cleavage sites (“D”). b and c) Alignments of the caspase-1 aspartate cleavage sites, with residues that do not match consensus highlighted with colored boxes. While the two C-terminal cleavage sites are highly conserved within primates b) and rodents c), the N-terminal site is more highly variable, and some species lack the aspartate residue entirely. Arrowheads mark nearby positively selected sites.

Caspase Variants Within Inbred Mouse and Rat Lines

While inbred mouse and rat lines are widely used for studies of inflammation and immunity, these strains can harbor polymorphisms that impact caspase activity. Indeed, the noncanonical inflammasome signaling of caspase-4/11 was only discovered in 2011 after scientists realized that the widely used 129S1 mouse strain background carried an inactivating mutation in caspase-11 (Kayagaki et al. 2011). This mutation was a 5-bp deletion that eliminated the splice acceptor on exon 7 of Casp11, resulting in an out-of-frame splice isoform that behaved as a Casp11 null. Inspired by this history, we looked for nonsynonymous caspase-1 family mutations in the genomes of 16 mouse lines and 8 rat lines that are commonly used for laboratory study. In addition to the 129S1 deletion described above, we identified a wide variety of missense variants. By comparing these variants with our alignments, we categorized each variant as follows: (i) having a potential to impact caspase function (if the polymorphism altered a site highly conserved across 10 closely related rodents), (ii) unlikely to impact function (if the polymorphism introduced a residue that was shared with another rodent species), or (iii) unknown. We identified only a few variants across the rat strains, most of which we predicted were unlikely to affect caspase function (Table 2). One notable feature was a 1-bp “insertion” in exon 7 of Casp12 that was present in all 8 rat genomes. Whereas the reference rat Casp12 gene has a frameshift relative to the homologs, this 1-bp “insertion” resulted in a full-length, in-frame Casp12. Because this “variant” was in every strain and because the in-frame transcript is well-represented in RNA-seq datasets (NCBI Reference Sequence: NM_130422.2), we believe this position reflects an error in the reference rat genome assembly. Otherwise, we found very few nonsynonymous variants within the rat Casp1, Casp11, and Casp12 genes.

Table 2.

Nonsynonymous variants in rat inbred lines

  Casp1 Casp11 Casp12
Strain Q343R V60A F115L 1-bp insertion
ACI_N x x x
BN_SsN x
BUF_N x x x
F344_N x x x
M520_N x x x
MR_N x X x
WKY_N x X x
WN_N x x

Italics and lowercase indicate sites that are unlikely to disrupt function, i.e. the same residue is present at that site in other rodent homologs. Bolded and uppercase indicate sites with the potential to disrupt function, as the variant alters a highly conserved site. Empty boxes indicate the presence of the reference allele.

In contrast, we identified many Casp11 and Casp12 variants in the inbred mouse lines, such that 50% of the mouse strains we examined carried missense variants in one or both genes (strains 129S1 to IP_J, Table 3). Only the 129S1 strain carried the previously characterized 5-bp deleterious deletion in Casp11. While the remaining missense variants will need to be tested experimentally, we predict that several are likely to impact caspase function. We therefore recommend that researchers studying Casp11 or Casp12 may want to avoid using the following mouse lines: 129S1/SvImJ, cAST/EiJ, pWK/PhJ, wSB/EiJ, aKR/J, cBA/J, nOD/ShiLtJ, and IP/J. The sPRET strain contained multiple nonsynonymous mutations, although we predict these are likely to have little impact. There were also seven mouse strains that did not carry any missense variants in Casp1, Casp11, or Casp12, which would be good models to use for Casp11 and Casp12 studies: bALB/cJ, c57BL/6NJ, a/J, nZO/HILtJ, fVB/NJ, dBA/2B, and c3H/HeJ.

Table 3.

Nonsynonymous variants in mouse inbred lines

  Casp1 Casp11 Casp12
Strain R33K E126K N152K E163Q G257E H305C 5-bp deletion C331Y A3V I15L D24N E46D P105L M130I L137I Q153K F154L E217K Q230H H270R N311S N330T
129S1_SvImJ X X X X X
cAST_EiJ x X X x X x X
pWK_PhJ x X x X X x
sPRET_EiJ x x x x x x x x
wSB_EiJ X X X
aKR_J X X X
cBA_J X X
nOD_ShiLtJ X
IP_J X
bALB_cJ
c57BL_6NJ
a_J
nZO_HILtJ
fVB_NJ
dBA_2J
c3H_HeJ

Italics and lowercase indicate sites that are unlikely to disrupt function, i.e. the same residue is present at that site in other rodent homologs. Bolded and uppercase boxes indicate sites with the potential to disrupt function, as the variant alters a highly conserved site. Empty boxes indicate the presence of the reference allele.

Discussion

While researchers commonly use mice as model systems for studying human proinflammatory caspases, we find a number of differences in the genes and evolutionary dynamics between primates and rodents, with potential implications for caspase functions.

First, the caspase-1 locus is more evolutionarily dynamic in primates, with evidence of recurrent gene duplications, pseudogenization of new genes, and gene conversion events (Figs. 1 to 3). Such events can create blocks of nucleotide identity within the locus that can facilitate further rearrangements or recombination, thus perpetuating complex genetic dynamics. The rapid amino acid turnover, particularly in primate CASP5 and CASP1 (Fig. 4), further accelerates the evolution of these genes. Combined with prior evidence of human segregating polymorphisms and copy number variation within the caspase-1 locus, we propose that these genes have not only a history of diversification but also abundant, ongoing innovation in primates. The genetic variation in the caspases is likely to cause differences in proinflammatory immune responses across individual humans (and cell lines) as well as nonhuman primates, which will not be reflected in mouse models. Inversely, many rodent laboratory models harbor caspase segregating variants that are not reflective of humans (Tables 2 and 3). Therefore, we urge researchers to carefully assess and report the caspase genotypes and gene copy number for the organisms and cell lines used in their experimental studies.

Second, our work has uncovered new information about variation in caspase gene content among primates and rodents. The CASP4/5 duplication event occurred in the ancestor of simian primates, around 50 Ma (Kumar et al. 2017), while the tarsier and loris lineages had additional, independent duplications of this locus (Fig. 1). Because duplication events often lead to subfunctionalization or neofunctionalization if the duplicates are retained for long periods of time, it is likely that there are significant differences in the functions of CASP4 and CASP5 genes and their single-copy Casp11 rodent homolog. Similarly, the CARD genes arose early, near the base of the primate tree, around 74 Ma. Because each of these genes lacks 1-to-1 orthologs in rodents and primates, their human functions may not be reflected in mouse models. We also discovered that both primates and rodents encode intact likely functional copies of CASP12, although this gene has been recurrently lost in both lineages. More surprisingly, we discovered multiple pseudogenization and gene loss events of both primate CASP5 and rodent Casp11, including four rodent species that apparently have Casp1 as their only caspase-1 subfamily gene. Given the proposed importance of Casp11 for pathogen defense, we speculate that these four species may have acquired new genes or alleles that can detect cytoplasmic LPS, thus compensating for Casp11 loss.

The evolutionary signatures also have interesting implications for the molecular functions of the caspase-1 subfamily. For example, we were surprised to see such strong selective signatures in CASP5 when compared with its close homologs CASP4 and Casp11, because current studies have yet to identify in vivo phenotypes or molecular functions associated with caspase-5 (nor its unique N-terminal tail), whereas both caspase-4 and caspase-11 are known to mediate important host–microbe molecular interactions. These evolutionary discrepancies suggest that caspase-4 and caspase-5 may have significant differences in their in vivo functions and that caspase-5 plays an important, undefined, role supporting organismal fitness. For example, caspase-5 may substitute for caspase-4 in certain contexts, or it may act as a molecular decoy for caspase-4 inhibitors. Although these scenarios will remain speculative until future experimental advances in the role of caspase-5, the positively selected residues we identified in Fig. 4c are likely to be sites that mediate caspase-5-specific activities.

We were also surprised to see the widespread signatures of positive selection in primate CASP1, when compared with the other caspase genes. This pattern suggests that caspase-1 is under substantial selective pressure at multiple molecular interfaces and that these pressures are largely not shared with caspase-4 or caspase-11. (The only exception is being at I156 of Casp11/V217 of CASP5/T187 of CASP1, which could interact with similar pathogen antagonists.) Analogous patterns of protein evolution (i.e. many positively selected sites distributed throughout the protein) have been found in examples such as protein kinase R (PKR) and human myxovirus resistance protein B (MxB), which are thought to be on the “defense” of evolutionary arms races (Daugherty and Malik 2012). In these scenarios, proteins are evolving to evade pathogen antagonists at multiple sites throughout the protein. Although pathogens are known to antagonize the caspase-1 signaling pathway both upstream (Brodsky et al. 2010; Chung et al. 2016) and downstream (Luchetti et al. 2021) of caspase-1, we would not expect these known, indirect antagonists to drive adaptation in caspase-1 itself. Several viruses encode multicaspase inhibitors that can target caspase-1, caspase-4, and caspase-5 by forming covalent bonds at the caspase active site (Kamada et al. 1997; Best 2008). However, as we did not observe an enrichment of positive selection around the caspase-1 active site (Fig. 4d to g) and as the inhibitors are not caspase-1 specific, we also do not believe that these inhibitors fully explain the patterns of caspase-1 evolution. Instead, we postulate that there are multiple, unknown pathogen proteins that specifically bind and inhibit caspase-1 at different molecular interfaces. We further predict that the residues we highlighted here are likely to mediate binding specificity between caspase-1 and various antagonists as well as between a given antagonist and different members of the caspase-1 family.

Materials and Methods

Selection of Species and Genomes to Analyze

To compare the evolutionary histories of primate and rodent caspases, our analyses required at least 10 to 15 caspase genomic loci from both the rodent and primate clades (McBee et al. 2015). We selected genomes to use in 2022 and therefore included the latest assembly version available at the time (supplementary table S1, Supplementary Material online). High-quality genome assemblies were included based on the following quality metrics: low assembly fragmentation (N50 > 300 kb and L50 < 2,000) and high read coverage (>40×). Rodent and primate species trees were based on previously published work (Menezes et al. 2010; Jameson et al. 2011; McBee et al. 2015; Swanson et al. 2019; Molaro et al. 2020; Côrte-Real et al. 2022) and the Zoonomia database. To approximate divergence of the species, we used CODEML model 0 (Yang 2007) to calculate the pairwise dS between the CASP1 homologs in the species of interest and either humans (for primates) or house mouse (for rodents). Species that had a pairwise dS > 0.3 from the reference were considered to be outgroups.

Extracting the Caspase-1 Locus From the Assemblies

To identify the CASP1 locus and CASP1-related homologs in each of our genome assemblies, we first created local blast databases of the collected, high-quality genomes for rodents and primates using BLAST+ v2.13.0 (Camacho et al. 2009) makeblastbd. We then used the Refseq representative transcript of house mouse or human CASP1 as a query in a blastn search on our local databases with default settings. From the output of possible CASP1 hits in each genome, we filtered for strong exon hits using a bitscore > 100. The coordinates of the hits also revealed that each CASP1-like gene was contained within a single genomic locus in each species, and none were located on different chromosomes. We then identified start and end coordinates of each candidate CASP1 homolog in each assembly, based on homology to the beginning and end of the reference query. Using the coordinates of the strongest CASP1 hit, we then extracted the CASP1 genomic locus with excess flanking 2 Mb for each assembly.

Exon Annotations, Alignments, Phylogenetic Trees, and Caspase Paralog Identification

To retrieve accurate CASP coding sequence (CDS) sequences from the CASP1 locus in each species, we first annotated each exon. For each of the CASP1 homologs and CARD-only proteins, we chose the Refseq representative transcript from human or house mouse as our reference CDS for exon annotation. In humans, CASP12 is considered catalytically inactive and likely a pseudogene (Fischer et al. 2002; Eckhart et al. 2008). Because pseudogenes are undergoing degeneration of their coding sequence, annotation of conserved exons from a pseudogenic reference is challenging. Therefore, we performed exon annotation of primate CASP12 loci using both the Sunda slow loris and house mouse CASP12 CDS sequences.

We searched for conserved exons of each homolog using both nucleotide similarity and synteny. To identify exons with high nucleotide identity to the reference (>60% identity to humans in primates or to house mouse in rodents), we used the Geneious Prime v2024.0.3 “Transfer Annotations” tool to transfer exon annotations from the reference sequence to the locus of interest. We empirically found that lowering the similarity below 60% resulted in spurious exon hits that did not align with homologs or contribute to full coding sequences after alignment. While this approach identified most exons of interest, some diverged exons were missed, whereas exons similar to multiple caspase paralogs matched to multiple regions within the same locus. We therefore relied on gene synteny to assign exons to the correct CASP gene and identify paralogs, even if only part of the gene was present.

To identify missing exons and obtain full CDS sequences for each gene from each species, we aligned the coding sequences of each gene with the annotated genomic DNA using MAFFT (Katoh and Standley 2013) with automatic direction detection. We then used the exon annotations to guide the manual trimming of introns across the alignment. We then realigned the trimmed sequences by translation using Geneious Prime translation alignment tool with the MUSCLEv5 (Edgar 2022) plugin. We confirmed the quality of our manual trimming by comparing our trimmed reference species CDSs with the publicly available reference CDSs and protein sequences. In some cases, like the Jerboa Casp11 and Casp12, our manual trimming method initially failed because the intronic regions were very different sizes compared with the rest of rodent homologs. However, using predicted CDS of the Jerboa caspases and our exon annotation, we were able to recover full-length in-frame CDS of these genes through manually trimming, independent of the gDNA alignment. For primate CASP5, Old World Monkeys had an insertion within the N-terminal tail of the protein, which we manually trimmed out before further analyses. We generated all phylogenetic trees using the Geneious Prime tree builder PhyML (Guindon et al. 2010) plugin with general time-reversible (GTR) model and approximate likelihood-ratio test (aLRT) statistics. We constructed the trees in Figs. 1 and 3 with multiple different substitution models, including GTR, HKY85, and/or Blosum62. In all cases, the trees were very similar, with only minor variation in support values and branch lengths.

Pseudogene and Gene Loss Calls

While CASP1 was well conserved in all lineages, other genes in the locus were not. To confirm the accuracy of our homolog genomic extractions, especially in cases of pseudogenes and gene absence, we searched for annotated transcript evidence of each caspase homolog. We performed an NCBI blast search in NCBI Transcript Reference Sequences database (reseq_rna) with each caspase homolog CDS reference. The presence of a full-length in-frame transcript from blast results, identical in sequence to our extracted CDS, was additional evidence of a functional full-length CASP homolog. If there was not a full-length, in-frame transcript but an alternative splice isoform from blast results was still in-frame, we called these homologs as “partial in-frame transcripts.” However, this CDS search strategy was inconclusive if CASP homologs did not have any transcript evidence at all, leaving open the possibilities of gene loss, pseudogenization, or lack of genome annotation.

We called events as “gene losses” if fewer than 70% of the exons were identified as described above. These partial loci were not included in subsequent gene alignments or trees. To define pseudogenes, we examined the sequences for frameshift indel mutations, exon truncations, or exon loss. Frameshift indel mutations were differentiated from genome assembly errors by examining closely related species to see if the same defect was present in multiple, independent studies/assemblies. See supplementary table S2, Supplementary Material online for a full description of the presence, absence, and pseudogene data for each gene in each species.

Identification of Regions of Nucleotide Identity Within the Caspase Locus

To search for evidence of recent gene conversion or duplication, we identified regions of extended nucleotide identity consistent with recent gene conversion in the caspase locus of humans (GCF_000001405.40) and Mus musculus (GCA_000001635.9; Fig. 2). To do so, we extracted a ∼350-kb region from each genome, centered on the CASP1 gene. We then used the Dotplot tool of Geneious Prime to detect any regions of self-similarity, which appeared as off-diagonal lines in the dotplot. We annotated any DNA regions that were at least 1-kb long that contained at least 50% nucleotide identity to another region within the 350-kb locus.

Measurements of Selection and Recombination

To perform selection analyses, we selected fifteen primate assemblies with the most diverged species from human being marmoset at a dS of 0.13. The rodent Muridaea clade of 12 species was selected for selection analyses with the most diverged clade from house mouse being spiny mouse at a dS of 0.29. To detect recombination breakpoints, we ran GARD (Kosakovsky Pond et al. 2006) on each homolog nucleotide alignment with default DataMonkey settings: faster run mode, universal genetic code, no site-to-site rate variation, and two rate classes. Because recombination can cause false positives in selection analyses, we then split the alignments at recombination breakpoints and analyzed each recombination segment independently. To detect residues of positive selection, we ran two softwares: PAML CODEML (Yang 2007) and FUBAR (Murrell et al. 2013). We performed PAML CODEML analysis with models 0, 0a, 1, 2, 7, 8, and 8a. All PAML analyses with estimated omega were initially run with a starting omega of 0.4 and the codon frequency model F3X4. We additionally re-ran all PAML analyses with a species and segment alignment tree. Model 0 estimates a single dN/dS value across the entire gene. Model 0a fixes the starting omega to 1 to simulate a gene evolving neutrally. Model 1 partitions sites into two dN/dS estimates, dN/dS = 1 or dN/dS < 1, whereas model 2 allows for a third bin, dN/dS > 1; statistical tests compare models 1 and 2 to assess if there are sites with dN/dS significantly >1. Similarly, model 7 includes many dN/dS bins <1, model 8a includes a bin = 1, and model 8 includes a bin with dN/dS > 1. P-values were obtained by comparing the lnL values among M0 v. 0a, M1 v. 2, M7 v. 8, or M8a v. 8. We performed FUBAR using the default universal genetic code. Residues of positive selection were defined as Bayes Emperical Bayes posterior probability >0.95 (CODEML) or Bayesian based posterior probability >0.9 (FUBAR). For segments that had residues identified as positively selected by PAML, we then tested the robustness of these predictions by re-running PAML with varying seeded omega (0.4 or 1 or 1.5) and compared multiple codon frequency models (1/61 or F3X4). See supplementary table S3 and data S1, Supplementary Material online for more detailed CODEML and FUBAR statistics and results.

Identification of Nonsynonymous Variants in Mouse and Rat Strains

We examined caspase variants in a set of laboratory inbred mouse and rat strains from the Sanger mouse genome project (Lilue et al. 2018) and the rat Heterogeneous Stock founder strains (Hansen and Spuhler 1984). For the rat strains, we used the Rat Genome Database Variant Visualizer tool to identify nonsynonymous variants (Laulederkind et al. 2023). For the mouse strains, we did the same thing using the Mouse strain assembly hub on the UCSC genome browser (https://hgdownload.soe.ucsc.edu/hubs/mouseStrains/hubIndex.html).

Supplementary Material

msae220_Supplementary_Data

Acknowledgments

We thank Matt Daugherty, Liz Fay, Janet Young, Patrick Mitchell, Mahtab Moayeri, Cammie Lesser, Sunny Shin, members of the Levin lab, and members of the Pitt Molecular Evolution discussion group for their helpful input on the project and manuscript. T.C.L. and M.H. were supported in this work by a grant from the National Institutes of Health (R35GM150681).

Contributor Information

Mische Holland, Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.

Rachel Rutkowski, Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.

Tera C. Levin, Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.

Supplementary Material

Supplementary material is available at Molecular Biology and Evolution online.

Data Availability

All datasets were derived from sources in the public domain with the relevant accessions listed in the supplement. The data and analysis underlying this article are available in the article and in its online Supplementary material.

References

  1. Best  SM. Viral subversion of apoptotic enzymes: escape from death row. Annu Rev Microbiol. 2008:62:171–192. 10.1146/annurev.micro.62.081307.163009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Bibo-Verdugo  B, Salvesen  G. Evolution of caspases and the invention of pyroptosis. Int J Mol Sci. 2024:25(10):5270. 10.3390/ijms25105270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Brodsky  IE, Palm  NW, Sadanand  S, Ryndak  MB, Sutterwala  FS, Flavell  RA, Bliska  JB, Medzhitov  R. A Yersinia effector protein promotes virulence by preventing inflammasome recognition of the type III secretion system. Cell Host Microbe. 2010:7(5):376–387. 10.1016/j.chom.2010.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Camacho  C, Coulouris  G, Avagyan  V, Ma  N, Papadopoulos  J, Bealer  K, Madden  TL. BLAST+: architecture and applications. BMC Bioinformatics. 2009:10:421. 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Castro  LK, Daugherty  MD. Tripping the wire: sensing of viral protease activity by CARD8 and NLRP1 inflammasomes. Curr Opin Immunol. 2023:83:102354. 10.1016/j.coi.2023.102354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Chung  LK, Park  YH, Zheng  Y, Brodsky  IE, Hearing  P, Kastner  DL, Chae  JJ, Bliska  JB. The Yersinia virulence factor YopM hijacks host kinases to inhibit type III effector-triggered activation of the pyrin inflammasome. Cell Host Microbe. 2016:20(3):296–306. 10.1016/j.chom.2016.07.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Côrte-Real  JV, Baldauf  H-M, Melo-Ferreira  J, Abrantes  J, Esteves  PJ. Evolution of Guanylate Binding Protein (GBP) genes in muroid rodents (Muridae and Cricetidae) reveals an outstanding pattern of gain and loss. Front Immunol. 2022:13:752186. 10.3389/fimmu.2022.752186. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Daugherty  MD, Malik  HS. Rules of engagement: molecular insights from host-virus arms races. Annu Rev Genet. 2012:46:677–700. 10.1146/annurev-genet-110711-155522. [DOI] [PubMed] [Google Scholar]
  9. Daugherty  MD, Zanders  SE. Gene conversion generates evolutionary novelty that fuels genetic conflicts. Curr Opin Genet Dev. 2019:58-59:49–54. 10.1016/j.gde.2019.07.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Devant  P, Cao  A, Kagan  JC. Evolution-inspired redesign of the LPS receptor caspase-4 into an interleukin-1β converting enzyme. Sci Immunol. 2021:6(62):eabh3567. 10.1126/sciimmunol.abh3567. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Devi  S, Indramohan  M, Jäger  E, Carriere  J, Chu  LH, de Almeida  L, Greaves  DR, Stehlik  C, Dorfleutner  A. CARD-only proteins regulate in vivo inflammasome responses and ameliorate gout. Cell Rep. 2023:42(3):112265. 10.1016/j.celrep.2023.112265. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Druilhe  A, Srinivasula  SM, Razmara  M, Ahmad  M, Alnemri  ES. Regulation of IL-1β generation by Pseudo-ICE and ICEBERG, two dominant negative caspase recruitment domain proteins. Cell Death Differ. 2001:8(6):649–657. 10.1038/sj.cdd.4400881. [DOI] [PubMed] [Google Scholar]
  13. Eckhart  L, Ballaun  C, Hermann  M, VandeBerg  JL, Sipos  W, Uthman  A, Fischer  H, Tschachler  E. Identification of novel mammalian caspases reveals an important role of gene loss in shaping the human caspase repertoire. Mol Biol Evol. 2008:25(5):831–841. 10.1093/molbev/msn012. [DOI] [PubMed] [Google Scholar]
  14. Eckhart  L, Fischer  H. Caspase-5: structure, pro-inflammatory activity and evolution. Biomolecules. 2024:14(5):520. 10.3390/biom14050520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Edgar  RC. Muscle5: high-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny. Nat Commun. 2022:13(1). 10.1038/s41467-022-34630-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Enard  D, Cai  L, Gwennap  C, Petrov  DA. Viruses are a dominant driver of protein adaptation in mammals. Elife. 2016:5:e12469. 10.7554/eLife.12469. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Fairley  S, Lowy-Gallego  E, Perry  E, Flicek  P. The International Genome Sample Resource (IGSR) collection of open human genomic variation resources. Nucleic Acids Res. 2020:48(D1):D941–D947. 10.1093/nar/gkz836. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Fischer  H, Koenig  U, Eckhart  L, Tschachler  E. Human caspase 12 has acquired deleterious mutations. Biochem Biophys Res Commun. 2002:293(2):722–726. 10.1016/S0006-291X(02)00289-9. [DOI] [PubMed] [Google Scholar]
  19. Guindon  S, Dufayard  J-F, Lefort  V, Anisimova  M, Hordijk  W, Gascuel  O. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010:59(3):307–321. 10.1093/sysbio/syq010. [DOI] [PubMed] [Google Scholar]
  20. Hansen  C, Spuhler  K. Development of the National Institutes of Health genetically heterogeneous rat stock. Alcohol Clin Exp Res. 1984:8(5):477–479. 10.1111/j.1530-0277.1984.tb05706.x. [DOI] [PubMed] [Google Scholar]
  21. Indramohan  M, Stehlik  C, Dorfleutner  A. COPs and POPs patrol inflammasome activation. J Mol Biol. 2018:430(2):153–173. 10.1016/j.jmb.2017.10.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Jameson  NM, Hou  ZC, Sterner  KN, Weckle  A, Goodman  M, Steiper  ME, Wildman  DE. Genomic data reject the hypothesis of a prosimian primate clade. J Hum Evol. 2011:61(3):295–305. 10.1016/j.jhevol.2011.04.004. [DOI] [PubMed] [Google Scholar]
  23. Jurka  J. Repbase update: a database and an electronic journal of repetitive elements. Trends Genet. 2000:16(9):418–420. 10.1016/S0168-9525(00)02093-X. [DOI] [PubMed] [Google Scholar]
  24. Kachapati  K, O’Brien  TR, Bergeron  J, Zhang  M, Dean  M. Population distribution of the functional caspase-12 allele. Hum Mutat. 2006:27(9):975. 10.1002/humu.9448. [DOI] [PubMed] [Google Scholar]
  25. Kamada  S, Funahashi  Y, Tsujimoto  Y. Caspase-4 and caspase-5, members of the ICE/CED-3 family of cysteine proteases, are CrmA-inhibitable proteases. Cell Death Differ. 1997:4(6):473–478. 10.1038/sj.cdd.4400268. [DOI] [PubMed] [Google Scholar]
  26. Karasawa  T, Kawashima  A, Usui  F, Kimura  H, Shirasuna  K, Inoue  Y, Komada  T, Kobayashi  M, Mizushina  Y, Sagara  J, et al.  Oligomerized CARD16 promotes caspase-1 assembly and IL-1β processing. FEBS Open Bio. 2015:5:348–356. 10.1016/j.fob.2015.04.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Katoh  K, Standley  DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013:30(4):772–780. 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Kayagaki  N, Warming  S, Lamkanfi  M, Walle  LV, Louie  S, Dong  J, Newton  K, Qu  Y, Liu  J, Heldens  S, et al. Non-canonical inflammasome activation targets caspase-11. Nature. 2011:479(7371):117–121. 10.1038/nature10558. [DOI] [PubMed] [Google Scholar]
  29. Kobayashi  T, Ogawa  M, Sanada  T, Mimuro  H, Kim  M, Ashida  H, Akakura  R, Yoshida  M, Kawalec  M, Reichhart  J-M, et al.  The Shigella OspC3 effector inhibits caspase-4, antagonizes inflammatory cell death, and promotes epithelial infection. Cell Host Microbe. 2013:13(5):570–583. 10.1016/j.chom.2013.04.012. [DOI] [PubMed] [Google Scholar]
  30. Kosakovsky Pond  SL, Posada  D, Gravenor  MB, Woelk  CH, Frost  SDW. GARD: a genetic algorithm for recombination detection. Bioinformatics. 2006:22(24):3096–3098. 10.1093/bioinformatics/btl474. [DOI] [PubMed] [Google Scholar]
  31. Kumar  S, Stecher  G, Suleski  M, Hedges  SB. TimeTree: a resource for timelines, timetrees, and divergence times. Mol Biol Evol. 2017:34(7):1812–1819. 10.1093/molbev/msx116. [DOI] [PubMed] [Google Scholar]
  32. Lamkanfi  M, Declercq  W, Kalai  M, Saelens  X, Vandenabeele  P. Alice in caspase land. A phylogenetic analysis of caspases from worm to man. Cell Death Differ. 2002:9(4):358–361. 10.1038/sj.cdd.4400989. [DOI] [PubMed] [Google Scholar]
  33. Lamkanfi  M, Denecker  G, Kalai  M, D’hondt  K, Meeus  A, Declercq  W, Saelens  X, Vandenabeele  P. INCA, a novel human caspase recruitment domain protein that inhibits interleukin-1β generation. J Biol Chem. 2004:279(50):51729–51738. 10.1074/jbc.M407891200. [DOI] [PubMed] [Google Scholar]
  34. Laulederkind  SJF, Hayman  GT, Wang  SJ, Kaldunski  ML, Vedi  M, Demos  WM, Tutaj  M, Smith  JR, Lamers  L, Gibson  AC, et al. The rat genome database: genetic, genomic, and phenotypic data across multiple species. Curr Protoc. 2023:3(6). 10.1002/cpz1.804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Li  Z, Liu  W, Fu  J, Cheng  S, Xu  Y, Wang  Z, Liu  X, Shi  X, Liu  Y, Qi  X, et al.  Shigella evades pyroptosis by arginine ADP-riboxanation of caspase-11. Nature. 2021:599(7884):290–295. 10.1038/s41586-021-04020-1. [DOI] [PubMed] [Google Scholar]
  36. Lilue  J, Doran  AG, Fiddes  IT, Abrudan  M, Armstrong  J, Bennett  R, Chow  W, Collins  J, Collins  S, Czechanski  A, et al. Sixteen diverse laboratory mouse reference genomes define strain-specific haplotypes and novel functional loci. Nat Genet. 2018:50(11):1574–1583. 10.1038/s41588-018-0223-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lin  XY, Choi  MSK, Porter  AG. Expression analysis of the human caspase-1 subfamily reveals specific regulation of the CASP5 gene by lipopolysaccharide and interferon-γ. J Biol Chem. 2000:275(51):39920–39926. 10.1074/jbc.M007255200. [DOI] [PubMed] [Google Scholar]
  38. Lu  A, Li  Y, Schmidt  FI, Yin  Q, Chen  S, Fu  T-M, Tong  AB, Ploegh  HL, Mao  Y, Wu  H. Molecular basis of caspase-1 polymerization and its inhibition by a new capping mechanism. Nat Struct Mol Biol. 2016:23(5):416–425. 10.1038/nsmb.3199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Luchetti  G, Roncaioli  JL, Chavez  RA, Schubert  AF, Kofoed  EM, Reja  R, Cheung  TK, Liang  Y, Webster  JD, Lehoux  I, et al.  Shigella ubiquitin ligase IpaH7.8 targets gasdermin D for degradation to prevent pyroptosis and enable infection. Cell Host Microbe. 2021:29(10):1521–1530.e10. 10.1016/j.chom.2021.08.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Makoni  NJ, Nichols  MR. The intricate biophysical puzzle of caspase-1 activation. Arch Biochem Biophys. 2021:699:108753. 10.1016/j.abb.2021.108753. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. McBee  RM, Rozmiarek  SA, Meyerson  NR, Rowley  PA, Sawyer  SL. The effect of species representation on the detection of positive selection in primate gene data sets. Mol Biol Evol. 2015:32(4):1091–1096. 10.1093/molbev/msu399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. McLaughlin  RN  Jr, Malik  HS. Genetic conflicts: the usual suspects and beyond. J Exp Biol. 2017:220(Pt 1):6–17. 10.1242/jeb.148148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Menezes  AN, Bonvicino  CR, Seuánez  HN. Identification, classification and evolution of owl monkeys (Aotus, Illiger 1811). BMC Evol Biol. 2010:10:248. 10.1186/1471-2148-10-248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Molaro  A, Malik  HS, Bourc’his  D. Dynamic evolution of de novo DNA methyltransferases in rodent and primate genomes. Mol Biol Evol. 2020:37(7):1882–1892. 10.1093/molbev/msaa044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Murrell  B, Moola  S, Mabona  A, Weighill  T, Sheward  D, Kosakovsky Pond  SL, Scheffler  K. FUBAR: a fast, unconstrained Bayesian approximation for inferring selection. Mol Biol Evol. 2013:30(5):1196–1205. 10.1093/molbev/mst030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Ringwald  M, Richardson  JE, Baldarelli  RM, Blake  JA, Kadin  JA, Smith  C, Bult  CJ. Mouse Genome Informatics (MGI): latest news from MGD and GXD. Mamm Genome. 2022:33(1):4–18. 10.1007/s00335-021-09921-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Sahoo  G, Samal  D, Khandayataray  P, Murthy  MK. A review on caspases: key regulators of biological activities and apoptosis. Mol Neurobiol. 2023:60(10):5805–5837. 10.1007/s12035-023-03433-5. [DOI] [PubMed] [Google Scholar]
  48. Shi  J, Zhao  Y, Wang  Y, Gao  W, Ding  J, Li  P, Hu  L, Shao  F. Inflammatory caspases are innate immune receptors for intracellular LPS. Nature. 2014:514(7521):187–192. 10.1038/nature13683. [DOI] [PubMed] [Google Scholar]
  49. Sundaram  B, Tweedell  RE, Prasanth Kumar  S, Kanneganti  T-D. The NLR family of innate immune and cell death sensors. Immunity. 2024:57(4):674–699. 10.1016/j.immuni.2024.03.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Swanson  MT, Oliveros  CH, Esselstyn  JA. A phylogenomic rodent tree reveals the repeated evolution of masseter architectures. Proc Biol Sci. 2019:286(1902):20190672. 10.1098/rspb.2019.0672. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Tenthorey  JL, Kofoed  EM, Daugherty  MD, Malik  HS, Vance  RE. Molecular basis for specific recognition of bacterial ligands by NAIP/NLRC4 inflammasomes. Mol Cell. 2014:54(1):17–29. 10.1016/j.molcel.2014.02.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Tsu  BV, Agarwal  R, Gokhale  NS, Kulsuptrakul  J, Ryan  AP, Fay  EJ, Castro  LK, Beierschmitt  C, Yap  C, Turcotte  EA, et al.  Host-specific sensing of coronaviruses and picornaviruses by the CARD8 inflammasome. PLoS Biol. 2023:21(6):e3002144. 10.1371/journal.pbio.3002144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Tsu  BV, Beierschmitt  C, Ryan  AP, Agarwal  R, Mitchell  PS, Daugherty  MD. Diverse viral proteases activate the NLRP1 inflammasome. Elife. 2021:10:e60609. 10.7554/eLife.60609. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Vollger  MR, Guitart  X, Dishuck  PC, Mercuri  L, Harvey  WT, Gershman  A, Diekhans  M, Sulovari  A, Munson  KM, Lewis  AP, et al.  Segmental duplications and their variation in a complete human genome. Science. 2022:376(6588):eabj6965. 10.1126/science.abj6965. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Wu  J, Cai  J, Tang  Y, Lu  B. The noncanonical inflammasome-induced pyroptosis and septic shock. Semin Immunol. 2023:70:101844. 10.1016/j.smim.2023.101844. [DOI] [PubMed] [Google Scholar]
  56. Xu  Z, Kombe Kombe  AJ, Deng  S, Zhang  H, Wu  S, Ruan  J, Zhou  Y, Jin  T. NLRP inflammasomes in health and disease. Mol Biomed. 2024:5(1):14. 10.1186/s43556-024-00179-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Xue  Y, Daly  A, Yngvadottir  B, Liu  M, Coop  G, Kim  Y, Sabeti  P, Chen  Y, Stalker  J, Huckle  E, et al.  Spread of an inactive form of caspase-12 in humans is due to recent positive selection. Am J Hum Genet. 2006:78(4):659–670. 10.1086/503116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Yang  Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007:24(8):1586–1591. 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  59. Zhu  Q, Zheng  M, Balakrishnan  A, Karki  R, Kanneganti  T-D. Gasdermin D promotes AIM2 inflammasome activation and is required for host protection against Francisella novicida. J Immunol. 2018:201(12):3662–3668. 10.4049/jimmunol.1800788. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

msae220_Supplementary_Data

Data Availability Statement

All datasets were derived from sources in the public domain with the relevant accessions listed in the supplement. The data and analysis underlying this article are available in the article and in its online Supplementary material.


Articles from Molecular Biology and Evolution are provided here courtesy of Oxford University Press

RESOURCES