Significance
Self and nonself discrimination is essential to innate immunity. The 2′,5′-oligoadenylate synthetase (OAS)–ribonuclease L (RNase L) pathway senses double-stranded RNA as nonself and represents a crucial branch of innate immunity. In this study, we found that the functional OAS–RNase L pathway might have originated through tinkering with preexisting proteins before the most recent common ancestor of jawed vertebrates during or before the Silurian period. Our study illustrates how an innate immune pathway originated through molecular tinkering and how the arsenal of immunity has been supplemented at times during the evolution of life.
Keywords: innate immunity, OAS–RNase L pathway, molecular tinkering, phylogenetics
Abstract
Discriminating self from nonself is fundamental to immunity. Yet, it remains largely elusive how the mechanisms of self and nonself discrimination originated. Sensing double-stranded RNA as nonself, the 2′,5′-oligoadenylate synthetase (OAS)–ribonuclease L (RNase L) pathway represents a crucial component of innate immunity. Here, we combine phylogenomic and functional analyses to show that the functional OAS–RNase L pathway likely originated through tinkering with preexisting proteins before the rise of jawed vertebrates during or before the Silurian period (444 to 419 Mya). Multiple concerted losses of OAS and RNase L occurred during the evolution of jawed vertebrates, further supporting the ancient coupling between OAS and RNase L. Moreover, both OAS and RNase L genes evolved under episodic positive selection across jawed vertebrates, suggesting a long-running evolutionary arms race between the OAS–RNase L pathway and microbes. Our findings illuminate how an innate immune pathway originated via molecular tinkering.
Distinguishing self from nonself is essential to immunity. Nucleic acids can serve as microbe-associated molecular patterns (MAMPs) sensed by the innate immune system, allowing the host to discriminate between self and nonself (1–3). The 2′,5′-oligoadenylate synthetase (OAS) proteins are among the first RNA sensors discovered in the cytosol (4, 5). Upon binding of double-stranded RNA (dsRNA), OAS proteins convert ATP into 2′,5′-linked oligoadenylates (2-5As) consisting of p3A(2′p5′A)n (n ≥ 2) (5, 6). 2-5As act as second messenger molecules and, in turn, activate monomeric ribonuclease L (RNase L) (7). RNase L then dimerizes into its active state and degrades single-stranded viral and cellular RNA, blocking viral replication (1, 8–12). The OAS–RNase L pathway represents a crucial component of innate immunity.
OAS proteins belong to a highly diverse superfamily of nucleotidyltransferase (NTase) fold proteins that are widely present in cellular organisms and viruses (13). OAS homologs that can catalyze the synthesis of 2-5As have been documented sporadically in metazoans (14–19). In contrast, little is known about the origin and evolution of RNase L. A lack of comparative studies taking both OAS and RNase L proteins into account leaves the origin of the functional OAS–RNase L innate immune pathway enigmatic.
Here, we combined phylogenomic and functional analyses to interrogate the origin and evolution of the OAS–RNase L innate immune pathway. Our findings suggest that the functional OAS–RNase L pathway originated through tinkering of preexisting proteins before the most recent common ancestor (MRCA) of modern jawed vertebrates and probably after the divergence between jawless and jawed vertebrates during or before the Silurian period. We found multiple concerted losses of OAS and RNase L took place during the evolutionary course of jawed vertebrates, supporting the ancient coevolution between OAS and RNase L proteins. Signals of episodic positive selection were detected in OAS and RNase L genes across jawed vertebrates, indicating a long-running arms race between the OAS–RNase L pathway and microbes. Our findings illuminate how an innate immune pathway originated through molecular tinkering deep in the evolution of vertebrates and provide insights into the origins and evolution of innate immunity.
Results
The Origin and Evolution of OAS Proteins.
The human OAS gene family consists of four closely related genes, namely hsOAS1, hsOAS2, hsOAS3, and hsOASL (hs indicates the source of genes, Homo sapiens) (Fig. 1A). The hsOASL protein possesses two tandem ubiquitin-like (UBL) domains at the C terminus and has been shown to lack the enzymatic activity of 2-5A synthesis (20–22). All the four human OAS proteins encode two domains, NTase (more specifically NTP_transf_2; accession: PF01909) and OAS1_C (accession: PF10421), in common. Thus, a tandem NTase and OAS1_C combination (either with or without UBL) was defined as an OAS unit in this study for convenience. hsOAS1, hsOAS2, hsOAS3, and hsOASL proteins encode one, two, three, and one OAS units, respectively (17, 23) (Fig. 1A).
Fig. 1.
The evolutionary history of OAS-related proteins. (A) Domain architectures of human OAS (hsOAS1-3 and hsOASL) proteins. The clade number to which each human OAS protein belong is shown above each OAS unit. (B) The phylogenetic relationship of NTase-containing proteins from cellular organisms reconstructed based on the NTase domain (accession: CL0260). Domain architectures of representative proteins are shown. Note: NTP_transf_2, SMODS, DNA_pol_B_palm, RlaP, and TUTase belong to the NTase superfamily and are shaded in brown. The lineage of OAS-related proteins is highlighted in red. The distribution for each NTase-containing protein in cellular organisms is shown, and solid and empty circles indicate presence and absence of related proteins, respectively. The eukaryote phylogeny is based on the literature (24). (C) The phylogenetic relationship of OAS-related proteins and bacterium-NTase proteins reconstructed based on the OAS unit (NTase and OAS1_C). OAS-related proteins are classified into three groups, namely OASa to OASc. The lines connecting OAS units indicate that these units are fused in one protein, and the units included or excluded in phylogenetic analysis are labeled in deep and shallow colors. The OASa group is collapsed into a gray triangle and shown in detail in panel (E). (D) The relationship among OAS units of OASa proteins. OAS units within a protein are connected by lines in blue (two units), yellow (three units), red (four units), and green (five units). (E) The phylogenetic relationship of OASa proteins, which shows the details of the gray triangle in panel (C ). OASa proteins are classified into 22 groups, namely clade 1 to 22. Human OASa proteins are highlighted, and the name for each unit is shown. OASa proteins are divided into OASa.U group (with the UBL domain) and OASa.dU group (without the UBL domain), which is indicated by a red line. OASL proteins from placental mammals are classified into two lineages, OASL1 and OASL2. The units included or excluded in phylogenetic analysis are labeled in deep and shallow colors. In panels (A), (C), and (E), domain architectures for each protein are shown near the corresponding protein. Brach supports are shown near the selected nodes. The tips are labeled based on their source species.
Given OAS proteins belong to the NTase superfamily, we first identified NTase homologs across 174 representative cellular organisms, including 21 bacteria, 15 archaea, and 138 eukaryotes (SI Appendix, Table S1). The NTase superfamily (accession: CL0260) includes NTP_transf_2 (accession: PF01909), SMODS (accession: PF18144), DNA_pol_B_palm (accession: PF14792), RlaP (accession: PF10127), TUTase (accession: PF19088) and others that share certain level of sequence or structural similarity. Large-scale phylogenetic analyses of NTase-containing proteins show that NTase proteins with OAS1_C domain cluster together with robust support (UFBoot = 100%), which is designated OAS-related proteins hereafter. OAS-related proteins were identified in amoebozoans, choanoflagellates, and metazoans (Fig. 1B) and cluster with a lineage of NTase-containing proteins from bacteria (designated bacterium-NTase) with robust support (UFBoot = 100%). Next, we performed phylogenetic analyses of OAS-related proteins with expanded metazoan sampling (370 cellular organism species with 280 metazoans) (SI Appendix, Table S2). Based on the phylogenetic analyses of the NTase-OAS1_C unit, OAS-related proteins are divided into three major groups, namely OASa (with UFBoot = 100%), OASb (with UFBoot = 86%), and OASc (Fig. 1C). The OASa group includes classical OAS proteins, such as hsOAS1-3 and hsOASL, and is only present in jawed vertebrates. The OASb group is patchily distributed across unikonts. The OASc group is only distributed in choanoflagellates (Fig. 1C). The OAS-related proteins share significant structural similarity to hsOAS1 protein (SI Appendix, Fig. S1). Combining phylogenetic analyses and domain architectures (Fig. 1 C and E), we found that NTase and OAS1_C domains appear to be frequently lost during the evolution of OAS-related proteins.
OASa proteins typically encode one to five repeats of OAS units, indicating that OAS units experienced frequent intragene duplication (Fig. 1D). Moreover, we found that gene conversion occurred occasionally among paralogous OAS-related genes or among OAS units, indicating gene conversion might play a role in the diversification of OAS genes (SI Appendix, Table S3). Based on phylogenetic and domain architecture analyses, OASa proteins can be divided into two major classes designated OASa.U (with UBL; for example, hsOASL) and OASa.dU (without UBL; for example, hsOAS1 to hsOAS3). Phylogenetic analyses suggest that OASa.dU proteins nest within the diversity of OASa.U proteins, indicating that NTase-OAS1_C-UBL is the ancient architecture of OASa proteins. OASa.dU proteins are present in amniotes, whereas OASa.U are distributed across jawed vertebrates (Fig. 1E). We infer that OASa.dU proteins might have arisen in two steps: i) An ancient gene duplication event occurred in the MRCA of tetrapods; ii) One of them lost the UBL domain before the MRCA of amniotes, leading to the birth of OASa.dU proteins. It should be noted that several OASa.dU proteins from Squamata species carry the UBL domain, which may have been independently acquired in relatively recent time (clade 12 in Fig. 1E). Taken together, our results suggest that OASa proteins originated from a preexisting OAS protein through acquiring additional UBL domains before the MRCA of jawed vertebrates during or before the Silurian period (444 to 419 Mya) (25, 26).
The Origin and Evolution of RNase L Proteins.
RNase L is a latent endoribonuclease composed of three domains: N-terminal Ankyrin (ANK) repeat domain (accession: PF12796, PF13637, or PF13857) involved in the binding of 2-5As, Pkinase (PK) domain (accession: PF00069) involved in RNase L dimerization, and C-terminal ribonuclease (RNase) domain (accession: PF06479) that cleaves ssRNA (27–29). RNase L shares certain sequence and structure similarity with inositol-requiring enzyme 1 (IRE1) protein involved in unfolded protein response (30, 31). Through phylogenomic analyses of representative cellular organisms, we identified RNase homologs in a wide range of eukaryotes (Fig. 2A). Most of these RNase homologs encode an additional domain, the PK domain. Phylogenetic analyses of RNase domains show that a monophyletic group of RNase homologs encode two to four ANK repeats and includes classical RNase L proteins, which is designated the RNase L lineage (clade B in Fig. 2A). RNase L proteins share significant structural similarity to hsRNase L protein, but non-RNase L RNase proteins share weaker structural similarity to hsRNase L protein (SI Appendix, Fig. S2). Phylogenetic analyses of RNase L-related proteins with expanded metazoan sampling (SI Appendix, Table S2) reveal that RNase L proteins are ubiquitously present in all the major lineages of jawed vertebrates except Actinopterygii (ray-finned fishes) (Fig. 2B). Outside the diversity of RNase L proteins, several proteins also encode ANK repeats, which are likely to have been gained independently. Taken together, our results indicate that RNase L proteins might have originated from a preexisting protein with PK and RNase domains through acquiring ANK repeats before the MRCA of jawed vertebrates.
Fig. 2.
The evolutionary history of RNase L-related proteins. (A) The phylogenetic relationship of RNase L homologs based on RNase domains. RNase L homologs are classified into ten clades. The distribution for each clade in cellular organisms is shown, and the eukaryote tree is based on the literature (24). Solid and empty circles indicate presence and absence of the corresponding clades, respectively. Domain architectures of representative proteins are shown. (B) The phylogenetic relationship of RNase L-related proteins and proteins from clades C1-C4, D, and E in panel (A). The domain architecture for each protein is shown near the corresponding protein. Brach supports are shown near the selected nodes. The tips are labeled based on their source species.
The Origin of the Functional OAS–RNase L Pathway.
Sequence similarity does not necessarily demonstrate functional conservation (32). To investigate whether OAS and RNase L proteins work together as a functional pathway, we used a proven budding yeast (Saccharomyces cerevisiae) system (33). Budding yeast constitutively produces dsRNA in the cytosol (34, 35), and heterologous coexpression of hsOAS1 and hsRNase L can arrest yeast growth due to cellular RNA degradation (33). We first tested the human OAS–RNase L pathway through heterologous coexpression of hsOASa genes (hsOAS1, hsOAS2, hsOAS3, and hsOASL) and hsRNase L gene. While galactose-induced expression of either hsOASa or hsRNase L does not inhibit yeast growth, coexpression of hsOAS1 and hsRNase L, hsOAS2 and hsRNase L, or hsOAS3 and hsRNase L results in yeast growth arrest, ribosomal RNA (rRNA) cleavage, and reduced hsOASa and hsRNase L protein levels (Fig. 3A). Moreover, coexpression of hsOAS1, hsOAS2, or hsOAS3 and a catalytically inactive hsRNase LH672A has no effect on yeast growth, and thus yeast growth arrest is dependent on the activation of hsRNase L (Fig. 3A). These results confirm that the yeast system provides an efficient platform to study the function of the OAS–RNase L pathway (33).
Fig. 3.
Functional analyses of the OAS–RNase L pathway in the yeast system. (A–D) Functional analyses of OASa and RNase L from representative jawed vertebrates, including H. sapiens (A), R. bivittatum (B), L. chalumnae (C), and C. milii (D). (E) Functional analyses of OASb and RNase L from R. bivittatum and C. milii. Yeast cells were transformed with the indicated plasmids encoding OAS-HA, RNase L-Flag, or both under a galactose-induced promoter (GAL1/10). Tenfold serial dilutions were spotted on the surface of solid medium containing either glucose or galactose and imaged after 48 h or 72 h, respectively. Proteins were separated by SDS-PAGE gel and detected by western blotting with an antibody to HA tag, Flag tag, and GAPDH, respectively. Total RNA was extracted and separated on agarose gel to assess RNA integrity. Transformants used in western blots and RNA integrity analyses were induced by galactose for 6 h.
Our phylogenomic analyses suggest that OASa and RNase L proteins originated before the MRCA of jawed vertebrates. To further investigate the origin of functional coupling of OASa and RNase L proteins, we coexpressed OASa and RNase L from three representative species that cover the diversity of jawed vertebrates, including the two-lined caecilian Rhinatrema bivittatum (amphibian), the coelacanth Latimeria chalumnae (lobe-finned fish), and the Australian ghost shark Callorhinchus milii (cartilaginous fish), in the yeast system. For all these three vertebrates, coexpression of OASa and RNase L causes attenuated yeast growth, and yeast grows normally when OASa and a catalytically inactive RNase L were coexpressed (Fig. 3 B–D). Coexpression of rbOASa and rbRNase L genes from R. bivittatum results in significant rRNA degradation and reduced OASa and RNase L protein levels. We did not observe significant rRNA degradation when coexpressing OASa and RNase L from L. chalumnae and C. milii. But reduced OASa and RNase L protein levels indicate that degradation of cellular mRNA might occur. To verify mRNA degradation, we coexpressed OASa and RNase L from L. chalumnae and C. milii in yeast and used long-read sequencing to sequence mRNA with poly(A) (Fig. 4A and SI Appendix, Fig. S4). When coexpressing OASa and RNase L from L. chalumnae or C. milii, the length of sequenced mRNA with poly(A) was reduced (Fig. 4 B and C). When we closely checked four example genes (ALG3, RIB1, ACH1, and RCR1) randomly selected, a lower sequencing coverage was observed for 5′-end of their mRNA in the yeast with coexpression of OASa and RNase L (Fig. 4D). Moreover, the length of sequenced mRNA was significantly reduced in the yeast with coexpression of OASa and RNase L (Fig. 4E). These results support that the observed growth arrest of yeast with coexpression of OASa and RNase L from L. chalumnae and C. milii is caused by degradation of cellular mRNA, which is not readily observed through the total RNA gel. We observed that the strain expressing lcRNase L gene grew relatively weakly (but more strongly than the strain expressing both lcOASa and lcRNase L genes) (Fig. 3C and SI Appendix, Fig. S3). When expressing lcRNase L with a catalytic site mutation (lcRNase LH710D), we observed improved (but not fully rescued) yeast growth. These results suggest that other mechanisms (besides RNase L activity) might be responsible for the weaker growth of the yeast strain expressing lcRNase L, which remain to be explored. Nevertheless, our results suggest that OASa and RNase L work together as a functional pathway across jawed vertebrates.
Fig. 4.
Degradation of mRNA with poly(A) in yeasts transformed with OASa and RNase L genes of L. chalumnae and C. milii. (A) Schematic outline of mRNA degradation detection using Oxford Nanopore Technologies (ONT) long-read sequencing. Yeast cells were transformed with plasmids carrying OAS and RNase L genes under the control of the GAL1/10 promoter, and yeast cells transformed with plasmids carrying the GAL1/10 promoter were used as control. mRNA with poly(A) was captured from the total RNA by magnetic beads, followed by ONT long-read sequencing. (B) Distribution of the length of mRNA with poly(A) isolated from yeasts transformed with the indicated plasmids. lcOR and cmOR indicate plasmids with inducible OASa and RNase L genes from L. chalumnae and C. milii, respectively. Two replicates (R1 and R2) were performed. Sequences with length of > 5,000 nt are not included. Dotted lines represent N50 values. (C) Distribution of the length of mRNA with poly(A) isolated from yeasts transformed with the indicated plasmids. Rectangles, center lines, and whiskers indicate interquartile ranges (IQRs), medians, and 1.5× IQRs, respectively. (D) Sequence coverage along four example genes in yeast chromosome II. Gene structures are shown with lines indicating untranslated regions, rectangles indicating coding regions, and arrows indicating transcription direction. (E) Comparison of the length of reads mapped onto four example genes. Diamonds indicate the means of read length. Rectangles, center lines, and whiskers indicate IQRs, medians, and 1.5× IQRs, respectively. P-values are calculated using two-tailed Wilcoxon rank-sum tests.
Our phylogenomic analyses indicate that OASb proteins are widely distributed in unikonts besides jawed vertebrates. To test whether OASb and RNase L work together in a functional pathway, we coexpressed OASb and RNase L from R. bivittatum and C. milii in the yeast system. We did not observe yeast growth arrest or rRNA degradation (Fig. 3E). These data do not support functional coupling of OASb and RNase L proteins. Taken together, our results demonstrate that the functional OAS–RNase L pathway might have originated in the MRCA of jawed vertebrates during or before the Silurian period (25, 26).
Coevolution between OASa and RNase L Proteins.
When analyzing the distribution of OAS and RNase L proteins across cellular organisms (Fig. 5 and SI Appendix, Fig. S5), we observed an interesting pattern that OASa and RNase L are often absent simultaneously in certain jawed vertebrates, in particular ray-finned fishes and two amphibian orders (Anura and Caudata), suggesting that concerted losses of OASa and RNase L took place independently in the MRCA of ray-finned fishes and the MRCA of Anura and Caudata (Fig. 5 and SI Appendix, Fig. S5). Moreover, many recent losses of OASa, RNase L, or both were also observed in jawed vertebrates (Fig. 5 and SI Appendix, Fig. S5). In contrast, OASb proteins are patchily distributed in unikonts, and OASb and RNase L proteins appear to exhibit great difference in distribution (Fig. 5 and SI Appendix, Fig. S5). We further quantified the correlation between the distribution of OAS and RNase L proteins across jawed vertebrates using the presence and absence of OASa, OASb, and RNase L as traits (SI Appendix, Table S4). Evidence favoring correlation was found between OASa and RNase L (Log BF = 8.50) (Fig. 5B), but not between OASb and RNase L (Log BF = −16.90) (Fig. 5C), further supporting the ancient coupling between OASa and RNase L proteins in jawed vertebrates.
Fig. 5.
Distribution and coevolution of OAS and RNase L proteins across cellular organisms. (A) Distribution of OASa to OASc and RNase L proteins across cellular organisms. Numbers of species used are shown near the corresponding eukaryote groups. Solid and empty circles indicate the presence and absence of the related proteins in species of the corresponding groups, respectively. Semisolid circles indicate the presence of the related proteins in a portion of species of the corresponding groups, and the numbers in circles indicate the numbers of species possessing the related proteins. The eukaryote tree is based on the literatures (24, 36), and phylogenetic uncertainty is indicated with dotted lines. (B) Correlation between the distribution of OASa (Left) and RNase L (Right) across jawed vertebrates. (C) Correlation between the distribution of OASb (Left) and RNase L (Right) across jawed vertebrates. For panels (B) and (C ), the phylogeny of jawed vertebrates is based on the literatures (24, 36). Solid and empty squares indicate the presence and absence of the related proteins in the corresponding species, respectively.
Episodic Positive Selection Acting on the OAS and RNase L.
To test whether an evolutionary arms race has occurred between the OAS–RNase L pathway and microbes, we detected signals of positive selection in OASa and RNase L genes in diverse vertebrate groups. Positively selected sites with varied numbers were detected in nearly all the OASa clades across jawed vertebrates (SI Appendix, Fig. S6A) and in RNase L genes across jawed vertebrates (SI Appendix, Fig. S6B). Sites subject to positive selection appear to be distributed dispersedly along OASa and RNase L genes, indicating diverse antagonists might have been involved in thwarting the OAS–RNase L pathway. Moreover, we also detected many lineages subject to positive selection in OASa and RNase L genes across jawed vertebrates (SI Appendix, Figs S7–S9). Our results indicate that a long-running arms race might have occurred between the OAS–RNase L pathway and diverse microbes during the evolution of jawed vertebrates.
Discussion
Distinguishing nonself from self is fundamental to immunity. Sensing dsRNA as MAMPs, the OAS–RNase L pathway represents a classical innate immune system to discriminate nonself from self. In this study, we combined phylogenomic and functional analyses to investigate the origin and evolution of the OAS–RNase L pathway. We found that OAS and RNase L homologs can be identified in unikonts, but RNase L with intact functional domains might have originated in the MRCA of jawed vertebrates. Functional analyses using the yeast system show that OASa, rather than OASb, functions with RNase L across jawed vertebrates. The distribution of RNase L is correlated with that of OASa, but not OASb, in jawed vertebrates, indicating the ancient coupling of OASa and RNase L in jawed vertebrates. Moreover, no OASa or RNase L proteins were identified in jawless vertebrates. These lines of evidence suggest that the functional OAS–RNase L pathway originated before the MRCA of jawed vertebrates and probably after the divergence between jawless and jawed vertebrates during or before the Silurian period. It should be noted that the functional assays used in this study are the heterologous expression of OAS and/or RNase L genes in the yeast model, and may miss the complexity of the regulation of the OAS–RNase L pathway and other lineage- or species-specific protein partners that may be implicated.
Our phylogenetic analyses show that OASa and OASb proteins form two groups (Fig. 1C), but OASa and OASb proteins exhibit different distribution (Fig. 5 and SI Appendix, Fig. S5). The pattern might be explained by the following scenario: OASa and OASb arose through an ancient gene duplication during the early evolution of unikonts, and OASa was recurrently lost in unikonts outside vertebrates. Indeed, OASb appears to have also undergone extensive losses throughout the evolution of unikonts. Nevertheless, our analyses indicate that OASa proteins originated from an ancient OAS-related protein via gaining UBL domain before the MRCA of jawed vertebrates. In concert, RNase L originated from a preexisting protein with PK and RNase domains via acquiring ANK repeats that are involved in 2-5A binding (27–29) before the MRCA of jawed vertebrates. Therefore, our study reveals that the OAS–RNase L pathway originated via molecular tinkering with preexisting proteins (Fig. 6). Interestingly, our previous study shows that plant innate immune machinery, the HOPZ-ACTIVATED RESISTANCE 1 (ZAR1) resistosome, also originated through tinkering with preexisting immune proteins (37). Our findings highlight the role of molecular tinkering with preexisting proteins in the origins of innate immune mechanisms.
Fig. 6.
Model for the origin and evolution of the functional OAS–RNase L pathway. OAS proteins might have originated during the early evolution of unikonts. OASa protein originated from an ancestral OAS-related protein by gaining UBL domains before the MRCA of jawed vertebrates. RNase L with intact functional domains originated from an ancestral protein with PK and RNase domains through acquiring ANK repeats before the MRCA of jawed vertebrates. The functional pathway of OASa and RNase L originated before the MRCA of jawed vertebrates and probably after the divergence of jawed and jawless vertebrates. A lineage of OAS.U proteins lost their UBL domain in the MRCA of amniotes. The OAS–RNase L pathway was lost in the MRCA of ray-finned fishes and the MRCA of Anura and Caudata. The order for the losses of OAS and RNase L in specific lineages remains uncertain. OASa proteins underwent amplification and diversification in mammals.
Our analyses suggest that OASa proteins evolved in a birth-and-death manner, in which new genes are generated through recurrent gene duplication (in this case, duplicate genes are also frequently fused), and some duplicate genes are maintained for a long time but others become nonfunctional and eventually deleted (38). OASa underwent multiple rounds of gene duplication, gene fusion, and gene conversion during the evolution of mammals, generating diverse OASa.U proteins (such as OASL1 and OASL2 lineages) and OASa.dU proteins (such as hsOAS1, hsOAS2, hsOAS3) (Fig. 1E). Some OASa proteins have retained the ancestral function, whereas others have evolved various new functions, which can explain the different effects on yeast growth of human OASa proteins (Fig. 3A) and the different strength of positive selection acting on distinct OASa clades (SI Appendix, Fig. S6). OASa proteins were also frequently lost across the jawed vertebrates, consistent with a previous study showing that OAS1 loss-of-function variation is common in primates (33). Moreover, the OAS–RNase L pathway was lost in the MRCA of ray-finned fishes and the MRCA of Anura and Caudata. The frequent losses of OASa proteins or the OAS–RNase L pathway might be due to the cost associated with their activation (33).
Interferons (IFNs) regulate the expression of OAS genes and RNase L genes in human cells (7, 39). The origin of the functional OAS–RNase L pathway coincides with the emergence of IFNs before the MRCA of jawed vertebrates (40–42). However, it remains to be explored on the role of IFNs in the regulation of the OAS–RNase L pathway in the jawed vertebrates outside mammals. Moreover, the functional OAS–RNase L pathway originated concurrently with the so-called “Big Bang” emergence of adaptive immunity in jawed vertebrates (43–45). It follows that innate immune systems might not necessarily be older than adaptive immunity. Signaling mediated by toll-like receptors, pattern recognition receptors in innate immunity, is likely to have a more ancient origin, likely during the early evolution of metazoans (46). Cyclic GMP–AMP synthase (cGAS) and Stimulator of IFN Genes (STING) pathway is an innate immune pathway responsible for surveillance of cytosolic dsDNA. While cGAS and STING signaling originated in bacteria (47, 48), detecting dsDNA has been thought to be an innovation in vertebrates (47, 49). Therefore, different innate immune systems might have arisen asynchronously. Our findings illuminate how an innate immune pathway originated through tinkering with preexisting proteins deep in the evolution of vertebrates.
Materials and Methods
Identification of OAS-Related and RNase L-Related Proteins.
Initially, we performed a similarity search of OAS and RNase L homologs in 174 representative species that cover the major diversity of cellular organisms, including 21 bacteria, 15 archaea, 15 fungi, 13 protists, 4 Chromista species, 22 plants, and 84 animals (SI Appendix, Table S1). For the identification of OAS homologs, we used HMMER, BLASTP, or TBLASTN algorithms to search against the proteomes or genomes of cellular organisms with NTase domain (accession: PF01909) sequences as seeds or queries and an e cutoff value of 10−5 (50). For the identification of RNase L homologs, we used HMMER, BLASTP, or TBLASTN algorithms to search against the proteomes or genomes of cellular organisms with RNase domain (accession: PF06479) sequences as seeds or queries and an e cutoff value of 10−5 (50). Putative pseudogenes were not used in phylogenetic analyses. Significant hits were aligned using the L-INS-I strategy implemented in MAFFT and refined manually (51). Phylogenetic analyses of OAS homologs and RNase L homologs were performed based on NTase domains and full-length RNase L homologs, respectively, using a maximum likelihood method implemented in IQ-TREE (v2.0) (52). The best-fit substitution model was selected using ModelFinder (53). The support values were assessed using an ultrafast bootstrap method with 1,000 replicates (54). Phylogenetic trees were then annotated using iTOL (55). Domain architecture analyses were performed using PfamScan and CD-Search (56, 57). Structures of representative OAS-related and RNase proteins predicted by AlphaFold were retrieved from AlphaFold Protein Structure Database (58). Template modeling score (TM-score) between protein structure pairs was calculated using Pairwise Structure Alignment tool in RCSB Protein Data Bank (59).
To further investigate the evolution of OAS-related and RNase L-related proteins, we used a dataset of cellular organisms with extended sampling in metazoans, including 21 bacteria, 15 archaea, 15 fungi, 13 protists, 4 chromistas, 22 plants, and 280 animals (SI Appendix, Table S2). We used BLASTP or TBLASTN algorithms to search against the proteomes or genomes of the dataset with hsOAS1 (accession: NP_001027581.1) and RNase domain (accession: PF06479) sequences as queries, respectively, and an e cutoff value of 10−5 (50). Putative pseudogenes were not used in phylogenetic analyses. Significant hits were aligned using the L-INS-I strategy implemented in MAFFT and refined manually (51). Phylogenetic analyses of OAS-related proteins and RNase L-related proteins were performed based on OAS units and full-length RNase L homologs, respectively, using a maximum likelihood method implemented in IQ-TREE (v2.0) (52) as aforementioned.
The alignments and trees used in this study are available at -https://data.mendeley.com/datasets/66d3vw824w/1.
Gene Conversion Detection.
Fused OAS units within an OASa gene were divided and analyzed independently. The GENECONV 1.81 program was used to detect gene conversion events that occurred among paralogous OAS genes or among the NTase-OAS1_C units from representative species with a cutoff global P-values of less than 0.05 (60). We only considered converted regions that were at least 50 bp long.
Cloning of OAS and RNase L cDNA.
For functional analyses of OAS and RNase L proteins, we selected four representative jawed vertebrates that occupy crucial phylogenetic positions, including H. sapiens (mammal), R. bivittatum (amphibian), L. chalumnae (lobe-finned fish), and C. milii (cartilaginous fish). Total RNA was extracted (FastPure Cell/Tissue Total RNA Isolation Kit V2, Vazyme #RC112-01) from human lung carcinoma cells (A549) treated by 50 ng/mL IFNα2a (Novoprotein #C025) for 20 h, and was used as templates in reverse transcription reactions (HiScript III 1st Strand cDNA Synthesis Kit, Vazyme #R312-01). cDNA of hsOAS1, hsOAS3, hsOASL, and hsRNase L genes was amplified from total cDNA with primers listed in SI Appendix, Table S5. cDNA of above genes was then sequenced to verify their authenticity (SI Appendix, Table S6). cDNA sequences of hsOAS2 (with codon optimization), rbOASa, rbOASb, rbRNase L, lcOASa, lcRNase L, cmOASa, cmOASb, and cmRNase L were synthesized by GenScript (SI Appendix, Table S6). Key catalytic sites of RNase L were determined based on previous studies (9, 28). Site-directed mutagenesis was performed using primers listed in SI Appendix, Table S5.
Construction of Yeast Transformants.
OAS and RNase L genes were seamlessly cloned (ClonExpress II One Step Cloning Kit, Vazyme #C112-01) into the GAL1/10 dual expression plasmid Gal_HF (SI Appendix, Table S6). Expression of OAS and RNase L genes was induced by a galactose-induced promoter (GAL1/10) and detected using HA and Flag tags, respectively. Recombinant plasmids were transformed into Escherichia coli by heat shock, selected on LB solid medium containing 100 mg/ml Kanamycin, and verified by PCR and sequencing. Plasmids were then transformed into S. cerevisiae (strain S288C) using lithium acetate and screened for G418 sulfate resistance. Yeast transformants were verified by PCR.
Spot Assay.
Yeast transformants were grown at 30 ℃ in selective YPD liquid medium overnight and washed by sterile water. Tenfold serial dilutions (OD600 = 3.0, 0.3, 0.03, 0.003, and 0.0003) were prepared, and 2 μL of each dilution was platted on the surface of solid medium containing 2% glucose or 2% galactose, followed by imaging after 48 h and 60 h of growth.
Western Blotting.
Strains were cultured in selective liquid medium overnight and diluted with fresh liquid medium until grown to log phase (OD600 ~ 0.6). Cells were collected by centrifugation and washed before induced by 2% galactose for 6 h. Yeast cells were then treated with 1.8 M NaOH containing 10 mM phenylmethanesulfonyl fluoride and 1% β-mercaptoethanol for 5 min to lyse cells. An equal volume of 50% trichloroacetic acid was added to the above solution to form protein precipitation. Sediments were obtained by centrifugation, followed by washing in 0.5 M Tris-HCl and ultrapure water. Total protein was dissolved in Protein Loading Buffer (Trans #DL101-02) and boiled for 10 min. Protein samples were separated on a 10% SDS-PAGE gel, followed by transferred to 0.45-μm PVDF transfer membrane (Biosharp #BS-PVDF-45). Membranes were blocked with 5% skimmed milk powder in TBS-T (20 mM Tris-HCl pH 7.6, 150 mM NaCl, 0.1% Tween-20) for 2 h at room temperature and incubated 2 h with different antibodies diluted in TBS-T in turn [1:2,000 for mouse anti-HA (Trans #HT301-01), 1:10,000 for mouse anti-Flag (Affinity #T0003), and 1:10,000 for mouse anti-GAPDH (Proteintech #60004-1-Ig)]. After washing, the membrane was incubated for 1 h at room temperature with an IRDye infrared secondary antibody diluted in TBS-T [1:2,000 for goat anti-Mouse IgG (Fdbio #FD0147)]. Revelation was done using the Odyssey Imaging System and Image Studio software.
RNA Integrity Analysis.
Transformed yeast strains were grown to log phase in selective YPD liquid medium, followed by induction by 2% galactose for 6 h. Total RNA was extracted form yeast cultures (5 to 10 mL) using a Yeast RNA Kit (OMEGA #R6870-01) and quantified. RNA integrity was assessed using 2% agarose gel with the Clinx Science Instruments.
Nanopore DNA Library Preparation and Sequencing.
Total RNA was extracted using the Yeast RNA Kit (OMEGA #R6870-01) from yeast cells transformed with empty vector carrying GAL1/10 promoter and plasmids carrying inducible OASa and RNase L from L. chalumnae and C. milii. Two biological duplicates were used. mRNA with poly(A) was captured using AMPure beads (Beckman #A63882). 1D library preparation was performed using the Rapid Barcoding Kit (ONT #SQK-PBK004) and PCR-cDNA Sequencing Kit (ONT #SQK-PCS109). Six prepared libraries were sequenced on the PromethION platforms using ONT R9.4 flow cells with the MinKNOW (v8.3.1) to generate fast5 files (61). All the generated files were then basecalled in GUPPY (v5.0.16) to yield fastq files. Full-length reads were identified using Pychopper (v2.5.0), and reads with a quality score Q of <7 were removed using NanoFilt (v2.8.0) (62). Filtered full-length reads were then mapped onto S. cerevisiae reference genome (GCF_000146045.2_R64) using Minimap2 (63) and then were converted to the BAM format using SAMTools (v1.9) (64). Coverage data were visualized with IGV (v2.15.4) (65). Statistics of sequencing information was performed by NanoPack (62). SAMTools (v1.9) was used to extract reads mapped on the yeast genome based on the gene position (64). The raw sequence data reported in this study have been deposited in National Genomics Data Center (GSA: CRA009763) that are publicly accessible at https://ngdc.cncb.ac.cn/gsa.
Correlation Analyses of OAS and RNase L Distribution.
To quantify the correlation of OAS and RNase L distribution in jawed vertebrates, we assigned the presence/absence of OAS and RNase L as traits (SI Appendix, Table S4). We used BayesTraits (v4.0.0) (66) to test the correlation between OASa and RNase L protein and OASb and RNase L proteins along the species phylogeny. Two models were assessed, namely the independent model in which traits evolved independently and the dependent model in which traits evolved dependently. The marginal likelihood of the two models was estimated using the stepping stone sampler with 250 stones and 5,000 iterations for each stone (67). Log Bayes factors (Log BF) values were calculated by comparing the likelihood of the two models: Log BF = 2(log marginal likelihood of the dependent model–log marginal likelihood of the independent model). Log BF values of >2 indicate positive evidence of trait correlation, and Log BF values of >5 indicate strong evidence of trait correlation.
Selection Analyses of OASa and RNase L Genes.
We performed selection analyses of OASa and RNase L genes in a wide range of jawed vertebrate groups. OASa and RNase L genes from five orders of mammals (Carnivora, Artiodactyla, Chiroptera, Primates, and Rodentia), two infraclasses of birds (Palaeognathae and Neognathae), two orders of reptiles (Testudines and Squamata), and the class Chondrichthyes were analyzed independently. Fused OAS units within a OASa gene were divided and analyzed independently based on their classification in Fig. 1E. Coding sequences were aligned using codon model implemented in MUSCLE, and ambiguous regions were trimmed manually (68). Phylogenetic trees were reconstructed using a maximum likelihood method implemented in IQ-TREE (54). Free ratio model (model = 1) in PAML (v4.9) was used to detect branches subject to positive selection. Two pairs of site models (M7 versus M8) in PAML (v4.9) were used to detect sites subject to positive selection with the Bayes Empirical Bayes posterior probability of >0.95 (69).
Supplementary Material
Appendix 01 (PDF)
Acknowledgments
This work was supported by National Natural Science Foundation of China (32270684 and 31922001).
Author contributions
G.-Z.H. designed research; L.C., Z.G., and W.W. performed research; L.C. and G.-Z.H. analyzed data; and L.C. and G.-Z.H. wrote the paper.
Competing interests
The authors declare no competing interest.
Footnotes
This article is a PNAS Direct Submission. N.C.E. is a guest editor invited by the Editorial Board.
Data, Materials, and Software Availability
Raw sequence data have been deposited in National Genomics Data Center (GSA: CRA009763) (70).
Supporting Information
References
- 1.Hornung V., Hartmann R., Ablasser A., Hopfner K. P., OAS proteins and cGAS: Unifying concepts in sensing and responding to cytosolic nucleic acids. Nat. Rev. Immunol. 14, 521–528 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Wu J., Chen Z. J., Innate immune sensing and signaling of cytosolic nucleic acids. Annu. Rev. Immunol. 32, 461–488 (2014). [DOI] [PubMed] [Google Scholar]
- 3.Bartok E., Hartmann G., Immune sensing mechanisms that discriminate self from altered self and foreign nucleic acids. Immunity 53, 54–77 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hovanessian A. G., Brown R. E., Kerr I. M., Synthesis of low molecular weight inhibitor of protein synthesis with enzyme from interferon-treated cells. Nature 268, 537–540 (1977). [DOI] [PubMed] [Google Scholar]
- 5.Zilberstein A., Kimchi A., Schmidt A., Revel M., Isolation of two interferon-induced translational inhibitors: a protein kinase and an oligo-isoadenylate synthetase. Proc. Natl. Acad. Sci. U.S.A. 75, 4734–4738 (1978). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Kerr I. M., Brown R. E., pppA2’p5’A2’p5’A: An inhibitor of protein synthesis synthesized with an enzyme fraction from interferon-treated cells. Proc. Natl. Acad. Sci. U.S.A. 75, 256–260 (1978). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhou A., Hassel B. A., Silverman R. H., Expression cloning of 2–5A-dependent RNAase: A uniquely regulated mediator of interferon action. Cell 72, 753–765 (1993). [DOI] [PubMed] [Google Scholar]
- 8.Silverman R. H., Viral encounters with 2’,5’-oligoadenylate synthetase and RNase L during the interferon antiviral response. J. Virol. 81, 12720–12729 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Huang H., et al. , Dimeric structure of pseudokinase RNase L bound to 2–5A reveals a basis for interferon-induced antiviral activity. Mol. Cell 53, 221–234 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Donovan J., Rath S., Kolet-Mandrikov D., Korennykh A., Rapid RNase L-driven arrest of protein synthesis in the dsRNA response without degradation of translation machinery. RNA 23, 1660–1671 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Burke J. M., Moon S. L., Matheny T., Parker R., RNase L reprograms translation by widespread mRNA turnover escaped by antiviral mRNAs. Mol. Cell 75, 1203–1217.e5 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Rath S., et al. , Concerted 2–5A-mediated mRNA decay and transcription reprogram protein synthesis in the dsRNA response. Mol. Cell 75, 1218–1228.e6 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kuchta K., Knizewski L., Wyrwicz L. S., Rychlewski L., Ginalski K., Comprehensive classification of nucleotidyltransferase fold proteins: Identification of novel families and their representatives in human. Nucleic Acids Res. 37, 7701–7714 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Kumar S., Mitnik C., Valente G., Floyd-Smith G., Expansion and molecular evolution of the interferon-induced 2’-5’ oligoadenylate synthetase gene family. Mol. Biol. Evol. 17, 738–750 (2000). [DOI] [PubMed] [Google Scholar]
- 15.Kjaer K. H., et al. , Evolution of the 2’-5’-oligoadenylate synthetase family in eukaryotes and bacteria. J. Mol. Evol. 69, 612–624 (2009). [DOI] [PubMed] [Google Scholar]
- 16.Saby E., Poulsen J. B., Justesen J., Kelve M., Uriz M. J., 2’-phosphodiesterase and 2’,5’-oligoadenylate synthetase activities in the lowest metazoans, sponge [porifera]. Biochimie 91, 1531–1534 (2009). [DOI] [PubMed] [Google Scholar]
- 17.Kristiansen H., Gad H. H., Eskildsen-Larsen S., Despres P., Hartmann R., The oligoadenylate synthetase family: An ancient protein family with multiple antiviral activities. J. Interferon Cytokine Res. 31, 41–47 (2011). [DOI] [PubMed] [Google Scholar]
- 18.Pari M., et al. , Enzymatically active 2’,5’-oligoadenylate synthetases are widely distributed among Metazoa, including protostome lineage. Biochimie 97, 200–209 (2014). [DOI] [PubMed] [Google Scholar]
- 19.Hu J., et al. , Origin and development of oligoadenylate synthetase immune system. BMC Evol. Biol. 18, 201 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Rong E., et al. , Molecular mechanisms for the adaptive switching between the OAS/RNase L and OASL/RIG-I pathways in birds and mammals. Front. Immunol. 9, 1398 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Schwartz S. L., Conn G. L., RNA regulation of the antiviral protein 2’-5’-oligoadenylate synthetase. Wiley Interdiscip. Rev. RNA 10, e1534 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ibsen M. S., et al. , Structural and functional analysis reveals that human OASL binds dsRNA to enhance RIG-I signaling. Nucleic Acids Res. 43, 5236–5248 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Hovanessian A. G., Justesen J., The human 2’-5’oligoadenylate synthetase family: Unique interferon-inducible enzymes catalyzing 2’-5’ instead of 3’-5’ phosphodiester bond formation. Biochimie 89, 779–788 (2007). [DOI] [PubMed] [Google Scholar]
- 24.Burki F., Roger A. J., Brown M. W., Simpson A. G. B., The new tree of eukaryotes. Trends Ecol. Evol. 35, 43–55 (2020). [DOI] [PubMed] [Google Scholar]
- 25.Hedges S. B., Marin J., Suleski M., Paymer M., Kumar S., Tree of life reveals clock-like speciation and diversification. Mol. Biol. Evol. 32, 835–845 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Friedman M., Fossils reveal the deep roots of jawed vertebrates. Nature 609, 897–898 (2022). [DOI] [PubMed] [Google Scholar]
- 27.Han Y., Whitney G., Donovan J., Korennykh A., Innate immune messenger 2–5A tethers human RNase L into active high-order complexes. Cell Rep. 2, 902–913 (2012). [DOI] [PubMed] [Google Scholar]
- 28.Han Y., et al. , Structure of human RNase L reveals the basis for regulated RNA decay in the IFN response. Science 343, 1244–1248 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Tanaka N., et al. , Structural basis for recognition of 2’,5’-linked oligoadenylates by human ribonuclease L. EMBO J. 23, 3929–3938 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Lee K. P., et al. , Structure of the dual enzyme Ire1 reveals the basis for catalysis and regulation in nonconventional RNA splicing. Cell 132, 89–100 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Korennykh A. V., et al. , The unfolded protein response signals through high-order assembly of Ire1. Nature 457, 687–693 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Wang C., Gong Z., Han G. Z., On the origins and evolution of phytohormone signaling and biosynthesis in plants. Mol. Plant 16, 511–513 (2023), 10.1016/j.molp.2023.02.002. [DOI] [PubMed] [Google Scholar]
- 33.Carey C. M., et al. , Recurrent loss-of-function mutations reveal costs to OAS1 antiviral activity in primates. Cell Host Microbe 25, 336–343.e4 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Dever T. E., et al. , Mammalian eukaryotic initiation factor 2 alpha kinases functionally substitute for GCN2 protein kinase in the GCN4 translational control mechanism of yeast. Proc. Natl. Acad. Sci. U.S.A. 90, 4616–4620 (1993). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Elde N. C., Child S. J., Geballe A. P., Malik H. S., Protein kinase R reveals an evolutionary model for defeating viral mimicry. Nature 457, 485–489 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Li Y., Shen X. X., Evans B., Dunn C. W., Rokas A., Rooting the animal tree of life. Mol. Biol. Evol. 38, 4322–4333 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Gong Z., et al. , The origin and evolution of a plant resistosome. Plant Cell 34, 1600–1620 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Nei M., Gu X., Sitnikova T., Evolution by the birth-and-death process in multigene families of the vertebrate immune system. Proc. Natl. Acad. Sci. U.S.A. 94, 7799–7806 (1997). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Chebath J., Benech P., Hovanessian A., Galabru J., Revel M., Four different forms of interferon-induced 2’,5’-oligo(A) synthetase identified by immunoblotting in human cells. J. Biol. Chem. 262, 3852–3857 (1987). [PubMed] [Google Scholar]
- 40.Redmond A. K., Zou J., Secombes C. J., Macqueen D. J., Dooley H., Discovery of all three types in cartilaginous fishes enables phylogenetic resolution of the origins and evolution of interferons. Front. Immunol. 10, 1558 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Zou J., et al. , Salmonids have an extraordinary complex type I IFN system: Characterization of the IFN locus in rainbow trout oncorhynchus mykiss reveals two novel IFN subgroups. J. Immunol. 193, 2273–2286 (2014). [DOI] [PubMed] [Google Scholar]
- 42.Levraud J. P., et al. , Identification of the zebrafish IFN receptor: Implications for the origin of the vertebrate IFN system. J. Immunol. 178, 4385–4394 (2007). [DOI] [PubMed] [Google Scholar]
- 43.Litman G. W., Rast J. P., Fugmann S. D., The origins of vertebrate adaptive immunity. Nat. Rev. Immunol. 10, 543–553 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Huang S., et al. , Discovery of an active RAG transposon illuminates the origins of V(D)J recombination. Cell 166, 102–114 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Flajnik M. F., A cold-blooded view of adaptive immunity. Nat. Rev. Immunol. 18, 438–453 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Brennan J. J., Gilmore T. D., Evolutionary origins of toll-like receptor signaling. Mol. Biol. Evol. 35, 1576–1587 (2018). [DOI] [PubMed] [Google Scholar]
- 47.Kranzusch P. J., et al. , Ancient origin of cGAS-STING reveals mechanism of universal 2’,3’ cGAMP signaling. Mol. Cell 59, 891–903 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Morehouse B. R., et al. , STING cyclic dinucleotide sensing originated in bacteria. Nature 586, 429–433 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Margolis S. R., Wilson S. C., Vance R. E., Evolutionary origins of cGAS-STING signaling. Trends Immunol. 38, 733–743 (2017). [DOI] [PubMed] [Google Scholar]
- 50.Eddy S. R., Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Katoh K., Standley D. M., MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Nguyen L. T., Schmidt H. A., von Haeseler A., Minh B. Q., IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Kalyaanamoorthy S., Minh B. Q., Wong T. K. F., von Haeseler A., Jermiin L. S., ModelFinder: Fast model selection for accurate phylogenetic estimates. Nat. Methods 14, 587–589 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54.Hoang D. T., Chernomor O., von Haeseler A., Minh B. Q., Vinh L. S., UFBoot2: Improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Letunic I., Bork P., Interactive tree of life (iTOL) v4: Recent updates and new developments. Nucleic Acids Res. 47, W256–W259 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 56.El-Gebali S., et al. , The Pfam protein families database in 2019. Nucleic Acids Res. 47, D427–D432 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 57.Lu S., et al. , CDD/SPARCLE: The conserved domain database in 2020. Nucleic Acids Res. 48, D265–D268 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Varadi M., et al. , AlphaFold protein structure database: Massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 50, D439–D444 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 59.Berman H. M., et al. , The protein data bank. Nucleic Acids Res. 28, 235–242 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60.Sawyer S., Statistical tests for detecting gene conversion. Mol. Biol. Evol. 6, 526–538 (1989). [DOI] [PubMed] [Google Scholar]
- 61.Jain M., Olsen H. E., Paten B., Akeson M., The oxford nanopore MinION: Delivery of nanopore sequencing to the genomics community. Genome Biol. 17, 239 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.De Coster W., D’Hert S., Schultz D. T., Cruts M., Van Broeckhoven C., NanoPack: Visualizing and processing long-read sequencing data. Bioinformatics 34, 2666–2669 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Li H., Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Li H., et al. , The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65.Thorvaldsdottir H., Robinson J. T., Mesirov J. P., Integrative genomics viewer (IGV): High-performance genomics data visualization and exploration. Brief Bioinform. 14, 178–192 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Pagel M., Meade A., Bayesian analysis of correlated evolution of discrete characters by reversible-jump Markov chain Monte Carlo. Am. Nat. 167, 808–825 (2006). [DOI] [PubMed] [Google Scholar]
- 67.Xie W., Lewis P. O., Fan Y., Kuo L., Chen M. H., Improving marginal likelihood estimation for Bayesian phylogenetic model selection. Syst. Biol. 60, 150–160 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 68.Kumar S., Stecher G., Tamura K., MEGA7: Molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 33, 1870–1874 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Yang Z., PAML 4: Phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007). [DOI] [PubMed] [Google Scholar]
- 70.Chu L., Gong Z., Wang W., Han G.-Z., GSA: CRA009763. National Genomics Data Center. https://ngdc.cncb.ac.cn/gsa/browse/CRA009763. Deposited 9 February 2023.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Appendix 01 (PDF)
Data Availability Statement
Raw sequence data have been deposited in National Genomics Data Center (GSA: CRA009763) (70).