Abstract
Siphonophores are complex colonial animals, consisting of asexually produced bodies (zooids) that are functionally specialized for specific tasks, including feeding, swimming, and sexual reproduction. Though this extreme functional specialization has captivated biologists for generations, its genomic underpinnings remain unknown. We use RNA-seq to investigate gene expression patterns in five zooids and one specialized tissue across seven siphonophore species. Analyses of gene expression across species present several challenges, including identification of comparable expression changes on gene trees with complex histories of speciation, duplication, and loss. We examine gene expression within species, conduct classical analyses examining expression patterns between species, and introduce species branch filtering, which allows us to examine the evolution of expression across species in a phylogenetic framework. Within and across species, we identified hundreds of zooid-specific and species-specific genes, as well as a number of putative transcription factors showing differential expression in particular zooids and developmental stages. We found that gene expression patterns tended to be largely consistent in zooids with the same function across species, but also some large lineage-specific shifts in gene expression. Our findings show that patterns of gene expression have the potential to define zooids in colonial organisms. Traditional analyses of the evolution of gene expression focus on the tips of gene phylogenies, identifying large-scale expression patterns that are zooid or species variable. The new explicit phylogenetic approach we propose here focuses on branches (not tips) offering a deeper evolutionary perspective into specific changes in gene expression within zooids along all branches of the gene (and species) trees.
Keywords: Siphonophora, Cnidaria, expression evolution, functional specialization
Introduction
Colonial animals, consisting of genetically identical bodies that are physically and physiologically connected, can be found across the metazoan tree of life (Hiebert et al. 2021). Functional specialization of such bodies has evolved multiple times within colonial animals, with siphonophores in particular showing the greatest diversity of functionally specialized bodies (Beklemishev 1969). Siphonophores are highly complex, colonial “superorganisms” consisting of asexually produced bodies (termed zooids) that are homologous to solitary free-living polyps and medusae (the typical body forms in Cnidaria), but that share a common gastrovascular cavity (Mackie 1963, 1986; Totton 1965; Mackie et al. 1987; Dunn and Wagner 2006). The extreme specialization of siphonophore zooids has been of central interest to zoologists since the 19th century, in part because these zooids are highly interdependent (Mackie 1963; Beklemishev 1969). Siphonophore zooids have been likened to animal organs, with each functionally specialized zooid performing distinct roles within the colony (fig. 1) (Mackie 1963). Although this analogy works well for understanding the function of zooids within the colony as a whole, the analogy falls short in terms of explaining the evolutionary origin and development of these biological units: functionally specialized zooids are evolutionarily homologous to free living organisms, they are multicellular, possess distinct zooid-specific cell types, and show regional subfunctionalization (Mackie 1960; Totton 1965; Church et al. 2015). Although the developmental mechanisms generating zooids are very different in different clades of colonial animals (Carré 1967, 1969; Carré and Carré 1991; Dunn and Wagner 2006; Siebert et al. 2015), the evolutionary processes acting on zooids may be similar to those acting on other modular biological units such as cell type, tissue, and organ (Hiebert et al. 2021). The cellular and molecular processes underlying the patterning and molecular function of functionally specialized cnidarian zooids remain an open biological question. In recent years, there has been a focus in particular on differential gene expression (DGE) patterns found in different functionally specialized zooids (Siebert et al. 2011; Plachetzki et al. 2014; Sanders et al. 2014, 2015).
Efforts to investigate the functional specialization of siphonophores have been limited, in part because there have been few detailed investigations of zooid structure. In the last half century, the microanatomy of siphonophore zooids and tissues has been investigated in only a handful of siphonophore species (Mackie 1960; Carré 1969; Bardi and Marques 2007; Church et al. 2015). This leaves many unknowns about how zooid structure and function differ across zooid types and species. Recent in situ gene expression analyses in siphonophores have described where a small number of preselected genes are expressed at high spatial resolution (Siebert et al. 2011, 2015; Church et al. 2015), but these methods are limited since they require a large number of specimens per gene and siphonophores are relatively difficult to collect. RNA-seq analyses of hand-dissected specimens (Siebert et al. 2011; Sanders et al. 2014, 2015; Macrander et al. 2015), in contrast, can describe the expression of a very large number of genes at low spatial resolution. The fact that so many data are obtained from each specimen is particularly advantageous in difficult-to-collect organisms like siphonophores. An earlier RNA-seq study of gene expression in two zooid types in a single siphonophore species showed the potential of this method to better understand differences between zooids (Siebert et al. 2011).
In this study, we use RNA-seq data to investigate the functional specialization and evolution of zooids across siphonophore species. We address this question at three comparative levels, namely within species, between species, and across species incorporating an explicit phylogenetic approach. To carry out these analyses, we also address three key challenges to study the evolution of gene expression in a comparative framework (Dunn, Luo, et al. 2013). First, we introduce a new metric to normalize gene expression which is valid for comparison across species. A common metric used to quantify gene expression is transcripts per million (TPM) (Li and Dewey 2011; Wagner et al. 2012). TPM is a relative measure of expression which depends on the number of genes present in a reference. Hence, TPM values are a valid metric when comparing libraries within a single species, but are not directly comparable across species when the references are incomplete, or genes have been gained and lost in the course of evolution. To address this issue, we introduce transcripts per million 10K (TPM10K), a metric which normalizes TPM to account for different sequencing depths among species (see Materials and Methods for details). We use TPM10K in our between species and phylogenetic-based analyses.
Second, we further account for gene and species effects to compare gene expression across genes and species. Normalized read counts produced by RNA-seq experiments are proportional to gene expression, but they are also impacted by factors that can vary across genes and species such as gene sequence and length. Therefore, to estimate and compare expected gene counts it is necessary to model such factors with unknown species- and gene-specific counting-efficiency coefficients (see Dunn, Luo, et al. [2013]). A direct comparison of expected gene counts among species, however, can be misleading if differences in counts are simply due to differences in counting efficiency and not due to differences in expression. To address this issue, we use ratios of expected counts. Using ratios of counts, we are able to eliminate species- and gene-specific counting efficiency within species, as the unknown counting efficiency factor is in both the numerator and the denominator, and is thus removed prior to comparisons among species (Dunn, Luo, et al. 2013). We use ratios of expected counts in our between species and phylogenetic-based analyses.
Third, we use a novel phylogenetic approach to investigate the evolution of gene expression on gene trees with complex evolutionary histories. Most evolutionary studies of gene expression focus exclusively on comparing expression values of strict orthologs (i.e., gene lineages related only by speciation events) which are shared across all species. Analyses of strict orthologs are limited to a subset of genes that have no evidence of duplication, and we call this approach species tree filtering (STF) (Brawand et al. 2011; Yang and Wang 2013; Levin et al. 2016; Cardoso-Moreira et al. 2019). Because the history of genes is characterized by more complex scenarios involving gene duplication and speciation, we introduce a new method that takes advantage of this rich history to examine the evolution of gene expression across more genes and species. We call this method species branch filtering (SBF) (fig. 2), as we map expression data to gene phylogenies and identify equivalent branches in the species tree which are descended from speciation events in order to make valid comparisons across species.
Using STF and SBF, combined with classical DGE analysis, we conducted DGE analyses among siphonophore zooids, and one specialized tissue (the pneumatophore, a gas-filled float) (fig. 1) in seven siphonophore species, to address three questions. First, we analyzed differences in gene expression within species to identify patterns that define particular zooid types, and gain insight into the genes that may be playing a role in the structure, maintenance, and functioning of zooids. We also examined whether gene expression patterns could distinguish novel, distinct species-specific zooid types within particular species. Second, we compared gene expression between species, using STF followed by linear models to identify zooid- and species-variable genes. And third, we used phylogenetic comparative methods and SBF to identify putative clade/lineage or zooid-specific expression patterns.
Results
Within Species Analyses: Which Genes Are Specific to Particular Zooid Types?
We sequenced mRNA from microdissected zooids from seven different siphonophore species (at least two replicate colonies each) and mapped these short-read libraries to previously published transcriptomes (Munro et al. 2018). We collected RNA-seq data from, where possible, five different zooids and one specialized tissue, the pneumatophore, as well as unique zooids specific to Agalma elegans (B palpons), Physalia physalis (tentacular palpon), and Bargmannia elongata (yellow and white gastrozooids), and where possible developing and mature zooids (supplementary table S1, Supplementary Material online). Due to species availability, we were not able to sample more than one replicate for some zooids—single replicates were excluded from downstream DGE analyses (see supplementary fig. S1, Supplementary Material online).
The first component of variation that we assessed was among technical replicates. The technical replicates consist of resequenced developing nectophores and developing gastrozooids from the same Frillagalma vityazi individual that were spiked in across multiple lanes and runs. Lane and run effects have been proposed as major sources of technical variability in RNA-seq data that may confound observations of biological variation (Auer and Doerge 2010; McIntyre et al. 2011). The differences between technical replicates (supplementary fig. S2, Supplementary Material online) were much smaller (0.39% variance of expression distance) than the differences between zooids (98.32% variance of expression distance). Differences among technical replicates of the same zooid were correlated with library size and run, not by lane.
The second component of variation we considered was biological variation among sampled colonies (supplementary figs. S3–S9, Supplementary Material online). Specimens were collected in the wild at different depths and over different time periods, but despite these environmental factors there was remarkably little variation among sampled colonies. Some samples did show greater variation across replicates, such as a developing palpon replicate in Nanomia bijuga (supplementary fig. S8, Supplementary Material online), and a developing gastrozooid and male gonodendra in F.vityazi (supplementary fig. S7, Supplementary Material online).
The third component of variation we considered was among zooids/specialized tissues within species. This was the greatest component of variation (supplementary figs. S3–S9, Supplementary Material online). We identified genes that were significantly differentially expressed in particular zooids, and in the pneumatophore, for each of the sampled species, based on pairwise comparisons (supplementary file 1, Supplementary Material online). In each case, significantly differentially expressed refers to genes with higher transcript abundance relative to the other zooid. As it is possible for the same gene to be significantly differentially expressed in pairwise comparisons of several zooids, we also identified a subset of genes that were significantly differentially expressed in only one zooid but not in any other zooid within a given species (supplementary fig. S10 and supplementary file 2, Supplementary Material online).
For each species and zooid, we found GO term enrichment for biological processes consistent with the functional specialization of the zooid (supplementary file 3, Supplementary Material online). Gastrozooids are solely responsible for feeding and digestion, for example, and we found these zooids to be enriched for genes involved in chitin, glutathione, and peptide catabolism, proteolysis, as well as metabolism of carbohydrates. Likewise, for male gonodendra, we found GO term enrichment for biological processes such as sperm flagellum, mitotic cell cycle process, DNA replication. For female gonodendra, there was GO term enrichment for a number of biological processes including mitotic cell cycle process, DNA replication (in A.elegans), as well as a number of signaling pathways and developmental processes (in F.vityazi).
Within the four best sampled species, we also identified 92 higher abundance putative transcription factors (24 out of 71 in B.elongata, 43 out of 75 identified in F.vityazi, 50 out of 79 in A.elegans, and 34 out of 72 identified in N.bijuga). Many of these transcription factors have higher expression in several zooids regardless of developmental stage (both mature and developing zooids), and a subset have higher expression only in particular zooids and developmental stages (supplementary fig. S11, Supplementary Material online).
Novel Zooids within Species: Can Expression Patterns Distinguish Distinct Zooid Types?
In siphonophores, there are several instances of lineage-specific zooid diversification events. We investigated gene expression patterns between the novel zooid type and the hypothesized most closely related zooid type in three species. In B.elongata, there are two morphologically distinct gastrozooids, that we termed “white” and “yellow” gastrozooids (supplementary fig. S12A and B, Supplementary Material online). The “yellow” gastrozooid is larger and darker and occurs as the seventh to tenth gastrozooid on the stem (Dunn 2005). In the Portuguese man of war, P.physalis, the gastrozooid is unique compared with other gastrozooids in other species—it has a mouth, but no tentacle, and the basigaster region is greatly reduced (Mackie 1960; Totton 1960). Meanwhile, the tentacle is associated with another zooid, the tentacular palpon (supplementary fig. S12C, Supplementary Material online) (Haeckel 1888; Totton 1960; Bardi and Marques 2007; Munro et al. 2019). In P. physalis, both the gastrozooid and the tentacular palpon are considered to be subfunctionalized from an ancestral gastrozooid type (Munro et al. 2019). Finally, in A.elegans, there are thought to be at least two different palpon types: gastric palpons that arise at the base of the peduncle of the gastrozooid, and a palpon called the B-palpon (supplementary fig. S12D, Supplementary Material online) (Dunn and Wagner 2006). The distinction between these two types of palpon is based on the location of these zooids—the gastrozooid is typically the last element of each repeating pattern of zooids along the stem (cormidium), but based on the budding sequence, Dunn et al. proposed that the enlarged B-palpon is the last element in A.elegans (Dunn and Wagner 2006). Each of these cases represents a different type of novelty: in B.elongata, the distinction between zooids was made based on size and color but not on obvious differences in function, in P. physalis, the gastrozooids and tentacular palpons differ structurally and functionally, and finally in A.elegans, gastric palpons and B palpons differ only in colony location, development, and possibly size.
In P.physalis, 976 genes showed significant differential expression (in all cases, higher transcript abundance relative to the other zooid) in the mature tentacular palpon, compared with 606 genes in the mature gastrozooid (supplementary file 4, Supplementary Material online). A number of genes were significantly differentially expressed in the mature tentacular palpon relative to all other tissues, of which, 670 genes were significantly differentially expressed relative to other tissues that were not shared with the gastrozooid. In the gastrozooid, 849 genes were significantly differentially expressed relative to other tissues that were not shared with the tentacular palpon. A number of genes significantly differentially expressed in the tentacular palpon are uncharacterized, however, we identified 46 putative toxins in the tentacular palpon, including hemostasis interfering and platelet aggregation activating toxins, phospholipases, serineproteases, hydrolases, metalloendoproteases, calglandulin-like genes, and a neurotoxin (supplementary file 5, Supplementary Material online). By contrast, in the gastrozooid, we found 59 significantly differentially expressed putative toxins (supplementary file 6, Supplementary Material online), including pore-forming Conoporin-Cn1-like and Tereporin-Ca1-like toxins, multiple neurotoxins, hydrolases, serine proteases, toxins likely involved in the promotion of blood coagulation and inhibition of platelet aggregation. Reflecting the role of the gastrozooid in digestion, we also find significant differential expression of digestive enzymes, including chymotrypsin-like genes.
Between the white mature gastrozooid and the yellow mature gastrozooid in B.elongata, few significantly differentially expressed genes were identified (eight genes were up in “white” mature gastrozooids relative to 36 genes up in “yellow” gastrozooids) (supplementary file 4, Supplementary Material online). Among genes that were significantly differentially expressed in either “white” or “yellow” gastrozooids relative to all other tissues, 276 genes were unique to “yellow” gastrozooids and not found in “white” gastrozooids, and 886 genes were found in “white” gastrozooids and not found in “yellow” gastrozooids.
Finally, in A.elegans, very few significantly differentially expressed genes were identified between the B palpon and gastric palpons (one was significantly differentially expressed in B palpons and two were significantly differentially expressed in gastric palpons) (supplementary file 4, Supplementary Material online). All three genes have no significant BLAST hit and did not map to any gene trees). Genes were identified that are significantly differentially expressed in B palpons relative to all other zooids (gastric palpons were excluded). Of this, 928 genes were differentially expressed in the B palpons relative to all other zooids. Most of these genes overlapped with those differentially expressed in the gastric palpons relative to all other zooids (746 genes).
Classical Analysis of Gene Expression between Species: How Different Is Zooid-Specific Expression between Species?
Most comparative studies of gene expression focus exclusively on strict 1:1 orthologs, requiring an assessment of orthology across all species. We call this type of analysis STF, as it limits analyses to a subset of gene trees with very specific evolutionary histories. Following this approach, we used Orthofinder 2 (v2.4.0) (Emms and Kelly 2019) to identify strict orthologs. For all seven species, we identified 1,173 strict orthologs, of which 952 orthologs had expression data across all seven species. In order to increase the number of genes for analysis, we focused on the four best sampled species (A.elegans, B.elongata, F.vityazi, and N.bijuga). Using data from four taxa, we identified 4,009 strict orthologs, of which 3,174 orthologs had expression values in four zooids/tissues: gastrozooids (developing and mature), nectophores (developing), palpons (mature), and the pneumatophore. Using TPM10K values directly, we found that orthologs clustered largely by species rather than zooid/tissue (fig. 3A). Following Breschi et al. (2016) (see Materials and Methods), we used linear models to identify the proportion of variance that could be explained by species or zooid/tissue (fig. 3B), and found that of 3,174 total orthologs, 2,125 are species-variable genes (SVG), and 168 are zooid/tissue-variable genes (TVG).
The strong species-dominated clustering observed when comparing expression values directly may be due to differences in counting efficiency between species, especially as we used reference transcriptomes (Dunn, Luo, et al. 2013). To account for this, we used ratios of expression values, using TMP10K expression values of the most commonly sampled zooid—mature gastrozooid—as the denominator (Dunn, Luo, et al. 2013). Using this approach, we found that orthologs clustered largely by zooid/tissue rather than species (fig. 3C). Using linear models, we identified, out of 3,174 orthologs, 494 are SVGs, and 349 are TVGs (fig. 3D). GO terms enriched among TVG included embryonic morphogenesis, embryo development, cartilage development, and a number of metabolic and biosynthetic processes. Meanwhile, SVG were enriched in GO terms such as cellular response to stress, DNA repair, and cell cycle processes. A list of TVG and SVG can be found in supplementary files 7 and 8, Supplementary Material online, respectively.
Phylogenetic Analysis of Gene Expression with SBF: What Are the Evolutionary Changes in Zooid Expression along Branches in the Species Tree?
For our SBF analyses (see fig. 2; Materials and Methods), we used transcriptome and genome data from 41 cnidarian species to generate a total of 7,070 gene trees, of which 3,831 gene trees passed filtering criteria (see Materials and Methods for filtering approach). We used Orthofinder 2 (Emms and Kelly 2019) to reconcile the species tree and the gene trees to annotate each gene tree node as either a speciation or duplication event. The number of genes represented in these gene trees is shown in supplementary table S1, Supplementary Material online. The internal nodes on these gene trees represent 20,088 speciation events and 9,082 duplication events. Expression values (TPM10K) were mapped to the tips of the gene trees for each zooid/tissue separately, and ancestral expression values were inferred at internal nodes in a maximum likelihood framework. We focused on a subset of zooids and tissues that are common across the sampled species: gastrozooids (developing and mature), nectophores (developing), palpons (mature), and the pneumatophore. The distribution of expression changes along species-equivalent branches are shown in supplementary figure S13, Supplementary Material online.
As with the STF and linear model analyses, we investigated the impact of counting efficiency on our comparative analyses. We used the TPM10K values at the tips and nodes to calculate expression ratios and calculated changes in expression ratios across a branch for pairs of tissues, using the mature gastrozooid values as the denominator. The distribution of expression ratio changes along species-equivalent branches (branches in the gene tree that correspond to equivalent branches in the species tree, and are descended from speciation events) are shown in supplementary figure S14, Supplementary Material online. We found that the variance of change across a given species-equivalent branch is considerably higher for raw TPM10K values as compared with ratios of expression (supplementary figs. S15 and S16, Supplementary Material online). We also identified the number of gene tree branches that had a negative (change ≤−2), positive (change ≥2), or neutral (change >−2 and <2) change in either TPM10K or TPM10K ratios across the branch (fig. 4). The mean number of species-equivalent branches considered in these analyses are shown in table 1—this value reflects the mean number of branches across each zooid or zooid ratio (due to incomplete sampling, some zooids may have fewer or more branches; as seen in fig. 4). For each species-equivalent branch, we found more negative and positive branches than neutral branches when we used TPM10K values, whereas for expression ratios, we found that the majority of branches show no change across the branch. This suggests that as for the STF analyses, counting efficiency has a large impact on expression change and in turn leads to large gene/branch-specific signal. For expression ratios, these results indicate that the vast majority of branches show close to 0 (neutral) change across the branch, suggesting that for closely related genes, expression ratios tend not to differ.
Table 1.
Species-Equivalent Branch | Mean Number of Branches TPM10K | Mean Number of Branches Ratio |
---|---|---|
A | 1,331.00 | NA |
B | 1,215.50 | NA |
C | 1,509.50 | 1,499.00 |
D | 1,154.50 | 1,100.00 |
E | 982.00 | 982.00 |
F | 1,545.00 | 1,521.00 |
G | 1,338.50 | NA |
H | 1,947.00 | 1,928.00 |
I | 1,491.00 | 1,461.00 |
J | 1,790.60 | 1,765.75 |
K | 1,682.75 | 1,673.00 |
L | 1,591.75 | 1,582.00 |
Out of a total of 3,357 final gene trees considered in these analyses, 3,329 gene trees contained branches with neutral changes in expression ratios, 1,294 gene trees contained branches with positive changes in expression ratios, and 1,041 gene trees contained branches with negative changes in expression ratios. Expression ratio changes across species-equivalent branches from all gene trees is available in supplementary file 9, Supplementary Material online (note: in this file, “BLAST hit” is the most frequent BLAST hit for the gene tree, gene identity should still be confirmed for each gene, particularly in large gene trees).
The vast majority of gene trees contained species-equivalent branches with neutral changes in expression ratios, including a number of transcription factors and morphogenic signaling pathway genes. For example, in ratios of developing and mature gastrozooids, among the relevant species-equivalent branches in the Wnt gene phylogeny (Wnt gene identity based on Condamine et al. [2019]), changes were neutral across Wnt3 and Wnt2 branches, with the exception of a slight positive change across the branch leading to a Diphyes disparWnt2 paralog, suggesting a higher relative expression of this gene in the developing gastrozooid of D. dispar (fig. 5 and supplementary fig. S17, Supplementary Material online). For all zooid ratios, we found very consistent Wnt3 expression patterns with neutral or very small positive or negative changes in expression ratio across the branches. Indeed, across all cnidarians examined, Wnt3 has consistent localized expression at the oral pole, likely playing a role in axis formation and maintenance (Hobmayer et al. 2000; Guder et al. 2006; Momose et al. 2008; Nawrocki and Cartwright 2013; Hensel et al. 2014; Sanders and Cartwright 2015; Bagaeva et al. 2019). However, for Pneumatophore/Gastrozooid ratios, some large changes were seen across branches K and L, leading to an A.elegans and N.bijugaWnt2 paralog, respectively, likewise, for Palpon/Gastrozooid, a large change was seen across branch J, leading to a F.vityaziWnt2 paralog. In Nematostella vectensis and Hydractinia echinata, Wnt2 is expressed in the middle of the polyp (Kusserow et al. 2005; Hensel et al. 2014). Without more detailed spatial expression patterns, it is difficult to know whether these differences in expression between species reflect species-specific differences in axial patterning and morphogenesis, such as an expansion or reduction of the expression domain.
We also looked at the 277 gene trees with species-relative branches that had very large changes (5 < change < −5), representing putative lineage-specific expression patterns (see supplementary file 9, Supplementary Material online). In Frizzled5/8, Wnt4, and homeobox (putatively Hox-B8) gene-trees, for example, we found very large positive changes in ratios of developing gastroozoid/mature gastrozooids across the same species-equivalent branches, J (leading to F.vityazi) and D (leading to Euphysonectae, the clade comprising A.elegans, N.bijuga, and F. vityazi). In Frizzled5/8, for developing nectophores/gastrozooids, we also saw a large positive change across branches L (leading to N.bijuga) and J (leading to F.vityazi) and negative change across branch K (leading to A.elegans). In ratios of developing nectophores/gastrozooids, we also found very large positive changes in branch J for Wnt-7b-like, Wnt-4, (putatively) Hox-B8 and NKX1-2. Likewise, in pneumatophore/gastrozooid ratios, we also found large positive changes in branches J and L for Frizzled5/8, and a negative change across branch K. Across J, we also saw positive changes in pneumatophore/gastrozooid in Hox-B8 and NKX1-2. Finally, for branch K, we saw large positive changes in Wnt4 and Wnt7b-like. Many of these genes have been shown to have specific expression domains patterning cnidarian bodies (Kusserow et al. 2005; Ryan et al. 2007; Sinigaglia et al. 2013; Hensel et al. 2014), and further detailed expression analyses of these genes and others within the zooid are required to determine whether the patterns identified here reflect differences in expression domain between species and zooid.
The nature of these analyses makes it difficult to investigate GO term enrichment, as each gene tree contains multiple gene lineages often with diverse molecular function and that play diverse roles in biological processes—additionally, our approach focuses on species-relative branches rather than tips. Nevertheless, we assigned GO terms at a gene tree level, and investigated GO term enrichment among gene trees that contain branches with negative and positive changes. Among gene trees that contained species-relative branches with positive changes, we found an enrichment for a number of biological processes, including detection of stimulus, neuroblast proliferation, stem cell proliferation, and axon guidance (supplementary files 10 and 11, Supplementary Material online). Among gene trees that contained species-relative branches with negative changes, we found in particular an enrichment for a number of metabolic processes including amine, NAD, tryptophan, and indolalkylamine metabolic process (supplementary files 12 and 13, Supplementary Material online).
To assess the extent to which missing data may impact these analyses, we identified BUSCO scores for each of the reference transcriptomes (supplementary table S2, Supplementary Material online). BUSCO completeness score ranged from 88.7% to 70.2% against the Metazoa BUSCO data set (Manni et al. 2021). Of this, 92.65% of identified BUSCO genes are present in the initial expression data set. This suggests that our de novo transcriptome assemblies captured the vast majority of known metazoan genes, and therefore are likely not sensitive to missing data. After filtering for the ortholog STF analyses, 38.65% of identified BUSCO genes were retained. By contrast, in the SBF analyses, 44.02% of identified BUSCO genes were retained. These findings show that our novel phylogenetic approach SBF may improve evolution of gene expression studies as it retains more gene trees for downstream analyses.
The vast majority of zooid libraries mapped well to the reference transcriptomes, with the majority of libraries having between 85% and 95% of reads aligned to the reference, with an average of 85.84% of reads aligned (supplementary file 14, Supplementary Material online). Only eight libraries had poor alignment scores (<70%), notably all three developing nectophore libraries from N.bijuga (56.53–63.08%), male and female gonodendron libraries from a single F.vityazi replicate (68.18% and 68.89%, respectively), an Apolemia lanosa developing nectophore replicate (60.21%), and a developing bract and pneumatophore replicate in B.elongata (64.46% and 49.61%, respectively). The fact that all three N.bijuga developing nectophore replicates had lower mapping scores, points to a reduced presence of likely nectophore-specific genes in the N.bijuga reference transcriptome, which also had the lowest BUSCO score of all of the reference transcriptomes.
Discussion
Evolution of Gene Expression in Siphonophores
In this study, we used RNA-seq to investigate the functional specialization and evolution of zooids across siphonophore species. In our analysis of differential expression within species, we found a large number of differentially expressed genes across siphonophore zooids, reflecting the distinct anatomy and function of these zooids. In addition, we identified potential transcription factors that are significantly differentially expressed within particular zooids, with potential homologs found in several species, which are interesting candidates for future study in siphonophores or related cnidarian colonial groups. Using these within-species DGE analyses, we identified distinct expression patterns that may be used to define particular zooid types.
We also explored gene expression patterns in three zooid types that are unique to three species, P.physalis, A.elegans, and B.elongata. In siphonophores, different zooid types are typically defined based on morphological and functional differences/similarities, and on the location of the zooid within the colony (which is determined in most species by the asexual budding process in the growth zone, which gives rise to a repeating pattern of zooids along the stem) (Totton 1965; Mackie et al. 1987; Dunn 2005; Dunn and Wagner 2006). Based on both morphology and development, P.physalis tentacular palpons are considered to be a distinct and unique zooid type (Munro et al. 2019), and the DGE data indicate that there are clear morphological and functional differences between gastrozooids and tentacular palpons. We also identified several putative toxin genes with distinct expression profiles between these two zooids, which matches tissue-specific venom observed in other cnidarian species (Ames et al. 2016; Macrander et al. 2016; Klompen et al. 2021).
For “yellow” and “white” gastrozooids in B.elongata, the differential expression data point to few morphological or functional differences between these two zooids. Through pairwise differential expression patterns between “yellow” or “white” gastrozooids and other zooids within the colony, we were able to identify hundreds of genes that are significantly expressed in one zooid and not the other. This suggests that these two gastrozooids may be distinct zooid types, although they are functionally very similar to one another. Sequencing at greater depth, and also functional work within this species may help clarify the nature of these differences. By contrast, there is no strong evidence in our data that the B palpon and gastric palpons in A.elegans are sufficiently different to constitute a novel zooid type. These findings suggest that location within the colony is not necessarily sufficient to designate a novel zooid type.
With our STF and linear model analyses, we found that the vast majority of identified orthologs were neither exclusively species- or tissue-variable, with only a subset of genes being either tissue- or species-variable. As has been observed for vertebrate organs, we find based on GO terms that identified SVGs tend to play a role in basic cellular functions (so-called housekeeping genes), as compared with TVGs (Breschi et al. 2016).
What can we learn from the SBF analyses, using ratios of expression data? It is important to stress that changes in expression ratios cannot tell us about changes in expression magnitude. A TVG (identified via linear models in classical analyses) that is either highly or lowly expressed in, for example, nectophores relative to gastrozooids in all species, may show largely neutral changes across all species-equivalent branches. The sign (positive or negative) of change provides an indication of whether expression in the child node is relatively higher in the denominator or the numerator, relative to the parent node. We find overall that most expression ratios show very little change across species-equivalent gene tree branches, suggesting that gene expression patterns tend to be largely consistent among species. Positive or negative changes across branches, especially very large changes, represent putative linage-specific shifts in expression. Whether these reflect distinct morphological or functional changes in the tissues of a particular species or clade requires further validation—a difficulty in this system, where the species are difficult to collect and maintain in the lab.
Challenges and Solutions in the Analysis of the Evolution of Gene Expression
In this study, we used three different analyses to investigate gene expression patterns within homologous zooids across species, each are complementary and enable different possibilities for biological discovery. Within-species analyses focus on expression among zooids and tissues within a single species—a very large number of genes identified in the transcriptome or genome can be considered, and the broader across species gene-tree need not necessarily be considered. This approach is especially useful for investigating expression patterns within novel zooids that do not have clear homologs in other species. The classical between species analysis as implemented here, using STF followed by linear models, relies on phylogenetic assumptions yet is a nonphylogenetic tip-focused approach where a single gene is identified per species, and where the identified differences in values across tips are due to changes along all branches in the tree. With linear models, we can identify genes with expression values that are common to a particular zooid across all species, or with expression that is specific to a particular species. However, the vast majority of genes are neither zooid nor species variable, and show more complex patterns of expression. We proposed a third approach, called SBF, that gives access to changes in expression across gene phylogenies, which may differ significantly from the species phylogeny.
Where traditional approaches identify strict ortholog genes before conducting analyses, SBF uses information from all gene copies within a gene tree regardless of their evolutionary history of duplication or speciation. STF is a gene tree filtering approach, where entire trees or subtrees are discarded due to even a single duplication event, SBF is a branch filtering approach. This means that SBF retains many informative branches from trees that would be removed by STF, and preserves many more evolutionary comparisons for analyses. By filtering our data by branches, we consider expression patterns at both the tips and internal nodes of gene phylogenies. We are thus able to identify shifts in expression leading to tips as well as shifts in expression leading to particular clades. Although we focus here on expression following speciation events, it is also possible to compare expression following duplication and speciation events.
Using ancestral trait reconstruction, we also overcame sampling issues at the tips of the gene and species phylogenies, as expression values of different treatments can be reconstructed at deep internal nodes, even where there may be inconsistent sampling at the tips. That is, even if expression values are missing for a given tissue and gene in a gene phylogeny, we are still able to examine expression patterns within this gene tree. By contrast, in the STF analyses, incomplete sampling in the expression matrix leads to the elimination of the ortholog from the analysis.
As with all methods that rely on mapping to reference transcriptomes rather than genomes (including STF analyses), this approach is limited by the quality of the reference transcriptomes. Ratios of expression helped significantly to improve issues of differences in count efficiency among species (Dunn, Luo, et al. 2013). However, reference transcriptome quality also has an impact on the quality of the gene trees used for expression mapping. Not all reference transcriptomes were sequenced to equal depth among species (supplementary table S2, Supplementary Material online), and this has important effects on the presence or absence of genes from particular species within the gene tree, as well as within the broader expression data set. This not only has an effect on the representation of expression values, but also impacts the power to investigate patterns of expression among branches within a gene tree. With genome sequencing becoming cheaper and more readily available, the widespread availability of reference genomes will help alleviate many of these issues. Reference genomes will also improve gene models, enabling the distinction of different alleles of the same gene from duplicated gene copies, this in turn will improve the quality of the gene trees. However, gene loss will nevertheless present a challenge to these analyses.
Conclusions
With the expansion of functional genomic tools, including RNA-seq and single cell-sequencing methods, there is considerable interest in looking not only at how genomic variation gives rise to phenotypic diversity in a single species or organism, but also at how functional genomic variation shapes phenotypic diversity across multiple closely and distantly related species to understand broader evolutionary patterns and processes (Brawand et al. 2011; Barbosa-Morais et al. 2012; Merkin et al. 2012; Perry et al. 2012; Yang and Wang 2013; Necsulea et al. 2014; Zhang et al. 2014; Sudmant et al. 2015; Breschi et al. 2016; Levin et al. 2016; Macrander et al. 2016; Clarke et al. 2017; Ma et al. 2018; Cardoso-Moreira et al. 2019; Darbellay and Necsulea 2020; Fukushima and Pollock 2020). Many of these analyses have the goal of identifying shared expression patterns among modular biological units (cell type, tissue, organ, zooid) across species, in order to identify commonalities in expression patterns among species. This is of particular interest for medically orientated fields interested in understanding the extent to which expression results can be extrapolated from one model organism to another. Another goal is to identify expression patterns in a particular biological unit that are unique to a particular species or even clade. For these questions, within-species analyses provide the greatest depth, in terms of number of genes investigated, however comparisons between species on the basis of within-species DGE are limited and largely qualitative. Here, we use two approaches to investigate expression patterns in a quantitative manner across homologous tissues. Classic between species analyses, using STF in conjunction with linear models, enabled the investigation of a smaller number of strict orthologs that vary in a tissue- or species-specific manner, however as this analysis focuses on the tips of gene trees, our ability to investigate lineage or especially clade-specific patterns of expression in a given zooid are more limited. Meanwhile, phylogenetic analysis using SBF focuses on branches rather than tips, also with specific evolutionary histories (descended from speciation events), but it enables the identification of expression patterns that vary little among genes/species, as well as expression patterns that show strong lineage- or clade-specific patterns for a given zooid.
Materials and Methods
Collecting
Specimens were collected in the north-eastern Pacific Ocean in Monterey Bay and, in the case of P.physalis, the Gulf of Mexico. Specimens were collected by remotely operated vehicle or during blue-water SCUBA dives. Physalia specimens were collected by hand from the beach after being freshly washed on-shore by prevailing winds. Available physical vouchers have been deposited at the Peabody Museum of Natural History (Yale University), New Haven, CT (supplementary file 15, Supplementary Material online). Specimens were relaxed using 7.5% MgCl2 hexahydrate in Milli-Q water at a ratio of 1/3 MgCl2 and 2/3 seawater. Zooids were subsequently dissected from the colony and flash frozen in liquid nitrogen. Colonies were cooled to collection temperatures (e.g., 4 °C for deep sea species) while the dissections took place. Dissections took no longer than 15–20 min. In the case of large colonies, the stem was cut and only partial sections of the colony were placed under the microscope at a given time. Each replicate individual represents a genetically distinct colony from the same species. Replicate specimens were of an equivalent colony size, and zooid replicates were also equivalent sizes. Larger zooid types, such as gastrozooids, were sampled as a single zooid, but smaller zooids were pooled. Pooled zooids were of a comparable maturity and sampled from the same location in a single colony. Sampling data, including time, date, depth, and voucher ID, can be found in supplementary file 15, Supplementary Material online.
Sequencing
mRNA was extracted directly from tissue using Zymo Quick RNA MicroPrep (Zymo No. R1050), including a DNase step, and subsequently prepared for sequencing using the Illumina TruSeq Stranded Library Prep Kit (Illumina, No. RS-122-2101). 50 base-pair single-end libraries were all sequenced on the HiSeq 2500 sequencing platform. Three sequencing runs were conducted, representing three full flow cells. To avoid potential run/lane confounding effects, where possible, libraries of multiple zooids/tissue of a single individual in a species were barcoded and pooled in a single sequencing lane, and replicate lanes of zooids/tissue from different individuals of the same species were sequenced in separate runs. Additionally, two libraries were run as technical replicates across all runs and many lanes, for a total of 20 technical replicates.
Analysis
Differential Gene Expression
Short-read libraries were mapped to previously published transcriptomes (Munro et al. 2018) using Agalma v 2.0.0 (Dunn, Howison, et al. 2013; Guang et al. 2021), which uses a number of existing tools for transcript quantification, including RSEM (which uses Bowtie) (Langmead et al. 2009; Li and Dewey 2011). Using the agalmar package (https://github.com/caseywdunn/agalmar, last accessed February 7, 2022), we filtered out genes that were flagged as being rRNA, and selected only protein coding genes. We also only considered genes that had greater than 0 counts in at least two libraries. DGE analyses, including normalization, were conducted in R, using the DESeq2 package (Love et al. 2014). Libraries that were found to be outliers based on mean Cook’s distance were removed from the DESeq object and from downstream analyses and normalization. Testing for differential expression was conducted using the results() function in DESeq2. Genes were considered to be significantly differentially expressed if adjusted P values (Bonferroni correction) were less than 0.05. Differential expression analyses were only conducted on zooids/tissue with two or more replicates.
GO annotations were retrieved for each of the reference-translated transcriptomes (Munro et al. 2018) using the PANNZER2 web server (Törönen et al. 2018). The PANNZER2 format was modified to match the gene2GO format required for the package topGO (Alexa and Rahnenfuhrer 2020). Gene set enrichment analyses were carried out within species using the R package GOseq (Young et al. 2010), which takes gene length into account. Over and underrepresented categories were calculated using the Wallenius approximation, and P values were adjusted using the Benjamini and Hochberg method. Categories with an adjusted P value below 0.05 were considered enriched. Gene set enrichment analyses were also conducted at the gene tree level, considering representative GO terms for particular gene trees. Representative GO terms were selected based on the frequency of occurrence among genes in the gene tree. As gene lengths vary among species and genes in the gene tree, the GOseq approach could not be used, and topGO was used to detect GO terms that are enriched based on Fisher’s exact test. This approach assumes that each gene tree has an equal probability of having genes shared among species that are detected as differentially expressed, however results may be biased by a number of factors, including mean gene length among genes in the gene tree (Young et al. 2010). Putative toxin genes were identified using BlastP from the ToxProt data set (http://www.uniprot.org/program/Toxins, last accessed May 13, 2021). See Supplementary Material online for R package version numbers.
TPM10K
For all comparative expression analyses, expression values were normalized using a new method we call TPM10K. For gene of a given species, TPM is typically calculated as (Li et al. 2010):
where is the number of the mapped reads to gene , is the effective length of the gene, and n is the number of genes in the reference. The intent of this measure is to make libraries comparable within a single species. The sum of TPM values within a library is , and the mean is . One implication of this is that TPM values are not directly comparable across species, since in practice, differs across species. If this were not accounted for, then it could appear, for example, that genes all have lower expression in a species with a more complete reference transcriptome and higher . To account for differences in means among species, we use a new measure, TPM10K, that accounts for differences in :
where the sum of TPM10K values within a library is and the mean is . By multiplying by we are able to account for different sequencing depths among species, and ensure a common mean. As is large, we divide by an arbitrary number (in this case ) in order to reduce the magnitude of the expression value.
Species Tree Filtering
For STF analyses, single copy orthologs were obtained using Orthofinder 2 (v2.4.0) (Emms and Kelly 2019). Linear models used in STF analyses were constructed using lm(), following methods and code developed by Breschi et al. (2016), https://github.com/abreschi/Rscripts/blob/master/anova.R (last accessed February 7, 2022). All SVGs and TVGs are genes for which both species and tissue explain greater than 75% of variance. Additionally, in SVGs, the proportion of variance explained by species is two times greater than that explained by tissues; whereas in TVGs, the proportion of variance explained by tissues is two times greater than that explained by species.
Species Branch Filtering
The principles of SBF are as follows: first, we identify speciation and duplication events in a given gene tree, specifically labeling nodes that correspond to speciation events in the species tree (fig. 2, step 1). Next, we map gene expression values to the tips of the gene tree (fig. 2, step 2), and use phylogenetic methods to reconstruct expression values at the internal nodes (fig. 2, step 3). Expression values are mapped and reconstructed for each zooid/tissue separately, with the assumption that the structure is homologous across species. Then, we calculate scaled expression values across branches, by subtracting the expression value at the child node from the parent node, and dividing by branch length (branch lengths are calibrated against branches in the species tree) (fig. 2, step 4). Finally, we identify branches in the gene trees that correspond to equivalent branches in the species tree (hereafter species-equivalent branches) (fig. 2, step 5). Species-equivalent branches are gene tree branches with parent and child nodes that are both speciation events and that correspond to branches in the species tree. Each branch in the species tree is given a unique identifier (i.e., a letter) and the species-equivalent branches in gene trees are given the same identifier (fig. 2, step 5). This method enables the selection of branches within gene trees that are equivalent to branches within the species tree, and are thus comparable with one another across all gene trees. Unlike the STF approach, this approach considers equivalent branches that are descended from speciation events, but that have more complex evolutionary histories. For example, due to deeper gene duplication events, gene trees often contain multiple branches that correspond to the same branch in the species tree (fig. 2, step 5). Our method allows us to consider all of these branches.
Gene trees were generated using the transcriptomes and genomes from 41 species (Munro et al. 2018) using Agalma v 2.0.0 (Dunn, Howison, et al. 2013; Guang et al. 2021). Although we have expression data for seven species, we used a broader species sampling in order to infer more complete gene histories and more easily assign speciation and duplication events. Following the treeinform step of Agalma v 2.0.0, amino acid data were exported and supplied to Orthofinder 2 (v2.3.8) (Emms and Kelly 2019) for simultaneous co-estimation of gene trees with the published maximum likelihood species tree (Munro et al. 2018). Within Orthofinder 2, the selected multiple sequence alignment method was MAFFT (Katoh and Standley 2013) and maximum likelihood tree inference method was IQ-tree with the LG+F+R4 substitution model (Nguyen et al. 2015). BUSCO scores were generated from the reference transcriptomes using the metazoa_odb10 BUSCO data set with BUSCO v 5.2.2 (Manni et al. 2021).
Phylogenetic analyses were conducted in R using geiger, ape, phytools, Rphylopars, and hutan (Paradis et al. 2004; Harmon et al. 2008; Revell 2012; Goolsby et al. 2017). Phylogenetic trees were visualized in R using ggtree and treeio (Yu et al. 2017).
For each gene tree, we generated summary statistics of the branch lengths in the tree (maximum, minimum, SD) as well as the fraction of branches that have a default length value of 10−6 (that is indicative of branch length = 0). The majority of gene trees have branches with a maximum length around 1. Gene trees were filtered to exclude trees with branches whose maximum length is >2 and that had more than 0.25 branches with the default length value, representing 15.37% of gene trees. The goal of this is to exclude trees that include very long branches, or a large proportion of branches with very short branch lengths. Gene tree nodes were annotated as speciation events or duplication events, based on assignment by Orthofinder 2 (Emms and Kelly 2019). Speciation nodes were subsequently assigned a node ID equivalent to the species tree node, using species tip names from the gene tree to determine the most common recent ancestor in the species tree (Munro et al. 2018), using the phytools package. Due to the use of the species-overlap method by Orthofinder 2, some clades of single copy genes were assigned as speciation events, although the topology is inconsistent with the species tree. To avoid time calibration issues, due to descendant nodes being assigned the same species node ID, descendant speciation nodes with the same species node ID were marked as null, and gene trees with nodes greater than 0.3 null nodes per internal node were excluded—indicating widespread topological differences with the species tree. Additionally, only trees with one or more speciation events were retained, as speciation events are used for time calibrations. The gene trees were then time calibrated to the species tree using chronos() in the ape package, so that the branch lengths were scaled to the same equivalent length across all gene trees (Paradis et al. 2004). Some gene trees could not be calibrated against the node constraints from the species tree and were discarded. Additionally, we calculated the maximum node age for each gene tree, and excluded gene trees with roots that exceeded a maximum root depth of 5, which can be indicative of calibration problems (the root of the species tree is 1). Only a single tree was excluded as a result of a very deep root. Tips without expression values were then pruned out of the tree. Gene trees with fewer than three expression values at the tips were discarded, retaining only trees with three or more values. Pruned and unpruned calibrated gene tree files are available as Supplementary Material online.
We then took the mean TPM10K value for each gene across replicates of the same zooid/tissue within a species and applied a log transformation. Using gene trees with expression values for each gene within a species at the tips, maximum likelihood ancestral trait values were generated at the nodes using the anc.recon() function in Rphylopars assuming a Brownian motion (BM) model of evolution (fig. 2, steps 2 and 3) (Goolsby et al. 2017). As not all zooids are present in all of the species, the trees were pruned down to the subset of tips with expression values for ancestral trait reconstructions. Node values were then added back to the unpruned tree with all of the reconstructed expression values. Change in expression was measured across a branch by taking the difference between a parent node and a child node, and then this difference is scaled by branch length (fig. 2, step 4).
To examine if expression values in our data set evolve under BM, we simulated data set of random expression values using fastBM() from the phytools package with empirically derived mean and SD values for each gene tree. Under this model, expression variance accumulates among linages on the gene tree as a function of time, and is used as a model of evolution under drift, as well as some forms of natural selection (Felsenstein 1973; Revell and Harmon 2008; Revell et al. 2008).
Supplementary Material
Supplementary data are available at Molecular Biology and Evolution online.
Supplementary Material
Acknowledgments
We would like to thank Steve Haddock, Lynne Christianson, and the whole crew of the Western Flyer for their help and expertise. We thank Lourdes Rojas and Eric Lazo-Wasem for cataloging the specimens used in this study. Additional thanks are due to Samuel Church for his assistance during a research cruise, and for productive conversations about comparative gene expression methods. We also thank Lucas Leclère for his comments on a draft of the manuscript, as well as two anonymous reviewers for their comments. This work was supported by the National Science Foundation (NSF), DEB-1256695, the Waterman Award. C.M. was also supported in part by a RI-EPSCoR Fellowship, NSF EPS-1004057. Analyses were conducted with computational resources and services at the Center for Computation and Visualization at Brown University, supported in part by the NSF EPSCoR EPS-1004057 and the State of Rhode Island. Analyses were also conducted at the Yale Center for Research Computing (YCRC).
Author Contributions
C.M., S.S., C.W.D., and F.Z. made major contributions to the conception and design of this work. C.M. and S.S. collected and dissected specimens. C.M. conducted lab work, data analysis, data interpretation, and drafting of the manuscript. C.W.D. and F.Z. contributed to data analysis and interpretation. M.H. and F.Z. were involved in the creation of software and code (Agalma) used in this work. All authors read, contributed to, and approved the final manuscript.
Data Availability
The data underlying this article are available in Figshare with the following DOIs: 10.6084/m9.figshare.14838384, 10.6084/m9.figshare.14838372, 10.6084/m9.figshare.14838315, 10.6084/m9.figshare.14838090, 10.6084/m9.figshare.14829183. Sequences, along with sampling metadata, are available on NCBI under BioProject ID PRJNA540747. All scripts for the analyses are available in a git repository at https://github.com/dunnlab/Siphonophore_Expression (last accessed February 7, 2022). The most recent commit at the time of the analysis presented here was f1808ad. The git repository is also available as a supplementary zipped folder, Supplementary Material online.
References
- Alexa A, Rahnenfuhrer J.. 2020. topGO: enrichment analysis for gene ontology. R package version 2.40.0. Buffalo (NY): Bioconductor. [Google Scholar]
- Ames CL, Ryan JF, Bely AE, Cartwright P, Collins AG.. 2016. A new transcriptome and transcriptome profiling of adult and larval tissue in the box jellyfish Alatina alata: an emerging model for studying venom, vision and sex. BMC Genomics 17(1):650. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Auer PL, Doerge R.. 2010. Statistical design and analysis of RNA-seq data. Genetics 185(2):405–416. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bagaeva TS, Kupaeva DM, Vetrova AA, Kosevich IA, Kraus YA, Kremnyov SV.. 2019. cWnt signaling modulation results in a change of the colony architecture in a hydrozoan. Dev Biol. 456(2):145–153. [DOI] [PubMed] [Google Scholar]
- Barbosa-Morais NL, Irimia M, Pan Q, Xiong HY, Gueroussov S, Lee LJ, Slobodeniuc V, Kutter C, Watt S, Colak R, et al. 2012. The evolutionary landscape of alternative splicing in vertebrate species. Science 338(6114):1587–1593. [DOI] [PubMed] [Google Scholar]
- Bardi J, Marques AC.. 2007. Taxonomic redescription of the Portuguese man-of-war, Physalia physalis (Cnidaria, Hydrozoa, Siphonophorae, Cystonectae) from Brazil. Iheringia, Sér Zool. 97(4):425–433. [Google Scholar]
- Beklemishev WN. 1969. Principles of comparative anatomy of invertebrates. Vol. I. Edinburgh (United Kingdom: ): Oliver & Boyd. [Google Scholar]
- Brawand D, Soumillon M, Necsulea A, Julien P, Csárdi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, et al. 2011. The evolution of gene expression levels in mammalian organs. Nature 478(7369):343–348. [DOI] [PubMed] [Google Scholar]
- Breschi A, Djebali S, Gillis J, Pervouchine DD, Dobin A, Davis CA, Gingeras TR, Guigó R.. 2016. Gene-specific patterns of expression variation across organs and species. Genome Biol. 17(1):151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cardoso-Moreira M, Halbert J, Valloton D, Velten B, Chen C, Shao Y, Liechti A, Ascenção K, Rummel C, Ovchinnikova S, et al. 2019. Gene expression across mammalian organ development. Nature 571(7766):505–509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carré C. 1967. Le developpement larvaire d’Abylopsis tetragona. Cah Biol Mar. 8:185–193. [Google Scholar]
- Carré D. 1969. Etude histologique du developpement de Nanomia bijuga (Chiaje, 1841), Siphonophore Physonecte, Agalmidae. Cah Biol Mar. 10:325–341. [Google Scholar]
- Carré C, Carré D.. 1991. A complete life cycle of the calycophoran siphonophore Muggiaea kochi (Will) in the laboratory, under different temperature conditions: ecological implications. Philos Trans R Soc Lond B Biol Sci. 334(1269):27–32. [Google Scholar]
- Church SH, Siebert S, Bhattacharyya P, Dunn CW.. 2015. The histology of Nanomia bijuga (Hydrozoa: Siphonophora). J Exp Zool B Mol Dev Evol. 324(5):435–449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clarke TH, Garb JE, Haney RA, Chaw RC, Hayashi CY, Ayoub NA.. 2017. Evolutionary shifts in gene expression decoupled from gene duplication across functionally distinct spider silk glands. Sci Rep. 7(1):8393. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Condamine T, Jager M, Leclère L, Blugeon C, Lemoine S, Copley RR, Manuel M.. 2019. Molecular characterisation of a cellular conveyor belt in clytia medusae. Dev Biol. 456(2):212–225. [DOI] [PubMed] [Google Scholar]
- Darbellay F, Necsulea A.. 2020. Comparative transcriptomics analyses across species, organs, and developmental stages reveal functionally constrained lncRNAs. Mol Biol Evol. 37(1):240–259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunn CW. 2005. Complex colony-level organization of the deep-sea siphonophore Bargmannia elongata (Cnidaria, Hydrozoa) is directionally asymmetric and arises by the subdivision of pro-buds. Dev Dyn. 234(4):835–845. [DOI] [PubMed] [Google Scholar]
- Dunn CW, Howison M, Zapata F.. 2013. Agalma: an automated phylogenomics workflow. BMC Bioinformatics 14(1):330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunn CW, Luo X, Wu Z.. 2013. Phylogenetic analysis of gene expression. Integr Comp Biol. 53(5):847–856. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunn CW, Wagner GP.. 2006. The evolution of colony-level development in the Siphonophora (Cnidaria: Hydrozoa). Dev Genes Evol. 216(12):743–754. [DOI] [PubMed] [Google Scholar]
- Emms DM, Kelly S.. 2019. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20(1):238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Felsenstein J. 1973. Maximum-likelihood estimation of evolutionary trees from continuous characters. Am J Hum Genet. 25(5):471. [PMC free article] [PubMed] [Google Scholar]
- Fukushima K, Pollock DD.. 2020. Amalgamated cross-species transcriptomes reveal organ-specific propensity in gene expression evolution. Nat Commun. 11(1):4459. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goolsby EW, Bruggeman J, Ané C.. 2017. Rphylopars: fast multivariate phylogenetic comparative methods for missing data and within-species variation. Methods Ecol Evol. 8(1):22–27. [Google Scholar]
- Guang A, Howison M, Zapata F, Lawrence C, Dunn CW.. 2021. Revising transcriptome assemblies with phylogenetic information. PLoS One 16(1):e0244202. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guder C, Philipp I, Lengfeld T, Watanabe H, Hobmayer B, Holstein T.. 2006. The Wnt code: cnidarians signal the way. Oncogene 25(57):7450–7460. [DOI] [PubMed] [Google Scholar]
- Haeckel E. 1888. Report on the Siphonophorae collected by HMS Challenger during the years 1873-1876. Report of the Scientific Results of the Voyage of HMS Challenger Zoology. Vol. 28. London (United Kingdom): Eyre & Spottiswoode. p. 1–380. [Google Scholar]
- Harmon LJ, Weir JT, Brock CD, Glor RE, Challenger W.. 2008. GEIGER: investigating evolutionary radiations. Bioinformatics 24(1):129–131. [DOI] [PubMed] [Google Scholar]
- Hensel K, Lotan T, Sanders SM, Cartwright P, Frank U.. 2014. Lineage-specific evolution of cnidarian Wnt ligands. Evol Dev. 16(5):259–269. [DOI] [PubMed] [Google Scholar]
- Hiebert LS, Simpson C, Tiozzo S.. 2021. Coloniality, clonality, and modularity in animals: the elephant in the room. J Exp Zool B Mol Dev Evol. 336(3):198–211. [DOI] [PubMed] [Google Scholar]
- Hobmayer B, Rentzsch F, Kuhn K, Happel CM, Laue CC. V, Snyder P, Rothbächer U, Holstein TW.. 2000. WNT signalling molecules act in axis formation in the diploblastic metazoan Hydra. Nature 407(6801):186–189. [DOI] [PubMed] [Google Scholar]
- Katoh K, Standley DM.. 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 30(4):772–780. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Klompen AML, Kayal E, Collins AG, Cartwright P.. 2021. Phylogenetic and selection analysis of an expanded family of putatively pore-forming jellyfish toxins (Cnidaria: Medusozoa). Genome Biol Evol. 13(6):evab081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kusserow A, Pang K, Sturm C, Hrouda M, Lentfer J, Schmidt HA, Technau U, Von Haeseler A, Hobmayer B, Martindale MQ, et al. 2005. Unexpected complexity of the Wnt gene family in a sea anemone. Nature 433(7022):156–160. [DOI] [PubMed] [Google Scholar]
- Langmead B, Trapnell C, Pop M, Salzberg SL.. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10(3):R25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Levin M, Anavy L, Cole AG, Winter E, Mostov N, Khair S, Senderovich N, Kovalev E, Silver DH, Feder M, et al. 2016. The mid-developmental transition and the evolution of animal body plans. Nature 531(7596):637–641. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li B, Dewey CN.. 2011. RSEM: accurate transcript quantificationfrom RNA-seq data with or without a reference genome. BMC Bioinformatics 12(1):323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li B, Ruotti V, Stewart RM, Thomson JA, Dewey CN.. 2010. RNA-seq gene expression estimation with read mapping uncertainty. Bioinformatics 26(4):493–500. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Love MI, Huber W, Anders S.. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15(12):550. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ma S, Avanesov AS, Porter E, Lee BC, Mariotti M, Zemskaya N, Guigo R, Moskalev AA, Gladyshev VN.. 2018. Comparative transcriptomics across 14 Drosophila species reveals signatures of longevity. Aging Cell. 17(4):e12740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mackie GO. 1960. Studies on Physalia physalis (L.). Part 2. Behavior and histology. Discovery Reports. Vol. 30. Cambridge (United Kingdom): Cambridge University Press. p. 371–407. [Google Scholar]
- Mackie GO. 1963. Siphonophores, bud colonies, and superorganism. In: Dougherty E, editor. The lower metazoan. Berkeley (CA: ): University of California Press. p. 329–337. [Google Scholar]
- Mackie GO. 1986. From aggregates to integrates: physiological aspects of modularity in colonial animals. Philos Trans R Soc Lond B Biol Sci. 313(1159):175–196. [Google Scholar]
- Mackie GO, Pugh PR, Purcell JE.. 1987. Siphonophore biology. Adv Mar Biol. 24:97–262. [Google Scholar]
- Macrander J, Broe M, Daly M.. 2016. Tissue-specific venom composition and differential gene expression in sea anemones. Genome Biol Evol. 8(8):2358–2375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Macrander J, Brugler MR, Daly M.. 2015. A RNA-seq approach to identify putative toxins from acrorhagi in aggressive and non-aggressive Anthopleura elegantissima polyps. BMC Genomics 16(1):221. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM.. 2021. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol Biol Evol. 38(10):4647–4654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McIntyre LM, Lopiano KK, Morse AM, Amin V, Oberg AL, Young LJ, Nuzhdin SV.. 2011. RNA-seq: technical variability and sampling. BMC Genomics 12(1):293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Merkin J, Russell C, Chen P, Burge CB.. 2012. Evolutionary dynamics of gene and isoform regulation in mammalian tissues. Science 338(6114):1593–1599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Momose T, Derelle R, Houliston E.. 2008. A maternally localised Wnt ligand required for axial patterning in the cnidarian Clytia hemisphaerica. Development 135(12):2105–2113. [DOI] [PubMed] [Google Scholar]
- Munro C, Siebert S, Zapata F, Howison M, Damian-Serrano A, Church SH, Goetz FE, Pugh PR, Haddock SHD, Dunn CW.. 2018. Improved phylogenetic resolution within Siphonophora (Cnidaria) with implications for trait evolution. Mol Phylogenet Evol. 127:823–833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Munro C, Vue Z, Behringer RR, Dunn CW.. 2019. Morphology and development of the Portuguese man of war, Physalia physalis. Sci Rep. 9(1):15522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nawrocki AM, Cartwright P.. 2013. Expression of Wnt pathway genes in polyps and medusa-like structures of Ectopleura larynx (Cnidaria: Hydrozoa). Evol Dev. 15(5):373–384. [DOI] [PubMed] [Google Scholar]
- Necsulea A, Soumillon M, Warnefors M, Liechti A, Daish T, Zeller U, Baker JC, Grützner F, Kaessmann H.. 2014. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature 505(7485):635–640. [DOI] [PubMed] [Google Scholar]
- Nguyen L-T, Schmidt HA, Von Haeseler A, Minh BQ.. 2015. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 32(1):268–274. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Paradis E, Claude J, Strimmer K.. 2004. APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20(2):289–290. [DOI] [PubMed] [Google Scholar]
- Perry GH, Melsted P, Marioni JC, Wang Y, Bainer R, Pickrell JK, Michelini K, Zehr S, Yoder AD, Stephens M, et al. 2012. Comparative RNA sequencing reveals substantial genetic variation in endangered primates. Genome Res. 22(4):602–610. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Plachetzki DC, Pankey SM, Johnson BR, Ronne EJ, Kopp A, Grosberg RK.. 2014. Gene co-expression modules underlying polymorphic and monomorphic zooids in the colonial Hydrozoan, Hydractinia symbiolongicarpus. Integr Comp Biol. 54(2):276–283. [DOI] [PubMed] [Google Scholar]
- Revell LJ. 2012. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol Evol. 3(2):217–223. [Google Scholar]
- Revell LJ, Harmon LJ.. 2008. Testing quantitative genetic hypotheses about the evolutionary rate matrix for continuous characters. Evol Ecol Res. 10(3):311–331. [Google Scholar]
- Revell LJ, Harmon LJ, Collar DC.. 2008. Phylogenetic signal, evolutionary process, and rate. Syst Biol. 57(4):591–601. [DOI] [PubMed] [Google Scholar]
- Ryan JF, Mazza ME, Pang K, Matus DQ, Baxevanis AD, Martindale MQ, Finnerty JR.. 2007. Pre-bilaterian origins of the Hox cluster and the Hox code: evidence from the sea anemone, Nematostella vectensis. PLoS One 2(1):e153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sanders SM, Cartwright P.. 2015. Patterns of Wnt signaling in the life cycle of Podocoryna carnea and its implications for medusae evolution in Hydrozoa (Cnidaria). Evol Dev. 17(6):325–336. [DOI] [PubMed] [Google Scholar]
- Sanders SM, Shcheglovitova M, Cartwright P.. 2014. Differential gene expression between functionally specialized polyps of the colonial hydrozoan Hydractinia symbiolongicarpus (Phylum Cnidaria). BMC Genomics 15(1):406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sanders SM, Shcheglovitova M, Cartwright P.. 2015. Interspecific differential expression analysis of RNA-seq data yields insight into life cycle variation in hydractiniid hydrozoans. Genome Biol Evol. 15(1):406–2431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Siebert S, Goetz FE, Church SH, Bhattacharyya P, Zapata F, Haddock SHD, Dunn CW.. 2015. Stem cells in Nanomia bijuga (Siphonophora), a colonial animal with localized growth zones. EvoDevo 6(1):22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Siebert S, Robinson MD, Tintori SC, Goetz F, Helm RR, Smith SA, Shaner N, Haddock SHD, Dunn CW.. 2011. Differential gene expression in the siphonophore Nanomia bijuga (Cnidaria) assessed with multiple next-generation sequencing workflows. PLoS One 6(7):e22953. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sinigaglia C, Busengdal H, Leclere L, Technau U, Rentzsch F.. 2013. The bilaterian head patterning gene six3/6 controls aboral domain development in a cnidarian. PLoS Biol. 11(2):e1001488. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sudmant PH, Alexis MS, Burge CB.. 2015. Meta-analysis of RNA-seq expression data across species, tissues and studies. Genome Biol. 16(1):287. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Törönen P, Medlar A, Holm L.. 2018. PANNZER2: a rapid functional annotation web server. Nucleic Acids Res. 46(W1):W84–W88. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Totton AK. 1960. Studies on Physalia physalis (L.). Part 1. Natural history and morphology. Discovery Reports. Vol. 30. Cambridge (United Kingdom): Cambridge University Press. p. 301–368. [Google Scholar]
- Totton AK. 1965. A synopsis of the Siphonophora. London (United Kingdom): British Museum (Natural History; ). [Google Scholar]
- Wagner GP, Kin K, Lynch VJ.. 2012. Measurement of mRNA abundance using RNA-seq data: RPKM measure is inconsistent among samples. Theory Biosci. 131(4):281–285. [DOI] [PubMed] [Google Scholar]
- Yang R, Wang X.. 2013. Organ evolution in angiosperms driven by correlated divergences of gene sequences and expression patterns. Plant Cell 25(1):71–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Young MD, Wakefield MJ, Smyth GK, Oshlack A.. 2010. Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol. 11(2):R14. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yu G, Smith DK, Zhu H, Guan Y, Lam TT-Y.. 2017. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol Evol. 8(1):28–36. [Google Scholar]
- Zhang G, Li C, Li Q, Li B, Larkin DM, Lee C, Storz JF, Antunes A, Greenwold MJ, Meredith RW, et al. 2014. Comparative genomics reveals insights into avian genome evolution and adaptation. Science 346(6215):1311–1320. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data underlying this article are available in Figshare with the following DOIs: 10.6084/m9.figshare.14838384, 10.6084/m9.figshare.14838372, 10.6084/m9.figshare.14838315, 10.6084/m9.figshare.14838090, 10.6084/m9.figshare.14829183. Sequences, along with sampling metadata, are available on NCBI under BioProject ID PRJNA540747. All scripts for the analyses are available in a git repository at https://github.com/dunnlab/Siphonophore_Expression (last accessed February 7, 2022). The most recent commit at the time of the analysis presented here was f1808ad. The git repository is also available as a supplementary zipped folder, Supplementary Material online.