Skip to main content
iScience logoLink to iScience
. 2021 Jul 1;24(7):102811. doi: 10.1016/j.isci.2021.102811

New binding specificities evolve via point mutation in an invertebrate allorecognition gene

Aidan L Huene 1,2, Traci Chen 1, Matthew L Nicotra 1,2,3,4,5,
PMCID: PMC8282982  PMID: 34296075

Summary

Many organisms use genetic self-recognition systems to distinguish themselves from conspecifics. In the cnidarian, Hydractinia symbiolongicarpus, self-recognition is partially controlled by allorecognition 2 (Alr2). Alr2 encodes a highly polymorphic transmembrane protein that discriminates self from nonself by binding in trans to other Alr2 proteins with identical or similar sequences. Here, we focused on the N-terminal domain of Alr2, which can determine its binding specificity. We pair ancestral sequence reconstruction and experimental assays to show that amino acid substitutions can create sequences with novel binding specificities either directly (via one mutation) or via sequential mutations and intermediates with relaxed specificities. We also show that one side of the domain has experienced positive selection and likely forms the binding interface. Our results provide direct evidence that point mutations can generate Alr2 proteins with novel binding specificities. This provides a plausible mechanism for the generation and maintenance of functional variation in nature.

Subject areas: Molecular Genetics, Molecular Biology, Evolutionary Biology

Graphical abstract

graphic file with name fx1.jpg

Highlights

  • Three binding specificities evolved in a clade of five domain 1 sequences

  • One new specificity evolved via a single amino acid mutation

  • Another new specificity evolved through a dual-specificity intermediate

  • Sequence analyses suggest a possible binding interface


Molecular genetics; Molecular biology; Evolutionary biology

Introduction

The ability to discriminate self from same-species nonself (often referred to as allorecognition) has evolved in plants (Fujii et al., 2016), fungi (Paoletti, 2016; Gonçalves et al., 2020), slime molds (Kundert and Shaulsky, 2019), marine invertebrates (Nicotra, 2019), and bacteria (Gibbs and Greenberg, 2011; Pathak et al., 2013; Cao et al., 2019). In all cases, it is based on an organism's genotype at polymorphic loci. This polymorphism is thought to be maintained by a form of balancing selection called negative frequency-dependent selection (Wright, 1939; Kimura and Crow, 1964). Under negative frequency-dependent selection, alleles become more fit as they become less frequent. This is because rare alleles are unlikely to be shared by chance, making them better markers of self. New alleles, the rarest of all, spread in a population until their frequencies reach that of other alleles (Richman and Kohn, 2000). These dynamics can maintain tens to hundreds of self-recognition alleles in a population (Casselton and Olesnicky, 1998; Lawrence, 2000; Gloria-Soria et al., 2012; James, 2015; Nydam et al., 2017; Goncalves et al., 2019). How new, functional self-recognition alleles are generated and ultimately contribute to this extreme polymorphism remains a puzzle.

Hydractinia symbiolongicarpus is a colonial cnidarian that uses proteins that are their own ligand for allorecognition (Frank et al., 2020). Hydractinia colonies begin when a sexually produced larva settles on a hermit crab shell and metamorphoses into a polyp. The animal then expands across the shell by elongating stolons (extensions of its gastrovascular system) or mat (a plate of tissue that fills the space between stolons), from which new polyps grow to form a mature colony. As it grows, a colony's stolons and mat edges meet and fuse to create an anastomosing network of gastrovascular canals embedded in a continuous sheet of mat. The colony will also fuse to itself as it grows around the shell or recovers from injury. Because nearly half of all shells bear more than one colony (Yund et al., 1987), colonies also frequently encounter conspecifics. This usually elicits an aggressive rejection response in which the colonies fight by firing nematocysts (harpoon-like organelles) until one dies (Nicotra and Buss, 2005).

Previous experiments with inbred, laboratory strains of Hydractinia have demonstrated that colonies can distinguish self from nonself by their genotype at two linked genes called allorecognition 1 (Alr1) and allorecognition 2 (Alr2) (Cadavid et al., 2004; Powell et al., 2007, 2011). Animals that shared at least one allele at both loci fused, while those that shared no alleles at either Alr1 or Alr2 rejected. If colonies only shared alleles at one locus, they fused but then separated. Because only two alleles were present at each locus in these strains, it was impossible to determine how similar alleles need to be for colonies to fuse. In addition, subsequent experiments with wild-type colonies have strongly suggested at least one additional allorecognition locus exists in the genomic region encoding Alr1 and Alr2 (Powell et al., 2007, 2011; Nicotra et al., 2009; Rosa et al., 2010).

Alr1 and Alr2 both encode type I transmembrane proteins with tandem Ig-like domains in their extracellular regions (Nicotra et al., 2009; Rosa et al., 2010). Each Alr is capable of cell-to-cell (i.e., trans) homophilic binding (Karadge et al., 2015). Binding is restricted to isoforms with identical or very similar sequences (Karadge et al., 2015). These results, combined with the fact that Hydractinia must share Alr1 and Alr2 alleles to recognize each other as self, have led to the hypothesis that homophilic binding of Alr1 and Alr2 between colonies is part of the in vivo self-recognition mechanism.

Alr1 and Alr2 are also highly polymorphic. A study of Alr2 identified 183 distinct Alr2 amino acid sequences from a single population (Gloria-Soria et al., 2012). Alr1 is expected to be similarly diverse based on the extreme levels of sequence polymorphism observed in 20 sequenced alleles (Rosa et al., 2010). These observations suggest hundreds of distinct binding specificities could exist in nature.

Two features of Hydractinia's natural history likely contribute to the evolution of this extreme polymorphism. First, colonies must be able to compete for space while simultaneously retaining the ability to recognize and fuse to themselves. Thus, a new allele that binds only to itself is favored because it permits a colony to compete with every other Hydractinia in the population but still fuse with itself. Second, Hydractinia has a pluripotent stem cell lineage that can differentiate into germ cells at any point in the colony's life. Fusion allows these stem cells to migrate from one colony into the other, where they could dominate its gametic output. This phenomenon, called stem cell parasitism, has been observed anecdotally in Hydractinia (Künzel et al., 2010; Dubuc et al., 2020) and is thought to be a common trait in most colonial organisms (Buss, 1987; Stoner and Weissman, 1996; Stoner et al., 1999; Laird et al., 2005; Aanen et al., 2008). Thus, a new allele that restricts fusion to self would be favored because it would reduce the risk of stem cell parasitism.

It has been assumed that novel Alr1 and Alr2 alleles are generated by random mutations that are then subjected to negative frequency-dependent selection. This raises the question of whether point mutations, by themselves, can generate alleles with novel homophilic binding specificities and, furthermore, whether this type of mutation could, in part, explain the large number of binding specificities thought to exist in natural populations.

Here, we sought to determine how binding specificities evolve in the N-terminal domain of Alr2. This domain, referred to as “domain 1,” is the most polymorphic region of Alr2. Changes in domain 1 can prevent Alr2 proteins from binding and therefore might be able to generate alleles with new identities. To determine how this domain has evolved in nature, we identified a clade of five domain 1 sequences encoding isoforms that differed by six or fewer amino acids. We then used ancestral sequence reconstruction and in vitro binding assays to determine the evolutionary history of the clade. Our results demonstrate that the binding specificity of domain 1 can be altered by single amino acid changes, resulting in novel specificities or intermediates with broadened specificities. Finally, we show that one face of the predicted domain 1 structure appears to be under diversifying selection, which also allows us to hypothesize that Alr2 protein-protein interactions occur in a side-to-side manner.

Results

Point mutations in domain 1 can create new binding specificities

We searched a data set of full-length, naturally occurring Alr2 alleles (Nicotra et al., 2009; Gloria-Soria et al., 2012) and identified two (111A06 and 214 × 106) that encoded Alr2 allelic isoforms (hereafter, “isoforms”) with six amino acid differences in domain 1 and identical sequences across the rest of the extracellular region (Figures 1A and 1B). Using cell aggregation assays (Karadge et al., 2015), we found that each isoform bound to itself across opposing cell membranes but did not bind to the other (Figure 1C). We therefore sought to identify the amino acid differences that prevented them from binding to each other.

Figure 1.

Figure 1

Isoform-specific, homophilic binding of Alr2 isoforms

(A) Alr2 protein structure. SP = Signal peptide, ECS = Extracellular spacer, TM = Transmembrane domain, CT = Cytoplasmic tail.

(B) Multiple sequence alignment of 111A06 and 214E06 domain 1. Polymorphisms highlighted in purple.

(C) Cell aggregation assays of 111A06 and 214E06. Cells transfected with vectors encoding only fluorescent proteins (eGFP or mRuby2) do not form aggregates (bottom right). Scale bar = 100 μm.

Each amino acid difference between 111A06 and 214E06 is the result of one point mutation. To reconstruct the evolutionary history of these mutations, we created a phylogeny of all known domain 1 coding sequences (Figure 2A). 111A06 and 214E06 were located in a clade with three additional sequences (Figure 2B). We then used ancestral sequence reconstruction to infer the sequence of each node. All but the ancestral node (Anc) were predicted to be identical to an extant sequence (Figure 2C). Because 214E06 and Hap010 differed only by a single synonymous mutation, we used 214E06 to represent their shared amino acid sequence.

Figure 2.

Figure 2

Evolution of novel binding specificities via point mutation

(A) Maximum-likelihood tree of 146 domain 1 coding sequences.

(B) Expansion of clade that includes 111A06 and 214E06. Allele names on branch. Amino acid changes indicated along branches.

(C) Multiple sequence alignment of clade. Variant residues highlighted.

(D) Plasmid map Alr2 fusion proteins. (E-H) Representative images of cell aggregation assays.

(E) Anc, 046B, and Hap074 against themselves.

(F) Anc, 046B, and Hap074 against 111A06.

(G) All pairwise combinations of Anc, 046B, and Hap074.

(H) Anc, 046B, and Hap074 versus 214E06. Arrowheads point to semi-mixed aggregates (See also Figure S1). Scale bar = 100 μm.

(I) Node network of isoforms colored by binding specificity. Triangles indicate the hypothesized direction of mutation from Anc. Green dotted lines indicate weaker heterophilic interactions.

To determine the binding specificity of the domain 1 isoforms encoded by these sequences, we expressed each as a fusion to domain 2 through the cytoplasmic tail of the 111A06 isoform, with a C-terminal fluorescent protein tag (Figure 2D). The resulting isoforms were tested against themselves and each other in cell aggregation assays (Figures 2E–2H). Each isoform, including the predicted ancestor, Anc, caused cells to form multicellular aggregates, indicating it was capable of homophilic binding (Figure 2E). In pairwise assays, 111A06 did not form mixed aggregates with any isoform, indicating it had a unique binding specificity within the clade (Figure 2F). In contrast, Anc, 046B, and Hap074 all formed mixed aggregates with each other, indicating a shared binding specificity (Figure 2G).

In assays that paired 214E06 with Anc, 046B, or Hap074, we observed single-color aggregates, some of which appeared to adhere to aggregates of a different color (Figure 2H, arrowheads). These semimixed aggregates were repeatable (Figure S1A) and qualitatively different from the mixed aggregates it formed when paired with itself and from the completely separate aggregates it formed with the other four isoforms. This ruled out a defect in 214E06 that prevented homophilic binding or caused it to bind to any isoform. Semimixed aggregates have been observed in studies of cell adhesion molecules that have strong homophilic affinities but weaker heterophilic affinities (Katsamba et al., 2009; Goodman et al., 2016). Because of this, we concluded that 214E06 binds more weakly to Anc, 046B, and Hap074 than to itself and that it therefore had a different binding profile from the other isoforms.

Our results are consistent with the following evolutionary history (Figure 2I). An ancestral sequence, Anc, underwent a single mutation, N32Y, which created a daughter sequence, 111A06, with a novel binding specificity. In a separate lineage, the Anc sequence underwent two mutations, T76R and E93K to create 046B, which retained the ability to bind to Anc. A third mutation, S89L, then created Hap074, which also remained able to bind Anc and 046B. Two more mutations, S44G and G47E, then created 214E06, which bound more weakly to the ancestral isoforms than to itself (Figure 2I, dotted lines). The result is a clade in which we can discern three binding specificities, one of which arose via a single-point mutation.

New homophilic specificities can evolve via less restricted intermediates

Within the phylogeny, two pairs of mutations occurred within single branches (Figure 2B), preventing us from determining which came first. To determine whether the missing single-step intermediates were functional (i.e., able to bind homophilically) or had a different binding specificity from their parent and daughter sequences, we recreated each one (Figure 3A) and tested it in cell aggregation assays. We found each intermediate could bind homophilically (Figure 3B), thus ruling out the possibility that there were nonfunctional intermediates in the clade.

Figure 3.

Figure 3

Domain 1 isoforms can evolve via intermediates with broadened specificity

(A) Expanded node network including hypothesized single-step mutants between Anc and 046B, Hap074 and 214E06. Scale bar = 100 μm and applies to all images.

(B–H) Representative images of cell aggregation assays.

(B) Mutants tested against themselves.

(C) Anc-T76R and Anc-E93K tested against Anc, 046B, Hap074.

(D) Anc-T76R versus 111A06 (left) and Anc-E93K versus 111A06 (right). Semi-mixed aggregates indicated with arrowheads (See also Figure S1A).

(E) Anc-T76R and Anc-E93K versus 214E06 (See also Figure S1C).

(F) Hap074-S44G and Hap074-G47E versus 111A06.

(G) Hap074-S44G and Hap074-G47E versus 214E06.

(H) Hap074-S44G and Hap074-G47E versus all remaining isoforms.

We next tested the specificity of each missing intermediate. The first pair, Anc-T76R and Anc-E93K, formed mixed aggregates with Anc, 046B, and Hap074 (Figure 3C). Assays pairing Anc-T76R with 111A06 resulted in single-color aggregates (Figure 3D), but those pairing Anc-E93K with 111A06 resulted in a few semimixed aggregates (Figure 3D, arrowheads, Figure S1B). Both mutants also formed semimixed aggregates when paired with 214E06 (Figures 3E and S1C). Thus, evolution from Anc to 046B is unlikely to have involved a significant change in binding specificity (Figure 3A).

In contrast, the specificity of the second pair of intermediates, Hap074-S44G and Hap074-G47E, was different from their parent and daughter sequences. These mutants failed to form mixed aggregates with 111A06 (Figure 3F) but did form mixed aggregates with 214E06 (Figure 3G) and all other ancestral sequences (Figure 3H). We did not observe semimixed aggregates in any assay. These results suggest the first mutation on the path from Hap074 to 214E06, either S44G or G47E, created a sequence that could still bind Hap074 (Figure 3A). The acquisition of the second mutation then generated a new allele, 214E06, which remained able to bind its parent sequence, but had a weaker affinity for Hap074. The evolution of new domain 1 sequences can therefore proceed through intermediates with broader specificities than their parental or daughter sequences.

The N32Y mutation preserves homophilic binding and alters specificity

Isoform 111A06 evolved when position 32 mutated from Asn to Tyr in Anc. We therefore hypothesized the N32Y mutation might turn 046B or Hap074, which had the same specificity as Anc, into isoforms with the same specificity as 111A06. To test this, we generated 046B-N32Y and Hap074-N32Y. In assays with themselves, each formed mixed aggregates, indicating the mutation did not disrupt homophilic binding (Figure S1D). In pairwise assays with each other and 111A06, the mutants formed mixed aggregates, indicating they had gained the ability to bind 111A06 and each other (Figures 4A and S1G). In pairwise assays with their immediate ancestors, however, the mutants formed semimixed aggregates (Figure 4B, asterisks, and Figure S1E). This indicated each could still bind its ancestor, albeit more weakly than it did itself. Finally, we performed pairwise assays with the remaining isoforms in the clade. This showed the mutants had different specificities than 111A06, 046B, or Hap074 (Figures 4B and S2). In sum, the N32Y altered the specificities of 046B and Hap074 but did not generate daughter sequences with the same specificity as 111A06.

Figure 4.

Figure 4

Effects of N32Y mutation on binding specificity and structural analysis

(A) Results of assays between N32Y mutants and 111A06 (See also Figures S1G and S1H).

(B) Binding profiles of 111A06, 046B-N32Y, Hap074-N32Y, 046B, and Hap074. Asterisk denotes the result of an allele and its N32Y mutant.

(C) Binding profiles of 214E06 and 214E06-N32Y.

(D) Predicted structure of Anc domain 1. Six variant residues labeled.

(E) Sequence conservation mapped onto domain 1.

(F) Residues predicted to have experienced either diversifying or purifying selection mapped onto domain 1. Colors correspond to the predictions of MEME and/or FEL. Arrowhead indicates the one residue predicted to be under positive selection by FEL only.

(G) Hypothetical binding topologies Alr2.

We next tested whether the N32Y mutation would alter the specificity of 214E06, the remaining domain 1 isoform known to exist in nature. We generated 214E06-N32Y and found it formed mixed aggregates with itself (Figure S1D), indicating it was able to bind homophilically to itself. It also formed semimixed aggregates with 214E06 (Figure 4C, asterisk, and Figure S1F), indicating a reduced binding affinity for its immediate ancestor compared to itself. However, 214E06-N32Y only formed semimixed aggregates with 111A06, 046B-N32Y, and Hap074-N32Y (Figure 4A). Thus, simply sharing a Tyr at position 32 was insufficient for isoforms to bind each other as strongly as they did themselves. Pairwise assays with the remaining isoforms revealed 214E06-N32Y to have a different binding profile than 214E06, with the exception of the mixed aggregates formed with Hap074-S44G (Figures 4C and S3). The effect of the N32Y mutation thus depends on the sequence context in which it occurs.

Structural and evolutionary analyses suggest a potential binding interface

In this study, three mutations changed the binding specificity of domain 1 (N32Y, S44G, and G47E), and three others did not. To investigate how these mutations might affect the tertiary structure of domain 1—and thus its binding specificity—we used I-TASSER to predict their structures. All were predicted to fold like V-set Ig-domains, which was consistent with previous work (Nicotra et al., 2009). Five mutations mapped to one face of the predicted beta-sandwich, with the three specificity-altering mutations in close proximity to each other in beta-strands C and C’ (Figure 4D shows the structure of Anc for illustration). This suggested these strands are involved in homophilic binding between compatible domain 1 isoforms.

To gain further insight into the mechanism of homophilic binding, we compared the predicted structures of domains differing by a single amino acid (e.g., Anc vs 111A06). We noted many differences in the orientation of the mutated residues and their nearby amino acids. However, molecular dynamics simulations indicated these orientations were probably unstable (data not shown), so we did not analyze the models any further.

As an alternative approach to identify functionally important parts in domain 1, we reasoned that selection should increase sequence variation at or near the binding site. We therefore calculated the level of sequence variation at each site across all known domain 1 sequences, then mapped this metric onto the predicted structure of Anc. We found that most of the variable sites were also concentrated on the side of the domain that includes strands C and C′ and residues 32, 44, and 47 (Figure 4E).

One explanation for this increase in variation is that positive (diversifying) selection is acting on amino acid positions at the binding interface because this can generate new specificities.

Although current sequence-based methods do not allow one to test whether a single mutation on a single-branch experienced positive selection (Murrell et al., 2012; Spielman et al., 2019), we were able to test whether positive selection has acted on specific sites in domain 1 across the entire phylogeny of domain 1 sequences. To do this, we analyzed the alignment of all known domain 1 sequences with MEME (Murrell et al., 2012) and FEL (Pond and Frost, 2005). Thirty sites were predicted to have experienced positive selection and were concentrated on the side of the domain that includes strands C and C’ (Figure 4F, Table S1). Twenty sites were predicted to be under negative (purifying) selection and mapped to this side of the domain.

With respect to the six positions at which mutations occurred in our clade of interest, sites 32, 44, 89, and 93 were predicted to have experienced positive selection on at least one branch of the full phylogeny, but site 47 was not. Site 76 was predicted by MEME to be under positive selection, but by FEL to be under negative selection, a pattern consistent with a burst of diversifying selection against a background of purifying selection (Spielman et al., 2019). In all, these results are consistent with positive selection acting to increase sequence variation at sites on a probable binding face.

Taken together, these evolutionary signatures also suggest Alr2 proteins might bind via “side-to-side” interactions at their N-terminal domains. We speculate these interactions could occur in either an antiparallel or parallel topology (Figure 4G).

Discussion

Domain 1 is the most polymorphic region of Alr2 (Gloria-Soria et al., 2012). Here, we demonstrate that sequence differences in this domain can prevent Alr2 isoforms from binding to each other. Then, by reconstructing the history of a small domain 1 sequence family, we show that new sequences capable of discriminating between themselves and their ancestors can evolve via point mutation. This can occur with as little as one mutation or via sequential mutations leading through intermediates with relaxed specificities. The fact that so few mutations occurred within this family also increases our confidence in our sequence reconstructions. Because sequence differences in domain 1 are sufficient to alter Alr2 specificity, these mutations may have generated Alr2 alleles with novel identities. Moreover, because the sequences in this study were drawn from a single population, our results show that natural selection maintains ancestral sequences alongside one encoding new specificities. Thus, our results reveal a mechanism capable of generating, maintaining, and increasing the functional diversity of Alr2.

In this study, we failed to identify domain 1 sequences that could not bind homophilically. This is somewhat surprising because alleles incapable of homophilic binding might be expected to exist in nature. Colonies that are Alr2a/null (where a is an allele encoding a homophilic binding protein and null cannot bind homophilically) might be functionally equivalent to Alr2a/a colonies. This is possible because fusions between colonies sharing only one allele are identical to fusions between colonies that share two alleles. Colonies with null alleles might even have a fitness advantage because the probability that they will fuse with nonself is reduced from the sum of two allele frequencies to the frequency of a single allele. So, why have we not detected null alleles in this and a previous study (Karadge et al., 2015)? One possibility is that null alleles are rare, and we have not found one yet because we have only studied ∼5% of sequence variation at Alr2. A second possibility is that Alr2a/null animals are not, in fact, equivalent to Alr2a/a animals. This might be true if Alr2 has essential functions beyond self-recognition at the colony border. In fact, we suspect this is the case because Alr2 is constitutively expressed from embryonic development through adulthood and across all tissues in a colony (Nicotra et al., 2009). Alr2 might therefore be required to maintain adhesion between epithelial cell layers. If true, Alr2wt/null colonies might be unfit, and Alr2null/null animals might be inviable. This would also place an upper limit on the total frequency of null alleles in a population. A third possibility is that our assay is unable to detect null alleles. This would be the case if cells aggregate in our assay at a lower affinity than that required for colonies to recognize a tissue as self.

Assuming our assay does correlate with the in vivo function of Alr2, the observation that three sequences (Anc, 046B, and Hap074) encode the same binding specificity might also seem surprising, since their common specificity would make them less fit than 111A06 or 214E06. Alr2 allele frequencies might differ from expected equilibrium frequencies if the population has experienced changes in gene flow or recent bottlenecks. Similarly, if there are beneficial alleles at nonallorecognition genes tightly linked to Alr2, some Alr2 alleles might have higher than expected frequencies due to genetic hitchhiking. In addition, we note that the tree for this clade does not represent actual allele frequencies because we removed duplicate sequences prior to constructing the phylogeny. Indeed, in the original study (Gloria-Soria et al., 2012), which reported near-saturation sampling of a single population in Long Island Sound, USA, the 111A06 specificity was represented by three alleles, the Anc/046B/Hap074 specificity by five alleles, and the 214E06 specificity by three alleles. Although essentially anecdotal, this distribution is closer to the expectation of equal phenotype frequencies. These considerations suggest that future work elucidating the population genetics of Hydractinia, comprehensively assessing the full breadth of Alr2 binding specificity diversity, and annotating genomic regions linked to Alr1 and Alr2 will be fruitful.

Many positions in domain 1 appear to have experienced positive selection that was either episodic (i.e., limited to particular branches and detected by MEME) or pervasive (under pressure throughout the phylogeny and detected by FEL). As previously mentioned, our evolutionary analyses cannot not tell us whether the six specific mutations that occurred within the branches of our clade experienced positive selection. What we can say is that four mutations occurred at positions under positive selection somewhere in the phylogeny. Two of these mutations (at positions 32 and 44) altered binding specificity and two did not (89 and 93). One interpretation of this is that nonsynonymous mutations are favored at these positions because they can alter specificity in some sequence contexts, some of which are present in other branches of the tree. Alternatively, these latter mutations might actually alter specificity at a level that our assays could not detect. With respect to position 47, which was not found to be under positive selection but which did alter binding specificity, it is possible that positive selection was present but neither MEME nor FEL had power to detect positive selection because the branches were short. The same explanation could apply to position 76, although our results suggest that positive selection acted only briefly and against a background of strong negative selection. This would also be consistent with mutations at this position altering specificity elsewhere in the larger tree. Several sites in the hypothesized binding surface were also predicted be under negative selection. These sites could be highly conserved because altering them would render the domain incapable of homophilic binding at all. Further work to complement these analyses with functional assays will answer these questions.

Because our study identified residues that affect homophilic binding specificity, we attempted to use structural modeling to identify the biophysical basis of this specificity. Ultimately, we determined that the models produced by I-TASSER were not reliable enough for us to do so. Understanding the biophysical mechanism of this exquisite specificity must therefore await experimentally determined structures. We were, however, able to use sequence variation to generate a hypothesis for how the proteins interact. Across all Alr2 alleles, positions with the highest degree of variation, and those experiencing diversifying selection, were predicted to occur on one side of the Ig-like beta barrel. This suggests that the N-terminal domains of Alr2 bind in a side-to-side manner.

Although we focused here on domain 1, other regions might also determine binding specificity. Evidence for this comes from the fact that the entire extracellular region of Alr2 is polymorphic, and the prediction that residues in domains 2–3 and the ECS are predicted to have experienced diversifying selection (Nicotra et al., 2009; Gloria-Soria et al., 2012). Point mutations in these regions might also give rise to new alleles. Recombination might also generate novel binding specificities. Domains 1–3 and the ECS are each encoded by single exons. These exons frequently recombine between Alr2 alleles and get shuffled between Alr2 and several adjacent pseudogenes via gene conversion or unequal crossing over (Gloria-Soria et al., 2012). This could generate chimeric domain 1 sequences with novel specificities. It might also bring together new combinations of domains 1–3 or the ECS that would have different specificities than either of the nonrecombinant parental alleles.

In light of our results, Hydractinia would be a productive system in which to study “sequence space”—the theoretical universe of all possible peptides of a given length. Long-standing questions about how many functional variants of a protein exist in sequence space, how many of these actually appear in nature, and whether evolution is constrained in its ability to reach them remain unresolved (Weinreich et al., 2006; Povolotskaya and Kondrashov, 2010; Podgornaia and Laub, 2015). Because natural selection drives the continued evolution of new allorecognition alleles, allorecognition loci like Alr2 are essentially natural experiments exploring sequence space.

Limitations of the study

The main limitation of this study is the qualitative nature of our cell aggregation assays. Although such assays are commonly used to test binding in cell adhesion molecules (Kasinrerk et al., 1999; Katsamba et al., 2009; Schreiner and Weiner, 2010; Thu et al., 2014; Rubinstein et al., 2015; Goodman et al., 2016), it can be difficult to draw conclusions from them about quantitative binding affinities. This is particularly true here because we used transient transfections, which led to unavoidable variation in the expression of each Alr2 isoform between cell populations. This prevented us from using measures of aggregation speed or aggregate size to infer their binding strength. In other words, in this study, assays with just one allele reveal whether the encoded protein can bind to itself in trans but do not indicate its homophilic binding affinity. Similarly, assays in which two alleles are present only reveal whether homophilic or heterophilic interactions were favored. Therefore, it is possible that isoforms that did not bind each other in our assays would, in fact, bind heterophilically if homophilic interactions were prevented, as would likely be the case if they were expressed on the outward facing epithelia of opposing Hydractinia colonies. With this limitation in mind, we conservatively interpreted “semimixed” aggregates as indicating that two isoforms had heterophilic affinities that were relatively weaker than their homophilic affinities. We hypothesize this type of aggregate formed because the difference in affinities led to homophilic clusters that then associated heterophilically. This interpretation is in line with what is thought to happen when similar aggregates form with cadherins and other immunoglobulin superfamily cell adhesion proteins (Katsamba et al., 2009; Goodman et al., 2016). These caveats should be kept in mind when extrapolating our results to nature. Resolving this issue will require quantitative assays paired with transgenic experiments to ectopically express these alleles in living colonies and determine their phenotypic effect, an experimental approach now possible thanks to recent advances in Hydractinia functional genomics (Sanders et al., 2018).

STAR★Methods

Key resources table

REAGENT or RESOURCE SOURCE IDENTIFIER
Experimental models: cell lines

HEK293T cells ATCC Cat# CRL-3216

Recombinant DNA

pFLAG-CMV3-111A06-eGFP This paper pUP801
pFLAG-CMV3-111A06-mRuby2 This paper pUP738
pFLAG-CMV3-214E06-eGFP This paper pUP836
pFLAG-CMV3-214E06-mRuby2 This paper pUP746
pFLAG-CMV3-Anc-eGFP This paper pUP871
pFLAG-CMV3-Anc-mRuby2 This paper pUP748
pFLAG-CMV3-046B-eGFP This paper pUP872
pFLAG-CMV3-046B-mRuby2 This paper pUP750
pFLAG-CMV3-Hap074-eGFP This paper pUP894
pFLAG-CMV3-Hap074-mRuby2 This paper pUP752
pFLAG-CMV3-214E06-N32Y-eGFP This paper pUP838
pFLAG-CMV3-214E06-N32Y-mRuby2 This paper pUP839
pFLAG-CMV3-Hap074-G47E-eGFP This paper pUP875
pFLAG-CMV3-Hap074-G47E-mRuby2 This paper pUP876
pFLAG-CMV3-Hap074-S44G-eGFP This paper pUP878
pFLAG-CMV3-Hap074-S44G-mRuby2 This paper pUP877
pFLAG-CMV3-Anc-T76R-eGFP This paper pUP866
pFLAG-CMV3-Anc-T76R-mRuby2 This paper pUP865
pFLAG-CMV3-Anc-E93K-eGFP This paper pUP879
pFLAG-CMV3-Anc-E93K-mRuby2 This paper pUP880
pFLAG-CMV3-046B-N32Y-eGFP This paper pUP892
pFLAG-CMV3-046B-N32Y-mRuby2 This paper pUP794
pFLAG-CMV3-Hap074-N32Y-eGFP This paper pUP893
pFLAG-CMV3-Hap074-N32Y-mRuby2 This paper pUP795

Software and algorithms

I-TASSER v5.1 Roy et al. (2010); Yang et al. (2015); Zhang (2008) https://zhanglab.ccmb.med.umich.edu/I-TASSER/download/
Multialign Viewer (Chimera v1.14) USCF Chimera Meng et al., 2006; Pettersen et al. (2004) NA
MEME (HyPhy 2.2.4) Murrell et al., 2012 http://hyphy.org/w/index.php/Download
FEL (HyPhy 2.2.4) Pond and Frost (2005) http://hyphy.org/w/index.php/Download
ImageJ v1.53a Abràmoff et al., 2004; Schneider et al. (2012) https://imagej.nih.gov/ij/

Other

TransIT-293 Mirus Bio Cat#MIR 2700
35μm strainer mesh Steller scientific Cat#FSC-FLTCP
DNase I Sigma Cat#D4527-10KU
24-well ultra-low attachment plate Fisher Scientific Cat#07-200-602
Orbital rotator IBI Scientific Model# BBUAAUVIS
NEBuilder HiFi DNA Assembly New England Biolabs Cat#E2621S
Hap200 Genbank JX048906.1
Hap199 Genbank JX048905.1
Hap197 Genbank JX048904.1
Hap196 Genbank JX048903.1
Hap195 Genbank JX048902.1
Hap194 Genbank JX048901.1
Hap193 Genbank JX048900.1
Hap192 Genbank JX048899.1
Hap191 Genbank JX048898.1
Hap190 Genbank JX048897.1
Hap189 Genbank JX048896.1
Hap188 Genbank JX048895.1
Hap187 Genbank JX048894.1
Hap186 Genbank JX048893.1
Hap185 Genbank JX048892.1
Hap184 Genbank JX048891.1
Hap183 Genbank JX048890.1
Hap182 Genbank JX048889.1
Hap181 Genbank JX048888.1
Hap180 Genbank JX048887.1
Hap179 Genbank JX048886.1
Hap178 Genbank JX048885.1
Hap176 Genbank JX048884.1
Hap175 Genbank JX048883.1
Hap173 Genbank JX048881.1
Hap172 Genbank JX048880.1
Hap171 Genbank JX048879.1
Hap169 Genbank JX048877.1
Hap168 Genbank JX048876.1
Hap167 Genbank JX048875.1
Hap166 Genbank JX048874.1
Hap165 Genbank JX048873.1
Hap164 Genbank JX048872.1
Hap163 Genbank JX048871.1
Hap161 Genbank JX048869.1
Hap159 Genbank JX048867.1
Hap158 Genbank JX048866.1
Hap157 Genbank JX048865.1
Hap156 Genbank JX048864.1
Hap155 Genbank JX048863.1
Hap154 Genbank JX048862.1
Hap153 Genbank JX048861.1
Hap152 Genbank JX048860.1
Hap151 Genbank JX048859.1
Hap150 Genbank JX048858.1
Hap149 Genbank JX048857.1
Hap148 Genbank JX048856.1
Hap147 Genbank JX048855.1
Hap146 Genbank JX048854.1
Hap145 Genbank JX048853.1
Hap144 Genbank JX048852.1
Hap143 Genbank JX048851.1
Hap142 Genbank JX048850.1
Hap141 Genbank JX048849.1
Hap139 Genbank JX048847.1
Hap138 Genbank JX048846.1
Hap137 Genbank JX048845.1
Hap136 Genbank JX048844.1
Hap135 Genbank JX048843.1
Hap134 Genbank JX048842.1
Hap133 Genbank JX048841.1
Hap132 Genbank JX048840.1
Hap131 Genbank JX048839.1
Hap130 Genbank JX048838.1
Hap129 Genbank JX048837.1
Hap128 Genbank JX048836.1
Hap127 Genbank JX048835.1
Hap126 Genbank JX048834.1
Hap125 Genbank JX048833.1
Hap124 Genbank JX048832.1
Hap123 Genbank JX048831.1
Hap122 Genbank JX048830.1
Hap121 Genbank JX048829.1
Hap120 Genbank JX048828.1
Hap119 Genbank JX048827.1
Hap118 Genbank JX048826.1
Hap117 Genbank JX048825.1
Hap116 Genbank JX048824.1
Hap115 Genbank JX048823.1
Hap114 Genbank JX048822.1
Hap113 Genbank JX048821.1
Hap112 Genbank JX048820.1
Hap111 Genbank JX048819.1
Hap110 Genbank JX048818.1
Hap109 Genbank JX048817.1
Hap108 Genbank JX048816.1
Hap107 Genbank JX048815.1
Hap106 Genbank JX048814.1
Hap105 Genbank JX048813.1
Hap104 Genbank JX048812.1
Hap103 Genbank JX048811.1
Hap102 Genbank JX048810.1
Hap101 Genbank JX048809.1
Hap100 Genbank JX048808.1
Hap099 Genbank JX048807.1
Hap098 Genbank JX048806.1
Hap097 Genbank JX048805.1
Hap096 Genbank JX048804.1
Hap095 Genbank JX048803.1
Hap094 Genbank JX048802.1
Hap093 Genbank JX048801.1
Hap092 Genbank JX048800.1
Hap091 Genbank JX048799.1
Hap090 Genbank JX048798.1
Hap089 Genbank JX048797.1
Hap088 Genbank JX048796.1
Hap087 Genbank JX048795.1
Hap086 Genbank JX048794.1
Hap085 Genbank JX048793.1
Hap084 Genbank JX048792.1
Hap083 Genbank JX048791.1
Hap082 Genbank JX048790.1
Hap081 Genbank JX048789.1
Hap080 Genbank JX048788.1
Hap079 Genbank JX048787.1
Hap078 Genbank JX048786.1
Hap077 Genbank JX048785.1
Hap076 Genbank JX048784.1
Hap075 Genbank JX048783.1
Hap074 Genbank JX048782.1
Hap073 Genbank JX048781.1
Hap072 Genbank JX048780.1
Hap071 Genbank JX048779.1
Hap070 Genbank JX048778.1
Hap069 Genbank JX048777.1
Hap068 Genbank JX048776.1
Hap067 Genbank JX048775.1
Hap066 Genbank JX048774.1
Hap065 Genbank JX048773.1
Hap064 Genbank JX048772.1
Hap063 Genbank JX048771.1
Hap062 Genbank JX048770.1
Hap061 Genbank JX048769.1
Hap060 Genbank JX048768.1
Hap058 Genbank JX048766.1
Hap057 Genbank JX048765.1
Hap056 Genbank JX048764.1
Hap054 Genbank JX048762.1
Hap051 Genbank JX048759.1
Hap048 Genbank JX048756.1
Hap046 Genbank JX048754.1
Hap045 Genbank JX048753.1
Hap044 Genbank JX048752.1
Hap043 Genbank JX048751.1
Hap042 Genbank JX048750.1
Hap041 Genbank JX048749.1
Hap040 Genbank JX048748.1
Hap039 Genbank JX048747.1
Hap038 Genbank JX048746.1
Hap037 Genbank JX048745.1
Hap036 Genbank JX048744.1
Hap035 Genbank JX048743.1
Hap034 Genbank JX048742.1
Hap033 Genbank JX048741.1
Hap032 Genbank JX048740.1
Hap031 Genbank JX048739.1
Hap030 Genbank JX048738.1
Hap029 Genbank JX048737.1
Hap028 Genbank JX048736.1
Hap027 Genbank JX048735.1
Hap026 Genbank JX048734.1
Hap025 Genbank JX048733.1
Hap024 Genbank JX048732.1
Hap022 Genbank JX048730.1
Hap021 Genbank JX048729.1
Hap020 Genbank JX048728.1
Hap019 Genbank JX048727.1
Hap018 Genbank JX048726.1
Hap017 Genbank JX048725.1
Hap016 Genbank JX048724.1
Hap015 Genbank JX048723.1
Hap014 Genbank JX048722.1
Hap012 Genbank JX048720.1
Hap010 Genbank JX048718.1
Hap008 Genbank JX048716.1
Hap007 Genbank JX048715.1
Hap006 Genbank JX048714.1
Hap005 Genbank JX048713.1
Hap004 Genbank JX048712.1
Hap003 Genbank JX048711.1
Hap002 Genbank JX048710.1
Hap001 Genbank JX048709.1
Hap174 Genbank JX048882.1
Hap162 Genbank JX048870.1
Hap160 Genbank JX048868.1
Hap050 Genbank JX048758.1
Hap013 Genbank JX048721.1
Hap009 Genbank JX048717.1
LH09_466G04 Genbank JX049024.1
LH09_466B06 Genbank JX049023.1
LH09_465F03 Genbank JX049022.1
LH09_465B09 Genbank JX049021.1
LH09_459C06 Genbank JX049020.1
LH09_459C03 Genbank JX049019.1
LH09_454D03 Genbank JX049018.1
LH09_452H02 Genbank JX049017.1
LH09_449H03 Genbank JX049016.1
LH09_449F06 Genbank JX049015.1
LH09_447F08 Genbank JX049014.1
LH09_447D09 Genbank JX049013.1
LH09_443D04 Genbank JX049012.1
LH09_443B07 Genbank JX049011.1
LH09_436B04 Genbank JX049010.1
LH09_435B06 Genbank JX049009.1
LH09_435B05 Genbank JX049008.1
LH09_431F06 Genbank JX049007.1
LH09_431C08 Genbank JX049006.1
LH09_429C03 Genbank JX049005.1
LH09_429A03 Genbank JX049004.1
LH09_425B08 Genbank JX049003.1
LH09_425B07 Genbank JX049002.1
LH09_417C08 Genbank JX049001.1
LH09_417B05 Genbank JX049000.1
LH09_406B01 Genbank JX048999.1
LH09_396C10 Genbank JX048998.1
LH09_396B06 Genbank JX048997.1
LH09_396A08 Genbank JX048996.1
LH09_395G03 Genbank JX048995.1
LH09_386C08 Genbank JX048994.1
LH09_384F06 Genbank JX048993.1
LH09_384E03 Genbank JX048992.1
LH09_380B02 Genbank JX048991.1
LH09_380A03 Genbank JX048990.1
LH09_274C02 Genbank JX048989.1
LH09_271B05 Genbank JX048988.1
LH09_270D09 Genbank JX048987.1
LH09_270C02 Genbank JX048986.1
LH09_268E09 Genbank JX048985.1
LH09_268B05 Genbank JX048984.1
LH09_265F08 Genbank JX048983.1
LH09_265B10 Genbank JX048982.1
LH09_261C01 Genbank JX048981.1
LH09_249C04 Genbank JX048980.1
LH09_249B04 Genbank JX048979.1
LH09_248A06 Genbank JX048978.1
LH09_244H05 Genbank JX048977.1
LH09_244B05 Genbank JX048976.1
LH09_230G03 Genbank JX048975.1
LH09_230F04 Genbank JX048974.1
LH09_230B04 Genbank JX048973.1
LH09_214H10 Genbank JX048972.1
LH09_214E06 Genbank JX048971.1
LH09_212C05 Genbank JX048970.1
LH09_212B03 Genbank JX048969.1
LH09_205E03 Genbank JX048968.1
LH09_205C02 Genbank JX048967.1
LH09_202E04 Genbank JX048966.1
LH09_162G06 Genbank JX048965.1
LH09_162D02 Genbank JX048964.1
LH09_158D11 Genbank JX048963.1
LH09_158A08 Genbank JX048962.1
LH09_158A03 Genbank JX048961.1
LH09_145D12 Genbank JX048960.1
LH09_116B02 Genbank JX048959.1
LH09_111C09 Genbank JX048958.1
LH09_111A06 Genbank JX048957.1
LH09_110C01 Genbank JX048956.1
LH09_110B01 Genbank JX048955.1
LH09_085A06 Genbank JX048954.1
LH09_084B07 Genbank JX048953.1
LH09_083D05 Genbank JX048952.1
LH09_083C10 Genbank JX048951.1
LH09_082F03 Genbank JX048950.1
LH09_082D07 Genbank JX048949.1
LH09_078E08 Genbank JX048948.1
LH09_068F07 Genbank JX048947.1
LH09_068B01 Genbank JX048946.1
LH09_064D04 Genbank JX048945.1
LH09_064C05 Genbank JX048944.1
LH09_061G09 Genbank JX048943.1
LH09_061G06 Genbank JX048942.1
LH09_059C03 Genbank JX048941.1
LH09_058G02 Genbank JX048940.1
LH09_058C05 Genbank JX048939.1
LH09_055H01 Genbank JX048938.1
LH09_055A07 Genbank JX048937.1
LH09_054E03 Genbank JX048936.1
LH09_054D05 Genbank JX048935.1
LH09_052D03 Genbank JX048934.1
LH09_052C02 Genbank JX048933.1
LH09_051E03 Genbank JX048932.1
LH09_051B03 Genbank JX048931.1
LH09_048E02 Genbank JX048930.1
LH09_044F06 Genbank JX048929.1
LH09_044C10 Genbank JX048928.1
LH09_042B02 Genbank JX048927.1
LH09_042A05 Genbank JX048926.1
LH09_039G03 Genbank JX048925.1
LH09_037B08 Genbank JX048924.1
LH09_037A08 Genbank JX048923.1
LH09_035C05 Genbank JX048922.1
LH09_034H09 Genbank JX048921.1
LH09_032F03 Genbank JX048920.1
LH09_032E02 Genbank JX048919.1
LH09_024B01 Genbank JX048918.1
LH09_023H02 Genbank JX048917.1
LH09_023E08 Genbank JX048916.1
LH09_019D10 Genbank JX048915.1
LH09_018D08 Genbank JX048914.1
LH09_018B08 Genbank JX048913.1
LH09_005G06 Genbank JX048912.1
LH09_005E05 Genbank JX048911.1
LH09_004E06 Genbank JX048910.1
LH09_004B06 Genbank JX048909.1
LH09_001E02 Genbank JX048908.1
14_7F Genbank JX048907.1
OQ-6Db Genbank HM013632.1
OQ-6Da Genbank HM013631.1
LH07:060b Genbank HM013630.1
LH07:060a Genbank HM013629.1
LH07:049b Genbank HM013628.1
LH07:049a Genbank HM013627.1
LH07:046b Genbank HM013626.1
LH07:046a Genbank HM013625.1
LH07:043a Genbank HM013624.1
LH07:041b Genbank HM013623.1
LH07:041a Genbank HM013622.1
LH07:037b Genbank HM013621.1
LH07:037a Genbank HM013620.1
LH07:036b Genbank HM013619.1
LH07:036a Genbank HM013618.1
LH07:026b Genbank HM013617.1
LH07:026a Genbank HM013616.1
LH06:050b Genbank HM013613.1
LH06:049b Genbank HM013611.1
LH06:028b Genbank HM013609.1
LH06:028a Genbank HM013608.1
LH06:003b Genbank HM013607.1
 alr2-W60b Genbank FJ207419.1
 alr2-W60a Genbank FJ207418.1
 alr2-W49b Genbank FJ207417.1
 alr2-W49a Genbank FJ207416.1
 alr2-W41b Genbank FJ207415.1
 alr2-W41a Genbank FJ207414.1
 alr2-W36b Genbank FJ207413.1
 alr2-W36a Genbank FJ207412.1
 alr2-W14b Genbank FJ207411.1
 alr2-W14a Genbank FJ207410.1
alr2-LH49b Genbank FJ207402.1
alr2-LH49a Genbank FJ207401.1
alr2-LH03i Genbank FJ207396.1
alr2-LH03a Genbank FJ207395.1
LH07:014b Genbank HM013615.1
LH07:014a Genbank HM013614.1
LH06:050a Genbank HM013612.1
LH06:049a Genbank HM013610.1
LH06:003a Genbank HM013606.1
alr2-LH53b Genbank FJ617568.1
alr2-LH53a Genbank FJ617567.1
alr2-LH04b Genbank FJ617566.1
alr2-LH04a Genbank FJ617565.1
alr2-R Genbank FJ207409.1
alr2-LH82b Genbank FJ207408.1
alr2-LH82a Genbank FJ207407.1
alr2-LH58b Genbank FJ207406.1
alr2-LH58a Genbank FJ207405.1
alr2-LH57b Genbank FJ207404.1
alr2-LH57a Genbank FJ207403.1
alr2-LH09b Genbank FJ207400.1
alr2-LH09a Genbank FJ207399.1
alr2-LH08b Genbank FJ207398.1
alr2-LH08a Genbank FJ207397.1

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Matthew Nicotra (matthew.nicotra@pitt.edu).

Materials availability

This study did not generate new reagents. Plasmids generated in this study are available from the Lead Contact upon request.

Data and code availability

This paper analyzes existing, publicly available data. These accession numbers for the datasets are listed in the key resources table. This paper does not report original code. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.

Experimental model and subject details

HEK293T cells (ATCC Cat# CRL-3216) were cultured at 37°C with 5% CO2 in accordance with ATCC guidelines. Complete HEK culture medium was made using DMEM (Fisher Science, SH30081.01), 10% fetal bovine serum (Thermofisher Scientific, #16000044), 0.001% beta-mercaptoethanol (Fisher Scientific, 21-985-023), 100 U/mL penicillin and 100 mg/mL streptomycin (Sigma, P4333-100ML).

Method details

Alr2 sequence acquisition and processing

Alr2 alleles 111A06 and 214E06 were identified from previously published Alr2 sequences (Gloria-Soria et al., 2012). To obtain a dataset of Alr2 domain 1 sequences, we downloaded all 373 Hydractinia symbiolongicarpus Alr2 cDNA sequences from GenBank, aligned them with MAFFT (Katoh et al., 2005) as implemented in Jalview 2.10.5 (Waterhouse et al., 2009), then trimmed the alignment leaving only the region encoding domain 1. Duplicate sequences were then removed with ElimDupes (www.hiv.lanl.gov), to yield 146 distinct domain 1 cDNA sequences, encoding 137 distinct amino acid sequences.

Phylogenetic analysis and ancestral state reconstruction

The 146 domain 1 cDNA sequences were aligned with PRANK (Löytynoja, 2014), a codon-aware alignment program (File S1). The alignment was then used to construct a phylogenetic tree using maximum likelihood through IQ-TREE (http://iqtree.cibiv.univie.ac.at/) (Trifinopoulos et al., 2016) (File S2). From the web portal, the defaults settings were used with codon selected for the sequence type, standard/universal genetic code, ultrafast bootstrap analysis with a maximum of 1000 alignments, 0.99 minimum correlation coefficient, 1000 replicates of SH-aLRT branch test, 0.5 perturbation strength, and 100 set for the IQ-TREE stopping rule. Ancestral states were estimated using the phylogenetic tree generated from IQ-TREE and the ancestral reconstruction function within PRANK (File S3) (Dutheil and Boussau, 2008; Löytynoja, 2014). An unrooted tree was generated using iTOL v5.5.1 with one iteration of equal-daylight (Letunic and Bork, 2019).

Constructs for ectopic expression of Alr2 alleles

The plasmid backbone used for all constructs in this study was the pFLAG-CMV-3 (Sigma, E6783). Previously, it was determined that the N-terminal FLAG tag did not have an effect on the binding capability of Alr2 (Karadge et al., 2015). The Hydractinia Alr2 allele sequences were optimized for human expression using the Integrated DNA Technologies (IDT) Codon Optimization Tool (https://www.idtdna.com/CodonOpt). The full Alr2 sequence (domain 1 in the ectodomain through the cytoplasmic tail) for 111A06 and domain 1 sequences for Anc, 046B, Hap074, and 214 × 106 were ordered as gBlocks Gene Fragments from IDT. All other mutant domain sequences were ordered from Twist Bioscience as Gene Fragments. Coding sequences for fluorescent proteins were cloned from vectors encoding eGFP and mRuby2 (gift from Michael Davidson, Addgene plasmid #54614 (Lam et al., 2012)). Cloning was performed using the NEBuilder HiFi DNA Assembly (New England Biolabs, E2621S) with primers designed to amplify the vector and insert sequences with ≥20 bp overlap. The FLAG-111A06-eGFP/mRuby2 plasmids (pUP801, pUP746) were cloned first and then used as the template for cloning in the other domain 1 isoforms. Within the construct, linker sequences were used before (Leu-Ala-Ala-Ala) and after (Gly-Pro-Pro-Val-Glu-Lys) the Alr2 allele.

Expression of Alr2 alleles in mammalian cells

To prepare plasmids for transfection, plasmids were transformed into chemically competent bacteria and isolated from cultures using the GeneJET Plasmid Midi-prep Kit (Thermofisher Scientific, K0481) or the PureLink HiPure Plasmid Maxiprep Kit (Thermofisher Scientific, K2100006). Plasmids were transiently transfected into HEK293T cells using TransIT-293 (Mirus Bio, MIR 2700) according to the manufacturer's instructions. To summarize, on day 1, HEK293T cells were plated in a 12-well plate (Fisher Scientific, #353043) at a density of 3x105/well in 1 mL of complete HEK medium to achieve approximately 60-70% confluency on Day 2. On Day 2, the transfection mixture was prepared in a total volume of 100 μL using 1 ug (X μl) of plasmid DNA (plasmid concentrations between 300ng and 1000ng/μl), diluted with optiMEM (Gibco, #31985-070) (97-X μl), and 3 μL of TransIT-293 reagent. While incubating the DNA:lipid complexes, the cells were washed using 500 μL of DPBS (Fisher Scientific, BW17-512F), incubated with 1 mL transfection medium (complete HEK medium without antibiotics), and replaced in the 5% CO2 incubator. Once the DNA:lipid complexes had incubated, the 100 μL mixture was added to the appropriate well, the plate gently shaken back and forth and then replaced in the incubator. On Day 4, cells were used in the aggregation assay.

Aggregation assay

Our aggregation protocol is adapted from previous work (Karadge et al., 2015). To summarize, previously transfected HEK293T cells were incubated with 0.25% Trypsin/0.1% EDTA solution (Corning, MT25053CI), washed in complete HEK culture medium, mechanically disrupted via pipette, and filtered through a 35μm strainer mesh (Steller scientific, FSC-FLTCP) to create a single cell suspension. For each aggregation assay, a total of 5x104 cells were resuspended in 500 μL aggregation assay medium (complete HEK medium, 70 U/ml DNase I [Sigma, D4527-10KU], and 2 mM EGTA [Goldbio, E−217-25]) and added to one well of a 24-well ultra-low attachment plate (Fisher Scientific, 07-200-602). When testing isoforms pairwise, 2.5x104 cells of each transfection were added to the same well and resuspended in a total of 500 μL. The plate was incubated for one hour at 37°C in 5% CO2 on an orbital rotator (IBI Scientific, Model# BBUAAUVIS) set at 90 rpm. Assays were visualized using an inverted fluorescence microscope (Nikon Eclipse TS100). Each pairwise assay was repeated at least three times. In cases when the assay results could not be viewed immediately, cell aggregates were fixed by adding 500 μL of 8% paraformaldehyde (Fisher, AA433689M) diluted in DPBS to each well and the results imaged within 5 h. All images and merged images were processed using ImageJ (Abràmoff et al., 2004; Schneider et al., 2012).

Sequence variability and visualization of domain 1

The structure for the Anc domain 1 isoform was predicted using I-TASSER v5.1 (Zhang, 2008; Roy et al., 2010; Yang et al., 2015) which resulted in a domain with a V-set like fold. To visualize the variable positions within domain 1, the aligned 137 protein sequences were uploaded to the Multialign Viewer in UCSF Chimera (Pettersen et al., 2004; Meng et al., 2006) and the conservation rendered onto the structure. Sites under positive selection were identified using MEME (Murrell et al., 2012) and FEL (Pond and Frost, 2005) as implemented in HyPhy 2.5.8 (Pond et al., 2019). Both algorithms were run using synonymous rate variation and significance threshold of p = 0.1, as recommended by the developers (Spielman et al., 2019).

Quantification and statistical analysis

Sites under positive and/or negative selection were identified using statistical tests as implemented in MEME and FEL, at significance thresholds of p = 0.1. No other quantification or statistical analyses were performed in this study.

Acknowledgments

Molecular graphics and analyses were performed with UCSF Chimera, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from NIH P41-GM103311. We thank Kristina Paris for assistance and discussion with running molecular dynamics for the models. M.N. was supported by NSF grant IOS-1557339. A.H. was supported by NIH T32 AI074491.

Author contributions

A.H. and T.C. performed the experiments. A.H. did the data analysis and structural visualizations. A.H. and M.N. designed the experiments and wrote the paper.

Declaration of interests

The authors declare no competing interests.

Published: July 23, 2021

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.isci.2021.102811.

Supplemental information

Table S1. Sites in domain 1 under positive or negative selection, Related to Figure 4
mmc1.xlsx (11.7KB, xlsx)
Data S1–S3. Sequences and alignments
mmc2.zip (10.6KB, zip)
Data S4. High resolution version of Figure S1
mmc3.zip (6.3MB, zip)
Data S5. High resolution version of Figure S2
mmc4.zip (6.1MB, zip)
Data S6. High resolution version of Figure S3
mmc5.zip (2.9MB, zip)

References

  1. Aanen D.K., Debets A.J.M., Visser J.A.G.M. De, Hoekstra R.F. The social evolution of somatic fusion. Bioessays. 2008;30:1193–1203. doi: 10.1002/bies.20840. [DOI] [PubMed] [Google Scholar]
  2. Abràmoff M.D., Magalhães P.J., Ram S.J. Image processing with ImageJ. Biophotonics Int. 2004;7:36–42. [Google Scholar]
  3. Buss L.W. Princeton University Press; 1987. The Evolution of Individuality. [Google Scholar]
  4. Cadavid L.F., Powell A.E., Nicotra M.L., Moreno M., Buss L.W. An invertebrate histocompatibility complex. Genetics. 2004;167:357–365. doi: 10.1534/genetics.167.1.357. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Cao P., Wei X., Awal P., Müller R. A highly polymorphic receptor governs many distinct self-recognition types within the Myxococcales order. MBio. 2019;10:1–15. doi: 10.1128/mBio.02751-18. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Casselton L.A., Olesnicky N.S. Molecular genetics of mating recognition in Basidiomycete fungi. Microbiol. Mol. Biol. Rev. 1998;62:55–70. doi: 10.1128/mmbr.62.1.55-70.1998. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dubuc T.Q., Schnitzler C.E., Chrysostomou E., Mcmahon E.T., Gahan J.M., Buggie T., Gornik S.G., Hanley S., Barreira S.N., Gonzalez P. Transcription factor AP2 controls cnidarian germ cell induction. Science. 2020;367:757–762. doi: 10.1126/science.aay6782. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Dutheil J., Boussau B. 2008. Of Libraries and Programs 12; pp. 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Frank U., Nicotra M.L., Schnitzler C.E. The colonial cnidarian Hydractinia. Evodevo. 2020;11:7–12. doi: 10.1186/s13227-020-00151-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Fujii S., Kubo K., Takayama S. Non-self- and self-recognition models in plant self-incompatibility. Nat. Plants. 2016;2:1–9. doi: 10.1038/NPLANTS.2016.130. [DOI] [PubMed] [Google Scholar]
  11. Gibbs K.A., Greenberg E.P. Territoriality in proteus: advertisement and aggression. Chem. Rev. 2011;111:188–194. doi: 10.1021/cr100051v. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gloria-Soria A., Moreno M.A., Yund P.O., Lakkis F.G., Dellaporta S.L., Buss L.W. Evolutionary genetics of the hydroid allodeterminant alr2. Mol. Biol. Evol. 2012;29:3921–3932. doi: 10.1093/molbev/mss197. [DOI] [PubMed] [Google Scholar]
  13. Gonçalves A.P., Heller J., Rico-ramírez A.M., Daskalov A., Rosenfield G., Glass N.L. conflict, competition, and cooperation regulate social interactions in Filamentous fungi. Annu. Rev. Microbiol. 2020;74:693–712. doi: 10.1146/annurev-micro-012420-080905. [DOI] [PubMed] [Google Scholar]
  14. Goncalves A.P., Heller J., Span E.A., Rosenfiled G., Do H.P., Palma-Guerrero J., Requena N., Marletta M.A., Glass N.L. Allorecognition upon fungal cell-cell contact determines social cooperation and impacts the acquisition of multicellularity article allorecognition upon fungal cell-cell contact determines social cooperation. Curr. Biol. 2019;29:3006–3017. doi: 10.1016/j.cub.2019.07.060. [DOI] [PubMed] [Google Scholar]
  15. Goodman K.M., Yamagata M., Jin X., Mannepalli S., Katsamba P.S., Ahlsen G., Sergeeva A.P., Honig B., Sanes J.R., Shapiro L. Molecular basis of sidekick-mediated cell-cell adhesion and specificity. Elife. 2016;5:1–21. doi: 10.7554/eLife.19058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. James T.Y. Why mushrooms have evolved to be so promiscuous: insights from evolutionary and ecological patterns. Fungal Biol. Rev. 2015;29:167–178. doi: 10.1016/j.fbr.2015.10.002. [DOI] [Google Scholar]
  17. Karadge U.B., Gosto M., Nicotra M.L. Allorecognition proteins in an invertebrate exhibit homophilic interactions. Curr. Biol. 2015;25:2845–2850. doi: 10.1016/j.cub.2015.09.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Kasinrerk W., Tokrasinwit N., Phunpae P. CD147 monoclonal antibodies induce homotypic cell aggregation of monocytic cell line U937 via LFA-1/ICAM-1 pathway. Immunology. 1999;96:184–192. doi: 10.1046/j.1365-2567.1999.00653.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Katoh K., Kuma K., Toh H., Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33:511–518. doi: 10.1093/nar/gki198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Katsamba P., Carroll K., Ahlsen G., Bahna F., Vendome J., Posy S., Rajebhosale M., Price S., Jessell T.M., Ben-Shaul A. Linking molecular affinity and cellular specificity in cadherin-mediated adhesion. Proc. Natl. Acad. Sci. U. S. A. 2009;106:11594–11599. doi: 10.1073/pnas.0905349106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kimura M., Crow J.F. The number of alleles that can be maintained in a finite population. Genetics. 1964;49:725–738. doi: 10.1093/genetics/49.4.725. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Kundert P., Shaulsky G. Cellular allorecognition and its roles in Dictyostelium development and social evolution. Int. J. Dev. Biol. 2019;393:383–393. doi: 10.1387/ijdb.190239gs. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Künzel T., Heiermann R., Frank U., Müller W., Tilmann W., Bause M., Nonn A., Helling M., Schwarz R.S., Plickert G. Migration and differentiation potential of stem cells in the cnidarian Hydractinia analysed in eGFP-transgenic animals and chimeras. Dev. Biol. 2010;348:120–129. doi: 10.1016/j.ydbio.2010.08.017. [DOI] [PubMed] [Google Scholar]
  24. Laird D.J., De Tomaso A.W., Weissman I.L. Stem cells are units of natural selection in a colonial ascidian. Cell. 2005;123:1351–1360. doi: 10.1016/j.cell.2005.10.026. [DOI] [PubMed] [Google Scholar]
  25. Lam A.J., St-pierre F., Gong Y., Marshall J.D., Cranfill P.J., Baird M.A., Mckeown M.R., Wiedenmann J., Davidson M.W., Schnitzer M.J. Improving FRET dynamic range with bright green and red fluorescent proteins. Nat. Methods. 2012;9:1005–1012. doi: 10.1038/NMETH.2171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Lawrence M.J. Population genetics of the homomorphic self-incompatibility polymorphisms in flowering plants. Ann. Bot. 2000;85:221–226. [Google Scholar]
  27. Letunic I., Bork P. Interactive Tree of Life (iTOL) v4: recent updates and. Nucleic Acids Res. 2019;47:256–259. doi: 10.1093/nar/gkz239. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Löytynoja A. Phylogeny-aware alignment with PRANK. Methods Mol. Biol. 2014;1079:155—170. doi: 10.1007/978-1-62703-646-7_10. [DOI] [PubMed] [Google Scholar]
  29. Meng E.C., Pettersen E.F., Couch G.S., Huang C.C., Ferrin T.E. Tools for integrated sequence-structure analysis with UCSF Chimera. BMC Bioinformatics. 2006;7:1–10. doi: 10.1186/1471-2105-7-339. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Murrell B., Wertheim J.O., Moola S., Weighill T., Scheffler K., Pond S.L.K. Detecting Individual sites subject to episodic diversifying selection. PLoS Genet. 2012;8:1–10. doi: 10.1371/journal.pgen.1002764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Nicotra M.L. Invertebrate allorecognition. Curr. Biol. 2019;29:R463–R467. doi: 10.1016/j.cub.2019.03.039. [DOI] [PubMed] [Google Scholar]
  32. Nicotra M.L., Buss L.W. A test for larval kin aggregations. Biol. Bull. 2005;208:157–158. doi: 10.2307/3593147. [DOI] [PubMed] [Google Scholar]
  33. Nicotra M.L., Powell A.E., Rosengarten R.D., Moreno M., Grimwood J., Lakkis F.G., Dellaporta S.L., Buss L.W. A hypervariable invertebrate allodeterminant. Curr. Biol. 2009;19:583–589. doi: 10.1016/j.cub.2009.02.040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Nydam M.L., Stephenson E.E., Waldman C.E., Tomaso A.W. De. Balancing selection on allorecognition genes in the colonial ascidian Botryllus schlosseri. Dev. Comp. Immunol. 2017;69:60–74. doi: 10.1016/j.dci.2016.12.006. [DOI] [PubMed] [Google Scholar]
  35. Paoletti M. Vegetative incompatibility in fungi: from recognition to cell death, whatever does the trick. Fungal Biol. Rev. 2016;30:152–162. doi: 10.1016/j.fbr.2016.08.002. [DOI] [Google Scholar]
  36. Pathak D.T., Wei X., Dey A., Wall D. Molecular recognition by a polymorphic cell surface receptor governs cooperative behaviors in bacteria. PLoS Genet. 2013;9:1–12. doi: 10.1371/journal.pgen.1003891. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Pettersen E.F., Goddard T.D., Huang C.C., Couch G.S., Greenblatt D.M., Meng E.C., Ferrin T.E. UCSF Chimera — a visualization system for exploratory research and analysis. J. Comput. Chem. 2004;25:1605–1612. doi: 10.1002/jcc.20084. [DOI] [PubMed] [Google Scholar]
  38. Podgornaia A.I., Laub M.T. Pervasive degeneracy and epistasis in a protein-protein interface. Science. 2015;347:673–677. doi: 10.1126/science.1257360. [DOI] [PubMed] [Google Scholar]
  39. Pond S.L.K., Frost S.D.W. Not so different after all: a comparison of methods for detecting amino acid sites under selection. Mol. Biol. Evol. 2005;22:1208–1222. doi: 10.1093/molbev/msi105. [DOI] [PubMed] [Google Scholar]
  40. Pond S.L.K., Poon A.F.Y., Velazquez R., Weaver S., Hepler N.L., Murrell B., Shank S.D., Magalis B.R., Bouvier D., Nekrutenko A. HyPhy 2.5 — a Customizable platform for evolutionary hypothesis testing using Phylogenies. Mol. Biol. Evol. 2019;37:295–299. doi: 10.1093/molbev/msz197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Povolotskaya I.S., Kondrashov F.A. Sequence space and the ongoing expansion of the protein universe. Nature. 2010;465:922–926. doi: 10.1038/nature09105. [DOI] [PubMed] [Google Scholar]
  42. Powell A.E., Moreno M., Gloria-soria A., Lakkis F.G., Dellaporta S.L., Buss L.W. Genetic background and allorecognition phenotype in Hydractinia symbiolongicarpus. G. 2011;1:499–503. doi: 10.1534/g3.111.001149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Powell A.E., Nicotra M.L., Moreno M.A., Lakkis F.G., Dellaporta S.L., Buss L.W. Differential effect of allorecognition loci on phenotype in Hydractinia symbiolongicarpus (Cnidaria: Hydrozoa) Genetics. 2007;177:2101–2107. doi: 10.1534/genetics.107.075689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Richman A.D., Kohn J.R. Evolutionary genetics of self-incompatibility in the Solanaceae. Plant Mol. Biol. 2000;42:169–179. [PubMed] [Google Scholar]
  45. Rosa S.F.P., Powell A.E., Rosengarten R.D., Nicotra M.L., Moreno M.A., Grimwood J., Lakkis F.G., Dellaporta S.L., Buss L.W. Hydractinia allodeterminant alr1 resides in an immunoglobulin superfamily-like gene complex. Curr. Biol. 2010;20:1122–1127. doi: 10.1016/j.cub.2010.04.050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Roy A., Kucukural A., Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat. Protoc. 2010;5:725–738. doi: 10.1038/nprot.2010.5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Rubinstein R., Thu C.A., Goodman K.M., Wolcott H.N., Bahna F., Mannepalli S., Ahlsen G., Chevee M., Halim A., Clausen H. Molecular logic of neuronal self-recognition through protocadherin domain interactions. Cell. 2015;163:629–642. doi: 10.1016/j.cell.2015.09.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Sanders S.M., Ma Z., Hughes J.M., Riscoe B.M., Gibson G.A., Watson A.M., Flici H., Frank U., Schnitzler C.E., Baxevanis A.D., Nicotra M.L. CRISPR/Cas9-mediated gene knockin in the hydroid Hydractinia symbiolongicarpus. BMC Genomics. 2018;19:1–17. doi: 10.1186/s12864-018-5032-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Schneider C.A., Rasband W.S., Eliceiri K.W. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods. 2012;9:671–675. doi: 10.1038/nmeth.2089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Schreiner D., Weiner J.a. Combinatorial homophilic interaction between gamma-protocadherin multimers greatly expands the molecular diversity of cell adhesion. Proc. Natl. Acad. Sci. U. S. A. 2010;107:14893–14898. doi: 10.1073/pnas.1004526107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Spielman S.J., Weaver S., Shank S.D., Magalis B.R., Li M., Pond S.L.K. Evolutionary Genomics. Methods in Molecular Biology. 2019. Evolution of viral genomes: interplay between selection, recombination, and other forces; pp. 427–468. [DOI] [PubMed] [Google Scholar]
  52. Stoner D.S., Rinkevich B., Weissman I.L. Heritable germ and somatic cell lineage competitions in chimeric colonial protochordates. Proc. 1999;96:9148–9153. doi: 10.1073/pnas.96.16.9148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Stoner D.S., Weissman I.L. Somatic and germ cell parasitism in a colonial ascidian: possible role for a highly polymorphic allorecognition system. Proc. Natl. Acad. Sci. U. S. A. 1996;93:15254–15259. doi: 10.1073/pnas.93.26.15254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Thu C.A., Chen W.V., Rubinstein R., Chevee M., Wolcott H.N., Felsovalyi K.O., Tapia J.C., Shapiro L., Honig B., Maniatis T. Single-cell identity generated by combinatorial homophilic interactions between alpha, beta,and gamma protocadherins. Cell. 2014;158:1045–1059. doi: 10.1016/j.cell.2014.07.012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Trifinopoulos J., Nguyen L., Haeseler A.V., Minh B.Q. W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis. Nucleic Acids Res. 2016;44:232–235. doi: 10.1093/nar/gkw256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Waterhouse A.M., Procter J.B., Martin D.M.A., Clamp M., Barton G.J. Jalview Version 2 — a multiple sequence alignment editor and analysis workbench. Bioinformatics. 2009;25:1189–1191. doi: 10.1093/bioinformatics/btp033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Weinreich D.M., Delaney N.F., DePristo M.A., Hartl D.L. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science. 2006;312:111–114. doi: 10.1126/science.1123539. [DOI] [PubMed] [Google Scholar]
  58. Wright S. The distribution of self-sterility alleles in populations. Genetics. 1939;24:538–552. doi: 10.1093/genetics/24.4.538. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Yang J., Yan R., Roy A., Xu D., Poisson J., Zhang Y. The I-TASSER suite: protein structure and function prediction. Nat. Publ. Gr. 2015;12:7–8. doi: 10.1038/nmeth.3213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Yund P., Cunningham C.W., Buss L.W. Recruitment and postrecruitment interactions in a colonial hydroid. Ecology. 1987;68:971–982. doi: 10.2307/1938368. [DOI] [Google Scholar]
  61. Zhang Y. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics. 2008;9:1–18. doi: 10.1186/1471-2105-9-40. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Table S1. Sites in domain 1 under positive or negative selection, Related to Figure 4
mmc1.xlsx (11.7KB, xlsx)
Data S1–S3. Sequences and alignments
mmc2.zip (10.6KB, zip)
Data S4. High resolution version of Figure S1
mmc3.zip (6.3MB, zip)
Data S5. High resolution version of Figure S2
mmc4.zip (6.1MB, zip)
Data S6. High resolution version of Figure S3
mmc5.zip (2.9MB, zip)

Data Availability Statement

This paper analyzes existing, publicly available data. These accession numbers for the datasets are listed in the key resources table. This paper does not report original code. Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request.


Articles from iScience are provided here courtesy of Elsevier

RESOURCES