Skip to main content
NASA Author Manuscripts logoLink to NASA Author Manuscripts
. Author manuscript; available in PMC: 2022 Feb 25.
Published in final edited form as: Curr Biol. 2017 Nov 9;27(22):3568–3575.e3. doi: 10.1016/j.cub.2017.10.008

Idiosyncratic genome degradation in a bacterial endosymbiont of periodical cicadas

Matthew A Campbell a, Piotr Łukasik a, Chris Simon b, John P McCutcheon a,1
PMCID: PMC8879801  NIHMSID: NIHMS933338  PMID: 29129532

Summary

When a free-living bacterium transitions to a host-beneficial endosymbiotic lifestyle, it almost invariably loses a large fraction of its genome [1, 2]. The resulting small genomes often become unusually stable in size, structure, and coding capacity [35]. Candidatus Hodgkinia cicadicola (Hodgkinia), a bacterial endosymbiont of cicadas, sometimes exemplifies this genomic stability. The Hodgkinia genome has remained completely co-linear in some cicadas diverged by tens of millions of years [6, 7]. But in the long-lived periodical cicada Magicicada tredecim, the Hodgkinia genome has split into dozens of tiny, gene-sparse genomic circles that sometimes reside in distinct Hodgkinia cells [8]. Previous data suggested that other Magicicada species harbor similarly complex Hodgkinia populations, but the timing, number of origins, and outcomes of the splitting process were unknown. Here, by sequencing Hodgkinia metagenomes from the remaining six Magicicada species and two sister species, we show that all Magicicada species harbor Hodgkinia populations of at least twenty genomic circles each. We find little synteny among the 256 Hodgkinia circles analyzed except between the most closely related species. Individual gene phylogenies show that Hodgkinia first split in the common ancestor of Magicicada and its closest known relatives, but that most splitting has occurred within Magicicada and has given rise to highly variable Hodgkinia gene dosages between cicada species. These data show that Hodgkinia genome degradation has proceeded down different paths in different Magicicada species, and support a model of genomic degradation that is stochastic in outcome and likely nonadaptive for the host. These patterns mirror the genomic instability seen in some mitochondria.

Keywords: Hodgkinia cicadicola, Magicicada, endosymbiosis, nonadaptive evolution, organelle genomes, mutation, periodical cicadas, levels of selection

Results

Like all sap-feeding insects, cicadas depend on specialized endosymbiotic microorganisms for supplementation of their nutrient-poor diet of plant sap [911]. One of these microbes, an alphaproteobacterium called Hodgkinia, is associated with all reported cicadas [10]. As is typical for bacterial endosymbiont genomes, Hodgkinia’s genome is extremely reduced (~150kb at its largest in some cicadas), rendering it completely dependent on its cicada host and partner bacterial endosymbiont Sulcia for basic biological function [4]. Despite a very high rate of sequence evolution [8], Hodgkinia genomes can be remarkably stable in structure and gene content, changing little in cicadas diverged by tens of millions of years [7, 12]. In other cicadas, Hodgkinia has evolved into two or more distinct but codependent genomic and cellular lineages that are present in individual hosts, which have undergone reciprocal gene inactivation and loss [7]. We refer to this unusual process as “splitting”. We have shown that in the long-lived periodical cicada Magicicada tredecim, Hodgkinia has split into dozens of small distinct genomic circles encoding few recognizable genes [8]. Here we analyze the Hodgkinia metagenomes of the six remaining Magicicada species in order to understand the timing and outcome of this splitting process across the cicada genus.

Hodgkinia is complex in all Magicicada species

Our new sequencing data confirm [8] that Hodgkinia is comprised of many distinct genomic circles in all species of Magicicada (Fig. 1, Tables 1, S1). We refer to individual circular-mapping Hodgkinia genomic contigs as ‘circles’ because, though we know some reside in distinct cells [8], we currently do not know whether most of these molecules are chromosomes that share the same cell or genomes representing different cell types [8]. We refer to the total complement of Hodgkinia contigs assembled in a single species of cicada—whether closed into circles or not—as that species’ Hodgkinia Genome Complex (HGC). The smallest HGC is found in M. tredecula and consists of at least 153 contigs totaling 1.20 megabases (Mb) of DNA, while the largest is from M. neotredecim and consists of 215 contigs totaling 1.58 Mb (Table 1). In each Magicicada species, we identified between 26 and 42 contigs with large-insert mate pair data suggesting they were circular DNA molecules. We were able to fully close these contigs into circular molecules in at least 20 instances in all cicada species (Fig.1, Table 1). Contigs with mate pair, PCR, and/or Sanger sequence data supporting their circularity were considered putative circles if they were not fully closed (Fig. 1). The combination of confirmed and putative circular molecules comprised between 51.4% and 72.5% of the total DNA in each HGC (the remaining contigs lack end-joining data; Fig. 1). Individual completed circles range in size from 0.69kb to 70.5kb, contain a maximum of 27 genes, a minimum of 1 gene (with a single exception, a circle encoding only a single pseudogene), and span as much as a 653-fold range of sequencing coverage (Table S1). There is an even higher range in coverages for contigs that did not assemble into circular molecules: we find that contigs from Magicicada HGCs span at least a 2,500-fold difference in average sequencing coverage in each Magicicada species (Table 1).

Figure 1. Hodgkinia genomic complexity in all study species.

Figure 1

Left: Phylogeny of the cicada species used in this study, based on all 13 protein-coding and both ribosomal genes from the mitochondrial genomes. The cicada Diceroprocta semicincta was used to root the tree, but was not included in the figure. Bootstrap support values are shown on each resolved node. Right: Diagrams representing the confirmed and putatively circular molecules of the HGC in all study species. Rows with an asterisk at the end represent putative circular molecules. On each circle, red regions represent rRNA genes, green represents histidine synthesis genes, orange represents cobalamin synthesis genes, purple represents methionine synthesis genes, blue represents all other genes, and white space represents noncoding DNA. Values in parentheses indicate the proportion of total Hodgkinia DNA from each cicada species represented by the circular molecules. The three species groups are annotated next to the species labels.

Table 1. Summary statistics for all HGCs described in this work.

Total HGC size is a sum of all Hodgkinia contigs no matter whether they are closed into circular molecules or not. The number of unique genes found in other Hodgkinia genomes range from 168–183.

Species Species
abbreviation
Number
of contigs
Total
HGC
size
(Mb)
Total #
of
circles
Cumulative size
of circles (Mb)
Unique
genes
Total
genes
Fold-
coverage
difference
M. cassini MAGCAS 118 1.27 29 0.77 142 306 6,376
M. tredecassini MAGTCS 201 1.42 26 0.73 145 316 5,494
M. septendecula MAGSDC 119 1.22 27 0.71 139 297 4,827
M. tredecula MAGTDC 153 1.20 27 0.68 140 318 3,189
M. neotredecim MAGNEO 215 1.68 41 1.00 139 333 3,379
M. septendecim MAGSEP 166 1.64 39 1.11 137 314 5,723
M. tredecim MAGTRE 118 1.53 42 1.11 135 305 2,500
T. crassa TRYCRA 106 1.16 14 0.26 137 203 947
A. curvicosta ALECUR 138 0.95 11 0.35 136 199 830

We identified between 135 and 145 of the 186 unique protein- and RNA-coding genes annotated as functional in at least one of D. semicincta and T. ulnaria in each Magicicada HGC [7, 12]. In all species, several additional genes were identified as truncated fragments or obvious pseudogenes. Because of the very low coverage of some contigs and the extremely rapid rate of Hodgkinia sequence evolution [8], it it is likely that many of the remaining genes are present but were either not fully assembled or are not recognizable due to their low sequence similarity to other annotated Hodgkinia genes.

We also sequenced Hodgkinia from two cicada species that are closely related to Magicicada [13, 14] (Fig. 1, Table 1). The HGCs from Australian cicada species Aleeta curvicosta and Tryella crassa are similar in many ways to HGCs from Magicicada, but somewhat less complex. However, we generated less sequencing data for these species (~8.7 gigabases (Gb) for A. curvicosta, ~1.5 Gb for T. crassa, compared with an average of ~30.5 Gb per species of Magicicada) and so this relative simplicity may be due to sequencing effort. We therefore primarily focus our analyses on the Hodgkinia genomes from Magicicada species. Nevertheless, data from these outgroup species allowed us to infer whether Hodgkinia lineage splitting began in Magicicada or whether independent Hodgkinia lineages existed before Magicicada diverged from its common ancestor.

The origin of Hodgkinia lineage splitting predates the diversification of the genus Magicicada

To determine whether Hodgkinia lineage splitting started in the Magicicada genus or predated its origin, we reconstructed phylogenetic trees for Hodgkinia genes present in multiple copies in at least one of the two outgroup species. If splitting began independently in each genus, we would expect phylogenetic trees inferred from individual Hodgkinia genes to be monophyletic within each cicada genus (Fig. S1A). However, if splitting predated the divergence of Magicicada from Tryella/Aleeta, then gene trees should show two or more strongly supported monophyletic clades, each consisting of copies of genes from Magicicada along with Tryella and Aleeta (Figs. S1B–D). Though we see some cases where gene trees form monophyletic groupings within cicada genera (Fig. S1A), we also find several instances where gene phylogenies reveal two (Fig. S1B–C) or three (Fig. S1D) well-supported clades that group Magicicada genes with at least one gene copy from Tryella or Aleeta. It is possible to see both patterns because not all redundant genes from split lineages are retained in the new lineages [7]. Overall these patterns show that lineage splitting in Hodgkinia began before the Magicicada, Tryella, and Aleeta diverged from one another. We estimate that the last common ancestor of these genera had a minimum of three Hodgkinia lineages (Fig. S2A–C), similar to the complexity of Hodgkinia in the cicada Tettigades undata [7].

Hodgkinia lineage splitting is ongoing in Magicicada species groups

Having found evidence that Hodgkinia splitting had started prior to the divergence of Magicicada from its ancestor with Tryella and Aleeta, we tried to assess whether most circular molecules were formed prior to the diversification of Magicicada and were conserved throughout the genus, or if lineage splitting is a process that has been ongoing since the origin of Magicicada. If most lineage splitting occurred in the ancestor to all Magicicada, then most gene copies should have representatives in all extant cicada species. If most lineage splitting occurred within species groups, then many gene copies should be unique to species groups and not shared throughout the genus. Gene phylogenies generated with six representative Hodgkinia genes show multiple but relatively few well-supported clades with representatives of all three Magicicada species groups (Fig. S2). In some cases, we identified only one gene copy per HGC, with gene phylogenies that recapitulate host phylogeny (Fig. S2A). This pattern suggests that there was a single copy of the gene in the last common ancestor of Magicicada, and that the genomic circle it was on may not have undergone splits. In other cases, the single ancestral gene copy co-diversified with hosts, but also underwent splits in some species groups (Fig. S2B). Trees of other genes form between two and five highly supported clades that include copies from all species groups (Fig. S2C–F), showing evidence for variable amount of additional splitting after species groups diverged (Fig. S2G). These phylogenies show that a minimum of five distinct Hodgkinia lineages existed in the last common ancestor of Magicicada.

These data suggest that most of the splitting we see in Fig. 1 happened after Magicicada started to diversify. If this is true, we expect that the similarity of HGCs should diminish as a function of cicada phylogenetic distance. In comparing extant circular molecules between cicada species, we find few clearly homologous circles with identical gene sets conserved in all Magicicada species. Because comparative genomic methods are generally based on sequence similarity and synteny comparisons, and we found little obvious synteny to compare, we developed a metric based on the Jaccard Index [15] to quantify the similarity in gene content of the finished circles between cicada species. We call this metric the Hodgkinia Similarity Index (HSI, Fig. 2). We calculate the HSI as follows, for hypothetical circular molecules A and B:

HSI=Genes in AGenes in BGenes in AGenes in B×Length (in bp) of smaller of A and BLength (in bp) of larger of A and B (Equation1)

Figure 2. HSI scores for individual circles and HGCs.

Figure 2

(A) Illustration of synteny conservation between homologous circular molecules. Shown is the reference circle MAGCAS001 and the circle most similar to it from all other Magicicada species (abbreviations are taken from Table 1). Horizontal black lines represent the genome backbone, orange boxes are genes shared between a given circular molecule and MAGCAS001. Blue bars represent genes present in a given circular molecule but not present on MAGCAS001. Shaded vertical lines show gene homologs present on different circles, black lines connect putative homologs over gaps in some genomes. HSI scores between the reference and all circular molecules are shown on the right. Numbers on the phylogenies represent inferred mutational events on the respective lineage: (1) genome rearrangement, (2) and (3) individual gene loss events, (4) loss of five genes. The exact branch on which (4) occurred is ambiguous. Three contigs from M. tredecim seem to be homologous to the reference circle when joined together, but we could not close them to a single circle so they were not included in the HSI analysis. (B) Distribution of all HSI scores for M. cassini. The x-axis shows each species M. cassini was compared with, and the y-axis shows the HSI score. The bold orange line represents the circles shown in (A). (C) Heatmap showing pairwise average HSI scores between all species. (D) Heatmap showing pairwise average Jaccard Index of the whole Hodgkinia gene set in each species. In both B and C, a score of one indicates that the two species are identical, and zero indicates that they share no genes in common. All trees are taken from Figure 1.

Briefly, a finished circular molecule of one cicada species is compared to a circular molecule of another cicada species. We calculate the Jaccard Index of the two gene sets (the intersection of gene sets divided by the union, the left half of Eq. 1) and multiply that by the ratio of the length of the smaller circle divided by the length of the larger one (right half of Eq. 1). We calculate this pairwise value for all circles of a species and report the average HSI score between those two cicada species. We then repeat this for all pairwise comparisons of cicada species. An HSI value of one indicates the two circles have the same functional genes and are the same length, whereas a value of zero indicates they share no common genes. Because the circles have on average very low coding densities and have apparently undergone rearrangements in some cases (Fig. 2A), this metric does not take gene co-linearity into account. It is also not (necessarily) a true measure of homology since it does not distinguish between conservation of an ancestral circle and convergent evolution to a similar state. Rather it is a rough metric to score the overall similarity of HGCs between cicada species in the absence of much calculable similarity (Fig. 2B). It is also a conservative metric, since there will undoubtedly be homologous circular molecules that were not completely assembled and thus not calculated in the HSI.

We find a strong phylogenetic signal in HSI scores, where HGCs between species pairs (M. cassini–M. tredecassini, M. septendecula–M. tredecula, and M. septendecim–M. neotredecim) are highly similar to one another (0.80 HSI on average, Fig. 2C). This is expected given that each of these species pairs are estimated to have diverged from each other less than 50 thousand years each ago [13]. The HSI scores degrade quickly with increased phylogenetic distance, however. Pairwise comparisons between M. tredecim-M. neotredecim and M. tredecim-M. septendecim (500 thousand years diverged [13]), Cassini species with Decula species (2.5 million years diverged [13]), and Cassini and Decula species with Decim species (4 million years diverged [13]) give average HSI scores of 0.43, 0.46, and 0.29, respectively. This lack of similarity is remarkable given that the HSI between the single Hodgkinia genomes of Diceroprocta semicincta and Tettigades ulnaria, which diverged more than 60 million years ago [1619], is 0.88.

Our combined phylogenetic and HSI analyses suggest that splitting began in the ancestor of Magicicada, Tryella, and Aleeta (into 2–3 circles), continued somewhat in the ancestor of all Magicicada (into at least 5–6 circles), but that splitting accelerated dramatically (into 20+circles) after Magicicada began diversifying.

Hodgkinia’s overall function mostly remains intact

The long-term stability of endosymbiont genomes is often attributed to the importance of their function to host survival [3, 20, 21]. Since Hodgkinia is clearly experiencing dramatic genomic instability, we wanted to test whether the complete ancestral Hodgkinia gene set was retained in HGCs in different Magicicada species. To directly compare gene complements between Hodgkinia HGCs, and to be consistent with the HSI, we calculated the Jaccard Index of each gene set for all pairwise comparisons of all Magicicada species. Similar to the HSI, a score of 1 would indicate that two cicada species have identical Hodgkinia gene sets, and a score of 0 would indicate that no genes are shared. We find that HGC gene sets within closely related species pairs are very similar (0.90 on average, Fig. 2D). Pairwise comparisons between M. tredecim-M. neotredecim and M. tredecim-M. septendecim (0.86), Cassini species with Decula species (0.87), and Cassini and Decula species with Decim species (0.86) also remain very similar, in contrast to the HSI scores calculated for these comparisons (compare Fig. 2C to 2D). These data indicate that while the patterns of Hodgkinia genome fragmentation is different in divergent Magicicada species, the overall set of retained genes is similar. For a sense of scale, the same analysis for cicadas diverged for dozens of millions of years [1619], such as Magicicada and D. semicincta, Magicicada and T. ulnaria, and D. semicincta and T. ulnaria gives values of 0.82, 0.77, and 0.92, respectively. We note again that not all Hodgkinia genes present in Magicicada may have fully assembled due to the complexity of the dataset, so the true values for Magicicada HGCs may be higher than what we report here.

Lineage splitting leads to different gene dosages

It seemed possible that lineage splitting in Hodgkinia might be beneficial for the host, perhaps as a mechanism to control the dosage of Hodgkinia genes. Under this hypothesis, we would expect similar gene dosages in comparisons of various Magicicada species. To calculate gene dosage in an HGC, we summed the average coverage of all contigs on which a given functional gene is found, scaled to the most abundant gene for each species. We find that the relative abundances of genes are similar within species groups (cicadas diverged less than 50 thousand years ago [13]), but not between species groups (Fig. 3). Principal coordinates analysis of relative gene abundances of all genes present in any species clusters the Decula and Cassini species groups together, M. neotredecim and M. septendecim together, and the remaining species – including M. tredecim separated from M. neotredecim-septendecim by only 0.5 Mya [13] – separately (Fig. S3A). This can be more clearly seen when only considering genes annotated in all species (Fig. S3B). This grouping is qualitatively similar to the HSI results, and suggests there is not the convergent pattern of gene dosage outcomes that might be expected if the host was dictating the process. Rather the gene dosage outcomes are stochastic and thus only similar in comparisons between very closely related cicadas.

Figure 3. Relative gene abundance in all study species.

Figure 3

Heatmaps showing the relative abundance of each gene in each species, ordered by gene category. A value of one (black) indicates the most abundant gene in that species, and zero (white) indicates that gene is absent in that species. Columns that are completely white represent genes that were not annotated in any species, so have either been lost, are present on broken contigs, or are present on contigs that did not otherwise assemble in our experiments. Trees are taken from Figure 1.

Discussion

Many endosymbioses consist of two or more partners that are strictly reliant on one another for survival. The eukaryotic cell is now completely intertwined with and dependent upon its mitochondria (with one known exception [22]), and mitochondria cannot survive outside of their host cell. Similarly, cicadas require Hodgkinia for survival, and it is unlikely that Hodgkinia could survive outside of cicada cells. Despite these obligate host-symbiont co-dependencies, each partner can experience selection and drift independently of the other, so their evolutionary trajectories are not inevitably aligned and may directly conflict with one another [2332]. Although the engulfed partner is capable of exerting selfish tendencies in some cases [3335], there exist several mechanisms for the host to constrain the evolution of its symbionts [3639]. In bacterial endosymbioses, this host-level constraint is often reflected in the genomic stasis of the bacterial partner. Endosymbiont genomes can remain stable in gene content and structure for tens [3], hundreds [12], or even thousands [5, 40] of millions of years, and we interpret this stability as a reflection of host-dominated evolution for the preservation of endosymbiont function.

However, secondary genome instability subsequent to this stasis is now recognized as relatively common, especially in mitochondria [41, 42]. Mitochondrial genomic instability manifests both as genome reduction [43, 44] that sometimes leads to outright genome loss [4549], and genome fragmentation [5053] that sometimes leads to massive genome expansion with little obvious functional change [5457]. We suggest that what unites these starkly different outcomes is a shift away from the host-driven constraint of the endosymbiont genome towards (sometimes temporary) symbiont-driven instability. In cases of mitochondrial reduction and loss, the host ecology changes such that the function of the organelle is no longer needed and therefore no longer under selective constraint from the host [4648]. For example, many eukaryotes that live in anaerobic environments no longer require the oxidative respiratory function of their mitochondria, so the genes for this process are free to be lost [44]. The forces promoting mitochondrial genome fragmentation and expansion are less clear, but these expansions sometimes seem to be associated with increases in mitochondrial mutation rates [55] and have been hypothesized to result from less efficacious host-level selection against slightly deleterious symbiont mutations [57, 58].

Depending on whether one takes a Hodgkinia- or cicada-centric perspective, the outcomes we report here could be interpreted either as a genome reductive or genome expansive process [7, 8]. From the perspective of individual Hodgkinia lineages, each circular molecule in Magicicada gets smaller after a split, eventually resulting in circles less than half the size and encoding fewer than ~30% of the genes that were present on the already tiny ancestral Hodgkinia genome (Table S1)[6, 7]. This reductive process likely reflects the deletional mutation bias of bacteria [59], which in part explains the extremely small size of bacterial endosymbiont genomes in general [1]. In Hodgkinia, newly split lineages resemble a gene duplication event [7] that often results in one gene copy being pseudogenized and eventually lost entirely. Hodgkinia’s splitting and deletion process leads to individual circular molecules that resemble the extremely degraded genomes of mitochondria found in some eukaryotes. The idiosyncratic nature of these circles in closely related cicada species (Fig. 2C) is consistent with stochastic mutational loss and suggests a process with no particular goal or end point. But an important difference between cases of mitochondrial genome reduction and Hodgkinia from Magicicada is that the host ecology has not changed such that Hodgkinia’s functions are no longer required. The massive gene loss on individual Hodgkinia circles is likely only tolerable because, from the host’s perspective, the combined HGCs seem to have retained Hodgkinia’s overall nutritional contribution to the symbiosis (Fig. 2D). From the host perspective, this splitting and genome reductive process results in a combined Hodgkinia “genome” size over an order of magnitude larger than the ancestral single genome (Table 1).

In our view, the most interesting parallel to what we report here for Hodgkinia can be found in the mitochondrial genomes of the angiosperm genus Silene [55, 60]. Like many plants, some Silene mitochondrial genomes consist of a single “master circle” with multiple “subcircles” that arise primarily from recombination [61]. Other Silene species, though, have experienced dramatic increases in mitochondrial mutation rates, which seem to be accompanied by the expansion to dozens of enormous mitochondrial chromosomes [55]. These mitochondrial chromosomes, some encoding few or no detectable genes, can be rapidly lost or gained in closely related Silene lineages [60]. Like Hodgkinia, this diversity of genome expansion outcomes in closely related plant hosts is not accompanied by any detectable increase in functional capacity. We previously hypothesized that the increased complexity of Hodgkinia in Magicicada results from a similar increased effective mutation rate in Hodgkinia [8], with a conceptual modification related to lifecycle changes of the host cicada. While Hodgkinia’s inherent mutation rate may not be different in various cicada hosts, longer host lifecycles such as the 13- or 17-year lifecycle of Magicicada [62] may allow more symbiont generations and thus more Hodgkinia mutations per host lifecycle. We hypothesize that this increase in effective mutation rate enables Hodgkinia’s lineage splitting process and eventually results in stochastic differences between HGCs from different species groups at the level of genome structure (Fig. 2C). While Hodgkinia genes are (mostly) maintained in all HGCs, they are now present at wildly different abundances in different cicada species groups (Fig. 3). We hypothesize that lineage splitting and changes in gene dosages are either maladaptive or neutral for the host. The cicada does not benefit from Hodgkinia degeneration but must tolerate it because the cicada is wholly dependent on Hodgkinia for survival.

Materials and Methods

DNA extraction

Bacteriomes were dissected from a single male of T. crassa, a single female of A. curvicosta and M. tredecim, and two females of the remaining species. DNA was extracted from all dissected bacteriomes using a DNeasy Blood and Tissue kit (Qiagen catalog #69506). Extracted DNA was stored at −20C.

Library preparation and sequencing

Genomic DNA from M. tredecim was sheared to an average fragment size of 550 base pairs using a Covaris E220. Sheared DNA was made into a sequencing library using the NEBNext Ultra DNA Library Prep Kit for Illumina (catalog #E7370S), according to the standard protocol. The library was sequenced at the University of Montana Genomics Core on a MiSeq benchtop sequencer with a v3 600 cycle kit.

Genomic DNA from A. curvicosta was sheared to an average size of 480 base pairs using a Covaris E220. Sheared DNA was made into a sequencing library using a TruSeq PCR-free kit (Illumina) and sequenced as ~1/12 of a multiplexed lane at NGX Bio in San Francisco, CA using a HiSeq 2500 Rapid SBS kit (Illumina).

Genomic DNA from T. crassa was sheared to an average of 570 base pairs using a Covaris E220. Sheared DNA was made into a sequencing library using the NEBNext Ultra DNA Library Prep Kit for Illumina (catalog #E7370S), according to the standard protocol. The library was sequenced as ~1/4 of a multiplexed lane at the University of Montana Genomics Core on a MiSeq benchtop sequencer with a v3 600 cycle kit.

Genomic DNA from M. neotredecim, septendecim, tredecassini, cassini, tredecula, and septendecula was sheared to an average of 500 base pairs using a Covaris E220. Sheared DNA was made into a sequencing library using the NEBNext Ultra DNA Library Prep Kit for Illumina (catalog #E7370S), according to the standard protocol. Libraries were sequenced on two lanes on a HiSeq 2500 with 250 cycles at the Johns Hopkins School of Medicine Genetic Resources Core Facility.

Genomic DNA from M. neotredecim, septendecim, tredecassini, cassini, tredecula, and septendecula was used for making libraries with a Nextera Mate Pair Sample Prep Kit (Illumina), according to the standard protocol. These libraries were sequenced on a single lane on a HiSeq 2500 with 100 cycles at the Case Western Reserve University Genomics Core Facility.

Genome assembly and annotation

Raw reads were quality filtered using Trimmomatic version 0.35 [63]. Remaining reads were further filtered using fastq_quality_filter from FASTX version 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit/).

Assembly of the filtered reads was done using Spades version 3.6.2 [64], using kmer sizes 127, 151, 191, and 291, both individually as well as combined together. Putative Hodgkinia contigs were identified with TBLASTN 2.2.31+ [65] with an E-value cutoff of 10e-5 using previously annotated Hodgkinia genes as the query. To remove redundant contigs, all putatively Hodgkinia contigs were queried against themselves, and any contig >=97% identical to another over >= 80% of its length was considered redundant and removed. Any contigs with BLASTN E-values less than 10e-10 to the mitochondrial genome were also removed. Coverage of individual contigs was calculated by the total coverage at each base, divided by the length of the contig. Completely assembled Hodgkinia circles were identified based on sequence overlap on both ends of the contig. To identify putative circular contigs, filtered paired end and mate pair reads were mapped back to the assembly using BWA version 0.7.12-r1039 [66] with default parameters. Contigs were considered putatively circular if more than five read pairs mapped with one mate mapping in the first 10% of the contig, while its mate mapped in the last 10% of the contig. Putatively circular contigs were then closed when possible by PCR and Sanger sequencing.

Annotation of the Hodgkinia circles was done using a custom Python pipeline based around the Jackhmmer module of HMMER v. 3.1b2 [67], RNAmmer 1.2 [68], and Aragorn v1.2.36 [69]. Occasionally RNAmmer misannotated the 23S rRNA gene, so barrnap 0.6 [70] was used for corrections. The completely closed Hodgkinia circles were then checked manually for any long open reading frames that could contain missing genes.

Phylogenetic analysis

Host phylogeny was reconstructed using RAxML v. 8.2.0 [71]based on manually inspected alignments of 15 mitochondrial genes (13 protein-coding and two rRNA) of the total length of 12744 bp, divided into four partitions corresponding to three codon positions and to rRNA genes. Rapid bootstraping (100 replicates) was used to estimate node support.

To construct individual gene phylogenies, homologous nucleotide sequences were translated into amino acids and aligned using mafft v. 7.221[74]. Visually inspected alignments were analyzed using RAxML v. 8.2.4 [71] using a PROTGAMMAWAG model of amino acid substitution and 100 bootstrap replicates. Trees were rooted using Aleeta-Tryella as outgroups (whenever they formed a single monophyletic clade), or alternatively on the longest branch separating well-supported clades that included species from all or most hosts in a comparison.

Comparative Hodgkinia genome analysis

To compare the homology of HGC circles between cicada species, a Hodgkinia Similarity Index (HSI) score was calculated for all pairwise comparisons of all circles, as explained in Results. The pair with the highest HSI score was kept for each circle.

To determine relative coverage of all Hodgkinia genes, the coverage of all Hodgkinia genes was summed based on the coverage of the contig on which it was annotated. These abundance values were then normalized based on the most abundant gene. Principal coordinates analysis was done using the R package Vegan 2.4–3 [72].

Supplementary Material

supplemental

Acknowledgments

We thank all members of the McCutcheon lab for helpful discussion and comments, and Scott Miller for suggesting the use of the Jaccard Index. Funding for the sequencing and analysis was supported by National Science Foundation grants IOS-1256680 and IOS-1553529, and National Aeronautics and Space Administration Astrobiology Institute Award NNA15BB04A. Funding for cicada collecting was provided by NSF DEB-09-55849.

Footnotes

Author Contributions

Conceptualization, M.A.C. and J.P.M.; Methodology, M.A.C. and P.L.; Formal Analysis, M.A.C.; Investigation, M.A.C.; Resources, C.S. and J.P.M.; Data Curation, M.A.C. and C.S.; Writing – Original Draft, M.A.C.; Writing – Review & Editing, M.A.C., P.L., C.S., and J.P.M.; Visualization, M.A.C.; Supervision, J.P.M.; Funding Acquisition, J.P.M.

The authors declare no conflict of interest.

References

  • 1.McCutcheon JP, Moran NA. Extreme genome reduction in symbiotic bacteria. Nature Rev Microbiol. 2012;10:13–26. doi: 10.1038/nrmicro2670. [DOI] [PubMed] [Google Scholar]
  • 2.Bennett GM, Moran NA. Heritable symbiosis: The advantages and perils of an evolutionary rabbit hole. Proc. Natl. Acad. Sci. U.S.A. 2015;112:10169–10176. doi: 10.1073/pnas.1421388112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Tamas I. 50 Million Years of Genomic Stasis in Endosymbiotic Bacteria. Science. 2002;296:2376–2379. doi: 10.1126/science.1071278. [DOI] [PubMed] [Google Scholar]
  • 4.McCutcheon JP. The bacterial essence of tiny symbiont genomes. Curr Opin Microbiol. 2010;13:73–78. doi: 10.1016/j.mib.2009.12.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Boore JL. Animal mitochondrial genomes. Nucl. Acids Res. 1999;27:1767–1780. doi: 10.1093/nar/27.8.1767. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.McCutcheon JP, McDonald BR, Moran NA. Origin of an Alternative Genetic Code in the Extremely Small and GC–Rich Genome of a Bacterial Symbiont. PLOS Genet. 2009;5:e1000565. doi: 10.1371/journal.pgen.1000565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Van Leuven JT, Meister RC, Simon C, McCutcheon JP. Sympatric speciation in a bacterial endosymbiont results in two genomes with the functionality of one. Cell. 2014;158:1270–1280. doi: 10.1016/j.cell.2014.07.047. [DOI] [PubMed] [Google Scholar]
  • 8.Campbell MA, Van Leuven JT, Meister RC, Carey KM, Simon C, McCutcheon JP. Genome expansion via lineage splitting and genome reduction in the cicada endosymbiont Hodgkinia. Proc. Natl. Acad. Sci. U.S.A. 2015;112:10192–10199. doi: 10.1073/pnas.1421386112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.White J, Strehl CE. Xylem feeding by periodical cicada nymphs on tree roots. Ecol Entomol. 1978;3:323–327. [Google Scholar]
  • 10.Buchner P. Endosymbiosis of animals with plant microorganisms. John Wiley & Sons; 1965. [Google Scholar]
  • 11.Zientz E, Dandekar T, Gross R. Metabolic interdependence of obligate intracellular bacteria and their insect hosts. Microbiol. Mol. Biol. Rev. 2004;68:745–770. doi: 10.1128/MMBR.68.4.745-770.2004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.McCutcheon JP, McDonald BR, Moran NA. Convergent evolution of metabolic roles in bacterial co-symbionts of insects. Proc. Natl. Acad. Sci. U.S.A. 2009;106:15394–15399. doi: 10.1073/pnas.0906424106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Sota T, Yamamoto S, Cooley JR, Hill KBR, Simon C, Yoshimura J. Independent divergence of 13- and 17-y life cycles among three periodical cicada lineages. Proc. Natl. Acad. Sci. U.S.A. 2013;110:6919–6924. doi: 10.1073/pnas.1220060110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Moulds MS. An appraisal of the cicadas of the genus Abricta Stal and allied genera (Hemiptera: Auchenorrhyncha: Cicadidae) Records of the Australian Museum 2003 [Google Scholar]
  • 15.Jaccard P. Etude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull Soc Vaudoise Sci Nat. 1901;37:547–579. [Google Scholar]
  • 16.Cooper KW. Davispia bearcreekensis Cooper, a new cicada from the Paleocene, with a brief review of the fossil Cicadidae. Am J Sci. 1941;239:286–304. [Google Scholar]
  • 17.Poinar G, Jr, Kritsky G. Morphological conservatism in the foreleg structure of cicada hatchlings, Burmacicada proteran. gen., n. sp. in Burmese amber, Dominicicada youngin. gen., n. sp. in Dominican amber and the extant Magicicada septendecim(L.) (Hemiptera: Cicadidae) Historical Biology. 2012;24:461–466. [Google Scholar]
  • 18.Poinar G, Jr, Kritsky G, Brown A. Minyscapheus dominicanus n. gen., n. sp. (Hemiptera: Cicadidae), a fossil cicada in Dominican amber. Historical Biology. 2012;103:1–5. [Google Scholar]
  • 19.Marshall DC, Hill KBR, Moulds M, Vanderpool D, Cooley JR, Mohagan AB, Simon C. Inflation of Molecular Clock Rates and Dates: Molecular Phylogenetics, Biogeography, and Diversification of a Global Cicada Radiation from Australasia (Hemiptera: Cicadidae: Cicadettini) Syst Biol. 2016;65:16–34. doi: 10.1093/sysbio/syv069. [DOI] [PubMed] [Google Scholar]
  • 20.Wernegreen JJ. Genome evolution in bacterial endosymbionts of insects. Nat. Rev. Genet. 2002;3:850–861. doi: 10.1038/nrg931. [DOI] [PubMed] [Google Scholar]
  • 21.Patiño-Navarrete R, Moya A, Latorre A, Peretó J. Comparative Genomics of Blattabacterium cuenoti: The Frozen Legacy of an Ancient Endosymbiont Genome. Genome Biol Evol. 2013;5:351–361. doi: 10.1093/gbe/evt011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Karnkowska A, Vacek V, Zubáčová Z, Treitli SC, Petrželková R, Eme L, Novák L, Žárský V, Barlow LD, Herman EK, et al. A Eukaryote without a Mitochondrial Organelle. Curr Biol. 2016;26:1274–1284. doi: 10.1016/j.cub.2016.03.053. [DOI] [PubMed] [Google Scholar]
  • 23.Bennett GM, McCutcheon JP, MacDonald BR, Romanovicz D, Moran NA. Differential Genome Evolution Between Companion Symbionts in an Insect-Bacterial Symbiosis. mBio. 2014;5:e01697–14–e01697–14. doi: 10.1128/mBio.01697-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Keeling PJ, McCutcheon JP. Endosymbiosis: The feeling is not mutual. J Theor Biol. 2017 doi: 10.1016/j.jtbi.2017.06.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Eberhard WG. Evolution in bacterial plasmids and levels of selection. Q Rev Biol. 1990;65:3–22. doi: 10.1086/416582. [DOI] [PubMed] [Google Scholar]
  • 26.Otto SP, Orive ME. Evolutionary consequences of mutation and selection within an individual. Genetics. 1995;141:1173–1187. doi: 10.1093/genetics/141.3.1173. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Okasha S. Multilevel Selection and the Major Transitions in Evolution. Philosophy of Science. 2005;72:1013–1025. [Google Scholar]
  • 28.Maynard Smith J. Group selection and kin selection. Nature. 1964;201:1145–1147. [Google Scholar]
  • 29.Hurst LD, Atlan A, Bengtsson BO. Genetic Conflicts. Q Rev Biol. 2015;71:317–364. doi: 10.1086/419442. [DOI] [PubMed] [Google Scholar]
  • 30.Kiers ET, West SA. Evolving new organisms via symbiosis. Science. 2015;348:392–394. doi: 10.1126/science.aaa9605. [DOI] [PubMed] [Google Scholar]
  • 31.West SA, Fisher RM, Gardner A, Kiers ET. Major evolutionary transitions in individuality. Proc. Natl. Acad. Sci. U.S.A. 2015;112:10112–10119. doi: 10.1073/pnas.1421402112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Sachs JL, Skophammer RG, Regus JU. Evolutionary transitions in bacterial symbiosis. Proc. Natl. Acad. Sci. U.S.A. 2011;108(Suppl 2):10800–10807. doi: 10.1073/pnas.1100304108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Taylor DR, Zeyl C, Cooke E. Conflicting levels of selection in the accumulation of mitochondrial defects in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. U.S.A. 2002;99:3690–3694. doi: 10.1073/pnas.072660299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Bastiaans E, Aanen DK, Debets AJM, Hoekstra RF, Lestrade B, Maas MFPM. Regular bottlenecks and restrictions to somatic fusion prevent the accumulation of mitochondrial defects in Neurospora. Philos Trans R Soc Lond B Biol Sci. 2014;369:20130448–20130448. doi: 10.1098/rstb.2013.0448. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Hansong Ma PHO. Selfish drive can trump function when animal mitochondrial genomes compete. Nat Genet. 2016;48:798–802. doi: 10.1038/ng.3587. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Bergstrom CT, Pritchard J. Germline Bottlenecks and the Evolutionary Maintenance of Mitochondrial Genomes. Genetics. 1998;149:2135–2146. doi: 10.1093/genetics/149.4.2135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Rispe C, Moran NA. Accumulation of Deleterious Mutations in Endosymbionts: Muller’s Ratchet with Two Levels of Selection. The American Naturalist. 2000;156:425–441. doi: 10.1086/303396. [DOI] [PubMed] [Google Scholar]
  • 38.Leigh EG. When does the good of the group override the advantage of the individual? Proc. Natl. Acad. Sci. U.S.A. 1983;80:2985–2989. doi: 10.1073/pnas.80.10.2985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Maynard Smith J. Group selection. Q Rev Biol. 1976;51:277–283. [Google Scholar]
  • 40.Adams K. Evolution of mitochondrial gene content: gene loss and transfer to the nucleus. Mol Phylogenet Evol. 2003;29:380–395. doi: 10.1016/s1055-7903(03)00194-5. [DOI] [PubMed] [Google Scholar]
  • 41.Smith DR, Keeling PJ. Mitochondrial and plastid genome architecture: Reoccurring themes, but significant differences at the extremes. Proc. Natl. Acad. Sci. U.S.A. 2015;112:10177–10184. doi: 10.1073/pnas.1422049112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Burger G, Gray MW, Franz Lang B. Mitochondrial genomes: anything goes. Trends Genet. 2003;19:709–716. doi: 10.1016/j.tig.2003.10.012. [DOI] [PubMed] [Google Scholar]
  • 43.Turmel M, Lemieux C, Burger G, Lang BF, Otis C, Plante I, Gray MW. The Complete Mitochondrial DNA Sequences of Nephroselmis olivacea and Pedinomonas minor: Two Radically Different Evolutionary Patterns within Green Algae. The Plant Cell Online. 1999;11:1717–1729. doi: 10.1105/tpc.11.9.1717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Conway DJ, Fanello C, Lloyd JM, Al-Joubori BMAS, Baloch AH, Somanath SD, Roper C, Oduola AMJ, Mulder B, Povoa MM, et al. Origin of Plasmodium falciparum malaria is traced by mitochondrial DNA. Mol Biochem Parasitol. 2000;111:163–171. doi: 10.1016/s0166-6851(00)00313-3. [DOI] [PubMed] [Google Scholar]
  • 45.Mai Z, Ghosh S, Frisardi M, Rosenthal B, Rogers R, Samuelson J. Hsp60 Is Targeted to a Cryptic Mitochondrion-Derived Organelle (“Crypton”) in the Microaerophilic Protozoan Parasite Entamoeba histolytica. Mol. Cell. Biol. 1999;19:2198–2205. doi: 10.1128/mcb.19.3.2198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Hackstein JHP, Akhmanova A, Boxma B, Harhangi HR, Voncken FGJ. Hydrogenosomes: eukaryotic adaptations to anaerobic environments. Trends Microbiol. 1999;7:441–447. doi: 10.1016/s0966-842x(99)01613-3. [DOI] [PubMed] [Google Scholar]
  • 47.van der Giezen M. Hydrogenosomes and Mitosomes: Conservation and Evolution of Functions. J Eukaryot Microbiol. 2009;56:221–231. doi: 10.1111/j.1550-7408.2009.00407.x. [DOI] [PubMed] [Google Scholar]
  • 48.Embley M, van der Giezen M, Horner DS, Dyal PL, Foster P. Mitochondria and hydrogenosomes are two forms of the same fundamental organelle. Philos Trans R Soc Lond B Biol Sci. 2003;358:191–203. doi: 10.1098/rstb.2002.1190. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Tovar J, León-Avila G, Sánchez LB, Sutak R, Tachezy J, van der Giezen M, Hernández M, Müller M, Lucocq JM. Mitochondrial remnant organelles of Giardia function in iron-sulphur protein maturation. Nature. 2003;426:172–176. doi: 10.1038/nature01945. [DOI] [PubMed] [Google Scholar]
  • 50.Shao R, Kirkness EF, Barker SC. The single mitochondrial chromosome typical of animals has evolved into 18 minichromosomes in the human body louse, Pediculus humanus. Genome Res. 2009;19:904–912. doi: 10.1101/gr.083188.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Shao R, Zhu X-Q, Barker SC, Herd K. Evolution of Extensively Fragmented Mitochondrial Genomes in the Lice of Humans. Genome Biol Evol. 2012;4:1088–1101. doi: 10.1093/gbe/evs088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Shao R, Li H, Barker SC, Song S. The Mitochondrial Genome of the Guanaco Louse, Microthoracius praelongiceps: Insights into the Ancestral Mitochondrial Karyotype of Sucking Lice (Anoplura, Insecta) Genome Biol Evol. 2017;9:431–445. doi: 10.1093/gbe/evx007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Vlček Č, Marande W, Teijeiro S, Lukeš J, Burger G. Systematically fragmented genes in a multipartite mitochondrial genome. Nucl. Acids Res. 2011;39:979–988. doi: 10.1093/nar/gkq883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Sloan DB. One ring to rule them all? Genome sequencing provides new insights into the “master circle” model of plant mitochondrial DNA structure. New Phytologist. 2013;200:978–985. doi: 10.1111/nph.12395. [DOI] [PubMed] [Google Scholar]
  • 55.Sloan DB, Alverson AJ, Chuckalovcak JP, Wu M, McCauley DE, Palmer JD, Taylor DR. Rapid Evolution of Enormous, Multichromosomal Genomes in Flowering Plant Mitochondria with Exceptionally High Mutation Rates. PLOS Biol. 2012;10:e1001241. doi: 10.1371/journal.pbio.1001241. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Alverson AJ, Rice DW, Dickinson S, Barry K, Palmer JD. Origins and recombination of the bacterial-sized multichromosomal mitochondrial genome of cucumber. Plant Cell. 2011;23:2499–2513. doi: 10.1105/tpc.111.087189. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Rice DW, Alverson AJ, Richardson AO, Young GJ, Sanchez-Puerta MV, Munzinger J, Barry K, Boore JL, Zhang Y, dePamphilis CW, et al. Horizontal Transfer of Entire Genomes via Mitochondrial Fusion in the Angiosperm Amborella. Science. 2013;342:1468–1473. doi: 10.1126/science.1246275. [DOI] [PubMed] [Google Scholar]
  • 58.Lynch M, Koskella B, Schaack S. Mutation Pressure and the Evolution of Organelle Genomic Architecture. Science. 2006;311:1727–1730. doi: 10.1126/science.1118884. [DOI] [PubMed] [Google Scholar]
  • 59.Mira A, Ochman H, Moran NA. Deletional bias and the evolution of bacterial genomes. Trends Genet. 2001;17:589–596. doi: 10.1016/s0168-9525(01)02447-7. [DOI] [PubMed] [Google Scholar]
  • 60.Wu Z, Cuthbert JM, Taylor DR, Sloan DB. The massive mitochondrial genome of the angiosperm Silene noctiflora is evolving by gain or loss of entire chromosomes. Proc. Natl. Acad. Sci. U.S.A. 2015;112:10185–10191. doi: 10.1073/pnas.1421397112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Palmer JD, Shields CR. Tripartite structure of the Brassica campestris mitochondrial genome. Nature 1984 [Google Scholar]
  • 62.Williams KS, Simon C. The ecology, behavior, and evolution of periodical cicadas. Annu Rev Entomol. 1995;40:269–295. [Google Scholar]
  • 63.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64.Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, et al. SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. J Comput Biol. 2012;19:455–477. doi: 10.1089/cmb.2012.0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics 2009. 2009;10(1):10–421. doi: 10.1186/1471-2105-10-421. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–1760. doi: 10.1093/bioinformatics/btp324. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Finn RD, Clements J, Eddy SR. HMMER web server: interactive sequence similarity searching. Nucl. Acids Res. 2011;39:gkr367–W37. doi: 10.1093/nar/gkr367. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 68.Lagesen K, Hallin P, Rødland EA, Starfeldt H-H, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucl. Acids Res. 2007;35:3100–3108. doi: 10.1093/nar/gkm160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Laslett D, Canback B. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucl. Acids Res. 2004;32:11–16. doi: 10.1093/nar/gkh152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Seemann T. barrnap. Available at: https://github.com/tseemann/barrnap.
  • 71.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Oksanen J, Kindt R, Legendre P, O'Hara B. The vegan package. Community ecology Package. 2007;10:631–637. [Google Scholar]
  • 73.R Core Team R: A Language and Environment for Statistical Computing. Available at: https://www.R-project.org.
  • 74.Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–780. doi: 10.1093/molbev/mst010. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplemental

RESOURCES