Abstract
The evolution of ants is marked by remarkable adaptations that allowed the development of very complex social systems. To identify how ant-specific adaptations are associated with patterns of molecular evolution, we searched for signs of positive selection on amino-acid changes in proteins. We identified 24 functional categories of genes which were enriched for positively selected genes in the ant lineage. We also reanalyzed genome-wide data sets in bees and flies with the same methodology to check whether positive selection was specific to ants or also present in other insects. Notably, genes implicated in immunity were enriched for positively selected genes in the three lineages, ruling out the hypothesis that the evolution of hygienic behaviors in social insects caused a major relaxation of selective pressure on immune genes. Our scan also indicated that genes implicated in neurogenesis and olfaction started to undergo increased positive selection before the evolution of sociality in Hymenoptera. Finally, the comparison between these three lineages allowed us to pinpoint molecular evolution patterns that were specific to the ant lineage. In particular, there was ant-specific recurrent positive selection on genes with mitochondrial functions, suggesting that mitochondrial activity was improved during the evolution of this lineage. This might have been an important step toward the evolution of extreme lifespan that is a hallmark of ants.
Keywords: comparative genomics, sociality, dN/dS, aging, lifespan, immunity, neurogenesis, olfactory receptors, metabolism, Hymenoptera, bees, Drosophila
Introduction
Ants constitute an extremely successful lineage of animals which has colonized virtually all ecosystems on Earth (Hölldobler and Wilson 1990). The pivotal feature at the basis of this ecological success is their highly social system with a reproductive division of labor, where one or a few queens specialize in reproduction, whereas workers conduct all the colony tasks such as brood care, nest maintenance, and food collection. In this article, we take advantage of the recent availability of seven sequenced ant genomes (Bonasio et al. 2010; Nygaard et al. 2011; Smith, Zimin, et al. 2011; Smith, Smith, et al. 2011; Suen et al. 2011; Wurm et al. 2011) to perform a genome-wide scan for positive selection on amino-acid changes in protein coding genes during the evolution of the ant lineage. We addressed three main questions.
First, we compared the amount of positive selection in functional categories of genes. Previous large-scale scans for positive selection in animals indicated that positive selection predominantly affects certain types of genes, such as those involved in evolutionary arms races, sexual selection, or conflicts with pathogens (Bakewell et al. 2007; Drosophila 12 Genomes Consortium 2007; Kosiol et al. 2008; Vamathevan et al. 2008; Oliver et al. 2010; George et al. 2011; Woodard et al. 2011). Such genes experienced positive selection events recurrently on broad evolutionary time scales, and it is likely that they contribute to a fraction of the positive selection events that occurred in the ant lineage. To identify these genes, we reasoned that they likely also were under positive selection in other insect lineages. A systematic comparison of the targets of positive selection from published studies in insects is not straightforward because genome-wide scans for positive selection were often performed with different methods in different lineages. For example, a positive selection scan on 12 Drosophila species (all solitary) (Drosophila 12 Genomes Consortium 2007) used the site test of Codeml (Yang et al. 2000), which is aimed at detecting recurrent positive selection events affecting particular sites of a protein, whereas a scan on 10 bee species (including solitary, primitively social, and highly social species) (Woodard et al. 2011) used the branch test (Yang 1998), which tends to detect positive selection events affecting a large number of sites of a protein but during a limited period of time. To perform a robust comparison of the genes that were under positive selection in ants and other insects, we conducted similar scans for positive selection in ants and the flies and bees outgroups. An example of genes expected to be repeatedly under positive selection in insects are genes involved in defense and immunity (Drosophila 12 Genomes Consortium 2007; Bulmer 2010). On the basis of the observed smaller set of immunity genes in the honeybee compared with Drosophila melanogaster, it has been suggested that selective pressure on these genes might have been relaxed in social insects, perhaps because they have social hygienic behaviors (Honeybee Genome Sequencing Consortium 2006; Smith et al. 2008; Viljakainen et al. 2009; Smith, Zimin, et al. 2011; Suen et al. 2011; Harpur and Zayed 2013). However, the addition of several newly sequenced insect genomes revealed that the important gene complement in fruit fly is a derived character (Werren et al. 2010; Fischman et al. 2011; Smith, Smith, et al. 2011). We used our data sets to test whether there was evidence for weaker positive selection on immunity genes in ants and bees compared with flies.
Next, we aimed at detecting sets of genes involved in functions likely to reflect ant-specific adaptations. We focused on three main adaptations. The first relates to the wide range of coordinated collective behaviors associated with division of labor in ant societies. Complex cooperative behaviors occur among nestmates for tasks such as communal nest construction and defense, brood rearing, social hygienic behavior, and collective foraging (Hölldobler and Wilson 1990). It has been suggested that the evolution of social interactions may be tracked down to molecular changes affecting nervous system development and function. In particular it may translate into increased rates of positive selection on nervous system-related genes, as documented in primitively social lineages of bees, which evolved social behaviors independently from ants (Fischman et al. 2011; Woodard et al. 2011). Complex collective behaviors also require efficient communication systems that are essentially mediated by chemical signaling in social insects. Ants identify nestmates from non-nestmates, as well as ants from other species, through their scent. Individuals also use various types of pheromones as alarm signals and to mark their trails and territories. It has therefore been suggested that genes involved in chemical signaling, notably pheromone production and perception, should experience increased positive selection in ants compared with solitary insects (Ingram et al. 2005; Robertson and Wanner 2006; Bonasio et al. 2010; Smith, Zimin, et al. 2011; Wurm et al. 2011; Kulmuni et al. 2013; Leboeuf et al. 2013). A manually curated data set of 873 olfactory receptor genes (ORs) allowed us to conduct a test for increased positive selection on these genes in ants.
The second type of potential molecular adaptation relates to phenotypic plasticity among castes. Although queens and workers usually develop from totipotent eggs (Schwander et al. 2010), they display dramatic morphological and physiological differences. Queens are often larger, have wings, and have much more highly developed ovaries than workers that often are sterile and lack a sperm storage organ (Hölldobler and Wilson 1990). In most species, the differences between castes result from developmental differences induced by environmental factors rather than genetic differences (Abouheif and Wray 2002; Schwander et al. 2010; Penick et al. 2012; Rajakumar et al. 2012). We therefore investigated whether there was evidence for increased positive selection in genes and pathways potentially involved in developmental plasticity (Smith et al. 2008; Fischman et al. 2011).
A third and interesting type of ant-specific adaptation relates to the extremely long lifespan of ant queens, which can live more than 20 years in some species (Keller and Genoud 1997; Jemielity et al. 2005). This corresponds to a 100-fold increase in lifespan compared with solitary insects. The variation in lifespan among castes is also remarkable, with queens living up to 10 times longer than workers and 500 times longer than males. So far, a limited number of molecular candidates have been identified to explain this pattern, mainly inspired from work in Drosophila (Jemielity et al. 2005; Keller and Jemielity 2006). We therefore investigated whether there was evidence of positive selection on genes that have previously been associated with aging in model organisms. It is possible that positive selection acted on the same sets of genes in the bee lineage, where queens also live longer than other castes and than solitary insects, but such a signal should not be observed in short-lived species of the Drosophila lineage. To further assess the link between positive selection and aging, we investigated whether genes that experienced positive selection in ants were genes shown in D. melanogaster to be differentially expressed between old and young individuals, and between oxygen-stressed and control individuals (Landis et al. 2004).
Finally, we investigated whether there was a difference in the level of positive selection between genes showing biased expression in queens, workers, and males. The efficiency of natural selection acting on an advantageous mutation—and thus the probability of its long-term fixation—is proportional to its effect on fitness (Duret 2008). The fitness effects of mutations in genes that are expressed only in nonreproductive workers are indirect, so everything else being equal, selection should be less efficient at fixing them than mutations on genes expressed in queens and males. This could translate into lower levels of positive selection on genes expressed specifically in workers compared with males and queens (Linksvayer and Wade 2009; Hall and Goodisman 2012). We therefore analyzed previously published microarray data from the red fire ant Solenopsis invicta (Ometto et al. 2011), and compared the amount of positive selection between groups of genes varying in the level of caste-biased expression.
Results
Pervasive Positive Selection Detected in Ants
To detect positive selection episodes that acted on protein-coding genes during the evolution of the ant lineage, the branch-site test of Codeml was run on 4,261 protein alignments of single-copy orthologs composed of four to seven ant and three to five outgroup species (see Materials and Methods). All branches that led to ant species in each gene family tree (including 2 hymenopteran and 13 ant branches; fig. 1) were successively tested for the presence of episodic positive selection. As many as 1,832 single-copy orthologs families (43%) displayed a signal of positive selection (at 10% false discovery rate [FDR]) in at least one of the branches tested (supplementary table S1, Supplementary Material online). In 91% of the significant alignments, at least one residue targeted by positive selection could be identified with a posterior probability greater than 0.9 (Bayes Empirical Bayes test; fig. 2) (Yang et al. 2005). There was evidence for positive selection in at least one branch of the ant lineage for 830 (20%) of the genes analyzed. For 74% of them positive selection was specific to ants and not observed in the basal hymenopteran branches #7 and #8 (fig. 1). The 10 gene families with the most significant test values in the ant lineage are given in table 1.
Table 1.
Test Used | Gene Family | Branch | Δln L | P-value | FDR | dS | ω (proportion) | Drosophila melanogaster Gene Name | Function Annotated in Flybase and Uniprot | Duplicates in Ants | Uniprot ID | References |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Branch-site test | 150 | 6 | 16.1 | 1.4e-8 | 2.2e-6 | 0.93 | 281 (4.2%) | Tequila | Serine-type endopeptidase activity; long-term memory; aging | – | O45029 | Didelot et al. (2006), Chen et al. (2012), and Remolina et al. (2012) |
11650a | 5 | 12.1 | 8.4e-7 | 7.9e-5 | 0.059 | 299 (2.5%) | CG17321 | Unknow | – | Q9VJ40 | – | |
453 | 6 | 11.4 | 1.9e-6 | 1.5e-4 | 0.29 | 44 (2.2%) | Guanine nucleotide exchange factor in mesoderm | Ral GTPase binding; imaginal disc-derived wing vein specification | – | A1ZBA1 | Blanke and Jackle (2006) | |
361 | 1 | 10.9 | 3.0e-6 | 2.3e-4 | 0.090 | 1.2 (6.9%) | Megator | Spindle assembly | – | A1Z8P9 | Qi et al. (2004) | |
5623 | 1 | 10.6 | 4.0e-6 | 3.0e-4 | 0.13 | 24 (4.6%) | Methylthioribose-1-phosphate isomerase | Catalyzes interconversion of methylthioribose-1-phosphate into methylthioribulose-1-phosphate; wing disc development | – | Q9V9X4 | Bronstein et al. (2010) | |
1050 | 6 | 10.1 | 6.7e-6 | 4.7e-4 | 0.49 | 45 (3%) | Dis3 | Regulation of gene expression; nuclear RNA surveillance; neurogenesis | – | Q8MSY2 | Kuan et al. (2009), Kiss and Andrulis (2010), Neumuller et al. (2011) | |
793 | 4 | 9.6 | 1.2e-5 | 7.6e-4 | 0.085 | ∞ (0.6%) | Embargoed | Protein binding; protein transporter activity; protein export from nucleus; multicellular organismal development; centriole replication | – | Q9TVM2 | Collier et al. (2000); Roth et al. (2003) | |
8639 | 6 | 9.5 | 1.4e-5 | 8.3e-4 | 0.26 | ∞ (11%) | ATP synthase, subunit b, mitochondria | Hydrogen-exporting ATPase activity, phosphorylative mechanism; phagocytosis, engulfment | – | Q94516 | Stroschein-Stevenson et al. (2005) | |
3983 | 4 | 8.9 | 2.4e-5 | 0.0014 | 0.036 | ∞ (0.5%) | Lysyl oxidase-like 2 | Protein-lysine 6-oxidase activity | – | Q8IH65 | Molnar et al. (2003), Molnar et al. (2005) | |
2208 | 6 | 8.7 | 3.0e-5 | 0.0016 | 0.49 | ∞ (1.2%) | Cytochrome P450 reductase | NADPH-hemoprotein reductase activity; oxidation-reduction process; putative function in olfactory clearance | – | Q27597 | Hovemann et al. (1997) | |
Site test | 3245 | – | 10.6 | 4.2e-6 | 0.0041 | – | 2.4 (2.1%) | CG6752 | Unknown | Yes | Q9VFC4, Q8SZS1 | – |
6214 | – | 8.3 | 4.6e-5 | 0.038 | – | 8.3 (0.9%) | CG42343 | Unknown | No | B7Z153, Q9VRI6 | – | |
6649 | – | 8.0 | 6.6e-5 | 0.045 | – | 4.9 (4.7%) | CG7845 | Muscle cell homeostasis | No | Q7K4B2 | Kucherenko et al. (2011) | |
5707 | – | 7.9 | 7.2e-5 | 0.045 | – | 3.4 (2.1%) | Mitochondrial ribosomal protein L37 | Structural constituent of ribosome; translation | No | Q9VGW9, Q3YNF4, Q3YNF5 | Kim, Morrow, et al. (2010) | |
2372 | – | 7.8 | 7.6e-5 | 0.045 | – | 3.2 (1.9%) | Mitochondrial trifunctional protein α subunit | Long-chain-3-hydroxyacyl-CoA dehydrogenase activity; long-chain-enoyl-CoA hydratase activity; response to starvation; determination of adult lifespan; fatty acid beta-oxidation; wound healing | No | Q8IPE8, Q9V397 | Kishita, Tsuda, Aigaki (2012) | |
8490 | – | 7.4 | 1.2e-4 | 0.062 | – | 6.2 (2.0%) | Phosphatidylinositol synthase | CDP-diacylglycerol-inositol 3-phosphatidyltransferase activity; phototransduction | No | Q8IR29, Q8SX37 | Wang and Montell (2006) | |
3891 | – | 6.8 | 2.2e-4 | 0.11 | – | 9.4 (0.3%) | CG1607 | Potential amino acid transmembrane transporter activity | No | Q9V9Y0, Q95T33 | – | |
2074 | – | 6.5 | 3.1e-4 | 0.14 | – | 4.2 (4.0%) | CG9715 | Unknown | Yes | Q9VVA9, Q960D5 | – | |
1584 | – | 6.3 | 4.0e-4 | 0.17 | – | 2.3 (2.7%) | Unextended | Potential role in cellular ion homeostasis | No | A8Y516 | – | |
1053 | – | 6.2 | 4.2e-4 | 0.17 | – | 9.1 (0.2%) | Coat protein (coatomer) β | Biosynthetic protein transport from the ER, via the Golgi up to the trans Golgi network. Required for limiting lipid storage in lipid droplets. Involved in innate immune response | No | P45437 | Bard et al. (2006), Beller et al. (2008), Cronin et al. (2009) |
Note.–Gene families are ranked based on their log-likelihood ratios (Δln L). Results of the branch-site test were filtered to keep only internal ant branches of the phylogenetic tree (branches 1 to #6) and with a dS on the tested branch below 1. Results of both tests were filtered to keep families with a good support for the detection of sites evolving under positive selection (BEB posterior probability > 0.9). Manual inspection of the best hits confirmed that the signal of positive selection seemed genuine for all cases, except for family 12370 in the branch-site test results, which was removed from the list.
aExample used in figure 2.
The proportion of positively selected genes varied significantly across the different branches tested (χ2 test, P < 1e-15; table 2), similarly to previous analyses with experimental and simulated data sets. This likely results at least in part from lower power of the branch-site test in shorter branches (Anisimova and Yang 2007; Kosiol et al. 2008; Studer et al. 2008; Fletcher and Yang 2010; George et al. 2011; Gharib and Robinson-Rechavi 2013). Consistent with this view, there was a significant correlation in our data set between the length of tested branches and the test score (log-likelihood ratio; Spearman correlation ρ = 0.41, P < 1e-15). Additional analyses ruled out the hypothesis that false positives caused by convergence problems of the test, selective constraints acting on synonymous sites, saturation of synonymous substitution rate dS, or sequencing errors could be responsible for this pattern (supplementary text, Supplementary Material online).
Table 2.
Branch Namea | Lineage Delineated | Fraction of Positively Selected Gene Families | Number of Positively Selected Gene Familiesb |
---|---|---|---|
Acep | Atta cephalotes | 0.056 | 144 |
Aech | Acromyrmex echinatior | 0.043 | 109 |
Sinv | Solenopsis invicta | 0.029 | 85 |
Pbar | Pogonomyrmex barbatus | 0.038 | 80 |
Cflo | Camponotus floridae | 0.017 | 65 |
Lhum | Linepithema humile | 0.036 | 97 |
Hsal | Harpegnathos saltator | 0.020 | 76 |
1 | Attini | 0.0088 | 16 |
2 | Myrmicinae | 0.0071 | 10 |
3 | Myrmicinae | 0.0087 | 16 |
4 | Formicoid | 0.0072 | 17 |
5 | Formicoid | 0.025 | 58 |
6 | Formicidae | 0.030 | 87 |
7 | Aculeata | 0.10 | 176 |
8 | Hymenoptera/Apocrita | 0.39 | 762 |
All above branches except 7 and 8 | Formicidae | 0.20 | 830 |
All above branches | Hymenoptera/Apocrita | 0.43 | 1,832 |
aAs illustrated in figure 1.
bBranches of gene families trees can be merged if genes are missing (or removed for quality reasons), and the resulting branches do not correspond to canonical branches defined by the species topology (fig. 1). When positive selection is found on such branches, it was not counted in branch-specific numbers displayed in table 2, but it was counted when a whole lineage was considered (e.g., Hymenoptera).
Taken together, these results demonstrate that positive selection was common in the evolution of the ant genes. The proportion of significant genes was similar in magnitude in the outgroup data set of 10 bees analyzed with the same methodology (20%; supplementary table S2, Supplementary Material online), but even higher in the outgroup data set of 12 flies (36%; supplementary table S3, Supplementary Material online). This difference might reflect biological differences between the lineages, such as effective population size NE, but also differences in the topology and branch lengths of the species trees, which influence the power to detect positive selection events in protein alignments (see supplementary text, Supplementary Material online).
To compare the amount of positive selection experienced by different functional categories of genes, we classified genes based on their Gene Ontology (GO) annotation in D. melanogaster orthologs, and performed a gene set enrichment test using for each gene family a score reflecting the overall occurrence of positive selection in the ant lineage (Materials and Methods; supplementary text, Supplementary Material online). Such an approach of grouping genes enables a more sensitive search for positive selection, while buffering the impact of potential false positives (e.g., from remaining alignment errors or GC-biased gene conversion events which are difficult to distinguish from real positive selection signals; see supplementary text and supplementary tables S20–S24, Supplementary Material online). Twenty-four functional categories of genes were significantly enriched for positively selected genes in the ant lineage (at 20% FDR; table 3). A large number of them (11 out of 24) were related to mitochondria and mitochondrial activity. The other significant categories were related to nervous system development, behavior, immunity, protein translation and degradation, cell membrane, and receptor activity. Thus, positive selection apparently targeted a diverse array of gene functions during the evolution of the ant lineage.
Table 3.
SetID | Ontology | SetName | SetSize | Score | P-value | FDR |
---|---|---|---|---|---|---|
GO:0000313 | CC | Organellar ribosome | 59 | 26.8 | 1.4e-10 | 0 |
GO:0006120 | BP | Mitochondrial electron transport, NADH to ubiquinone | 18 | 10.6 | 1.1e-9 | 0 |
GO:0005759 | CC | Mitochondrial matrix | 98 | 39.7 | 1.6e-9 | 0 |
GO:0005762 | CC | Mitochondrial large ribosomal subunit | 36 | 16.8 | 1.1e-7 | 0.0025 |
GO:0005746 | CC | Mitochondrial respiratory chain | 31 | 14.6 | 4.5e-7 | 0.0033 |
GO:0005747 | CC | Mitochondrial respiratory chain complex I | 22 | 11.0 | 1.3e-6 | 0.0033 |
GO:0008137 | MF | NADH dehydrogenase (ubiquinone) activity | 16 | 8.0 | 3.2e-5 | 0.013 |
GO:0005763 | CC | Mitochondrial small ribosomal subunit | 25 | 10.9 | 0.00018 | 0.047 |
GO:0008038 | BP | Neuron recognition | 19 | 8.7 | 0.00023 | 0.047 |
GO:0008344 | BP | Adult locomotory behavior | 19 | 8.4 | 0.00082 | 0.086 |
GO:0042254 | BP | Ribosome biogenesis | 39 | 15.0 | 0.0011 | 0.099 |
GO:0003735 | MF | Structural constituent of ribosome | 107 | 36.4 | 0.0012 | 0.099 |
GO:0044459 | CC | Plasma membrane part | 129 | 42.9 | 0.0016 | 0.12 |
GO:0006508 | BP | Proteolysis | 145 | 47.4 | 0.0022 | 0.14 |
GO:0006412 | BP | Translation | 191 | 61.0 | 0.0025 | 0.15 |
GO:0016491 | MF | Oxidoreductase activity | 127 | 41.8 | 0.0028 | 0.15 |
GO:0004872 | MF | Receptor activity | 90 | 30.6 | 0.0028 | 0.15 |
GO:0055114 | BP | Oxidation-reduction process | 129 | 42.2 | 0.0038 | 0.16 |
GO:0008237 | MF | Metallopeptidase activity | 36 | 13.6 | 0.0039 | 0.16 |
GO:0061134 | MF | Peptidase regulator activity | 17 | 7.2 | 0.0046 | 0.18 |
GO:0002520 | BP | Immune system development | 26 | 10.2 | 0.0053 | 0.19 |
GO:0048534 | BP | Hemopoietic or lymphoid organ development | 26 | 10.2 | 0.0053 | 0.19 |
GO:0016616 | MF | Oxidoreductase activity, acting on the CH–OH group of donors, NAD, or NADP as acceptor | 18 | 7.5 | 0.0053 | 0.19 |
GO:0016836 | MF | Hydro-lyase activity | 14 | 6.1 | 0.0055 | 0.19 |
Note.—The enrichment test considers a combined score for all analyzed branches of the ant lineage (Materials and Methods). The full table of results is shown in supplementary table S6, Supplementary Material online.
Usual Targets of Positive Selection in Insects
To identify GO categories that experienced positive selection not only in ants but also in other insects, we reanalyzed the fly and bee data sets with the same methodology used for the ant data set. These analyses revealed 106 GO categories significantly enriched for flies and 38 for bees (tables 4 and 5; supplementary tables S4 and S5, Supplementary Material online). We investigated which categories were enriched for positively selected genes in the three lineages. The first group of genes commonly enriched in ants, flies, and bees was related to proteolysis. This group included 4 of the 24 significantly enriched GO categories in ants (“proteolysis,” “metallopeptidase activity,” “peptidase regulator activity,” and “hydro-lyase activity”), 8 of the 106 GO categories enriched in flies (“serine-type endopeptidase activity,” “endopeptidase activity,” “proteolysis,” “metalloendopeptidase activity,” “peptidase activity,” “peptidase activity, acting on L-amino acid peptides,” “metallopeptidase activity,” and “exopeptidase activity”), and 6 of the 38 GO categories enriched bees (“amine metabolic process,” “metalloendopeptidase activity,” “metallopeptidase activity,” “signal transduction,” “cellular amine metabolic process,” and “cellular amino acid metabolic process”).
Table 4.
SetID | Ontology | SetName | SetSize | Score | P-value | FDR |
---|---|---|---|---|---|---|
GO:0006030 | BP | Chitin metabolic process | 29 | 16.2 | 4.0e-6 | 0.0015 |
GO:0006022 | BP | Aminoglycan metabolic process | 36 | 19.4 | 5.7e-6 | 0.0018 |
GO:0006952 | BP | Defense response | 36 | 19.2 | 9.8e-6 | 0.0018 |
GO:0008061 | MF | Chitin binding | 24 | 13.5 | 2.0e-5 | 0.0020 |
GO:0004252 | MF | Serine-type endopeptidase activity | 52 | 26.0 | 3.3e-5 | 0.0023 |
GO:0008026 | MF | ATP-dependent helicase activity | 18 | 10.5 | 4.0e-5 | 0.0026 |
GO:0004872 | MF | Receptor activity | 13 | 7.9 | 8.5e-5 | 0.0048 |
GO:0006006 | BP | Glucose metabolic process | 13 | 7.8 | 0.00021 | 0.0082 |
GO:0046486 | BP | Glycerolipid metabolic process | 17 | 9.7 | 0.00023 | 0.0090 |
GO:0005819 | CC | Spindle | 20 | 10.9 | 0.00046 | 0.012 |
GO:0004175 | MF | Endopeptidase activity | 78 | 35.9 | 0.00048 | 0.012 |
GO:0009607 | BP | Response to biotic stimulus | 31 | 15.8 | 0.00060 | 0.013 |
GO:0051707 | BP | Response to other organism | 31 | 15.8 | 0.00060 | 0.013 |
GO:0006508 | BP | Proteolysis | 136 | 59.5 | 0.00071 | 0.014 |
GO:0006007 | BP | Glucose catabolic process | 12 | 7.0 | 0.00072 | 0.014 |
GO:0019320 | BP | Hexose catabolic process | 12 | 7.0 | 0.00072 | 0.014 |
GO:0030312 | CC | External encapsulating structure | 12 | 7.0 | 0.00074 | 0.014 |
GO:0015081 | MF | Sodium ion transmembrane transporter activity | 16 | 8.9 | 0.00088 | 0.016 |
GO:0051649 | BP | Establishment of localization in cell | 14 | 7.9 | 0.00093 | 0.017 |
GO:0007126 | BP | Meiosis | 34 | 16.9 | 0.00096 | 0.017 |
GO:0003824 | MF | Catalytic activity | 844 | 337.0 | 0.0011 | 0.018 |
GO:0046488 | BP | Phosphatidylinositol metabolic process | 11 | 6.5 | 0.0012 | 0.018 |
GO:0016490 | MF | Structural constituent of peritrophic membrane | 11 | 6.4 | 0.0013 | 0.018 |
GO:0005975 | BP | Carbohydrate metabolic process | 64 | 29.5 | 0.0015 | 0.019 |
GO:0004888 | MF | Transmembrane receptor activity | 49 | 23.2 | 0.0016 | 0.020 |
GO:0051276 | BP | Chromosome organization | 59 | 27.2 | 0.0020 | 0.024 |
GO:0008270 | MF | Zinc ion binding | 173 | 73.4 | 0.0027 | 0.029 |
GO:0002376 | BP | Immune system process | 43 | 20.4 | 0.0027 | 0.029 |
GO:0002759 | BP | Regulation of antimicrobial humoral response | 11 | 6.3 | 0.0028 | 0.029 |
GO:0004984 | MF | Olfactory receptor activity | 19 | 9.9 | 0.0030 | 0.029 |
GO:0002697 | BP | Regulation of immune effector process | 11 | 6.2 | 0.0038 | 0.035 |
GO:0000819 | BP | Sister chromatid segregation | 11 | 6.2 | 0.0038 | 0.035 |
GO:0007143 | BP | Female meiosis | 11 | 6.2 | 0.0047 | 0.040 |
GO:0016021 | CC | Integral to membrane | 224 | 93.1 | 0.0047 | 0.040 |
GO:0031347 | BP | Regulation of defense response | 12 | 6.6 | 0.0055 | 0.044 |
GO:0015370 | MF | Solute:sodium symporter activity | 12 | 6.6 | 0.0061 | 0.047 |
GO:0000272 | BP | Polysaccharide catabolic process | 11 | 6.1 | 0.0065 | 0.049 |
GO:0016810 | MF | Hydrolase activity, acting on carbon–nitrogen (but not peptide) bonds | 28 | 13.6 | 0.0065 | 0.049 |
GO:0004521 | MF | Endoribonuclease activity | 11 | 6.1 | 0.0069 | 0.052 |
GO:0007291 | BP | Sperm individualization | 14 | 7.4 | 0.0074 | 0.053 |
GO:0010564 | BP | Regulation of cell cycle process | 30 | 14.4 | 0.0075 | 0.053 |
GO:0005635 | CC | Nuclear envelope | 17 | 8.7 | 0.0077 | 0.054 |
GO:0016773 | MF | Phosphotransferase activity, alcohol group as acceptor | 82 | 36.0 | 0.0077 | 0.054 |
GO:0051253 | BP | Negative regulation of RNA metabolic process | 35 | 16.5 | 0.0080 | 0.055 |
GO:0007608 | BP | Sensory perception of smell | 21 | 10.5 | 0.0080 | 0.055 |
GO:0004222 | MF | Metalloendopeptidase activity | 26 | 12.7 | 0.0082 | 0.055 |
GO:0006807 | BP | Nitrogen compound metabolic process | 298 | 121.5 | 0.0090 | 0.059 |
GO:0005576 | CC | Extracellular region | 97 | 41.9 | 0.010 | 0.066 |
GO:0006814 | BP | Sodium ion transport | 19 | 9.5 | 0.011 | 0.068 |
GO:0045132 | BP | Meiotic chromosome segregation | 11 | 5.9 | 0.011 | 0.070 |
GO:0034641 | BP | Cellular nitrogen compound metabolic process | 296 | 120.4 | 0.012 | 0.072 |
GO:0010629 | BP | Negative regulation of gene expression | 42 | 19.2 | 0.013 | 0.073 |
GO:0090304 | BP | Nucleic acid metabolic process | 162 | 67.6 | 0.013 | 0.075 |
GO:0016301 | MF | Kinase activity | 91 | 39.2 | 0.013 | 0.075 |
GO:0048584 | BP | Positive regulation of response to stimulus | 11 | 5.9 | 0.015 | 0.084 |
GO:0016798 | MF | Hydrolase activity, acting on glycosyl bonds | 26 | 12.4 | 0.017 | 0.086 |
GO:0006139 | BP | Nucleobase, nucleoside, nucleotide, and nucleic acid metabolic process | 236 | 96.4 | 0.017 | 0.086 |
GO:0016491 | MF | Oxidoreductase activity | 201 | 82.6 | 0.018 | 0.090 |
GO:0009987 | BP | Cellular process | 790 | 310.5 | 0.019 | 0.095 |
GO:0007088 | BP | Regulation of mitosis | 16 | 8.0 | 0.019 | 0.095 |
GO:0051783 | BP | Regulation of nuclear division | 16 | 8.0 | 0.019 | 0.095 |
GO:0006810 | BP | Transport | 200 | 82.1 | 0.021 | 0.10 |
GO:0051234 | BP | Establishment of localization | 197 | 80.9 | 0.021 | 0.10 |
GO:0006066 | BP | Alcohol metabolic process | 35 | 16.1 | 0.021 | 0.10 |
GO:0004553 | MF | Hydrolase activity, hydrolyzing O-glycosyl compounds | 22 | 10.6 | 0.022 | 0.11 |
GO:0008233 | MF | Peptidase activity | 26 | 12.3 | 0.022 | 0.11 |
GO:0070011 | MF | Peptidase activity, acting on l-amino acid peptides | 22 | 10.6 | 0.022 | 0.11 |
GO:0046914 | MF | Transition metal ion binding | 58 | 25.5 | 0.022 | 0.11 |
GO:0050660 | MF | Flavin adenine dinucleotide binding | 16 | 8.0 | 0.024 | 0.11 |
GO:0045892 | BP | Negative regulation of transcription, DNA-dependent | 26 | 12.2 | 0.024 | 0.11 |
GO:0032553 | MF | Ribonucleotide binding | 161 | 66.5 | 0.024 | 0.11 |
GO:0032555 | MF | Purine ribonucleotide binding | 161 | 66.5 | 0.024 | 0.11 |
GO:0035639 | MF | Purine ribonucleoside triphosphate binding | 161 | 66.5 | 0.024 | 0.11 |
GO:0006396 | BP | RNA processing | 36 | 16.4 | 0.025 | 0.11 |
GO:0031226 | CC | Intrinsic to plasma membrane | 30 | 13.9 | 0.026 | 0.11 |
GO:0035222 | BP | Wing disc pattern formation | 11 | 5.7 | 0.026 | 0.12 |
GO:0007346 | BP | Regulation of mitotic cell cycle | 31 | 14.3 | 0.028 | 0.12 |
GO:0045017 | BP | Glycerolipid biosynthetic process | 11 | 5.7 | 0.030 | 0.13 |
GO:0006955 | BP | Immune response | 30 | 13.8 | 0.030 | 0.13 |
GO:0044262 | BP | Cellular carbohydrate metabolic process | 44 | 19.6 | 0.030 | 0.13 |
GO:0017076 | MF | Purine nucleotide binding | 164 | 67.5 | 0.032 | 0.14 |
GO:0016705 | MF | Oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen | 21 | 10.0 | 0.032 | 0.14 |
GO:0005524 | MF | ATP binding | 163 | 67.0 | 0.034 | 0.14 |
GO:0030554 | MF | Adenyl nucleotide binding | 163 | 67.0 | 0.034 | 0.14 |
GO:0032559 | MF | Adenyl ribonucleotide binding | 163 | 67.0 | 0.034 | 0.14 |
GO:0008237 | MF | Metallopeptidase activity | 12 | 6.1 | 0.034 | 0.14 |
GO:0007127 | BP | Meiosis I | 18 | 8.7 | 0.035 | 0.14 |
GO:0019730 | BP | Antimicrobial humoral response | 14 | 6.9 | 0.039 | 0.15 |
GO:0005815 | CC | Microtubule organizing center | 16 | 7.8 | 0.041 | 0.16 |
GO:0055114 | BP | Oxidation–reduction process | 167 | 68.3 | 0.043 | 0.17 |
GO:0019899 | MF | Enzyme binding | 14 | 6.9 | 0.044 | 0.17 |
GO:0048232 | BP | Male gamete generation | 45 | 19.8 | 0.045 | 0.17 |
GO:0008033 | BP | tRNA processing | 17 | 8.2 | 0.045 | 0.17 |
GO:0005887 | CC | Integral to plasma membrane | 29 | 13.2 | 0.046 | 0.17 |
GO:0044281 | BP | Small molecule metabolic process | 212 | 85.8 | 0.046 | 0.17 |
GO:0008238 | MF | Exopeptidase activity | 18 | 8.6 | 0.046 | 0.17 |
GO:0051179 | BP | Localization | 236 | 95.1 | 0.047 | 0.17 |
GO:0007283 | BP | Spermatogenesis | 44 | 19.3 | 0.047 | 0.18 |
GO:0050662 | MF | Coenzyme binding | 48 | 21.0 | 0.048 | 0.18 |
GO:0034470 | BP | ncRNA processing | 27 | 12.3 | 0.049 | 0.18 |
GO:0048515 | BP | Spermatid differentiation | 24 | 11.1 | 0.050 | 0.18 |
GO:0045786 | BP | Negative regulation of cell cycle | 11 | 5.5 | 0.050 | 0.18 |
GO:0045934 | BP | Negative regulation of nucleobase, nucleoside, nucleotide, and nucleic acid metabolic process | 41 | 18.1 | 0.052 | 0.18 |
GO:0010639 | BP | Negative regulation of organelle organization | 12 | 5.9 | 0.055 | 0.19 |
GO:0015631 | MF | Tubulin binding | 12 | 5.9 | 0.056 | 0.19 |
GO:0005549 | MF | Odorant binding | 43 | 18.8 | 0.058 | 0.20 |
Note.—Depletion results are shown in supplementary table S4, Supplementary Material online.
Table 5.
SetID | Ontology | SetName | SetSize | Score | P-value | FDR |
---|---|---|---|---|---|---|
GO:0005099 | MF | Ras GTPase activator activity | 11 | 6.0 | 1.1e-5 | 0.03 |
GO:0005083 | MF | Small GTPase regulator activity | 18 | 8.1 | 0.00010 | 0.041 |
GO:0004872 | MF | Receptor activity | 16 | 7.2 | 0.00020 | 0.041 |
GO:0022836 | MF | Gated channel activity | 11 | 5.4 | 0.00021 | 0.041 |
GO:0006399 | BP | tRNA metabolic process | 26 | 10.3 | 0.00040 | 0.053 |
GO:0071842 | BP | Cellular component organization at cellular level | 190 | 55.5 | 0.0013 | 0.065 |
GO:0006418 | BP | tRNA aminoacylation for protein translation | 19 | 7.6 | 0.0021 | 0.084 |
GO:0009725 | BP | Response to hormone stimulus | 11 | 4.9 | 0.0022 | 0.084 |
GO:0005635 | CC | Nuclear envelope | 13 | 5.6 | 0.0022 | 0.084 |
GO:0032507 | BP | Maintenance of protein location in cell | 12 | 5.3 | 0.0023 | 0.084 |
GO:0051336 | BP | Regulation of hydrolase activity | 20 | 7.9 | 0.0026 | 0.089 |
GO:0006629 | BP | Lipid metabolic process | 51 | 17.0 | 0.0031 | 0.095 |
GO:0031072 | MF | Heat shock protein binding | 17 | 6.7 | 0.0046 | 0.11 |
GO:0008152 | BP | Metabolic process | 335 | 91.3 | 0.0070 | 0.14 |
GO:0004812 | MF | Aminoacyl-tRNA ligase activity | 19 | 7.2 | 0.0075 | 0.14 |
GO:0016740 | MF | Transferase activity | 211 | 59.1 | 0.0087 | 0.14 |
GO:0019899 | MF | Enzyme binding | 19 | 7.1 | 0.0097 | 0.14 |
GO:0005216 | MF | Ion channel activity | 14 | 5.5 | 0.011 | 0.14 |
GO:0022838 | MF | Substrate-specific channel activity | 14 | 5.5 | 0.011 | 0.14 |
GO:0009308 | BP | Amine metabolic process | 47 | 15.2 | 0.011 | 0.14 |
GO:0004222 | MF | Metalloendopeptidase activity | 15 | 5.8 | 0.012 | 0.14 |
GO:0005938 | CC | Cell cortex | 15 | 5.8 | 0.012 | 0.14 |
GO:0008237 | MF | Metallopeptidase activity | 23 | 8.2 | 0.012 | 0.14 |
GO:0007275 | BP | Multicellular organismal development | 274 | 74.9 | 0.013 | 0.14 |
GO:0007165 | BP | Signal transduction | 97 | 28.8 | 0.013 | 0.14 |
GO:0044106 | BP | Cellular amine metabolic process | 40 | 12.9 | 0.019 | 0.19 |
GO:0044459 | CC | Plasma membrane part | 45 | 14.3 | 0.021 | 0.19 |
GO:0006520 | BP | Cellular amino acid metabolic process | 32 | 10.6 | 0.021 | 0.19 |
GO:0032879 | BP | Regulation of localization | 36 | 11.7 | 0.022 | 0.19 |
GO:0006140 | BP | Regulation of nucleotide metabolic process | 13 | 5.0 | 0.022 | 0.19 |
GO:0030811 | BP | Regulation of nucleotide catabolic process | 13 | 5.0 | 0.022 | 0.19 |
GO:0033121 | BP | Regulation of purine nucleotide catabolic process | 13 | 5.0 | 0.022 | 0.19 |
GO:0033124 | BP | Regulation of GTP catabolic process | 13 | 5.0 | 0.022 | 0.19 |
GO:0043087 | BP | Regulation of GTPase activity | 13 | 5.0 | 0.022 | 0.19 |
GO:0006793 | BP | Phosphorus metabolic process | 97 | 28.3 | 0.023 | 0.19 |
GO:0006796 | BP | Phosphate metabolic process | 97 | 28.3 | 0.023 | 0.19 |
GO:0042578 | MF | Phosphoric ester hydrolase activity | 42 | 13.4 | 0.024 | 0.19 |
GO:0016758 | MF | Transferase activity, transferring hexosyl groups | 26 | 8.8 | 0.024 | 0.19 |
Note.—Depletion results are shown in supplementary table S5, Supplementary Material online.
The second group of genes enriched for positive selection signal in ants, flies, and bees was involved in response to stimuli. There was an enrichment of the GO category “receptor activity” in the three lineages as well as the GO categories “transmembrane receptor activity” and “olfactory receptor activity” in flies. This class of genes plays a pivotal role in the interactions between individuals and their environment. In addition, the GO categories “response to biotic stimulus” and “response to other organism” were enriched in flies, and the GO category “response to hormone stimulus” was enriched in bees. In ants, “response to ecdysone” and “response to steroid hormone stimulus” were marginally significant (FDR = 21%; supplementary table S6, Supplementary Material online).
Some functions were enriched for positively selected genes in only two of the three lineages. These included GO categories related to immunity that were enriched in ants and flies, and some categories related to metabolism which were enriched in flies and bees. Evidence for positive selection on immunity-related functions in ants came from a significant enrichment of the GO categories “immune system development” and “hemopoietic or lymphoid organ development,” the organ that produces during larval development the cells mediating the immune response in insects (Corley and Lavine 2006). Seven GO categories related to immunity were also enriched in flies (“defense response,” “immune system process,” “regulation of antimicrobial humoral response,” “regulation of immune effector process,” “regulation of defense response,” “immune response,” and “antimicrobial humoral response”). The absence of significant enrichment for related categories in bees might reflect a lack of power of the gene set enrichment test, because the set of immunity genes is small in the honeybee (Honeybee Genome Sequencing Consortium 2006) and the data set analyzed was further depleted in genes with immunity-related functions (supplementary text and table S7, Supplementary Material online). Consistent with this interpretation, there was a trend in the direction of an enrichment, although nonsignificant, for 5 of the 7 tested GO categories related to immunity in bees (data not shown), suggesting that immunity might be a common target of positive selection in insects.
The second set of GO categories enriched in two of the three insect lineages included various metabolic processes and their regulators, with metabolism of chitin, aminoglycan, carbohydrate, polysaccharide, glucose, hexose, glycerolipid, and phosphatidylinositol being enriched in flies and metabolism of lipid, amino-acid, nucleotide, and phosphorus being enriched in bees. There was no significant enrichment for GO categories related to metabolism in ants, but some categories were close to significance (e.g., “chitin metabolic process” and “rRNA metabolic process,” with FDR = 21% and 24%, respectively; supplementary table S6, Supplementary Material online). Metabolic functions, such as amino-acid, fatty acid, lipid, or RNA metabolism, were significantly enriched in ants when we used KEGG pathways annotation instead of the GO to perform the gene set enrichment test, as well as when the single-copy orthologs data set was reanalyzed with another multiple alignment method and a different quality filtering method (supplementary tables S25 and S26 and supplementary text, Supplementary Material online). It thus seems that metabolism is a common target of positive selection in insects.
Social Behaviors
Two of the GO categories enriched in ants (“neuron recognition” and “adult locomotory behavior”) might potentially be linked to the evolution of neural systems and behavior (table 3). The first category was “neuron recognition.” However, GO categories related to neural systems were also enriched in a nonsocial hymenoptera lineage (“regulation of synaptogenesis,” “mushroom body development,” and “memory” on branch #8, basal to the Hymenoptera; fig. 1 and supplementary table S9, Supplementary Material online), and in the branches leading to primitively social lineages of bees (“synapse,” “synapse organization,” “regulation of synaptic growth at neuromuscular junction”; data not shown) (also reported in Woodard et al. 2011), suggesting that positive selection on neural system genes in ants might not be directly associated with the emergence of social behaviors.
The second GO category enriched in ants was “adult locomotory behavior.” This category was not enriched in any of the other tested lineages. The three genes contributing most to the positive selection signal in this GO category were DCX-EMAP, turtle, and beethoven. Mutational analyses of these genes in Drosophila suggest that they play an important role in sensory perception functions. Adult flies carrying a piggyBac insertion in DCX-EMAP are uncoordinated and deaf and display loss of mechanosensory transduction and amplification (Bechstedt et al. 2010). Turtle plays an essential role in the execution of coordinated motor output in complex behaviors in flies, notably regarding the response to tactile stimulation (Bodily et al. 2001). Finally, beethoven is involved in male courtship behavior, adult walking behavior, and sensory perception of sound in flies (Tauber and Eberl 2001). This suggests that positive selection might have been important for the evolution of sensory perception functions in ants.
A specific analysis of ORs did not provide support for the evolution of sociality being associated with increased levels of positive selection on ORs. A scan for positive selection across branches of a tree gathering 873 manually annotated ORs from two ants (Pogonomyrmex barbatus and Linepithema humile) and the solitary wasp Nasonia vitripennis (see Materials and Methods) revealed that positive selection was pervasive, with 277 branches (23%) displaying significant signals for positive selection (fig. 3 and supplementary fig. S1, Supplementary Material online). However, positive selection was detected in only 19% of the 929 branches leading to ant species, whereas as many as 40% of the 156 branches leading to wasps were under positive selection (Fisher's exact test P = 7.6e-9).
Phenotypic Plasticity among Castes
None of the GO categories enriched for positively selected genes in ants could be linked to phenotypic plasticity (i.e., caste differences). In particular, there was no evidence of a significant enrichment for GO categories related to morphology or morphogenesis in the ant lineage. Another enrichment test using annotations obtained from mutant phenotypes in D. melanogaster (which are more relevant than GO annotations to analyze genes involved in morphogenesis since genes sets mostly refer to anatomical structures) also provided no clear support for positive selection on genes associated with phenotypic plasticity in ants (supplementary table S12, Supplementary Material online).
However, among the genes with the highest support for positive selection in ants (table 1), two genes had a role in wing development (Guanine nucleotide exchange factor in mesoderm and Methylthioribose-1-phosphate isomerase) and one in larval development (Dis3), suggesting that even if positive selection did not act consistently on large sets of genes related to morphogenesis, it could have acted specifically on a few individual genes.
Mitochondrial Genes
Eleven GO categories enriched for positively selected genes in ants were related to mitochondrial activity (e.g., “mitochondrial electron transport,” “mitochondrial matrix,” “mitochondrial respiratory chain,” “NADH dehydrogenase ubiquinone activity,” and “oxidoreductase activity”; table 3). The mitochondrial processes under positive selection were not restricted to respiration and energy production, but also included translation (“organellar ribosome,” “mitochondrial small/large ribosomal subunit”). GO categories related to mitochondria were also enriched for positively selected genes on many individual branches of the ant lineage analyzed separately (supplementary table S9, Supplementary Material online), and in a larger data set including duplicated genes analyzed with the site test (see Materials and Methods; table 1, supplementary tables S13 and S19, Supplementary Material online). This indicates that recurrent events of positive selection occurred on genes with mitochondrial functions during the evolution of the ant lineage. In contrast, mitochondria-related GO categories did not display any enrichment for positively selected genes in flies and bees (tables 4 and 5), despite a high power to detect it on the respective data sets (supplementary text, Supplementary Material online). Similarly, no mitochondrial function was significantly enriched in the branches #7 and #8, basal to the ant lineage (fig. 1 and supplementary table S9, Supplementary Material online), reinforcing the idea that increased positive selection on mitochondria is restricted to the ant lineage.
Of note, none of the 13 protein coding genes (Oliveira et al. 2008; Gotzek et al. 2010) from the mitochondrial genome was included in our main data set because the mitochondrial genomes of most of the ant species analyzed were not annotated. Our results thus reflect positive selection on nuclear genes encoding proteins that function in the mitochondrion. We annotated mitochondrial genomes in 5 of the 7 ant species analyzed and tested whether positive selection could also be detected on the mitochondrial genomes themselves (Materials and Methods) (Gerber et al. 2001; Bazin et al. 2006; Meiklejohn et al. 2007). However, we did not find evidence for positive selection on these alignments, neither with the branch-site test (supplementary table S15, Supplementary Material online) nor with the site test (supplementary table S14, Supplementary Material online).
Lifespan Genes
There was a significant enrichment for positively selected genes in the ant orthologs of D. melanogaster genes that were downregulated in 61-day-old flies compared with 10-day-old flies, based on a published microarray analysis (P = 0.011; below Bonferroni threshold α = 0.05/4 = 0.0125; supplementary table S16, Supplementary Material online) (Landis et al. 2004).
Two other genes known to be involved in aging were among the top-scoring genes for positive selection in our data set. The first was Tequila, which has been shown to be associated with aging in an experimental evolution study in D. melanogaster (Remolina et al. 2012). The other was mitochondrial trifunctional protein α subunit, whose knock-out also reduces lifespan in D. melanogaster (Kishita et al. 2012). Although not in the list of top hits, Sod2 (superoxide dismutase [Mn], mitochondrial), a gene known to have antioxidant activity and whose overexpression has been shown to be associated with increased lifespan in some strains of D. melanogaster (Mockett et al. 1999; Curtis et al. 2007), underwent positive selection at the base of the Hymenoptera lineage (FDR = 0.0073) and in the Acromyrmex echinatior branch (FDR = 9.6e-8).
Selective Pressure on Genes with Caste-Biased Expression
There was a marginally significant enrichment for positively selected genes among genes with biased expression in adult workers in S. invicta (effect size = 1.2, P = 0.025; not significant after Bonferroni correction α = 0.05/6 = 0.0083; supplementary table S17, Supplementary Material online) and a stronger enrichment for genes with queen-biased expression in adults (effect size = 1.8, P = 0.0028). Surprisingly, however, there was a pattern of weaker enrichment for genes with male-biased expression in adults (effect size = 1.04, P = 0.2381). At the pupal stage, we did not detect a significant enrichment for positively selected genes among any group of genes showing caste-biased expression. But similarly to the adult stage, the enrichment effect size was higher for genes with queen-biased expression (effect size = 1.2) than for genes with worker-biased expression (effect size = 1.1), and it was the lowest for genes showing male-biased expression (effect size = 1.06).
Discussion
In this article, we report results from a genome-wide scan for positive selection in protein-coding sequences of seven ant genomes, using the rigorous branch-site model of Codeml (Zhang et al. 2005) with stringent data quality control. Positive selection was detected in the ant lineage for 20% of the gene families analyzed. This proportion is similar in magnitude to the values observed in the other two insect lineages that we reanalyzed in this study: 20% in the 10 bee species and 36% in the 12 flies species.
Our analysis identified similarities in patterns of positive selection between the ants and other insect lineages. Notably, at the broadest phylogenetic scale that our data sets allowed us to study, functional categories related to proteolysis, metabolism, response to stimuli, and immunity, were enriched for positively selected genes in ants, bees, and flies. Interestingly, studies in mammals, fishes, and urchins also provided evidence for positive selection on similar functional categories (Kosiol et al. 2008; Studer et al. 2008; Oliver et al. 2010; Montoya-Burgos 2011). Recurrent positive selection on such long evolutionary time scales is typical of genes involved in the interaction with changing environments or in conflict and competition, such as evolutionary arms races between sexes or between different species, which cause the perpetuation of adaptations and counter-adaptations in competing sets of coevolving genes (Dawkins and Krebs 1979). It is notable that positive selection patterns on these categories of genes do not seem to reflect or be strongly affected by the large life-history differences between lineages analyzed here, for example the evolution of eusociality in the hymenopteran lineages. In particular, our results on immunity-related genes challenge the hypothesis that hygienic behaviors in social insects could have relaxed the selective pressure on immune genes, since this should be reflected in reduced levels of positive selection on these genes (Honeybee Genome Sequencing Consortium 2006; Smith et al. 2008; Viljakainen et al. 2009; Werren et al. 2010; Fischman et al. 2011; Smith, Zimin, et al. 2011; Smith, Smith, et al. 2011; Suen et al. 2011; Harpur and Zayed 2013).
Our analysis indicated that genes involved in neurogenesis were under positive selection in ants and the primitively social lineages of bees. It was previously hypothesized that stronger selection on genes related to brain function and development should be observed in eusocial Hymenoptera species due to high cognitive demands associated with social life (Fischman et al. 2011). However, our results are not consistent with this prediction because we also uncovered signs of positive selection at the base of the Hymenoptera lineage, i.e., before the evolution of sociality. Interestingly, a similar pattern had previously been reported with brain morphological data. A comparative analysis of insects showed that the size of mushroom body started to increase at the base of the Euhymenopteran (Orussioidea + Apocrita) lineage, approximately 90 My before the evolution of sociality in the Aculeata, and that there was no clear correlation between the size of brain components and the levels of sociality or cognition capabilities (Farris and Schulmeister 2011; Lihoreau et al. 2012). To account for this observation, Fischman et al. (2011) tried to identify factors, other than sociality, that may have placed unique selective pressure on brain evolution in species of the Hymenoptera lineage. Based on the observation that there was less positive selection on neurogenesis genes in highly social bees than in primitively social bees, they proposed that cognitive challenges might be associated with the mode of colony founding in social Hymenoptera. In particular, primitively social bees, which transit from a solitary phase during the process of colony founding to a social phase, could experience higher cognitive needs than highly social bees, which never go through a solitary phase. However, our results are also inconsistent with this model since increased positive selection was observed before the evolution of sociality in Hymenoptera. A comprehensive survey of positive selection on neurogenesis genes in Hymenoptera species, including species basal to the lineage, is required to identify precisely when the selective regime of these genes started to change, and in which hymenopteran sublineages it was maintained.
Our results also challenge the hypothesis that genes involved in chemical signaling experienced increased positive selection in social insects (Ingram et al. 2005; Robertson and Wanner 2006; Bonasio et al. 2010; Smith, Zimin, et al. 2011; Wurm et al. 2011; Zhou et al. 2012; Leboeuf et al. 2013). The analysis of olfactory receptor repertoires in two ants and a nonsocial wasp indicates that positive selection on amino-acid substitutions was surprisingly less frequent in ant than in wasp branches. Given the limited number of species used in this analysis, future work should concentrate on generating extensive annotation of olfactory receptors from more Hymenoptera as well as outgroup species to identify characters or traits that could be associated with the pattern of positive selection on olfactory receptors.
Although our analyses did not provide support for previous hypotheses about the expected effect of sociality on gene evolution, we identified several interesting functional categories which were enriched for positively selected genes exclusively in the ant lineage, possibly reflecting ant-specific adaptations. The most consistent and robust result was that genes functioning in the mitochondria were particularly likely to be under positive selection. Mitochondrial activity plays an important role in the process of reproductive isolation and speciation (Lee et al. 2008; Burton and Barreto 2012), interactions with endosymbionts such as Wolbachia (Werren 1997), diseases (Cortopassi 2002; Richly et al. 2003; Trifunovic et al. 2004, 2005), and aging (Lenaz 1998; Cortopassi 2002; Kowald and Kirkwood 2011). In that respect it is notable that the evolution of sociality has been accompanied by a nearly 100-fold increase in lifespan of queens compared with their solitary ancestors (Keller and Genoud 1997; Jemielity et al. 2005). Three lines of evidence suggest that increased lifespan of queens might be related to increased positive selection on mitochondrial genes in the ant lineage.
First, lifespan extension, not only in insects but also in other lineages such as birds and bats, appears to be associated with decreased production of reactive oxidative species (ROS) (Perez-Campo et al. 1998; Brunet-Rossinni 2004; Parker et al. 2004; Corona et al. 2005; Jemielity et al. 2005). ROS are a normal by-product of cellular metabolism. In particular, one major contributor to oxidative damage is hydrogen peroxide (H2O2), which is produced from leaks of the respiratory chain in the mitochondria (Harman 1972; Lenaz 1998; Finkel and Holbrook 2000; Cui et al. 2012). Positive selection in ants on genes functioning in the mitochondria may thus reflect selection to increase mitochondrial efficiency and reduce ROS production. Interestingly, positive selection on genes with mitochondrial functions was previously documented in the bat lineage (Shen et al. 2010; Zhang et al. 2013), which include species with exceptional longevity (Brunet-Rossinni and Austad 2004). In the bat Myotis lucifugus, ROS production was also shown to be significantly lower than in two similar sized mammal species (a mouse and a shrew) although the metabolic rates, and thus mitochondrial activity, of the former were much higher because of flight demands (Brunet-Rossinni 2004).
Second, on the basis of gene expression data obtained in the fire ant S. invicta, our analyses revealed that positive selection was strongest on genes with queen-biased expression, intermediate on genes with worker-biased expression, and weakest on genes with male-biased expression. This association between levels of positive selection and caste-biased differences in gene expression cannot be simply accounted by differences in expression levels of mitochondrial genes (which are enriched for positively selected genes in ants) since in S. invicta mitochondrial genes are significantly less expressed in queens than in workers at the larval stage, and not differentially expressed at the adult stage (supplementary fig. S2, Supplementary Material online). The finding of higher levels of positive selection for genes more highly expressed in the castes with the longer lifespan (queens can live decades in some species, whereas workers have lifespan in the order of months, and males in the order of days) suggests that increased positive selection on queen-specific genes could be related to longer lifespan.
Third, our analyses showed that the levels of positive selection were higher on orthologs of genes which are down-regulated during aging in flies. These genes include numerous energy metabolism genes, and their downregulation in old flies is thought to reflect a decline of normal and functional mitochondria with age (Yui et al. 2003; Landis et al. 2004). The finding of increased levels of positive selection on genes whose expression declines at older ages suggests that the function of these genes might be improved in ants, potentially delaying the loss of normal activity in mitochondria with age. It would be interesting to test if parallel mechanisms also evolved in the ant lineage to maintain the expression of these genes and delay the decline of mitochondria activity through lifespan in queens.
In contrast to ants, there was no evidence of elevated levels of positive selection on mitochondrial functions in bees. As most social species, bees also evolved longer queen lifespans (more than 2 years) compared with males and workers (a few weeks) (Keller and Genoud 1997; Munch et al. 2008). There are four possible explanations for the difference between ants and bees in the level of positive selection on mitochondrial genes. First, lifespan differences between castes are less pronounced in bees, where queens live up to 2–5 years, than in ants, where queens can live up to 30 years, possibly resulting in lower selective pressure to increase lifespan in bees than in ants. Second, because eusociality evolved independently in ants and bees it is possible that extended queen lifespans evolved by different molecular mechanisms (Jemielity et al. 2005; Jobson et al. 2010). For example, vitellogenin may play a more central role for aging in bees than ants (Amdam and Omholt 2002; Corona et al. 2007; Munch et al. 2008). Third, the evolution of mitochondria-related genes may have been differently constrained in ants and bees. For example, metabolic rates differ greatly between flying bee workers and non-flying ant workers because flight is an energetically costly behavior requiring highly elevated metabolic rates (Jensen and Holm-Jensen 1980; Suarez 2000; Niven and Scharlemann 2005). Because metabolism and mitochondrial activity are closely connected, lower metabolic rates in ants might have alleviated functional constraints on mitochondria-related genes, allowing selection to act on lifespan extension. Fourth, the GC content in bee genomes was shown to be lower than in ant genomes (Honeybee Genome Sequencing Consortium 2006; Simola et al. 2013). Some parts of the bee genomes, in particular their mitochondrial genomes (Crozier and Crozier 1993; Gotzek et al. 2010; Tan et al. 2011), display extreme bias in nucleotide composition, which leads to significant effect on both the codon usage patterns and amino-acid composition of proteins and may have interfered with the action of positive selection.
If positive selection acted to optimize the functioning of mitochondria in ants, it could be expected that the mitochondrial genome itself should be targeted by positive selection. However, mitochondrial genes generally exhibit very low dN/dS ratios (Montooth and Rand 2008) and there was no clear evidence in our results for positive selection on the 13 genes of the mitochondrial genome itself. This suggests that innovations related to mitochondrial activity could arise more easily on nuclear genes, whereas mitochondrial genes seem more likely to maintain conserved core functionalities.
In conclusion, this study provides a detailed analysis of the extent of positive selection events on protein-coding genes in seven ant species. Because false positives are a major concern for whole-genome scans for positive selection, we used a conservative methodology. We also reanalyzed data in bees and flies with the same methods to permit an unbiased and robust comparison of positive selection between lineages. The comparison between these three lineages provided interesting perspectives on the evolution of genes implicated in immunity, neurogenesis, and olfaction, and allowed us to pinpoint positive selection events that were specific to the ant lineage. In particular, we found that the evolution of extreme lifespan in ants was associated with positive selection on genes with mitochondrial functions, suggesting that a more efficient functioning of mitochondrial genes might have been an important step toward the extreme lifespan extension that characterizes this lineage. It would be interesting to complement this study by scans for genes under lineage-specific strong or relaxed purifying selection, to get a more global picture of natural selection patterns in ant genomes, and uncover additional genes that could have played a significant role during the evolution of the ant lineage.
Materials and Methods
Single-Copy Orthologs Gene Families Data Set
Protein-coding gene sequences of the seven ant genomes were downloaded from the Hymenoptera Genome Database (http://hymenopteragenome.org/ant_genomes/, last accessed April 24, 2014) (Munoz-Torres et al. 2011).
The complete annotated gene sets were OGS_1.0 for Acromyrmex echinatior (Nygaard et al. 2011), OGS_1.2 for Atta cephalotes (Suen et al. 2011), OGS_2.2.3 for Solenopsis invicta (Wurm et al. 2011), OGS_1.2 for Pogonomyrmex barbatus (Smith, Smith, et al. 2011), OGS_3.3 for Camponotus floridanus (Bonasio et al. 2010), OSG_1.2 for Linepithema humile (Smith, Zimin, et al. 2011), and OGS_3.3 for Harpegnathos saltator (Bonasio et al. 2010). Coding sequences of five outgroup species were downloaded from the Hymenoptera Genome Database for the honey bee (Apis mellifera Amel_pre_release2) (Honeybee Genome Sequencing Consortium 2006) and the jewel wasp (Nasonia Vitripenis OGS_v1.2) (Werren et al. 2010), from Flybase (Tweedie et al. 2009) for the fruit fly (Drosophila melanogaster FB5.29) (Adams et al. 2000), from BeetleBase (Kim, Murphy, et al. 2010) for the flour beetle (Tribolium castaneum Tcas_3.0) (Tribolium Genome Sequencing Consortium 2008), and from vectorBase (Lawson et al. 2009) for the body louse (Pediculus humanus PhumU1.2) (Kirkness et al. 2010).
Gene families were obtained from a custom run of the OrthoDB pipeline for the Ant Genomic Consortium (http://cegg.unige.ch/orthodbants and http://bioinfo.unil.ch/supdata/Roux_positive_selection_ants/orthoDB_run.zip, last accessed April 24, 2014; pipeline of OrthoDB release 4) (Waterhouse, Zdobnov, Tegenfeldt, et al. 2011; Simola et al. 2013). Briefly, OrthoDB implements a Best Reciprocal Hit clustering algorithm based on all-against-all Smith–Waterman protein sequence comparisons. The longest alternatively spliced form of genes is used. The orthologous groups are built at different taxonomic levels and it is possible to query for specific phyletic profiles by combining the criteria of absent, present, single-copy, multicopy, or no restriction, for each species within the studied clade.
Gene families including strictly one ortholog in each of the 12 species were selected (2,756 gene families). Because annotations of the studied genomes are likely to be incomplete (Simola et al. 2013), families with a few missing genes—gene losses or unannotated genes—were included, with the restriction that at least four genes out of the seven ant species, and three genes out of the five outgroup species should be present in the gene family. Simola et al. (2013) have shown that among the seven ant species, there were generally few lost or missing genes, apart from S. invicta (less than 400 S. invicta genes were missing in single-copy orthologs families) and Ac. echinatior (<150 Ac. echinatior genes were missing in single-copy orthologs families). Our gene family selection criteria allow handling such a moderate amount of missing genes in families. In order to transfer functional annotations from D. melanogaster, only families including a fruit fly ortholog were retained. With these criteria, the number of OrthoDB groups in the data set increased to 4,337. All gene families were assumed to follow the species tree topology (fig. 1). The exclusion of families that experienced gene duplication facilitates the comparison of branches between gene families, and keeps our analysis from biases related to differential duplication among lineages (Waterhouse, Zdobnov, Kriventseva 2011) and among genes (Davis and Petrov 2004; He and Zhang 2006), and to the consequences of duplication (Force et al. 1999; Brunet et al. 2006). Finally, results on single-copy orthologs can be easily compared with previously published studies using similar gene family topologies (Drosophila 12 Genomes Consortium 2007; Kosiol et al. 2008; Studer et al. 2008; Lindblad-Toh et al. 2011).
Basic sequence quality features were first controlled as in Hambuch and Parsch (2005). CDS (coding sequences) whose length was not a multiple of 3 or did not correspond to the length of the predicted protein, or that contained an internal stop codon, were eliminated; the longest CDS of genes showing multiple isoforms was retained; CDS shorter than 100 nt were eliminated.
Because misalignment errors can be an important source of false positives in genome-wide scans for positive selection in coding sequences (Schneider et al. 2009; Markova-Raina and Petrov 2011; Yang and dos Reis 2011; Jordan and Goldman 2012), we took great care at filtering the potentially problematic sites in the alignments. The quality filtering pipeline used here is adapted from the pipeline of the Selectome database release 4 (http://selectome.unil.ch, last accessed April 24, 2014) (Proux et al. 2009; Moretti et al. 2014). The multiple alignment of the protein sequences in each gene family was computed by M-Coffee (Wallace et al. 2006) from the T-Coffee package v8.93 (Notredame et al. 2000), which combines the output of different aligners. Similarly to Ensembl Compara (see http://www.ensembl.org/info/docs/compara/homology_method.html [last accessed April 24, 2014] for more details) (Vilella et al. 2009), four different aligners were used for M-Coffee (mafftgins_msa, muscle_msa, kalign_msa, and t_coffee_msa). M-Coffee outputs a consensus of four alignments from the different aligners, and a quality score for each residue based on the concordance of the alignment at each position by different aligners. Scores lie between 0, if a residue was not aligned at the same position by the different aligners, and 9 if it is reliably aligned at the same position in all cases. Reliably aligned residues with a score of 7 or above were retained. We used the heuristic algorithm of MaxAlign v1.1 (Gouveia-Oliveira et al. 2007) to detect and remove sequences badly aligned as a whole (gap-rich sequences) in the multiple sequence alignments. When a sequence was removed, the gene family was realigned and refiltered using M-Coffee. Families left with less than four sequences were discarded because of insufficient power to detect positive selection. The protein alignments were reverse-translated to nucleotide alignments using the seq_reformat utility of the T-Coffee package (Notredame et al. 2000).
We used a stringent Gblocks filtering (v0.91b; type = codons; minimum length of a block = 4; no gaps allowed) (Talavera and Castresana 2007) to remove gap-rich regions from the alignments, as these are problematic for positive selection inference (Fletcher and Yang 2010; Markova-Raina and Petrov 2011). The large memory requirements of M-Coffee for long alignments led us to use only Gblocks without M-Coffee scoring if the length of the alignment was greater than 9,000 nt.
After filtering, our data set included 4,261 gene families with an average of 10.4 branches per family to test (fig. 1; 44,306 branches to test; median = 10 branches per family). The mean length of filtered alignment was 1,133 nt (median = 885 nt), ranging from a minimum of 54 nt to a maximum of 22,248 nt. Of note, lost or missing genes in families affect the topology of the trees and the possibility to compare equivalent branches of different families. In total, our data set contains 36,681 branches (83%) in 4,256 families which corresponded to the canonical topology defined by the species tree (fig. 1) and could be compared across families (e.g., table 2).
Our analyses are likely to underestimate the genome-wide number of positive selection events because 1) single-copy orthologs tend to evolve under stronger purifying selection than multicopy gene families (Waterhouse, Zdobnov, Kriventseva 2011), 2) the ant genomes still lack good annotation of gene models and single-copy orthologs gene families could be missed, and 3) we filtered out unreliable parts of sequence alignments including fast evolving residues that are difficult to align (Fletcher and Yang 2010; Privman et al. 2012). The last point is balanced by the fact that conserved regions might be more prone to positively selected substitutions (Bazykin and Kondrashov 2012) and that the removal of unreliable regions seems to increase the power to detect positive selection (Jordan and Goldman 2012; Privman et al. 2012).
Extensive Gene Families Data Set
Another data set gathered all gene families from the OrthoDB database that could pass our quality filters, and notably families that experienced duplications. The CDS were filtered as described earlier. Amino-acid sequences were aligned using PAGAN version 0.47 (Loytynoja et al. 2012). The program GUIDANCE (v1.1) was used to assess alignment confidence and mask unreliably aligned residues (Penn et al. 2010; Privman et al. 2012). The combination of a phylogeny-aware aligner (PAGAN replaces PRANK [Löytynoja and Goldman 2008] and is based on the same principle) and of this filtering algorithm was shown to perform the best in recent benchmark studies on simulated data (Jordan and Goldman 2012; Privman et al. 2012). Gene family phylogenies were built using RAxML (v7.2.9) (Stamatakis 2006) from the amino-acid sequences, with the LG matrix and the CAT model. Amino-acid alignments were reverse-translated into the corresponding codon alignments. This resulted in 6,186 families tested, with an average of 11 genes, and an average length of filtered alignment of 3,129 nt (median of 2,385 nt, ranging from a minimum of 192 nt to a maximum of 20,556 nt).
Mitochondrial Gene Families Data Set
Contigs corresponding to mitochondrial genomes could be downloaded for five of the seven ant genomes (Ac. echinatior, At. cephalotes, S. invicta, P. barbatus, and L. humile). They were submitted to MITOS, a web server for the annotation of metazoan mitochondrial genomes (http://mitos.bioinf.uni-leipzig.de/index.py, last accessed April 24, 2014) (Bernt et al. 2012). This gave us the predicted coordinates of 13 mitochondrial protein-coding genes in these species. Frameshift errors or incomplete gene predictions were manually corrected. Mitochondrial genes from the outgroup species Ap. mellifera, N. Vitripenis, and T. castaneum were downloaded from GenBank (accession numbers: L06178; EU746609.1, and EU746613.1; AJ312413.2 and NC_003081.2, respectively). Mitochondrial genes from D. melanogaster were downloaded from Flybase at ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r5.43_FB2012_01/fasta/dmel-dmel_mitochondrion_genome-CDS-r5.43.fasta.gz (last accessed April 24, 2014). The alignment and filtering steps for the 13 mitochondrial gene families were identical to the data set of single-copy orthologs nuclear gene families (see above). A total of 119 branches were tested in this data set (average of 9.2 and median of 9 branches per family; average length of filtered alignment of 641 nt and median of = 621 nt, ranging from a minimum of 39 nt to a maximum of 1,413 nt).
Twelve Drosophila Data Set
Single-copy ortholog gene families from the twelve sequenced Drosophila species were downloaded from ftp://ftp.flybase.net/12_species_analysis/clark_eisen/alignments/ (last accessed April 24, 2014) (files “all_species.guide_tree.longest.cds.tar.gz” and “all_species.guide_tree.longest.translation.tar.gz”) (Drosophila 12 Genomes Consortium 2007). The alignment and filtering steps for these gene families were identical to the data set of single-copy ortholog gene families used for the ant analysis. Out of 6,698 initially downloaded Drosophila gene families, 3,749 (56%) passed our filters and could be tested for positive selection, resulting in 77,495 branches tested (average of 20.7 and median of 21 branches per family; average length of filtered alignment of 876 nt and median of 708 nt, ranging from a minimum of 15 nt to a maximum of 14,535 nt).
Bee Data Set
Single-copy ortholog gene families from 10 bee species were downloaded from http://insectsociogenomics.illinois.edu/ (last accessed April 24, 2014). This set of gene families is incomplete as it is derived from the sequencing of expressed sequence tags (using 454 Life Science/Roche GS-FLX platform) from nine bee species (Woodard et al. 2011), and from gene models of the honey bee Ap. mellifera (Honeybee Genome Sequencing Consortium 2006). The alignment and filtering steps for these gene families were identical to the data set of single-copy ortholog gene families used for the ant analysis. Out of 3,647 initially downloaded gene families, 2,256 (62%) passed our filters and could be tested for positive selection, resulting in 20,169 branches tested (average of 8.9 and median of 9 branches per family; average length of filtered alignment of 611 nt and median of 528 nt, ranging from a minimum of 27 nt to a maximum of 3,945 nt).
Branch-Site Test for Positive Selection
We used the updated branch-site test (Zhang et al. 2005) of Codeml from the package PAML v4.4c (Yang 2007) to detect Darwinian positive selection experienced by a gene family in a subset of sites in a specific branch of its phylogenetic tree. This test was previously used in genome-wide scans for positive selection in various lineages (Bakewell et al. 2007; Kosiol et al. 2008; Studer et al. 2008; Vamathevan et al. 2008; Oliver et al. 2010; George et al. 2011) and is used by the Selectome project (http://selectome.unil.ch, last accessed April 24, 2014) (Proux et al. 2009; Moretti et al. 2014). It is acknowledged to be more sensitive for the detection of positive selection than branch tests (Yang 1998) or site tests (Yang et al. 2000), because it does not average the signal over all codons in the alignment (branch test) nor over all branches of the phylogeny (site test) (Yang and dos Reis 2011). It is also robust to relaxation of purifying selection (ω close to 1) since this scenario is accounted for in the null model (Zhang 2004; Zhang et al. 2005). The alternative model is contrasted to the null model using a likelihood-ratio test (LRT), where log-likelihood ratios are compared to a chi-square distribution with 1 degree of freedom (Zhang et al. 2005). Previous studies have reported the branch-site test to be conservative in this setup (Bakewell et al. 2007; Studer et al. 2008; Fletcher and Yang 2010; Yang and dos Reis 2011; Gharib and Robinson-Rechavi 2013). We did not use the ω estimates to infer the strength of positive selection because they were shown to be unreliable (Bakewell et al. 2007; Yang and dos Reis 2011).
In the absence of a specific a priori hypothesis regarding which branches to test for positive selection, our implementation runs the test multiple times on each gene family, successively changing the branch selected as foreground. The branches considered as foreground are highlighted in red in figure 1. This approach was shown to be legitimate if P-values from the successive tests are corrected for multiple testing (Anisimova and Yang 2007; Yang and dos Reis 2011). We applied a FDR correction (Benjamini and Hochberg 1995) over all the P-values treated as one series (number of branches tested × number of gene families tested). In the ant single-copy orthologs nuclear data set, we analyzed a maximum of 15 branches leading to the 7 ant species, summing to 44,306 tests performed. In the ant mitochondrial data set, we analyzed a maximum of 11 branches leading to 5 ant species, summing to 119 tests (branches in red in supplementary fig. S3, Supplementary Material online). In the Drosophila single-copy orthologs data set, we analyzed a maximum of 21 branches, leading to a total of 77,495 tests (supplementary fig. S4, Supplementary Material online). Finally in the bee data set, we analyzed a maximum of 17 branches, leading to a total of 20,169 tests (supplementary fig. S5, Supplementary Material online).
All computations were performed using Slimcodeml (release May 4, 2011) (Schabauer et al. 2012), an optimized version of Codeml, based on the release 4.4c of the PAML package (downloadable at http://selectome.unil.ch/cgi-bin/download.cgi, last accessed April 24, 2014). Slimcodeml was estimated to run the branch-site models about 1.77 times faster than the original Codeml thanks to the use of external standard libraries for linear algebra calculations and specific optimizations for the computer architecture used. We verified on a subset of the gene families that the results given by Slimcodeml were identical with the original Codeml. Examples of Slimcodeml/Codeml control files used are provided in supplementary text, Supplementary Material online. For the ant mitochondrial data set, Codeml was used with the option “icode = 4” to use the Invertebrate mitochondrial genetic code (http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG5, last accessed April 24, 2014).
The branch-site model is known to display convergence problems in the calculation of likelihoods (Yang and dos Reis 2011), leading to negative or artificially large log-likelihood ratios. We thus launched three independent runs for both the alternative and null hypotheses, for each branch of each gene family, and kept the best likelihood value of each run to calculate the log-likelihood ratio (Yang and dos Reis 2011). Of note, the likelihood differences observed across the three runs were most of the time very small. Even after reconciliation of three replicate runs, we still observed a number of negative log-likelihood ratios (8% of the tests—most of them very close to 0). In such cases, we manually set the log-likelihood ratios to 0 (meaning nonsignificance). We recorded the largest differences in likelihood values between the three independent runs in both fixed and alternative models (d). The distribution of differences was bimodal, with a first major mode at d = 0, gathering most data, and a second minor mode at d ∼ 1. A cutoff at d = 0.004 clearly separated the two peaks. We used this stringent cutoff (d > 0.004) to eliminate all tests with potential convergence problems in the fixed and alternative models (see supplementary text and table S23, Supplementary Material online).
Values of dN and dS were calculated with parameters extracted from Codeml results files (.mlc files).
All calculations were performed on the SIB Vital-IT cluster in Lausanne (http://www.vital-it.ch/, last accessed April 24, 2014). All three runs and the two hypotheses of each test were performed on the same node of the cluster.
Site Test for Positive Selection
The site test (Yang et al. 2000) of Codeml from the package PAML v4.4e (Yang 2007), allowing the dN/dS ratio (ω) to vary among sites, was run on the extensive data set of 6,186 families (see above). We contrasted the null model M8a (beta and ω with ω = 1) to the alternative model M8 (beta and ω with ω ≥ 1) with 11 site classes (Swanson et al. 2003; Wong et al. 2004). Examples of Codeml control files used are provided in supplementary text, Supplementary Material online. Similar to the branch-site test, we launched three independent runs for both the alternative and null hypotheses for each gene family and kept the best likelihood value of each run for the LRT (supplementary table S19, Supplementary Material online). The likelihood ratios were compared to a chi-square distribution with 1 degree of freedom as recommended in PAML user’s guide (http://abacus.gene.ucl.ac.uk/software/pamlDOC.pdf, last accessed April 24, 2014).
Reconstruction of Ancestral G + C Content
The program nhPhyml (Galtier and Gouy 1998; Guindon and Gascuel 2003; Boussau and Gouy 2006) was used to estimate the G + C content at third codon positions at each node of the gene family trees (topology fixed, transition/transversion ratio estimated, alpha parameter estimated with eight categories). Following Studer et al. (2008), we calculated the shift in GC3 content at each branch as the difference between GC3 contents at the nodes delimitating that branch.
Olfactory Receptors Family
Olfactory receptors are difficult to process in automated pipelines since they are characterized by dynamic patterns of duplications and pseudogenization during evolution (Nozawa and Nei 2007). Furthermore, the sequences of ORs are highly variable and notoriously difficult for automatic gene annotation. Accordingly, our main data set of single-copy orthologs was depleted in genes involved in olfaction (supplementary tables S7, S8, S10, and S11, Supplementary Material online) and GO categories related to olfaction could not be tested for enrichment of positively selected genes because they included too few annotated genes. We therefore used a more comprehensive data set of 873 manually annotated protein-coding sequences of OR genes (excluding suspected pseudogenes) provided by Hugh Robertson for P. barbatus (291 genes) (Smith, Smith, et al. 2011), L. humile (320 genes) (Smith, Zimin, et al. 2011), and N. vitripennis (262 genes) (Werren et al. 2010). Nucleotide sequences were translated and amino-acid sequences were aligned using MAFFT (Katoh et al. 2005). Unreliably aligned residues were masked using GUIDANCE based on 32 bootstrap samples and a cutoff of 0.2 that was chosen so that the 15% lowest scoring residues are masked (Penn et al. 2010; Privman et al. 2012). Phylogeny was reconstructed using RAxML with the JTT substitution matrix, the CAT approximation, and 100 bootstrap samples (Stamatakis 2006). Because the resulting gene tree was too large for an analysis with the branch-site test of Codeml, we divided it into 16 smaller subtrees, each containing less than 100 leaves. Branches with as high as possible bootstrap support were chosen as splitting points. The 16 subtrees include all ant sequences but only 105 N. vitripennis sequences. The sequences from each subtree were realigned using PRANK version 100701 (Löytynoja and Goldman 2008) and reverse-translated into corresponding codon alignments. GUIDANCE was used to mask unreliably aligned codons (0.8 cutoff). Phylogeny was reconstructed using RAxML as above. Out of 1,744 branches in the initial tree, 1,400 branches from the subtrees were tested using the branch-site test of Codeml (see above), and the computation was successful (both null and alternative hypotheses) for 1,184 branches. Significant branches are highlighted in red in figure 3 and in supplementary figure S1, Supplementary Material online. Full results of the branch-site test on all 16 clades are shown in supplementary table S18, Supplementary Material online. A full tree with branch names and bootstrap values is provided as supplementary figure S1, Supplementary Material online. Newick trees of the 16 individual subtrees along with annotation of tested branches are available in supplementary text, Supplementary Material online.
Tests of Functional Category Enrichment
GO (Ashburner et al. 2000) annotations for gene families were taken from the annotation of the D. melanogaster gene member they include (downloaded from http://flybase.org/static_pages/downloads/FB2011_02/go/gene_association.fb.gz, last accessed April 24, 2014). The annotation of children GO categories was propagated to their parent categories following the GO graph structure. GO categories mapped to 10 genes or less were discarded for the enrichment analysis.
To identify over- and underrepresented functional categories present in the data sets used in this study, the package topGO version 2.4 (Alexa et al. 2006) of Bioconductor (Gentleman et al. 2004) was used. A Fisher's exact test was used, combined with the “elim” algorithm of topGO, which decorrelates the graph structure of the GO to reduce nonindependence problems (Alexa et al. 2006). The reference set was constituted of all OrthoDB families including a D. melanogaster gene with GO annotation. GO categories with an FDR < 20% are reported (Benjamini and Hochberg 1995).
Regarding the functional enrichment of genes targeted by positive selection, the Fisher's exact test approach has been criticized because it imposes the choice of an arbitrary cutoff to dichotomize genes into “significant” and “nonsignificant” categories. This leads to a loss of information and limits the power and robustness of this method (Allison et al. 2006; Tintle et al. 2009; Daub et al. 2013). To test for GO functional categories for enrichment for positively selected genes, we instead used a gene set enrichment approach, which tests whether the distribution of scores of genes from a gene set differs from the whole data set scores distribution, allowing the detection of gene sets that contain many marginally significant genes. Different implementations for this approach have been proposed. The most widely used is the gene set enrichment analysis (GSEA) (Subramanian et al. 2005), but it was shown to perform relatively poorly (Kim and Volsky 2005; Efron and Tibshirani 2007; Tintle et al. 2009). Here, we used a SUMSTAT test: for a given gene set g including n genes, the SUMSTAT statistic is defined as the sum of scores of the n genes. This statistic was shown to be more sensitive than a panel of other methods, while controlling well for the rate of false positives (Ackermann and Strimmer 2009; Tintle et al. 2009; Fehringer et al. 2012; Daub et al. 2013). To be able to use the distribution of log-likelihood ratios of the positive selection test—which follows a chi-square distribution with 1 degree of freedom and spans several orders of magnitude—as scores in the SUMSTAT test, we applied a fourth root transformation as variance stabilizing method. This transformation conserves the ranks of gene families (see http://udel.edu/∼mcdonald/stattransform.html, last accessed April 24, 2014) (Canal 2005; McDonald 2009). According to the Central Limit Theorem, the distribution of SUMSTAT scores from random gene sets approaches a normal distribution whose mean and variance derives from the mean and variance of the scores of the complete list of tested genes G:
and
We performed bidirectional tests against this distribution to test whether the SUMSTAT statistic for a given gene set is higher or lower than expected by chance, corresponding to respectively enrichment or depletion for positively selected genes in this gene set. We verified the accuracy of this methodology by drawing an empirical null distribution for each gene set of size n found in the real data set, based on scores of 10,000 gene sets of same size n randomly picked from the whole data set. The distribution of SUMSTAT scores of these randomized gene sets approximates closely a normal distribution, even when the set size is small (supplementary fig. S6, Supplementary Material online). This makes the SUMSTAT test less computationally intensive than other gene set enrichment approaches (e.g., GSEA) (Subramanian et al. 2005) where the null distribution cannot be inferred mathematically and randomizations have to be performed for each individual test. We verified that a GSEA approach gave broadly similar results (not shown).
Because different gene sets sometimes share many genes in common, the list of significant gene sets resulting from enrichment tests is usually highly redundant. We implemented the “elim” algorithm from the Bioconductor package topGO, to decorrelate the graph structure of the GO (Alexa et al. 2006). Briefly, the GO categories are tested recursively starting from the deeper levels of the GO tree, and the genes annotated to these significant categories are removed from all their parent categories. As the tests for different categories are not independent, it is not clear whether classical approaches to assess the FDR (e.g., Benjamini and Hochberg 1995) are accurate. Thus, we calculated empirically an FDR at each P-value threshold by performing 100 randomizations where the scores of gene families were permuted and the gene set enrichment test rerun. The FDR is estimated as
where at a given P-value threshold N0 represents the mean number of false positives obtained in the randomizations and Nt represents the number of positives obtained with the real data set. The FDR obtained with this approach was in good agreement with the Benjamini–Hochberg FDR (Benjamini and Hochberg 1995). GO categories with an FDR < 20% are reported. Functional categories depleted in positive selection reflect the most conserved sets of functional categories, under the action of purifying selection. These are not discussed in this article.
The gene set enrichment test ran on each individual branch of the tree with results of the branch-site test yields heterogeneous results, probably resulting from differences in power of the branch-site test on different branches of the phylogeny (supplementary table S9, Supplementary Material online; only branches Sinv, Pbar, Hsal, #3 and #6 show some significant categories at FDR 20%). This test could also be sensitive to false positive results of the branch-site test (e.g., GC-biased gene conversion, discussed in supplementary text, Supplementary Material online). Thus, we designed a test less sensitive to these problems. We considered a unique score per gene family reflecting the evidence for positive selection globally in the ant lineage, the mean of the branch-site test scores on the 13 individual ant branches. This scoring scheme should unveil functional categories of genes that experienced extensive and probably recurrent episodes of positive selection in the ant lineage, but is not strictly equivalent to using the results of a site test on ants branches, since it allows the detection of gene families with positive selection events affecting different sites on different branches. We also checked that in most cases, the enriched categories were not significant only because of a single outlier gene with a strong positive selection score, but displayed a significant shift in the distribution of positive selection scores of numerous genes (supplementary fig. S7, Supplementary Material online).
Finally, as a sanity check, the gene set enrichment test was also performed using KEGG pathways annotation. KEGG pathways and the mapping to D. melanogaster genes were downloaded with the KEGG REST API (http://www.kegg.jp/kegg/rest/keggapi.html, last accessed April 24, 2014). Because hierarchical relationships among KEGG pathways are limited, we did not use the “elim” decorrelation algorithm. Pathways mapped to more than 10 genes were retained. In total, 51 KEGG pathways were tested.
Tests of Phenotypic Category Enrichment
Mutant phenotype annotations of D. melanogaster genes were extracted from Flybase (Drysdale 2001; Osumi-Sutherland et al. 2013). The following ontologies were downloaded from the OBO foundry (Smith et al. 2007): the Flybase controlled vocabulary ontology (http://www.berkeleybop.org/ontologies/obo-all/flybase_vocab/flybase_vocab.obo, last accessed April 24, 2014), the Drosophila anatomical ontology (http://www.berkeleybop.org/ontologies/obo-all/fly_anatomy/fly_anatomy.obo, last accessed April 24, 2014), and the Drosophila developmental stages ontology (http://www.berkeleybop.org/ontologies/obo-all/fly_development/fly_development.obo, last accessed April 24, 2014). The relationships between genes and alleles, and between alleles and phenotypes (anatomical and developmental ontology categories) were extracted from Flybase (ftp://ftp.flybase.net/releases/FB2012_01/reporting-xml/FBgn.xml.gz, last accessed April 24, 2014; “derived_pheno_class” and “derived_pheno_manifest” entities). The information on gain or loss-of-function alleles was extracted from the file ftp://ftp.flybase.net/releases/FB2012_01/reporting-xml/FBal.xml.gz (last accessed April 24, 2014) (loss of function: controlled vocabulary term FBcv:0000287 and child terms; gain of function: FBcv:0000290 and child terms). The annotation of child phenotypic categories (anatomy of development) was propagated to their parent categories following the respective ontologies structures.
To perform an enrichment analysis based on mutant phenotypes in fruit fly, we used the SUMSTAT test. Because the annotation is scarcer than the GO annotation, we used only the categories mapped to more than five genes for the enrichment analysis. The reported results include the annotation for gain and loss-of-function alleles. We observed very similar results when gain-of-function alleles were removed from the annotation (Weng and Liao 2011) (not shown).
Expression Data
Microarray expression data from S. invicta (Ometto et al. 2011) were provided by the authors upon request. These included expression levels of clones of the spotted microarray used, as well as the list of genes identified to be overexpressed in each of the three castes (workers, queens, and males), both at pupal and adult stages. The mapping of clones to the gene model of S. invicta (OGS_2.2.3) (Wurm et al. 2011) was provided by Y. Wurm, and is similar to the mapping used in Hunt et al. (2011). If multiple clones mapped to the same gene, the average signal was used for expression. For differential expression, we used the results of the original study (BAGEL analysis, where a clone was considered to be differentially expressed between conditions if the Bayesian posterior probability was P < 0.001, corresponding to an FDR ∼ 5%) (Ometto et al. 2011). A gene was considered differentially expressed if at least one clone mapped to it was found differentially expressed. Expression data were available for 1,327 genes of the single-copy orthologs data set, including 603 genes overexpressed in at least one condition. We ran a SUMSTAT gene set enrichment test on the sets of genes with caste-specific expression (pupal male, pupal queen, pupal worker, adult male, adult queen, and adult worker). P-values were obtained by comparison to an empirical distribution created with 10,000 randomizations of gene scores.
Aging Genes
Aging and oxidative stress associated genes were obtained from a microarray study in D. melanogaster comparing the expression of genes in 10-day-old flies to 61-day-old flies, and flies exposed to 100% O2 for 7 days to controls (Landis et al. 2004). We tested the enrichment for positively selected genes (SUMSTAT test) in four gene sets constituted of up and downregulated genes in both contrasts. P-values were obtained by comparison to an empirical distribution created with 10,000 randomizations of gene scores.
Genes with Mitochondrial Function
Genes with mitochondrial function were identified as those mapped to any of the 310 GO categories including “mitochondria*” in their names or synonym names (using the search engine on http://amigo.geneontology.org/, last accessed April 24, 2014). Three hundred and thirteen of the identified genes had available microarray expression data in S. invicta.
Data Availability
Raw and filtered alignments used in these analyses track files for the alignment editor Jalview (Clamp et al. 2004), Codeml control files and result files can be downloaded at http://bioinfo.unil.ch/supdata/Roux_positive_selection_ants/Roux_et_al_datasets.tar.gz (last accessed April 24, 2014).
A simple web interface displaying gene families, GO mapping, Codeml results, and alignments (through a Jalview applet) is available at http://bioinfo.unil.ch/supdata/Roux_positive_selection_ants/families.html (last accessed April 24, 2014). Jalview tracks display the regions used or filtered out in the original protein alignments, as well as the residues found to be under positive selection by Bayes Empirical Bayes (Yang et al. 2005) in all the branches tested for each of the three replicate runs (fig. 2).
Supplementary Material
Supplementary tables S1–S26, figures S1–S7, and supplementary text are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).
Acknowledgments
The authors thank Yannick Wurm, Miguel Corona Villegas, Nicolas Salamin, Corrie Moreau, and members of the Keller laboratory for stimulating discussions. They are grateful to Oksana Riba-Grognuz, Lino Ometto, Roberto Bonasio, Robert Waterhouse, Hannes Schabauer, Walid Gharib, and members of the Ant Comparative Genomics Consortium for making data or software available for this study; Ben-Yang Liao and Meng-Pin Weng for help with Flybase phenotypic data extraction; Alexander Wild for providing illustrations for figure 1; and four anonymous reviewers for valuable comments. Computations were performed at the Vital-IT (http://www.vital-it.ch) center for high-performance computing of the SIB Swiss Institute of Bioinformatics. J.R. was funded by a Swiss NSF grant attributed to L.K., a Swiss NSF postdoc mobility fellowship (PBLAP3-134342) and a Marie Curie fellowship. M.R.R. and S.M. acknowledge funding from a Swiss NSF grant (31003A 133011/1), the Swiss Platform for High-Performance and High-Productivity Computing (HP2C), and project UNIL.5/SMSCG as part of the AAA/SWITCH. M.R.R. and J.T.D. acknowledge funding from a Swiss NSF ProDoc grant (PDFMP3_130309). L.K. is supported by several grants from the Swiss NSF and an ERC advanced grant. J.R., E.P., and L.K. designed the study; J.R. and E.P. analyzed data; J.T.D., S.M., and M.R.R. contributed code or programs; J.R. and L.K. wrote the manuscript with input from M.R.R.
References
- Abouheif E, Wray GA. Evolution of the gene network underlying wing polyphenism in ants. Science. 2002;297:249–252. doi: 10.1126/science.1071468. [DOI] [PubMed] [Google Scholar]
- Ackermann M, Strimmer K. A general modular framework for gene set enrichment analysis. BMC Bioinformatics. 2009;10:47. doi: 10.1186/1471-2105-10-47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185–2195. doi: 10.1126/science.287.5461.2185. [DOI] [PubMed] [Google Scholar]
- Alexa A, Rahnenfuhrer J, Lengauer T. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics. 2006;22:1600–1607. doi: 10.1093/bioinformatics/btl140. [DOI] [PubMed] [Google Scholar]
- Allison DB, Cui X, Page GP, Sabripour M. Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 2006;7:55–65. doi: 10.1038/nrg1749. [DOI] [PubMed] [Google Scholar]
- Amdam GV, Omholt SW. The regulatory anatomy of honeybee lifespan. J Theor Biol. 2002;216:209–228. doi: 10.1006/jtbi.2002.2545. [DOI] [PubMed] [Google Scholar]
- Anisimova M, Yang Z. Multiple hypothesis testing to detect lineages under positive selection that affects only a few sites. Mol Biol Evol. 2007;24:1219–1228. doi: 10.1093/molbev/msm042. [DOI] [PubMed] [Google Scholar]
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bakewell MA, Shi P, Zhang J. More genes underwent positive selection in chimpanzee evolution than in human evolution. Proc Natl Acad Sci U S A. 2007;104:7489–7494. doi: 10.1073/pnas.0701705104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bard F, Casano L, Mallabiabarrena A, Wallace E, Saito K, Kitayama H, Guizzunti G, Hu Y, Wendler F, Dasgupta R, et al. Functional genomics reveals genes involved in protein secretion and Golgi organization. Nature. 2006;439:604–607. doi: 10.1038/nature04377. [DOI] [PubMed] [Google Scholar]
- Bazin E, Glemin S, Galtier N. Population size does not influence mitochondrial genetic diversity in animals. Science. 2006;312:570–572. doi: 10.1126/science.1122033. [DOI] [PubMed] [Google Scholar]
- Bazykin GA, Kondrashov AS. Major role of positive selection in the evolution of conservative segments of Drosophila proteins. Proc R Soc B Biol Sci. 2012;279:3409–3417. doi: 10.1098/rspb.2012.0776. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bechstedt S, Albert JT, Kreil DP, Muller-Reichert T, Gopfert MC, Howard J. A doublecortin containing microtubule-associated protein is implicated in mechanotransduction in Drosophila sensory cilia. Nat Commun. 2010;1:11. doi: 10.1038/ncomms1007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beller M, Sztalryd C, Southall N, Bell M, Jäckle H, Auld DS, Oliver B. COPI complex is a regulator of lipid homeostasis. PLoS Biol. 2008;6:e292. doi: 10.1371/journal.pbio.0060292. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Methodol. 1995;57:289–300. [Google Scholar]
- Bernt M, Donath A, Juhling F, Externbrink F, Florentz C, Fritzsch G, Putz J, Middendorf M, Stadler PF. MITOS: improved de novo metazoan mitochondrial genome annotation. Mol Phylogenet Evol. 2012;69:313–319. doi: 10.1016/j.ympev.2012.08.023. [DOI] [PubMed] [Google Scholar]
- Blanke S, Jackle H. Novel guanine nucleotide exchange factor GEFmeso of Drosophila melanogaster interacts with Ral and Rho GTPase Cdc42. FASEB J. 2006;20:683–691. doi: 10.1096/fj.05-5376com. [DOI] [PubMed] [Google Scholar]
- Bodily KD, Morrison CM, Renden RB, Broadie K. A novel member of the Ig superfamily, turtle, is a CNS-specific protein required for coordinated motor control. J Neurosci. 2001;21:3113–3125. doi: 10.1523/JNEUROSCI.21-09-03113.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bonasio R, Zhang G, Ye C, Mutti NS, Fang X, Qin N, Donahue G, Yang P, Li Q, Li C, et al. Genomic comparison of the ants Camponotus floridanus and Harpegnathos saltator. Science. 2010;329:1068–1071. doi: 10.1126/science.1192428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boussau B, Gouy M. Efficient likelihood computations with nonreversible models of evolution. Syst Biol. 2006;55:756–768. doi: 10.1080/10635150600975218. [DOI] [PubMed] [Google Scholar]
- Brady SG, Schultz TR, Fisher BL, Ward PS. Evaluating alternative hypotheses for the early evolution and diversification of ants. Proc Natl Acad Sci U S A. 2006;103:18172–18177. doi: 10.1073/pnas.0605858103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bronstein R, Levkovitz L, Yosef N, Yanku M, Ruppin E, Sharan R, Westphal H, Oliver B, Segal D. Transcriptional regulation by CHIP/LDB complexes. PLoS Genet. 2010;6:e1001063. doi: 10.1371/journal.pgen.1001063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brunet FG, Crollius HR, Paris M, Aury JM, Gibert P, Jaillon O, Laudet V, Robinson-Rechavi M. Gene loss and evolutionary rates following whole-genome duplication in teleost fishes. Mol Biol Evol. 2006;23:1808–1816. doi: 10.1093/molbev/msl049. [DOI] [PubMed] [Google Scholar]
- Brunet-Rossinni A, Austad S. Ageing studies on bats: a review. Biogerontology. 2004;5:211–222. doi: 10.1023/B:BGEN.0000038022.65024.d8. [DOI] [PubMed] [Google Scholar]
- Brunet-Rossinni AK. Reduced free-radical production and extreme longevity in the little brown bat (Myotis lucifugus) versus two non-flying mammals. Mech Ageing Dev. 2004;125:11–20. doi: 10.1016/j.mad.2003.09.003. [DOI] [PubMed] [Google Scholar]
- Bulmer MS. Evolution of immune proteins in insects. Encyclopedia of life sciences (ELS) Chichester (NH): John Wiley & Sons, Ltd; 2010. [Google Scholar]
- Burton RS, Barreto FS. A disproportionate role for mtDNA in Dobzhansky–Muller incompatibilities? Mol Ecol. 2012;21:4942–4957. doi: 10.1111/mec.12006. [DOI] [PubMed] [Google Scholar]
- Canal L. A normal approximation for the chi-square distribution. Comput Stat Data Anal. 2005;48:803–808. [Google Scholar]
- Chen CC, Wu JK, Lin HW, Pai TP, Fu TF, Wu CL, Tully T, Chiang AS. Visualizing long-term memory formation in two neurons of the Drosophila brain. Science. 2012;335:678–685. doi: 10.1126/science.1212735. [DOI] [PubMed] [Google Scholar]
- Clamp M, Cuff J, Searle SM, Barton GJ. The Jalview Java alignment editor. Bioinformatics. 2004;20:426–427. doi: 10.1093/bioinformatics/btg430. [DOI] [PubMed] [Google Scholar]
- Collier S, Chan HYE, Toda T, McKimmie C, Johnson G, Adler PN, O’Kane C, Ashburner M. The Drosophila embargoed gene is required for larval progression and encodes the functional homolog of Schizosaccharomyces Crm1. Genetics. 2000;155:1799–1807. doi: 10.1093/genetics/155.4.1799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corley LS, Lavine MD. A review of insect stem cell types. Semin Cell Dev Biol. 2006;17:510–517. doi: 10.1016/j.semcdb.2006.07.002. [DOI] [PubMed] [Google Scholar]
- Corona M, Hughes KA, Weaver DB, Robinson GE. Gene expression patterns associated with queen honey bee longevity. Mech Ageing Dev. 2005;126:1230–1238. doi: 10.1016/j.mad.2005.07.004. [DOI] [PubMed] [Google Scholar]
- Corona M, Velarde RA, Remolina S, Moran-Lauter A, Wang Y, Hughes KA, Robinson GE. Vitellogenin, juvenile hormone, insulin signaling, and queen honey bee longevity. Proc Natl Acad Sci U S A. 2007;104:7128–7133. doi: 10.1073/pnas.0701909104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cortopassi GA. A neutral theory predicts multigenic aging and increased concentrations of deleterious mutations on the mitochondrial and Y chromosomes. Free Radic Biol Med. 2002;33:605–610. doi: 10.1016/s0891-5849(02)00966-8. [DOI] [PubMed] [Google Scholar]
- Cronin SJ, Nehme NT, Limmer S, Liegeois S, Pospisilik JA, Schramek D, Leibbrandt A, Simoes Rde M, Gruber S, Puc U, et al. Genome-wide RNAi screen identifies genes involved in intestinal pathogenic bacterial infection. Science. 2009;325:340–343. doi: 10.1126/science.1173164. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crozier RH, Crozier YC. The mitochondrial genome of the honeybee Apis mellifera: complete sequence and genome organization. Genetics. 1993;133:97–117. doi: 10.1093/genetics/133.1.97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cui H, Kong Y, Zhang H. Oxidative stress, mitochondrial dysfunction, and aging. J Signal Transduct. 2012;2012:646354. doi: 10.1155/2012/646354. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Curtis C, Landis GN, Folk D, Wehr NB, Hoe N, Waskar M, Abdueva D, Skvortsov D, Ford D, Luu A, et al. Transcriptional profiling of MnSOD-mediated lifespan extension in Drosophila reveals a species-general network of aging and metabolic genes. Genome Biol. 2007;8:R262. doi: 10.1186/gb-2007-8-12-r262. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Daub JT, Hofer T, Cutivet E, Dupanloup I, Quintana-Murci L, Robinson-Rechavi M, Excoffier L. Evidence for polygenic adaptation to pathogens in the human genome. Mol Biol Evol. 2013;30:1544–1558. doi: 10.1093/molbev/mst080. [DOI] [PubMed] [Google Scholar]
- Davis JC, Petrov DA. Preferential duplication of conserved proteins in eukaryotic genomes. PLoS Biol. 2004;2:e55. doi: 10.1371/journal.pbio.0020055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dawkins R, Krebs JR. Arms races between and within species. Proc R Soc B Biol Sci. 1979;205:489–511. doi: 10.1098/rspb.1979.0081. [DOI] [PubMed] [Google Scholar]
- Didelot G, Molinari F, Tchenio P, Comas D, Milhiet E, Munnich A, Colleaux L, Preat T. Tequila, a neurotrypsin ortholog, regulates long-term memory formation in Drosophila. Science. 2006;313:851–853. doi: 10.1126/science.1127215. [DOI] [PubMed] [Google Scholar]
- Drosophila 12 Genomes Consortium. Evolution of genes and genomes on the Drosophila phylogeny. Nature. 2007;450:203–218. doi: 10.1038/nature06341. [DOI] [PubMed] [Google Scholar]
- Drysdale R. Phenotypic data in FlyBase. Brief Bioinform. 2001;2:68–80. doi: 10.1093/bib/2.1.68. [DOI] [PubMed] [Google Scholar]
- Duret L. Neutral theory: the null hypothesis of molecular evolution. Nat Educ. 2008;1:218. [Google Scholar]
- Efron B, Tibshirani R. On testing the significance of sets of genes. Ann Appl Stat. 2007;1:107–129. [Google Scholar]
- Farris SM, Schulmeister S. Parasitoidism, not sociality, is associated with the evolution of elaborate mushroom bodies in the brains of hymenopteran insects. Proc R Soc B Biol Sci. 2011;278:940–951. doi: 10.1098/rspb.2010.2161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fehringer G, Liu G, Briollais L, Brennan P, Amos CI, Spitz MR, Bickeböller H, Wichmann HE, Risch A, Hung RJ. Comparison of pathway analysis approaches using lung cancer GWAS data sets. PLoS One. 2012;7:e31816. doi: 10.1371/journal.pone.0031816. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Finkel T, Holbrook NJ. Oxidants, oxidative stress and the biology of ageing. Nature. 2000;408:239–247. doi: 10.1038/35041687. [DOI] [PubMed] [Google Scholar]
- Fischman BJ, Woodard SH, Robinson GE. Molecular evolutionary analyses of insect societies. Proc Natl Acad Sci U S A. 2011;108(Suppl 2):10847–10854. doi: 10.1073/pnas.1100301108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fletcher W, Yang Z. The effect of insertions, deletions, and alignment errors on the branch-site test of positive selection. Mol Biol Evol. 2010;27:2257–2267. doi: 10.1093/molbev/msq115. [DOI] [PubMed] [Google Scholar]
- Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. Preservation of duplicate genes by complementary, degenerative mutations. Genetics. 1999;151:1531–1545. doi: 10.1093/genetics/151.4.1531. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galtier N, Gouy M. Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis. Mol Biol Evol. 1998;15:871–879. doi: 10.1093/oxfordjournals.molbev.a025991. [DOI] [PubMed] [Google Scholar]
- Gentleman R, Carey V, Bates D, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80. doi: 10.1186/gb-2004-5-10-r80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- George RD, McVicker G, Diederich R, Ng SB, MacKenzie AP, Swanson WJ, Shendure J, Thomas JH. Trans genomic capture and sequencing of primate exomes reveals new targets of positive selection. Genome Res. 2011;21:1686–1694. doi: 10.1101/gr.121327.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gerber AS, Loggins R, Kumar S, Dowling TE. Does nonneutral evolution shape observed patterns of DNA variation in animal mitochondrial genomes? Annu Rev Genet. 2001;35:539–566. doi: 10.1146/annurev.genet.35.102401.091106. [DOI] [PubMed] [Google Scholar]
- Gharib WH, Robinson-Rechavi M. The branch-site test of positive selection is surprisingly robust but lacks power under synonymous substitution saturation and variation in GC. Mol Biol Evol. 2013;30:1675–1686. doi: 10.1093/molbev/mst062. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gotzek D, Clarke J, Shoemaker D. Mitochondrial genome evolution in fire ants (Hymenoptera: Formicidae) BMC Evol Biol. 2010;10:300. doi: 10.1186/1471-2148-10-300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gouveia-Oliveira R, Sackett P, Pedersen A. MaxAlign: maximizing usable data in an alignment. BMC Bioinformatics. 2007;8:312. doi: 10.1186/1471-2105-8-312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003;52:696–704. doi: 10.1080/10635150390235520. [DOI] [PubMed] [Google Scholar]
- Hall DW, Goodisman MA. The effects of kin selection on rates of molecular evolution in social insects. Evolution. 2012;66:2080–2093. doi: 10.1111/j.1558-5646.2012.01602.x. [DOI] [PubMed] [Google Scholar]
- Hambuch TM, Parsch J. Patterns of synonymous codon usage in Drosophila melanogaster genes with sex-biased expression. Genetics. 2005;170:1691–1700. doi: 10.1534/genetics.104.038109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harman D. The biologic clock: the mitochondria? J Am Geriatr Soc. 1972;20:145–147. doi: 10.1111/j.1532-5415.1972.tb00787.x. [DOI] [PubMed] [Google Scholar]
- Harpur BA, Zayed A. Accelerated evolution of innate immunity proteins in social insects: adaptive evolution or relaxed constraint? Mol Biol Evol. 2013;30:1665–1674. doi: 10.1093/molbev/mst061. [DOI] [PubMed] [Google Scholar]
- He X, Zhang J. Higher duplicability of less important genes in yeast genomes. Mol Biol Evol. 2006;23:144–151. doi: 10.1093/molbev/msj015. [DOI] [PubMed] [Google Scholar]
- Hölldobler B, Wilson E. The ants. Cambridge (MA): Belknap Press of Harvard University Press; 1990. [Google Scholar]
- Honeybee Genome Sequencing Consortium. Insights into social insects from the genome of the honeybee Apis mellifera. Nature. 2006;443:931–949. doi: 10.1038/nature05260. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hovemann BT, Sehlmeyer F, Malz J. Drosophila melanogaster NADPH–cytochrome P450 oxidoreductase: pronounced expression in antennae may be related to odorant clearance. Gene. 1997;189:213–219. doi: 10.1016/s0378-1119(96)00851-7. [DOI] [PubMed] [Google Scholar]
- Hunt BG, Ometto L, Wurm Y, Shoemaker D, Yi SV, Keller L, Goodisman MAD. Relaxed selection is a precursor to the evolution of phenotypic plasticity. Proc Natl Acad Sci U S A. 2011;108:15936–15941. doi: 10.1073/pnas.1104825108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ingram KK, Oefner P, Gordon DM. Task-specific expression of the foraging gene in harvester ants. Mol Ecol. 2005;14:813–818. doi: 10.1111/j.1365-294X.2005.02450.x. [DOI] [PubMed] [Google Scholar]
- Jemielity S, Chapuisat M, Parker JD, Keller L. Long live the queen: studying aging in social insects. Age. 2005;27:241–248. doi: 10.1007/s11357-005-2916-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jensen TF, Holm-Jensen I. Energetic cost of running in workers of three ant species, Formica fusca L., Formica rufa L., and Camponotus herculeanus L. (Hymenoptera, Formicidae) J Comp Physiol. 1980;137:151–156. [Google Scholar]
- Jobson RW, Nabholz B, Galtier N. An evolutionary genome scan for longevity-related natural selection in mammals. Mol Biol Evol. 2010;27:840–847. doi: 10.1093/molbev/msp293. [DOI] [PubMed] [Google Scholar]
- Jordan G, Goldman N. The effects of alignment error and alignment filtering on the sitewise detection of positive selection. Mol Biol Evol. 2012;29:1125–1139. doi: 10.1093/molbev/msr272. [DOI] [PubMed] [Google Scholar]
- Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33:511–518. doi: 10.1093/nar/gki198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Keller L, Genoud M. Extraordinary lifespans in ants: a test of evolutionary theories of ageing. Nature. 1997;389:958–960. [Google Scholar]
- Keller L, Jemielity S. Social insects as a model to study the molecular basis of ageing. Exp Gerontol. 2006;41:553–556. doi: 10.1016/j.exger.2006.04.002. [DOI] [PubMed] [Google Scholar]
- Kim HJ, Morrow G, Westwood JT, Michaud S, Tanguay RM. Gene expression profiling implicates OXPHOS complexes in lifespan extension of flies over-expressing a small mitochondrial chaperone, Hsp22. Exp Gerontol. 2010;45:611–620. doi: 10.1016/j.exger.2009.12.012. [DOI] [PubMed] [Google Scholar]
- Kim HS, Murphy T, Xia J, Caragea D, Park Y, Beeman RW, Lorenzen MD, Butcher S, Manak JR, Brown SJ. BeetleBase in 2010: revisions to provide comprehensive genomic information for Tribolium castaneum. Nucleic Acids Res. 2010;38:D437–D442. doi: 10.1093/nar/gkp807. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kim SY, Volsky DJ. PAGE: parametric analysis of gene set enrichment. BMC Bioinformatics. 2005;6:144. doi: 10.1186/1471-2105-6-144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kirkness EF, Haas BJ, Sun W, Braig HR, Perotti MA, Clark JM, Lee SH, Robertson HM, Kennedy RC, Elhaik E, et al. Genome sequences of the human body louse and its primary endosymbiont provide insights into the permanent parasitic lifestyle. Proc Natl Acad Sci U S A. 2010;107:12168–12173. doi: 10.1073/pnas.1003379107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kishita Y, Tsuda M, Aigaki T. Impaired fatty acid oxidation in a Drosophila model of mitochondrial trifunctional protein (MTP) deficiency. Biochem Biophys Res Commun. 2012;419:344–349. doi: 10.1016/j.bbrc.2012.02.026. [DOI] [PubMed] [Google Scholar]
- Kiss DL, Andrulis ED. Genome-wide analysis reveals distinct substrate specificities of Rrp6, Dis3, and core exosome subunits. RNA. 2010;16:781–791. doi: 10.1261/rna.1906710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kosiol C, Vinar T, da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, Siepel A. Patterns of positive selection in six mammalian genomes. PLoS Genet. 2008;4:e1000144. doi: 10.1371/journal.pgen.1000144. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kowald A, Kirkwood TB. Evolution of the mitochondrial fusion-fission cycle and its role in aging. Proc Natl Acad Sci U S A. 2011;108:10237–10242. doi: 10.1073/pnas.1101604108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kuan YS, Brewer-Jensen P, Bai WL, Hunter C, Wilson CB, Bass S, Abernethy J, Wing JS, Searles LL. Drosophila suppressor of sable protein [Su(s)] promotes degradation of aberrant and transposon-derived RNAs. Mol Cell Biol. 2009;29:5590–5603. doi: 10.1128/MCB.00039-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kucherenko MM, Marrone AK, Rishko VM, Magliarelli Hde F, Shcherbata HR. Stress and muscular dystrophy: a genetic screen for dystroglycan and dystrophin interactors in Drosophila identifies cellular stress response components. Dev Biol. 2011;352:228–242. doi: 10.1016/j.ydbio.2011.01.013. [DOI] [PubMed] [Google Scholar]
- Kulmuni J, Wurm Y, Pamilo P. Comparative genomics of chemosensory protein genes reveals rapid evolution and positive selection in ant-specific duplicates. Heredity. 2013;110:538–547. doi: 10.1038/hdy.2012.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Landis GN, Abdueva D, Skvortsov D, Yang J, Rabin BE, Carrick J, Tavare S, Tower J. Similar gene expression patterns characterize aging and oxidative stress in Drosophila melanogaster. Proc Natl Acad Sci U S A. 2004;101:7663–7668. doi: 10.1073/pnas.0307605101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lawson D, Arensburger P, Atkinson P, Besansky NJ, Bruggner RV, Butler R, Campbell KS, Christophides GK, Christley S, Dialynas E, et al. VectorBase: a data resource for invertebrate vector genomics. Nucleic Acids Res. 2009;37:D583–D587. doi: 10.1093/nar/gkn857. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leboeuf AC, Benton R, Keller L. The molecular basis of social behavior: models, methods and advances. Curr Opin Neurobiol. 2013;23:3–10. doi: 10.1016/j.conb.2012.08.008. [DOI] [PubMed] [Google Scholar]
- Lee HY, Chou JY, Cheong L, Chang NH, Yang SY, Leu JY. Incompatibility of nuclear and mitochondrial genomes causes hybrid sterility between two yeast species. Cell. 2008;135:1065–1073. doi: 10.1016/j.cell.2008.10.047. [DOI] [PubMed] [Google Scholar]
- Lenaz G. Role of mitochondria in oxidative stress and ageing. Biochim Biophys Acta. 1998;1366:53–67. doi: 10.1016/s0005-2728(98)00120-0. [DOI] [PubMed] [Google Scholar]
- Lihoreau M, Latty T, Chittka L. An exploration of the social brain hypothesis in insects. Front Physiol. 2012;3:442. doi: 10.3389/fphys.2012.00442. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lindblad-Toh K, Garber M, Zuk O, Lin MF, Parker BJ, Washietl S, Kheradpour P, Ernst J, Jordan G, Mauceli E, et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature. 2011;478:476–482. doi: 10.1038/nature10530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Linksvayer TA, Wade MJ. Genes with social effects are expected to harbor more sequence variation within and between species. Evolution. 2009;63:1685–1696. doi: 10.1111/j.1558-5646.2009.00670.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Löytynoja A, Goldman N. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science. 2008;320:1632–1635. doi: 10.1126/science.1158395. [DOI] [PubMed] [Google Scholar]
- Loytynoja A, Vilella AJ, Goldman N. Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm. Bioinformatics. 2012;28:1684–1691. doi: 10.1093/bioinformatics/bts198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Markova-Raina P, Petrov D. High sensitivity to aligner and high rate of false positives in the estimates of positive selection in the 12 Drosophila genomes. Genome Res. 2011;21:863–874. doi: 10.1101/gr.115949.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McDonald JH. Handbook of biological statistics. Baltimore (MD): Sparky House Publishing; 2009. [Google Scholar]
- Meiklejohn CD, Montooth KL, Rand DM. Positive and negative selection on the mitochondrial genome. Trends Genet. 2007;23:259–263. doi: 10.1016/j.tig.2007.03.008. [DOI] [PubMed] [Google Scholar]
- Mockett RJ, Orr WC, Rahmandar JJ, Benes JJ, Radyuk SN, Klichko VI, Sohal RS. Overexpression of Mn-containing superoxide dismutase in transgenic Drosophila melanogaster. Arch Biochem Biophys. 1999;371:260–269. doi: 10.1006/abbi.1999.1460. [DOI] [PubMed] [Google Scholar]
- Molnar J, Fong KSK, He QP, Hayashi K, Kim Y, Fong SFT, Fogelgren B, Molnarne Szauter K, Mink M, Csiszar K. Structural and functional diversity of lysyl oxidase and the LOX-like proteins. Biochim Biophys Acta. 2003;1647:220–224. doi: 10.1016/s1570-9639(03)00053-0. [DOI] [PubMed] [Google Scholar]
- Molnar J, Ujfaludi Z, Fong SF, Bollinger JA, Waro G, Fogelgren B, Dooley DM, Mink M, Csiszar K. Drosophila lysyl oxidases Dmloxl-1 and Dmloxl-2 are differentially expressed and the active DmLOXL-1 influences gene expression and development. J Biol Chem. 2005;280:22977–22985. doi: 10.1074/jbc.M503006200. [DOI] [PubMed] [Google Scholar]
- Montooth KL, Rand DM. The spectrum of mitochondrial mutation differs across species. PLoS Biol. 2008;6:e213. doi: 10.1371/journal.pbio.0060213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Montoya-Burgos JI. Patterns of positive selection and neutral evolution in the protein-coding genes of Tetraodon and Takifugu. PLoS One. 2011;6:e24800. doi: 10.1371/journal.pone.0024800. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moreau CS, Bell CD, Vila R, Archibald SB, Pierce NE. Phylogeny of the ants: diversification in the age of angiosperms. Science. 2006;312:101–104. doi: 10.1126/science.1124891. [DOI] [PubMed] [Google Scholar]
- Moretti S, Laurenczy B, Gharib WH, Castella B, Kuzniar A, Schabauer H, Studer RA, Valle M, Salamin N, Stockinger H, et al. Selectome update: quality control and computational improvements to a database of positive selection. Nucleic Acids Res. 2014;42:D917–D921. doi: 10.1093/nar/gkt1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Munch D, Amdam GV, Wolschin F. Ageing in a eusocial insect: molecular and physiological characteristics of life span plasticity in the honey bee. Funct Ecol. 2008;22:407–421. doi: 10.1111/j.1365-2435.2008.01419.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Munoz-Torres MC, Reese JT, Childers CP, Bennett AK, Sundaram JP, Childs KL, Anzola JM, Milshina N, Elsik CG. Hymenoptera Genome Database: integrated community resources for insect species of the order Hymenoptera. Nucleic Acids Res. 2011;39:D658–D662. doi: 10.1093/nar/gkq1145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neumuller RA, Richter C, Fischer A, Novatchkova M, Neumuller KG, Knoblich JA. Genome-wide analysis of self-renewal in Drosophila neural stem cells by transgenic RNAi. Cell Stem Cell. 2011;8:580–593. doi: 10.1016/j.stem.2011.02.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Niven JE, Scharlemann JP. Do insect metabolic rates at rest and during flight scale with body mass? Biol Lett. 2005;1:346–349. doi: 10.1098/rsbl.2005.0311. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Notredame C, Higgins DG, Heringa J. T-coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol. 2000;302:205–217. doi: 10.1006/jmbi.2000.4042. [DOI] [PubMed] [Google Scholar]
- Nozawa M, Nei M. Evolutionary dynamics of olfactory receptor genes in Drosophila species. Proc Natl Acad Sci U S A. 2007;104:7122–7127. doi: 10.1073/pnas.0702133104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nygaard S, Zhang G, Schiøtt M, Li C, Wurm Y, Hu H, Zhou J, Ji L, Qiu F, Rasmussen M, et al. The genome of the leaf-cutting ant Acromyrmex echinatior suggests key adaptations to advanced social life and fungus farming. Genome Res. 2011;21:1339–1348. doi: 10.1101/gr.121392.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oliveira DC, Raychoudhury R, Lavrov DV, Werren JH. Rapidly evolving mitochondrial genome and directional selection in mitochondrial genes in the parasitic wasp nasonia (hymenoptera: pteromalidae) Mol Biol Evol. 2008;25:2167–2180. doi: 10.1093/molbev/msn159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Oliver TA, Garfield DA, Manier MK, Haygood R, Wray GA, Palumbi SR. Whole-genome positive selection and habitat-driven evolution in a shallow and a deep-sea urchin. Genome Biol Evol. 2010;2:800–814. doi: 10.1093/gbe/evq063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ometto L, Shoemaker D, Ross KG, Keller L. Evolution of gene expression in fire ants: the effects of developmental stage, caste, and species. Mol Biol Evol. 2011;28:1381–1392. doi: 10.1093/molbev/msq322. [DOI] [PubMed] [Google Scholar]
- Osumi-Sutherland D, Marygold S, Millburn G, McQuilton P, Ponting L, Stefancsik R, Falls K, Brown N, Gkoutos G. The Drosophila phenotype ontology. J Biomed Semantics. 2013;4:30. doi: 10.1186/2041-1480-4-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Parker JD, Parker KM, Sohal BH, Sohal RS, Keller L. Decreased expression of Cu-Zn superoxide dismutase 1 in ants with extreme lifespan. Proc Natl Acad Sci U S A. 2004;101:3486–3489. doi: 10.1073/pnas.0400222101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Penick CA, Prager SS, Liebig J. Juvenile hormone induces queen development in late-stage larvae of the ant Harpegnathos saltator. J Insect Physiol. 2012;58:1643–1649. doi: 10.1016/j.jinsphys.2012.10.004. [DOI] [PubMed] [Google Scholar]
- Penn O, Privman E, Landan G, Graur D, Pupko T. An alignment confidence score capturing robustness to guide tree uncertainty. Mol Biol Evol. 2010;27:1759–1767. doi: 10.1093/molbev/msq066. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Perez-Campo R, López-Torres M, Cadenas S, Rojas C, Barja G. The rate of free radical production as a determinant of the rate of aging: evidence from the comparative approach. J Comp Physiol B Biochem Syst Environ Physiol. 1998;168:149–158. doi: 10.1007/s003600050131. [DOI] [PubMed] [Google Scholar]
- Privman E, Penn O, Pupko T. Improving the performance of positive selection inference by filtering unreliable alignment regions. Mol Biol Evol. 2012;29:1–5. doi: 10.1093/molbev/msr177. [DOI] [PubMed] [Google Scholar]
- Proux E, Studer RA, Moretti S, Robinson-Rechavi M. Selectome: a database of positive selection. Nucleic Acids Res. 2009;37:D404–D407. doi: 10.1093/nar/gkn768. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qi H, Rath U, Wang D, Xu YZ, Ding Y, Zhang W, Blacketer MJ, Paddy MR, Girton J, Johansen J, et al. Megator, an essential coiled-coil protein that localizes to the putative spindle matrix during mitosis in Drosophila. Mol Biol Cell. 2004;15:4854–4865. doi: 10.1091/mbc.E04-07-0579. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rajakumar R, San Mauro D, Dijkstra MB, Huang MH, Wheeler DE, Hiou-Tim F, Khila A, Cournoyea M, Abouheif E. Ancestral developmental potential facilitates parallel evolution in ants. Science. 2012;335:79–82. doi: 10.1126/science.1211451. [DOI] [PubMed] [Google Scholar]
- Remolina SC, Chang PL, Leips J, Nuzhdin SV, Hughes KA. Genomic basis of aging and life-history evolution in Drosophila melanogaster. Evolution. 2012;66:3390–3403. doi: 10.1111/j.1558-5646.2012.01710.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Richly E, Chinnery PF, Leister D. Evolutionary diversification of mitochondrial proteomes: implications for human disease. Trends Genet. 2003;19:356–362. doi: 10.1016/S0168-9525(03)00137-9. [DOI] [PubMed] [Google Scholar]
- Robertson HM, Wanner KW. The chemoreceptor superfamily in the honey bee, Apis mellifera: expansion of the odorant, but not gustatory, receptor family. Genome Res. 2006;16:1395–1403. doi: 10.1101/gr.5057506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roth P, Xylourgidis N, Sabri N, Uv A, Fornerod M, Samakovlis C. The Drosophila nucleoporin DNup88 localizes DNup214 and CRM1 on the nuclear envelope and attenuates NES-mediated nuclear export. J Cell Biol. 2003;163:701–706. doi: 10.1083/jcb.200304046. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schabauer H, Valle M, Pacher C, Stockinger H, Stamatakis A, Robinson-Rechavi M, Yang Z, Salamin N. SlimCodeML: an optimized version of CodeML for the branch-site model. 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum. Shanghai. p. 2012:706–714. [Google Scholar]
- Schneider A, Souvorov A, Sabath N, Landan G, Gonnet GH, Graur D. Estimates of positive Darwinian selection are inflated by errors in sequencing, annotation, and alignment. Genome Biol Evol. 2009;1:114–118. doi: 10.1093/gbe/evp012. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schwander T, Lo N, Beekman M, Oldroyd BP, Keller L. Nature versus nurture in social insect caste differentiation. Trends Ecol Evol. 2010;25:275–282. doi: 10.1016/j.tree.2009.12.001. [DOI] [PubMed] [Google Scholar]
- Shen YY, Liang L, Zhu ZH, Zhou WP, Irwin DM, Zhang YP. Adaptive evolution of energy metabolism genes and the origin of flight in bats. Proc Natl Acad Sci U S A. 2010;107:8666–8671. doi: 10.1073/pnas.0912613107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Simola DF, Wissler L, Donahue G, Waterhouse RM, Helmkampf M, Roux J, Nygaard S, Glastad KM, Hagen DE, Viljakainen L, et al. Social insect genomes exhibit dramatic evolution in gene composition and regulation while preserving regulatory features linked to sociality. Genome Res. 2013;23:1235–1247. doi: 10.1101/gr.155408.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, et al. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol. 2007;25:1251–1255. doi: 10.1038/nbt1346. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith CD, Zimin A, Holt C, Abouheif E, Benton R, Cash E, Croset V, Currie CR, Elhaik E, Elsik CG, et al. Draft genome of the globally widespread and invasive Argentine ant (Linepithema humile) Proc Natl Acad Sci U S A. 2011;108:5673–5678. doi: 10.1073/pnas.1008617108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith CR, Smith CD, Robertson HM, Helmkampf M, Zimin A, Yandell M, Holt C, Hu H, Abouheif E, Benton R, et al. Draft genome of the red harvester ant Pogonomyrmex barbatus. Proc Natl Acad Sci U S A. 2011;108:5667–5672. doi: 10.1073/pnas.1007901108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith CR, Toth AL, Suarez AV, Robinson GE. Genetic and genomic analyses of the division of labour in insect societies. Nat Rev Genet. 2008;9:735–748. doi: 10.1038/nrg2429. [DOI] [PubMed] [Google Scholar]
- Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22:2688–2690. doi: 10.1093/bioinformatics/btl446. [DOI] [PubMed] [Google Scholar]
- Stroschein-Stevenson SL, Foley E, O’Farrell PH, Johnson AD. Identification of Drosophila gene products required for phagocytosis of Candida albicans. PLoS Biol. 2005;4:e4. doi: 10.1371/journal.pbio.0040004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Studer RA, Penel S, Duret L, Robinson-Rechavi M. Pervasive positive selection on duplicated and nonduplicated vertebrate protein coding genes. Genome Res. 2008;18:1393–1402. doi: 10.1101/gr.076992.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Suarez RK. Energy metabolism during insect flight: biochemical design and physiological performance. Physiol Biochem Zool. 2000;73:765–771. doi: 10.1086/318112. [DOI] [PubMed] [Google Scholar]
- Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Suen G, Teiling C, Li L, Holt C, Abouheif E, Bornberg-Bauer E, Bouffard P, Caldera EJ, Cash E, Cavanaugh A, et al. The genome sequence of the leaf-cutter ant Atta cephalotes reveals insights into its obligate symbiotic lifestyle. PLoS Genet. 2011;7:e1002007. doi: 10.1371/journal.pgen.1002007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Swanson WJ, Nielsen R, Yang Q. Pervasive adaptive evolution in mammalian fertilization proteins. Mol Biol Evol. 2003;20:18–20. doi: 10.1093/oxfordjournals.molbev.a004233. [DOI] [PubMed] [Google Scholar]
- Talavera G, Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007;56:564–577. doi: 10.1080/10635150701472164. [DOI] [PubMed] [Google Scholar]
- Tan H-W, Liu G-H, Dong X, Lin R-Q, Song H-Q, Huang S-Y, Yuan Z-G, Zhao G-H, Zhu X-Q. The complete mitochondrial genome of the Asiatic cavity-nesting honeybee Apis cerana (Hymenoptera: Apidae) PLoS One. 2011;6:e23008. doi: 10.1371/journal.pone.0023008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tauber E, Eberl DF. Song production in auditory mutants of Drosophila: the role of sensory feedback. J Comp Physiol A. 2001;187:341–348. doi: 10.1007/s003590100206. [DOI] [PubMed] [Google Scholar]
- Tintle N, Borchers B, Brown M, Bekmetjev A. Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16. BMC Proc. 2009;3:S96. doi: 10.1186/1753-6561-3-s7-s96. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tribolium Genome Sequencing Consortium. The genome of the model beetle and pest Tribolium castaneum. Nature. 2008;452:949–955. doi: 10.1038/nature06784. [DOI] [PubMed] [Google Scholar]
- Trifunovic A, Hansson A, Wredenberg A, Rovio AT, Dufour E, Khvorostov I, Spelbrink JN, Wibom R, Jacobs HT, Larsson NG. Somatic mtDNA mutations cause aging phenotypes without affecting reactive oxygen species production. Proc Natl Acad Sci U S A. 2005;102:17993–17998. doi: 10.1073/pnas.0508886102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Trifunovic A, Wredenberg A, Falkenberg M, Spelbrink JN, Rovio AT, Bruder CE, Bohlooly-Y M, Gidlöf S, Oldfors A, Wibom R, et al. Premature ageing in mice expressing defective mitochondrial DNA polymerase. Nature. 2004;429:417–423. doi: 10.1038/nature02517. [DOI] [PubMed] [Google Scholar]
- Tweedie S, Ashburner M, Falls K, Leyland P, McQuilton P, Marygold S, Millburn G, Osumi-Sutherland D, Schroeder A, Seal R, et al. FlyBase: enhancing Drosophila Gene Ontology annotations. Nucleic Acids Res. 2009;37:D555–D559. doi: 10.1093/nar/gkn788. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vamathevan J, Hasan S, Emes R, Amrine-Madsen H, Rajagopalan D, Topp SD, Kumar V, Word M, Simmons MD, Foord SM, et al. The role of positive selection in determining the molecular cause of species differences in disease. BMC Evol Biol. 2008;8:273. doi: 10.1186/1471-2148-8-273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E. EnsemblCompara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 2009;19:327–335. doi: 10.1101/gr.073585.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Viljakainen L, Evans JD, Hasselmann M, Rueppell O, Tingek S, Pamilo P. Rapid evolution of immune proteins in social insects. Mol Biol Evol. 2009;26:1791–1801. doi: 10.1093/molbev/msp086. [DOI] [PubMed] [Google Scholar]
- Wallace IM, O’Sullivan O, Higgins DG, Notredame C. M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 2006;34:1692–1699. doi: 10.1093/nar/gkl091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang T, Montell C. A phosphoinositide synthase required for a sustained light response. J Neurosci. 2006;26:12816–12825. doi: 10.1523/JNEUROSCI.3673-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waterhouse RM, Zdobnov EM, Kriventseva EV. Correlating traits of gene retention, sequence divergence, duplicability and essentiality in vertebrates, arthropods, and fungi. Genome Biol Evol. 2011;3:75–86. doi: 10.1093/gbe/evq083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waterhouse RM, Zdobnov EM, Tegenfeldt F, Li J, Kriventseva EV. OrthoDB: the hierarchical catalog of eukaryotic orthologs in 2011. Nucleic Acids Res. 2011;39:D283–D288. doi: 10.1093/nar/gkq930. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weng MP, Liao BY. DroPhEA: Drosophila phenotype enrichment analysis for insect functional genomics. Bioinformatics. 2011;27:3218–3219. doi: 10.1093/bioinformatics/btr530. [DOI] [PubMed] [Google Scholar]
- Werren JH. Biology of Wolbachia. Annu Rev Entomol. 1997;42:587–609. doi: 10.1146/annurev.ento.42.1.587. [DOI] [PubMed] [Google Scholar]
- Werren JH, Richards S, Desjardins CA, Niehuis O, Gadau J, Colbourne JK, The Nasonia Genome Working Group Functional and evolutionary insights from the genomes of three parasitoid Nasonia species. Science. 2010;327:343–348. doi: 10.1126/science.1178028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wong WS, Yang Z, Goldman N, Nielsen R. Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying positively selected sites. Genetics. 2004;168:1041–1051. doi: 10.1534/genetics.104.031153. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woodard SH, Fischman BJ, Venkat A, Hudson ME, Varala K, Cameron SA, Clark AG, Robinson GE. Genes involved in convergent evolution of eusociality in bees. Proc Natl Acad Sci U S A. 2011;108:7472–7477. doi: 10.1073/pnas.1103457108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wurm Y, Wang J, Riba-Grognuz O, Corona M, Nygaard S, Hunt BG, Ingram KK, Falquet L, Nipitwattanaphon M, Gotzek D, et al. The genome of the fire ant Solenopsis invicta. Proc Natl Acad Sci U S A. 2011;108:5679–5684. doi: 10.1073/pnas.1009690108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Z. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol. 1998;15:568–573. doi: 10.1093/oxfordjournals.molbev.a025957. [DOI] [PubMed] [Google Scholar]
- Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
- Yang Z, dos Reis M. Statistical properties of the branch-site test of positive selection. Mol Biol Evol. 2011;28:1217–1228. doi: 10.1093/molbev/msq303. [DOI] [PubMed] [Google Scholar]
- Yang Z, Nielsen R, Goldman N, Pedersen A-MK. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000;155:431–449. doi: 10.1093/genetics/155.1.431. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang Z, Wong WSW, Nielsen R. Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22:1107–1118. doi: 10.1093/molbev/msi097. [DOI] [PubMed] [Google Scholar]
- Yui R, Ohno Y, Matsuura ET. Accumulation of deleted mitochondrial DNA in aging Drosophila melanogaster. Genes Genet Syst. 2003;78:245–251. doi: 10.1266/ggs.78.245. [DOI] [PubMed] [Google Scholar]
- Zhang G, Cowled C, Shi Z, Huang Z, Bishop-Lilly KA, Fang X, Wynne JW, Xiong Z, Baker ML, Zhao W, et al. Comparative analysis of bat genomes provides insight into the evolution of flight and immunity. Science. 2013;339:456–460. doi: 10.1126/science.1230835. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhang J. Frequent false detection of positive selection by the likelihood method with branch-site models. Mol Biol Evol. 2004;21:1332–1339. doi: 10.1093/molbev/msh117. [DOI] [PubMed] [Google Scholar]
- Zhang J, Nielsen R, Yang Z. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005;22:2472–2479. doi: 10.1093/molbev/msi237. [DOI] [PubMed] [Google Scholar]
- Zhou X, Slone JD, Rokas A, Berger SL, Liebig J, Ray A, Reinberg D, Zwiebel LJ. Phylogenetic and transcriptomic analysis of chemosensory receptors in a pair of divergent ant species reveals sex-specific signatures of odor coding. PLoS Genet. 2012;8:e1002930. doi: 10.1371/journal.pgen.1002930. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Raw and filtered alignments used in these analyses track files for the alignment editor Jalview (Clamp et al. 2004), Codeml control files and result files can be downloaded at http://bioinfo.unil.ch/supdata/Roux_positive_selection_ants/Roux_et_al_datasets.tar.gz (last accessed April 24, 2014).
A simple web interface displaying gene families, GO mapping, Codeml results, and alignments (through a Jalview applet) is available at http://bioinfo.unil.ch/supdata/Roux_positive_selection_ants/families.html (last accessed April 24, 2014). Jalview tracks display the regions used or filtered out in the original protein alignments, as well as the residues found to be under positive selection by Bayes Empirical Bayes (Yang et al. 2005) in all the branches tested for each of the three replicate runs (fig. 2).