Skip to main content
Molecular Biology and Evolution logoLink to Molecular Biology and Evolution
. 2014 Apr 29;31(7):1661–1685. doi: 10.1093/molbev/msu141

Patterns of Positive Selection in Seven Ant Genomes

Julien Roux 1,2,*,, Eyal Privman 1,2,§, Sébastien Moretti 1,2,3, Josephine T Daub 1,2,4, Marc Robinson-Rechavi 1,2, Laurent Keller 1
PMCID: PMC4069625  PMID: 24782441

Abstract

The evolution of ants is marked by remarkable adaptations that allowed the development of very complex social systems. To identify how ant-specific adaptations are associated with patterns of molecular evolution, we searched for signs of positive selection on amino-acid changes in proteins. We identified 24 functional categories of genes which were enriched for positively selected genes in the ant lineage. We also reanalyzed genome-wide data sets in bees and flies with the same methodology to check whether positive selection was specific to ants or also present in other insects. Notably, genes implicated in immunity were enriched for positively selected genes in the three lineages, ruling out the hypothesis that the evolution of hygienic behaviors in social insects caused a major relaxation of selective pressure on immune genes. Our scan also indicated that genes implicated in neurogenesis and olfaction started to undergo increased positive selection before the evolution of sociality in Hymenoptera. Finally, the comparison between these three lineages allowed us to pinpoint molecular evolution patterns that were specific to the ant lineage. In particular, there was ant-specific recurrent positive selection on genes with mitochondrial functions, suggesting that mitochondrial activity was improved during the evolution of this lineage. This might have been an important step toward the evolution of extreme lifespan that is a hallmark of ants.

Keywords: comparative genomics, sociality, dN/dS, aging, lifespan, immunity, neurogenesis, olfactory receptors, metabolism, Hymenoptera, bees, Drosophila

Introduction

Ants constitute an extremely successful lineage of animals which has colonized virtually all ecosystems on Earth (Hölldobler and Wilson 1990). The pivotal feature at the basis of this ecological success is their highly social system with a reproductive division of labor, where one or a few queens specialize in reproduction, whereas workers conduct all the colony tasks such as brood care, nest maintenance, and food collection. In this article, we take advantage of the recent availability of seven sequenced ant genomes (Bonasio et al. 2010; Nygaard et al. 2011; Smith, Zimin, et al. 2011; Smith, Smith, et al. 2011; Suen et al. 2011; Wurm et al. 2011) to perform a genome-wide scan for positive selection on amino-acid changes in protein coding genes during the evolution of the ant lineage. We addressed three main questions.

First, we compared the amount of positive selection in functional categories of genes. Previous large-scale scans for positive selection in animals indicated that positive selection predominantly affects certain types of genes, such as those involved in evolutionary arms races, sexual selection, or conflicts with pathogens (Bakewell et al. 2007; Drosophila 12 Genomes Consortium 2007; Kosiol et al. 2008; Vamathevan et al. 2008; Oliver et al. 2010; George et al. 2011; Woodard et al. 2011). Such genes experienced positive selection events recurrently on broad evolutionary time scales, and it is likely that they contribute to a fraction of the positive selection events that occurred in the ant lineage. To identify these genes, we reasoned that they likely also were under positive selection in other insect lineages. A systematic comparison of the targets of positive selection from published studies in insects is not straightforward because genome-wide scans for positive selection were often performed with different methods in different lineages. For example, a positive selection scan on 12 Drosophila species (all solitary) (Drosophila 12 Genomes Consortium 2007) used the site test of Codeml (Yang et al. 2000), which is aimed at detecting recurrent positive selection events affecting particular sites of a protein, whereas a scan on 10 bee species (including solitary, primitively social, and highly social species) (Woodard et al. 2011) used the branch test (Yang 1998), which tends to detect positive selection events affecting a large number of sites of a protein but during a limited period of time. To perform a robust comparison of the genes that were under positive selection in ants and other insects, we conducted similar scans for positive selection in ants and the flies and bees outgroups. An example of genes expected to be repeatedly under positive selection in insects are genes involved in defense and immunity (Drosophila 12 Genomes Consortium 2007; Bulmer 2010). On the basis of the observed smaller set of immunity genes in the honeybee compared with Drosophila melanogaster, it has been suggested that selective pressure on these genes might have been relaxed in social insects, perhaps because they have social hygienic behaviors (Honeybee Genome Sequencing Consortium 2006; Smith et al. 2008; Viljakainen et al. 2009; Smith, Zimin, et al. 2011; Suen et al. 2011; Harpur and Zayed 2013). However, the addition of several newly sequenced insect genomes revealed that the important gene complement in fruit fly is a derived character (Werren et al. 2010; Fischman et al. 2011; Smith, Smith, et al. 2011). We used our data sets to test whether there was evidence for weaker positive selection on immunity genes in ants and bees compared with flies.

Next, we aimed at detecting sets of genes involved in functions likely to reflect ant-specific adaptations. We focused on three main adaptations. The first relates to the wide range of coordinated collective behaviors associated with division of labor in ant societies. Complex cooperative behaviors occur among nestmates for tasks such as communal nest construction and defense, brood rearing, social hygienic behavior, and collective foraging (Hölldobler and Wilson 1990). It has been suggested that the evolution of social interactions may be tracked down to molecular changes affecting nervous system development and function. In particular it may translate into increased rates of positive selection on nervous system-related genes, as documented in primitively social lineages of bees, which evolved social behaviors independently from ants (Fischman et al. 2011; Woodard et al. 2011). Complex collective behaviors also require efficient communication systems that are essentially mediated by chemical signaling in social insects. Ants identify nestmates from non-nestmates, as well as ants from other species, through their scent. Individuals also use various types of pheromones as alarm signals and to mark their trails and territories. It has therefore been suggested that genes involved in chemical signaling, notably pheromone production and perception, should experience increased positive selection in ants compared with solitary insects (Ingram et al. 2005; Robertson and Wanner 2006; Bonasio et al. 2010; Smith, Zimin, et al. 2011; Wurm et al. 2011; Kulmuni et al. 2013; Leboeuf et al. 2013). A manually curated data set of 873 olfactory receptor genes (ORs) allowed us to conduct a test for increased positive selection on these genes in ants.

The second type of potential molecular adaptation relates to phenotypic plasticity among castes. Although queens and workers usually develop from totipotent eggs (Schwander et al. 2010), they display dramatic morphological and physiological differences. Queens are often larger, have wings, and have much more highly developed ovaries than workers that often are sterile and lack a sperm storage organ (Hölldobler and Wilson 1990). In most species, the differences between castes result from developmental differences induced by environmental factors rather than genetic differences (Abouheif and Wray 2002; Schwander et al. 2010; Penick et al. 2012; Rajakumar et al. 2012). We therefore investigated whether there was evidence for increased positive selection in genes and pathways potentially involved in developmental plasticity (Smith et al. 2008; Fischman et al. 2011).

A third and interesting type of ant-specific adaptation relates to the extremely long lifespan of ant queens, which can live more than 20 years in some species (Keller and Genoud 1997; Jemielity et al. 2005). This corresponds to a 100-fold increase in lifespan compared with solitary insects. The variation in lifespan among castes is also remarkable, with queens living up to 10 times longer than workers and 500 times longer than males. So far, a limited number of molecular candidates have been identified to explain this pattern, mainly inspired from work in Drosophila (Jemielity et al. 2005; Keller and Jemielity 2006). We therefore investigated whether there was evidence of positive selection on genes that have previously been associated with aging in model organisms. It is possible that positive selection acted on the same sets of genes in the bee lineage, where queens also live longer than other castes and than solitary insects, but such a signal should not be observed in short-lived species of the Drosophila lineage. To further assess the link between positive selection and aging, we investigated whether genes that experienced positive selection in ants were genes shown in D. melanogaster to be differentially expressed between old and young individuals, and between oxygen-stressed and control individuals (Landis et al. 2004).

Finally, we investigated whether there was a difference in the level of positive selection between genes showing biased expression in queens, workers, and males. The efficiency of natural selection acting on an advantageous mutation—and thus the probability of its long-term fixation—is proportional to its effect on fitness (Duret 2008). The fitness effects of mutations in genes that are expressed only in nonreproductive workers are indirect, so everything else being equal, selection should be less efficient at fixing them than mutations on genes expressed in queens and males. This could translate into lower levels of positive selection on genes expressed specifically in workers compared with males and queens (Linksvayer and Wade 2009; Hall and Goodisman 2012). We therefore analyzed previously published microarray data from the red fire ant Solenopsis invicta (Ometto et al. 2011), and compared the amount of positive selection between groups of genes varying in the level of caste-biased expression.

Results

Pervasive Positive Selection Detected in Ants

To detect positive selection episodes that acted on protein-coding genes during the evolution of the ant lineage, the branch-site test of Codeml was run on 4,261 protein alignments of single-copy orthologs composed of four to seven ant and three to five outgroup species (see Materials and Methods). All branches that led to ant species in each gene family tree (including 2 hymenopteran and 13 ant branches; fig. 1) were successively tested for the presence of episodic positive selection. As many as 1,832 single-copy orthologs families (43%) displayed a signal of positive selection (at 10% false discovery rate [FDR]) in at least one of the branches tested (supplementary table S1, Supplementary Material online). In 91% of the significant alignments, at least one residue targeted by positive selection could be identified with a posterior probability greater than 0.9 (Bayes Empirical Bayes test; fig. 2) (Yang et al. 2005). There was evidence for positive selection in at least one branch of the ant lineage for 830 (20%) of the genes analyzed. For 74% of them positive selection was specific to ants and not observed in the basal hymenopteran branches #7 and #8 (fig. 1). The 10 gene families with the most significant test values in the ant lineage are given in table 1.

Fig. 1.

Fig. 1.

Phylogeny of the seven sequenced ant species and the five outgroups used in this study. The maximum-likelihood phylogeny was computed by R. Waterhouse from the concatenated alignment of the conserved protein sequences of 2,756 single-copy orthologs from OrthoDB (Simola et al. 2013). The scale bar indicates the average number of amino acid substitutions per site. The phylogeny is consistent with a previously published study (Brady et al. 2006). A second study only found a difference in the branching of Pogonomyrmex barbatus and Solenopsis invicta (Moreau et al. 2006). The 15 different branches where positive selection was tested are highlighted in red (the seven terminal branches leading to ant species and the branches numbered #1 to #8). The percentage of gene families showing positive selection in each of these branches at FDR = 10% is displayed in table 2. Illustrations of the seven ant species and Apis mellifera are courtesy of Alexander Wild at http://www.alexanderwild.com (last accessed April 24, 2014). Pediculus humanus illustration was downloaded from Vectorbase, Drosophila melanogaster, Tribolium castaneum, and Nasonia vitripennis illustrations were downloaded from Wikipedia. Illustrations are not to scale.

Fig. 2.

Fig. 2.

Protein alignment view of positive selection signal on gene family 11650. See table 1 for description of the potential function of this gene family. Protein alignment is shown partially, from position 230 to 350, and Drosophila melanogaster, Tribolium castaneum, and Pediculus humanus genes were removed by MaxAlign because of insufficient alignment quality (Materials and Methods). The second annotation track under the protein alignment (branch 5 BEB site) indicates positively selected sites on the tested branch #5 (fig. 1). Site 285 of the alignment (indicated with a red arrow) has been selected; it shows the fixation of isoleucine in lieu of the ancestral tryptophan, in the formicoid clade including six of the seven ant species.

Table 1.

Top Scoring Gene Families at Branch-Site and Site Tests for Positive Selection.

Test Used Gene Family Branch Δln L P-value FDR dS ω (proportion) Drosophila melanogaster Gene Name Function Annotated in Flybase and Uniprot Duplicates in Ants Uniprot ID References
Branch-site test 150 6 16.1 1.4e-8 2.2e-6 0.93 281 (4.2%) Tequila Serine-type endopeptidase activity; long-term memory; aging O45029 Didelot et al. (2006), Chen et al. (2012), and Remolina et al. (2012)
11650a 5 12.1 8.4e-7 7.9e-5 0.059 299 (2.5%) CG17321 Unknow Q9VJ40
453 6 11.4 1.9e-6 1.5e-4 0.29 44 (2.2%) Guanine nucleotide exchange factor in mesoderm Ral GTPase binding; imaginal disc-derived wing vein specification A1ZBA1 Blanke and Jackle (2006)
361 1 10.9 3.0e-6 2.3e-4 0.090 1.2 (6.9%) Megator Spindle assembly A1Z8P9 Qi et al. (2004)
5623 1 10.6 4.0e-6 3.0e-4 0.13 24 (4.6%) Methylthioribose-1-phosphate isomerase Catalyzes interconversion of methylthioribose-1-phosphate into methylthioribulose-1-phosphate; wing disc development Q9V9X4 Bronstein et al. (2010)
1050 6 10.1 6.7e-6 4.7e-4 0.49 45 (3%) Dis3 Regulation of gene expression; nuclear RNA surveillance; neurogenesis Q8MSY2 Kuan et al. (2009), Kiss and Andrulis (2010), Neumuller et al. (2011)
793 4 9.6 1.2e-5 7.6e-4 0.085 ∞ (0.6%) Embargoed Protein binding; protein transporter activity; protein export from nucleus; multicellular organismal development; centriole replication Q9TVM2 Collier et al. (2000); Roth et al. (2003)
8639 6 9.5 1.4e-5 8.3e-4 0.26 ∞ (11%) ATP synthase, subunit b, mitochondria Hydrogen-exporting ATPase activity, phosphorylative mechanism; phagocytosis, engulfment Q94516 Stroschein-Stevenson et al. (2005)
3983 4 8.9 2.4e-5 0.0014 0.036 ∞ (0.5%) Lysyl oxidase-like 2 Protein-lysine 6-oxidase activity Q8IH65 Molnar et al. (2003), Molnar et al. (2005)
2208 6 8.7 3.0e-5 0.0016 0.49 ∞ (1.2%) Cytochrome P450 reductase NADPH-hemoprotein reductase activity; oxidation-reduction process; putative function in olfactory clearance Q27597 Hovemann et al. (1997)
Site test 3245 10.6 4.2e-6 0.0041 2.4 (2.1%) CG6752 Unknown Yes Q9VFC4, Q8SZS1
6214 8.3 4.6e-5 0.038 8.3 (0.9%) CG42343 Unknown No B7Z153, Q9VRI6
6649 8.0 6.6e-5 0.045 4.9 (4.7%) CG7845 Muscle cell homeostasis No Q7K4B2 Kucherenko et al. (2011)
5707 7.9 7.2e-5 0.045 3.4 (2.1%) Mitochondrial ribosomal protein L37 Structural constituent of ribosome; translation No Q9VGW9, Q3YNF4, Q3YNF5 Kim, Morrow, et al. (2010)
2372 7.8 7.6e-5 0.045 3.2 (1.9%) Mitochondrial trifunctional protein α subunit Long-chain-3-hydroxyacyl-CoA dehydrogenase activity; long-chain-enoyl-CoA hydratase activity; response to starvation; determination of adult lifespan; fatty acid beta-oxidation; wound healing No Q8IPE8, Q9V397 Kishita, Tsuda, Aigaki (2012)
8490 7.4 1.2e-4 0.062 6.2 (2.0%) Phosphatidylinositol synthase CDP-diacylglycerol-inositol 3-phosphatidyltransferase activity; phototransduction No Q8IR29, Q8SX37 Wang and Montell (2006)
3891 6.8 2.2e-4 0.11 9.4 (0.3%) CG1607 Potential amino acid transmembrane transporter activity No Q9V9Y0, Q95T33
2074 6.5 3.1e-4 0.14 4.2 (4.0%) CG9715 Unknown Yes Q9VVA9, Q960D5
1584 6.3 4.0e-4 0.17 2.3 (2.7%) Unextended Potential role in cellular ion homeostasis No A8Y516
1053 6.2 4.2e-4 0.17 9.1 (0.2%) Coat protein (coatomer) β Biosynthetic protein transport from the ER, via the Golgi up to the trans Golgi network. Required for limiting lipid storage in lipid droplets. Involved in innate immune response No P45437 Bard et al. (2006), Beller et al. (2008), Cronin et al. (2009)

Note.–Gene families are ranked based on their log-likelihood ratios (Δln L). Results of the branch-site test were filtered to keep only internal ant branches of the phylogenetic tree (branches 1 to #6) and with a dS on the tested branch below 1. Results of both tests were filtered to keep families with a good support for the detection of sites evolving under positive selection (BEB posterior probability > 0.9). Manual inspection of the best hits confirmed that the signal of positive selection seemed genuine for all cases, except for family 12370 in the branch-site test results, which was removed from the list.

aExample used in figure 2.

The proportion of positively selected genes varied significantly across the different branches tested (χ2 test, P < 1e-15; table 2), similarly to previous analyses with experimental and simulated data sets. This likely results at least in part from lower power of the branch-site test in shorter branches (Anisimova and Yang 2007; Kosiol et al. 2008; Studer et al. 2008; Fletcher and Yang 2010; George et al. 2011; Gharib and Robinson-Rechavi 2013). Consistent with this view, there was a significant correlation in our data set between the length of tested branches and the test score (log-likelihood ratio; Spearman correlation ρ = 0.41, P < 1e-15). Additional analyses ruled out the hypothesis that false positives caused by convergence problems of the test, selective constraints acting on synonymous sites, saturation of synonymous substitution rate dS, or sequencing errors could be responsible for this pattern (supplementary text, Supplementary Material online).

Table 2.

Amount of Positive Selection Detected on Different Branches of the Analyzed Phylogeny.

Branch Namea Lineage Delineated Fraction of Positively Selected Gene Families Number of Positively Selected Gene Familiesb
Acep Atta cephalotes 0.056 144
Aech Acromyrmex echinatior 0.043 109
Sinv Solenopsis invicta 0.029 85
Pbar Pogonomyrmex barbatus 0.038 80
Cflo Camponotus floridae 0.017 65
Lhum Linepithema humile 0.036 97
Hsal Harpegnathos saltator 0.020 76
1 Attini 0.0088 16
2 Myrmicinae 0.0071 10
3 Myrmicinae 0.0087 16
4 Formicoid 0.0072 17
5 Formicoid 0.025 58
6 Formicidae 0.030 87
7 Aculeata 0.10 176
8 Hymenoptera/Apocrita 0.39 762
All above branches except 7 and 8 Formicidae 0.20 830
All above branches Hymenoptera/Apocrita 0.43 1,832

aAs illustrated in figure 1.

bBranches of gene families trees can be merged if genes are missing (or removed for quality reasons), and the resulting branches do not correspond to canonical branches defined by the species topology (fig. 1). When positive selection is found on such branches, it was not counted in branch-specific numbers displayed in table 2, but it was counted when a whole lineage was considered (e.g., Hymenoptera).

Taken together, these results demonstrate that positive selection was common in the evolution of the ant genes. The proportion of significant genes was similar in magnitude in the outgroup data set of 10 bees analyzed with the same methodology (20%; supplementary table S2, Supplementary Material online), but even higher in the outgroup data set of 12 flies (36%; supplementary table S3, Supplementary Material online). This difference might reflect biological differences between the lineages, such as effective population size NE, but also differences in the topology and branch lengths of the species trees, which influence the power to detect positive selection events in protein alignments (see supplementary text, Supplementary Material online).

To compare the amount of positive selection experienced by different functional categories of genes, we classified genes based on their Gene Ontology (GO) annotation in D. melanogaster orthologs, and performed a gene set enrichment test using for each gene family a score reflecting the overall occurrence of positive selection in the ant lineage (Materials and Methods; supplementary text, Supplementary Material online). Such an approach of grouping genes enables a more sensitive search for positive selection, while buffering the impact of potential false positives (e.g., from remaining alignment errors or GC-biased gene conversion events which are difficult to distinguish from real positive selection signals; see supplementary text and supplementary tables S20–S24, Supplementary Material online). Twenty-four functional categories of genes were significantly enriched for positively selected genes in the ant lineage (at 20% FDR; table 3). A large number of them (11 out of 24) were related to mitochondria and mitochondrial activity. The other significant categories were related to nervous system development, behavior, immunity, protein translation and degradation, cell membrane, and receptor activity. Thus, positive selection apparently targeted a diverse array of gene functions during the evolution of the ant lineage.

Table 3.

GO Categories Enriched for Positively Selected Genes, Based on Scores from the Branch-Site Test from Codeml in Ants.

SetID Ontology SetName SetSize Score P-value FDR
GO:0000313 CC Organellar ribosome 59 26.8 1.4e-10 0
GO:0006120 BP Mitochondrial electron transport, NADH to ubiquinone 18 10.6 1.1e-9 0
GO:0005759 CC Mitochondrial matrix 98 39.7 1.6e-9 0
GO:0005762 CC Mitochondrial large ribosomal subunit 36 16.8 1.1e-7 0.0025
GO:0005746 CC Mitochondrial respiratory chain 31 14.6 4.5e-7 0.0033
GO:0005747 CC Mitochondrial respiratory chain complex I 22 11.0 1.3e-6 0.0033
GO:0008137 MF NADH dehydrogenase (ubiquinone) activity 16 8.0 3.2e-5 0.013
GO:0005763 CC Mitochondrial small ribosomal subunit 25 10.9 0.00018 0.047
GO:0008038 BP Neuron recognition 19 8.7 0.00023 0.047
GO:0008344 BP Adult locomotory behavior 19 8.4 0.00082 0.086
GO:0042254 BP Ribosome biogenesis 39 15.0 0.0011 0.099
GO:0003735 MF Structural constituent of ribosome 107 36.4 0.0012 0.099
GO:0044459 CC Plasma membrane part 129 42.9 0.0016 0.12
GO:0006508 BP Proteolysis 145 47.4 0.0022 0.14
GO:0006412 BP Translation 191 61.0 0.0025 0.15
GO:0016491 MF Oxidoreductase activity 127 41.8 0.0028 0.15
GO:0004872 MF Receptor activity 90 30.6 0.0028 0.15
GO:0055114 BP Oxidation-reduction process 129 42.2 0.0038 0.16
GO:0008237 MF Metallopeptidase activity 36 13.6 0.0039 0.16
GO:0061134 MF Peptidase regulator activity 17 7.2 0.0046 0.18
GO:0002520 BP Immune system development 26 10.2 0.0053 0.19
GO:0048534 BP Hemopoietic or lymphoid organ development 26 10.2 0.0053 0.19
GO:0016616 MF Oxidoreductase activity, acting on the CH–OH group of donors, NAD, or NADP as acceptor 18 7.5 0.0053 0.19
GO:0016836 MF Hydro-lyase activity 14 6.1 0.0055 0.19

Note.—The enrichment test considers a combined score for all analyzed branches of the ant lineage (Materials and Methods). The full table of results is shown in supplementary table S6, Supplementary Material online.

Usual Targets of Positive Selection in Insects

To identify GO categories that experienced positive selection not only in ants but also in other insects, we reanalyzed the fly and bee data sets with the same methodology used for the ant data set. These analyses revealed 106 GO categories significantly enriched for flies and 38 for bees (tables 4 and 5; supplementary tables S4 and S5, Supplementary Material online). We investigated which categories were enriched for positively selected genes in the three lineages. The first group of genes commonly enriched in ants, flies, and bees was related to proteolysis. This group included 4 of the 24 significantly enriched GO categories in ants (“proteolysis,” “metallopeptidase activity,” “peptidase regulator activity,” and “hydro-lyase activity”), 8 of the 106 GO categories enriched in flies (“serine-type endopeptidase activity,” “endopeptidase activity,” “proteolysis,” “metalloendopeptidase activity,” “peptidase activity,” “peptidase activity, acting on L-amino acid peptides,” “metallopeptidase activity,” and “exopeptidase activity”), and 6 of the 38 GO categories enriched bees (“amine metabolic process,” “metalloendopeptidase activity,” “metallopeptidase activity,” “signal transduction,” “cellular amine metabolic process,” and “cellular amino acid metabolic process”).

Table 4.

GO Categories Enriched for Positively Selected Genes, Based on Scores from the Branch-Site Test from Codeml in Drosophila.

SetID Ontology SetName SetSize Score P-value FDR
GO:0006030 BP Chitin metabolic process 29 16.2 4.0e-6 0.0015
GO:0006022 BP Aminoglycan metabolic process 36 19.4 5.7e-6 0.0018
GO:0006952 BP Defense response 36 19.2 9.8e-6 0.0018
GO:0008061 MF Chitin binding 24 13.5 2.0e-5 0.0020
GO:0004252 MF Serine-type endopeptidase activity 52 26.0 3.3e-5 0.0023
GO:0008026 MF ATP-dependent helicase activity 18 10.5 4.0e-5 0.0026
GO:0004872 MF Receptor activity 13 7.9 8.5e-5 0.0048
GO:0006006 BP Glucose metabolic process 13 7.8 0.00021 0.0082
GO:0046486 BP Glycerolipid metabolic process 17 9.7 0.00023 0.0090
GO:0005819 CC Spindle 20 10.9 0.00046 0.012
GO:0004175 MF Endopeptidase activity 78 35.9 0.00048 0.012
GO:0009607 BP Response to biotic stimulus 31 15.8 0.00060 0.013
GO:0051707 BP Response to other organism 31 15.8 0.00060 0.013
GO:0006508 BP Proteolysis 136 59.5 0.00071 0.014
GO:0006007 BP Glucose catabolic process 12 7.0 0.00072 0.014
GO:0019320 BP Hexose catabolic process 12 7.0 0.00072 0.014
GO:0030312 CC External encapsulating structure 12 7.0 0.00074 0.014
GO:0015081 MF Sodium ion transmembrane transporter activity 16 8.9 0.00088 0.016
GO:0051649 BP Establishment of localization in cell 14 7.9 0.00093 0.017
GO:0007126 BP Meiosis 34 16.9 0.00096 0.017
GO:0003824 MF Catalytic activity 844 337.0 0.0011 0.018
GO:0046488 BP Phosphatidylinositol metabolic process 11 6.5 0.0012 0.018
GO:0016490 MF Structural constituent of peritrophic membrane 11 6.4 0.0013 0.018
GO:0005975 BP Carbohydrate metabolic process 64 29.5 0.0015 0.019
GO:0004888 MF Transmembrane receptor activity 49 23.2 0.0016 0.020
GO:0051276 BP Chromosome organization 59 27.2 0.0020 0.024
GO:0008270 MF Zinc ion binding 173 73.4 0.0027 0.029
GO:0002376 BP Immune system process 43 20.4 0.0027 0.029
GO:0002759 BP Regulation of antimicrobial humoral response 11 6.3 0.0028 0.029
GO:0004984 MF Olfactory receptor activity 19 9.9 0.0030 0.029
GO:0002697 BP Regulation of immune effector process 11 6.2 0.0038 0.035
GO:0000819 BP Sister chromatid segregation 11 6.2 0.0038 0.035
GO:0007143 BP Female meiosis 11 6.2 0.0047 0.040
GO:0016021 CC Integral to membrane 224 93.1 0.0047 0.040
GO:0031347 BP Regulation of defense response 12 6.6 0.0055 0.044
GO:0015370 MF Solute:sodium symporter activity 12 6.6 0.0061 0.047
GO:0000272 BP Polysaccharide catabolic process 11 6.1 0.0065 0.049
GO:0016810 MF Hydrolase activity, acting on carbon–nitrogen (but not peptide) bonds 28 13.6 0.0065 0.049
GO:0004521 MF Endoribonuclease activity 11 6.1 0.0069 0.052
GO:0007291 BP Sperm individualization 14 7.4 0.0074 0.053
GO:0010564 BP Regulation of cell cycle process 30 14.4 0.0075 0.053
GO:0005635 CC Nuclear envelope 17 8.7 0.0077 0.054
GO:0016773 MF Phosphotransferase activity, alcohol group as acceptor 82 36.0 0.0077 0.054
GO:0051253 BP Negative regulation of RNA metabolic process 35 16.5 0.0080 0.055
GO:0007608 BP Sensory perception of smell 21 10.5 0.0080 0.055
GO:0004222 MF Metalloendopeptidase activity 26 12.7 0.0082 0.055
GO:0006807 BP Nitrogen compound metabolic process 298 121.5 0.0090 0.059
GO:0005576 CC Extracellular region 97 41.9 0.010 0.066
GO:0006814 BP Sodium ion transport 19 9.5 0.011 0.068
GO:0045132 BP Meiotic chromosome segregation 11 5.9 0.011 0.070
GO:0034641 BP Cellular nitrogen compound metabolic process 296 120.4 0.012 0.072
GO:0010629 BP Negative regulation of gene expression 42 19.2 0.013 0.073
GO:0090304 BP Nucleic acid metabolic process 162 67.6 0.013 0.075
GO:0016301 MF Kinase activity 91 39.2 0.013 0.075
GO:0048584 BP Positive regulation of response to stimulus 11 5.9 0.015 0.084
GO:0016798 MF Hydrolase activity, acting on glycosyl bonds 26 12.4 0.017 0.086
GO:0006139 BP Nucleobase, nucleoside, nucleotide, and nucleic acid metabolic process 236 96.4 0.017 0.086
GO:0016491 MF Oxidoreductase activity 201 82.6 0.018 0.090
GO:0009987 BP Cellular process 790 310.5 0.019 0.095
GO:0007088 BP Regulation of mitosis 16 8.0 0.019 0.095
GO:0051783 BP Regulation of nuclear division 16 8.0 0.019 0.095
GO:0006810 BP Transport 200 82.1 0.021 0.10
GO:0051234 BP Establishment of localization 197 80.9 0.021 0.10
GO:0006066 BP Alcohol metabolic process 35 16.1 0.021 0.10
GO:0004553 MF Hydrolase activity, hydrolyzing O-glycosyl compounds 22 10.6 0.022 0.11
GO:0008233 MF Peptidase activity 26 12.3 0.022 0.11
GO:0070011 MF Peptidase activity, acting on l-amino acid peptides 22 10.6 0.022 0.11
GO:0046914 MF Transition metal ion binding 58 25.5 0.022 0.11
GO:0050660 MF Flavin adenine dinucleotide binding 16 8.0 0.024 0.11
GO:0045892 BP Negative regulation of transcription, DNA-dependent 26 12.2 0.024 0.11
GO:0032553 MF Ribonucleotide binding 161 66.5 0.024 0.11
GO:0032555 MF Purine ribonucleotide binding 161 66.5 0.024 0.11
GO:0035639 MF Purine ribonucleoside triphosphate binding 161 66.5 0.024 0.11
GO:0006396 BP RNA processing 36 16.4 0.025 0.11
GO:0031226 CC Intrinsic to plasma membrane 30 13.9 0.026 0.11
GO:0035222 BP Wing disc pattern formation 11 5.7 0.026 0.12
GO:0007346 BP Regulation of mitotic cell cycle 31 14.3 0.028 0.12
GO:0045017 BP Glycerolipid biosynthetic process 11 5.7 0.030 0.13
GO:0006955 BP Immune response 30 13.8 0.030 0.13
GO:0044262 BP Cellular carbohydrate metabolic process 44 19.6 0.030 0.13
GO:0017076 MF Purine nucleotide binding 164 67.5 0.032 0.14
GO:0016705 MF Oxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen 21 10.0 0.032 0.14
GO:0005524 MF ATP binding 163 67.0 0.034 0.14
GO:0030554 MF Adenyl nucleotide binding 163 67.0 0.034 0.14
GO:0032559 MF Adenyl ribonucleotide binding 163 67.0 0.034 0.14
GO:0008237 MF Metallopeptidase activity 12 6.1 0.034 0.14
GO:0007127 BP Meiosis I 18 8.7 0.035 0.14
GO:0019730 BP Antimicrobial humoral response 14 6.9 0.039 0.15
GO:0005815 CC Microtubule organizing center 16 7.8 0.041 0.16
GO:0055114 BP Oxidation–reduction process 167 68.3 0.043 0.17
GO:0019899 MF Enzyme binding 14 6.9 0.044 0.17
GO:0048232 BP Male gamete generation 45 19.8 0.045 0.17
GO:0008033 BP tRNA processing 17 8.2 0.045 0.17
GO:0005887 CC Integral to plasma membrane 29 13.2 0.046 0.17
GO:0044281 BP Small molecule metabolic process 212 85.8 0.046 0.17
GO:0008238 MF Exopeptidase activity 18 8.6 0.046 0.17
GO:0051179 BP Localization 236 95.1 0.047 0.17
GO:0007283 BP Spermatogenesis 44 19.3 0.047 0.18
GO:0050662 MF Coenzyme binding 48 21.0 0.048 0.18
GO:0034470 BP ncRNA processing 27 12.3 0.049 0.18
GO:0048515 BP Spermatid differentiation 24 11.1 0.050 0.18
GO:0045786 BP Negative regulation of cell cycle 11 5.5 0.050 0.18
GO:0045934 BP Negative regulation of nucleobase, nucleoside, nucleotide, and nucleic acid metabolic process 41 18.1 0.052 0.18
GO:0010639 BP Negative regulation of organelle organization 12 5.9 0.055 0.19
GO:0015631 MF Tubulin binding 12 5.9 0.056 0.19
GO:0005549 MF Odorant binding 43 18.8 0.058 0.20

Note.—Depletion results are shown in supplementary table S4, Supplementary Material online.

Table 5.

GO Categories Enriched for Positively Selected Genes, Based on Scores from the Branch-Site Test from Codeml in Bees.

SetID Ontology SetName SetSize Score P-value FDR
GO:0005099 MF Ras GTPase activator activity 11 6.0 1.1e-5 0.03
GO:0005083 MF Small GTPase regulator activity 18 8.1 0.00010 0.041
GO:0004872 MF Receptor activity 16 7.2 0.00020 0.041
GO:0022836 MF Gated channel activity 11 5.4 0.00021 0.041
GO:0006399 BP tRNA metabolic process 26 10.3 0.00040 0.053
GO:0071842 BP Cellular component organization at cellular level 190 55.5 0.0013 0.065
GO:0006418 BP tRNA aminoacylation for protein translation 19 7.6 0.0021 0.084
GO:0009725 BP Response to hormone stimulus 11 4.9 0.0022 0.084
GO:0005635 CC Nuclear envelope 13 5.6 0.0022 0.084
GO:0032507 BP Maintenance of protein location in cell 12 5.3 0.0023 0.084
GO:0051336 BP Regulation of hydrolase activity 20 7.9 0.0026 0.089
GO:0006629 BP Lipid metabolic process 51 17.0 0.0031 0.095
GO:0031072 MF Heat shock protein binding 17 6.7 0.0046 0.11
GO:0008152 BP Metabolic process 335 91.3 0.0070 0.14
GO:0004812 MF Aminoacyl-tRNA ligase activity 19 7.2 0.0075 0.14
GO:0016740 MF Transferase activity 211 59.1 0.0087 0.14
GO:0019899 MF Enzyme binding 19 7.1 0.0097 0.14
GO:0005216 MF Ion channel activity 14 5.5 0.011 0.14
GO:0022838 MF Substrate-specific channel activity 14 5.5 0.011 0.14
GO:0009308 BP Amine metabolic process 47 15.2 0.011 0.14
GO:0004222 MF Metalloendopeptidase activity 15 5.8 0.012 0.14
GO:0005938 CC Cell cortex 15 5.8 0.012 0.14
GO:0008237 MF Metallopeptidase activity 23 8.2 0.012 0.14
GO:0007275 BP Multicellular organismal development 274 74.9 0.013 0.14
GO:0007165 BP Signal transduction 97 28.8 0.013 0.14
GO:0044106 BP Cellular amine metabolic process 40 12.9 0.019 0.19
GO:0044459 CC Plasma membrane part 45 14.3 0.021 0.19
GO:0006520 BP Cellular amino acid metabolic process 32 10.6 0.021 0.19
GO:0032879 BP Regulation of localization 36 11.7 0.022 0.19
GO:0006140 BP Regulation of nucleotide metabolic process 13 5.0 0.022 0.19
GO:0030811 BP Regulation of nucleotide catabolic process 13 5.0 0.022 0.19
GO:0033121 BP Regulation of purine nucleotide catabolic process 13 5.0 0.022 0.19
GO:0033124 BP Regulation of GTP catabolic process 13 5.0 0.022 0.19
GO:0043087 BP Regulation of GTPase activity 13 5.0 0.022 0.19
GO:0006793 BP Phosphorus metabolic process 97 28.3 0.023 0.19
GO:0006796 BP Phosphate metabolic process 97 28.3 0.023 0.19
GO:0042578 MF Phosphoric ester hydrolase activity 42 13.4 0.024 0.19
GO:0016758 MF Transferase activity, transferring hexosyl groups 26 8.8 0.024 0.19

Note.—Depletion results are shown in supplementary table S5, Supplementary Material online.

The second group of genes enriched for positive selection signal in ants, flies, and bees was involved in response to stimuli. There was an enrichment of the GO category “receptor activity” in the three lineages as well as the GO categories “transmembrane receptor activity” and “olfactory receptor activity” in flies. This class of genes plays a pivotal role in the interactions between individuals and their environment. In addition, the GO categories “response to biotic stimulus” and “response to other organism” were enriched in flies, and the GO category “response to hormone stimulus” was enriched in bees. In ants, “response to ecdysone” and “response to steroid hormone stimulus” were marginally significant (FDR = 21%; supplementary table S6, Supplementary Material online).

Some functions were enriched for positively selected genes in only two of the three lineages. These included GO categories related to immunity that were enriched in ants and flies, and some categories related to metabolism which were enriched in flies and bees. Evidence for positive selection on immunity-related functions in ants came from a significant enrichment of the GO categories “immune system development” and “hemopoietic or lymphoid organ development,” the organ that produces during larval development the cells mediating the immune response in insects (Corley and Lavine 2006). Seven GO categories related to immunity were also enriched in flies (“defense response,” “immune system process,” “regulation of antimicrobial humoral response,” “regulation of immune effector process,” “regulation of defense response,” “immune response,” and “antimicrobial humoral response”). The absence of significant enrichment for related categories in bees might reflect a lack of power of the gene set enrichment test, because the set of immunity genes is small in the honeybee (Honeybee Genome Sequencing Consortium 2006) and the data set analyzed was further depleted in genes with immunity-related functions (supplementary text and table S7, Supplementary Material online). Consistent with this interpretation, there was a trend in the direction of an enrichment, although nonsignificant, for 5 of the 7 tested GO categories related to immunity in bees (data not shown), suggesting that immunity might be a common target of positive selection in insects.

The second set of GO categories enriched in two of the three insect lineages included various metabolic processes and their regulators, with metabolism of chitin, aminoglycan, carbohydrate, polysaccharide, glucose, hexose, glycerolipid, and phosphatidylinositol being enriched in flies and metabolism of lipid, amino-acid, nucleotide, and phosphorus being enriched in bees. There was no significant enrichment for GO categories related to metabolism in ants, but some categories were close to significance (e.g., “chitin metabolic process” and “rRNA metabolic process,” with FDR = 21% and 24%, respectively; supplementary table S6, Supplementary Material online). Metabolic functions, such as amino-acid, fatty acid, lipid, or RNA metabolism, were significantly enriched in ants when we used KEGG pathways annotation instead of the GO to perform the gene set enrichment test, as well as when the single-copy orthologs data set was reanalyzed with another multiple alignment method and a different quality filtering method (supplementary tables S25 and S26 and supplementary text, Supplementary Material online). It thus seems that metabolism is a common target of positive selection in insects.

Social Behaviors

Two of the GO categories enriched in ants (“neuron recognition” and “adult locomotory behavior”) might potentially be linked to the evolution of neural systems and behavior (table 3). The first category was “neuron recognition.” However, GO categories related to neural systems were also enriched in a nonsocial hymenoptera lineage (“regulation of synaptogenesis,” “mushroom body development,” and “memory” on branch #8, basal to the Hymenoptera; fig. 1 and supplementary table S9, Supplementary Material online), and in the branches leading to primitively social lineages of bees (“synapse,” “synapse organization,” “regulation of synaptic growth at neuromuscular junction”; data not shown) (also reported in Woodard et al. 2011), suggesting that positive selection on neural system genes in ants might not be directly associated with the emergence of social behaviors.

The second GO category enriched in ants was “adult locomotory behavior.” This category was not enriched in any of the other tested lineages. The three genes contributing most to the positive selection signal in this GO category were DCX-EMAP, turtle, and beethoven. Mutational analyses of these genes in Drosophila suggest that they play an important role in sensory perception functions. Adult flies carrying a piggyBac insertion in DCX-EMAP are uncoordinated and deaf and display loss of mechanosensory transduction and amplification (Bechstedt et al. 2010). Turtle plays an essential role in the execution of coordinated motor output in complex behaviors in flies, notably regarding the response to tactile stimulation (Bodily et al. 2001). Finally, beethoven is involved in male courtship behavior, adult walking behavior, and sensory perception of sound in flies (Tauber and Eberl 2001). This suggests that positive selection might have been important for the evolution of sensory perception functions in ants.

A specific analysis of ORs did not provide support for the evolution of sociality being associated with increased levels of positive selection on ORs. A scan for positive selection across branches of a tree gathering 873 manually annotated ORs from two ants (Pogonomyrmex barbatus and Linepithema humile) and the solitary wasp Nasonia vitripennis (see Materials and Methods) revealed that positive selection was pervasive, with 277 branches (23%) displaying significant signals for positive selection (fig. 3 and supplementary fig. S1, Supplementary Material online). However, positive selection was detected in only 19% of the 929 branches leading to ant species, whereas as many as 40% of the 156 branches leading to wasps were under positive selection (Fisher's exact test P = 7.6e-9).

Fig. 3.

Fig. 3.

Positive selection in the olfactory receptors gene family. Phylogenetic tree of manually annotated protein-coding sequences of olfactory receptors genes, including 291 genes from Pogonomyrmex barbatus in blue, 320 genes from Linepithema humile in green, and 262 genes from Nasonia vitripennis in black. The topology of the tree depicts the assemblage of 16 subtrees where positive selection was tested using the branch-site test of Codeml (Materials and Methods). Tested branches are depicted in gray if there was no evidence for positive selection and in red if there was significant evidence for positive selection at 10% FDR. Untested branches are depicted in black. Scale bar indicates the number of amino acid substitutions per site.

Phenotypic Plasticity among Castes

None of the GO categories enriched for positively selected genes in ants could be linked to phenotypic plasticity (i.e., caste differences). In particular, there was no evidence of a significant enrichment for GO categories related to morphology or morphogenesis in the ant lineage. Another enrichment test using annotations obtained from mutant phenotypes in D. melanogaster (which are more relevant than GO annotations to analyze genes involved in morphogenesis since genes sets mostly refer to anatomical structures) also provided no clear support for positive selection on genes associated with phenotypic plasticity in ants (supplementary table S12, Supplementary Material online).

However, among the genes with the highest support for positive selection in ants (table 1), two genes had a role in wing development (Guanine nucleotide exchange factor in mesoderm and Methylthioribose-1-phosphate isomerase) and one in larval development (Dis3), suggesting that even if positive selection did not act consistently on large sets of genes related to morphogenesis, it could have acted specifically on a few individual genes.

Mitochondrial Genes

Eleven GO categories enriched for positively selected genes in ants were related to mitochondrial activity (e.g., “mitochondrial electron transport,” “mitochondrial matrix,” “mitochondrial respiratory chain,” “NADH dehydrogenase ubiquinone activity,” and “oxidoreductase activity”; table 3). The mitochondrial processes under positive selection were not restricted to respiration and energy production, but also included translation (“organellar ribosome,” “mitochondrial small/large ribosomal subunit”). GO categories related to mitochondria were also enriched for positively selected genes on many individual branches of the ant lineage analyzed separately (supplementary table S9, Supplementary Material online), and in a larger data set including duplicated genes analyzed with the site test (see Materials and Methods; table 1, supplementary tables S13 and S19, Supplementary Material online). This indicates that recurrent events of positive selection occurred on genes with mitochondrial functions during the evolution of the ant lineage. In contrast, mitochondria-related GO categories did not display any enrichment for positively selected genes in flies and bees (tables 4 and 5), despite a high power to detect it on the respective data sets (supplementary text, Supplementary Material online). Similarly, no mitochondrial function was significantly enriched in the branches #7 and #8, basal to the ant lineage (fig. 1 and supplementary table S9, Supplementary Material online), reinforcing the idea that increased positive selection on mitochondria is restricted to the ant lineage.

Of note, none of the 13 protein coding genes (Oliveira et al. 2008; Gotzek et al. 2010) from the mitochondrial genome was included in our main data set because the mitochondrial genomes of most of the ant species analyzed were not annotated. Our results thus reflect positive selection on nuclear genes encoding proteins that function in the mitochondrion. We annotated mitochondrial genomes in 5 of the 7 ant species analyzed and tested whether positive selection could also be detected on the mitochondrial genomes themselves (Materials and Methods) (Gerber et al. 2001; Bazin et al. 2006; Meiklejohn et al. 2007). However, we did not find evidence for positive selection on these alignments, neither with the branch-site test (supplementary table S15, Supplementary Material online) nor with the site test (supplementary table S14, Supplementary Material online).

Lifespan Genes

There was a significant enrichment for positively selected genes in the ant orthologs of D. melanogaster genes that were downregulated in 61-day-old flies compared with 10-day-old flies, based on a published microarray analysis (P = 0.011; below Bonferroni threshold α = 0.05/4 = 0.0125; supplementary table S16, Supplementary Material online) (Landis et al. 2004).

Two other genes known to be involved in aging were among the top-scoring genes for positive selection in our data set. The first was Tequila, which has been shown to be associated with aging in an experimental evolution study in D. melanogaster (Remolina et al. 2012). The other was mitochondrial trifunctional protein α subunit, whose knock-out also reduces lifespan in D. melanogaster (Kishita et al. 2012). Although not in the list of top hits, Sod2 (superoxide dismutase [Mn], mitochondrial), a gene known to have antioxidant activity and whose overexpression has been shown to be associated with increased lifespan in some strains of D. melanogaster (Mockett et al. 1999; Curtis et al. 2007), underwent positive selection at the base of the Hymenoptera lineage (FDR = 0.0073) and in the Acromyrmex echinatior branch (FDR = 9.6e-8).

Selective Pressure on Genes with Caste-Biased Expression

There was a marginally significant enrichment for positively selected genes among genes with biased expression in adult workers in S. invicta (effect size = 1.2, P = 0.025; not significant after Bonferroni correction α = 0.05/6 = 0.0083; supplementary table S17, Supplementary Material online) and a stronger enrichment for genes with queen-biased expression in adults (effect size = 1.8, P = 0.0028). Surprisingly, however, there was a pattern of weaker enrichment for genes with male-biased expression in adults (effect size = 1.04, P = 0.2381). At the pupal stage, we did not detect a significant enrichment for positively selected genes among any group of genes showing caste-biased expression. But similarly to the adult stage, the enrichment effect size was higher for genes with queen-biased expression (effect size = 1.2) than for genes with worker-biased expression (effect size = 1.1), and it was the lowest for genes showing male-biased expression (effect size = 1.06).

Discussion

In this article, we report results from a genome-wide scan for positive selection in protein-coding sequences of seven ant genomes, using the rigorous branch-site model of Codeml (Zhang et al. 2005) with stringent data quality control. Positive selection was detected in the ant lineage for 20% of the gene families analyzed. This proportion is similar in magnitude to the values observed in the other two insect lineages that we reanalyzed in this study: 20% in the 10 bee species and 36% in the 12 flies species.

Our analysis identified similarities in patterns of positive selection between the ants and other insect lineages. Notably, at the broadest phylogenetic scale that our data sets allowed us to study, functional categories related to proteolysis, metabolism, response to stimuli, and immunity, were enriched for positively selected genes in ants, bees, and flies. Interestingly, studies in mammals, fishes, and urchins also provided evidence for positive selection on similar functional categories (Kosiol et al. 2008; Studer et al. 2008; Oliver et al. 2010; Montoya-Burgos 2011). Recurrent positive selection on such long evolutionary time scales is typical of genes involved in the interaction with changing environments or in conflict and competition, such as evolutionary arms races between sexes or between different species, which cause the perpetuation of adaptations and counter-adaptations in competing sets of coevolving genes (Dawkins and Krebs 1979). It is notable that positive selection patterns on these categories of genes do not seem to reflect or be strongly affected by the large life-history differences between lineages analyzed here, for example the evolution of eusociality in the hymenopteran lineages. In particular, our results on immunity-related genes challenge the hypothesis that hygienic behaviors in social insects could have relaxed the selective pressure on immune genes, since this should be reflected in reduced levels of positive selection on these genes (Honeybee Genome Sequencing Consortium 2006; Smith et al. 2008; Viljakainen et al. 2009; Werren et al. 2010; Fischman et al. 2011; Smith, Zimin, et al. 2011; Smith, Smith, et al. 2011; Suen et al. 2011; Harpur and Zayed 2013).

Our analysis indicated that genes involved in neurogenesis were under positive selection in ants and the primitively social lineages of bees. It was previously hypothesized that stronger selection on genes related to brain function and development should be observed in eusocial Hymenoptera species due to high cognitive demands associated with social life (Fischman et al. 2011). However, our results are not consistent with this prediction because we also uncovered signs of positive selection at the base of the Hymenoptera lineage, i.e., before the evolution of sociality. Interestingly, a similar pattern had previously been reported with brain morphological data. A comparative analysis of insects showed that the size of mushroom body started to increase at the base of the Euhymenopteran (Orussioidea + Apocrita) lineage, approximately 90 My before the evolution of sociality in the Aculeata, and that there was no clear correlation between the size of brain components and the levels of sociality or cognition capabilities (Farris and Schulmeister 2011; Lihoreau et al. 2012). To account for this observation, Fischman et al. (2011) tried to identify factors, other than sociality, that may have placed unique selective pressure on brain evolution in species of the Hymenoptera lineage. Based on the observation that there was less positive selection on neurogenesis genes in highly social bees than in primitively social bees, they proposed that cognitive challenges might be associated with the mode of colony founding in social Hymenoptera. In particular, primitively social bees, which transit from a solitary phase during the process of colony founding to a social phase, could experience higher cognitive needs than highly social bees, which never go through a solitary phase. However, our results are also inconsistent with this model since increased positive selection was observed before the evolution of sociality in Hymenoptera. A comprehensive survey of positive selection on neurogenesis genes in Hymenoptera species, including species basal to the lineage, is required to identify precisely when the selective regime of these genes started to change, and in which hymenopteran sublineages it was maintained.

Our results also challenge the hypothesis that genes involved in chemical signaling experienced increased positive selection in social insects (Ingram et al. 2005; Robertson and Wanner 2006; Bonasio et al. 2010; Smith, Zimin, et al. 2011; Wurm et al. 2011; Zhou et al. 2012; Leboeuf et al. 2013). The analysis of olfactory receptor repertoires in two ants and a nonsocial wasp indicates that positive selection on amino-acid substitutions was surprisingly less frequent in ant than in wasp branches. Given the limited number of species used in this analysis, future work should concentrate on generating extensive annotation of olfactory receptors from more Hymenoptera as well as outgroup species to identify characters or traits that could be associated with the pattern of positive selection on olfactory receptors.

Although our analyses did not provide support for previous hypotheses about the expected effect of sociality on gene evolution, we identified several interesting functional categories which were enriched for positively selected genes exclusively in the ant lineage, possibly reflecting ant-specific adaptations. The most consistent and robust result was that genes functioning in the mitochondria were particularly likely to be under positive selection. Mitochondrial activity plays an important role in the process of reproductive isolation and speciation (Lee et al. 2008; Burton and Barreto 2012), interactions with endosymbionts such as Wolbachia (Werren 1997), diseases (Cortopassi 2002; Richly et al. 2003; Trifunovic et al. 2004, 2005), and aging (Lenaz 1998; Cortopassi 2002; Kowald and Kirkwood 2011). In that respect it is notable that the evolution of sociality has been accompanied by a nearly 100-fold increase in lifespan of queens compared with their solitary ancestors (Keller and Genoud 1997; Jemielity et al. 2005). Three lines of evidence suggest that increased lifespan of queens might be related to increased positive selection on mitochondrial genes in the ant lineage.

First, lifespan extension, not only in insects but also in other lineages such as birds and bats, appears to be associated with decreased production of reactive oxidative species (ROS) (Perez-Campo et al. 1998; Brunet-Rossinni 2004; Parker et al. 2004; Corona et al. 2005; Jemielity et al. 2005). ROS are a normal by-product of cellular metabolism. In particular, one major contributor to oxidative damage is hydrogen peroxide (H2O2), which is produced from leaks of the respiratory chain in the mitochondria (Harman 1972; Lenaz 1998; Finkel and Holbrook 2000; Cui et al. 2012). Positive selection in ants on genes functioning in the mitochondria may thus reflect selection to increase mitochondrial efficiency and reduce ROS production. Interestingly, positive selection on genes with mitochondrial functions was previously documented in the bat lineage (Shen et al. 2010; Zhang et al. 2013), which include species with exceptional longevity (Brunet-Rossinni and Austad 2004). In the bat Myotis lucifugus, ROS production was also shown to be significantly lower than in two similar sized mammal species (a mouse and a shrew) although the metabolic rates, and thus mitochondrial activity, of the former were much higher because of flight demands (Brunet-Rossinni 2004).

Second, on the basis of gene expression data obtained in the fire ant S. invicta, our analyses revealed that positive selection was strongest on genes with queen-biased expression, intermediate on genes with worker-biased expression, and weakest on genes with male-biased expression. This association between levels of positive selection and caste-biased differences in gene expression cannot be simply accounted by differences in expression levels of mitochondrial genes (which are enriched for positively selected genes in ants) since in S. invicta mitochondrial genes are significantly less expressed in queens than in workers at the larval stage, and not differentially expressed at the adult stage (supplementary fig. S2, Supplementary Material online). The finding of higher levels of positive selection for genes more highly expressed in the castes with the longer lifespan (queens can live decades in some species, whereas workers have lifespan in the order of months, and males in the order of days) suggests that increased positive selection on queen-specific genes could be related to longer lifespan.

Third, our analyses showed that the levels of positive selection were higher on orthologs of genes which are down-regulated during aging in flies. These genes include numerous energy metabolism genes, and their downregulation in old flies is thought to reflect a decline of normal and functional mitochondria with age (Yui et al. 2003; Landis et al. 2004). The finding of increased levels of positive selection on genes whose expression declines at older ages suggests that the function of these genes might be improved in ants, potentially delaying the loss of normal activity in mitochondria with age. It would be interesting to test if parallel mechanisms also evolved in the ant lineage to maintain the expression of these genes and delay the decline of mitochondria activity through lifespan in queens.

In contrast to ants, there was no evidence of elevated levels of positive selection on mitochondrial functions in bees. As most social species, bees also evolved longer queen lifespans (more than 2 years) compared with males and workers (a few weeks) (Keller and Genoud 1997; Munch et al. 2008). There are four possible explanations for the difference between ants and bees in the level of positive selection on mitochondrial genes. First, lifespan differences between castes are less pronounced in bees, where queens live up to 2–5 years, than in ants, where queens can live up to 30 years, possibly resulting in lower selective pressure to increase lifespan in bees than in ants. Second, because eusociality evolved independently in ants and bees it is possible that extended queen lifespans evolved by different molecular mechanisms (Jemielity et al. 2005; Jobson et al. 2010). For example, vitellogenin may play a more central role for aging in bees than ants (Amdam and Omholt 2002; Corona et al. 2007; Munch et al. 2008). Third, the evolution of mitochondria-related genes may have been differently constrained in ants and bees. For example, metabolic rates differ greatly between flying bee workers and non-flying ant workers because flight is an energetically costly behavior requiring highly elevated metabolic rates (Jensen and Holm-Jensen 1980; Suarez 2000; Niven and Scharlemann 2005). Because metabolism and mitochondrial activity are closely connected, lower metabolic rates in ants might have alleviated functional constraints on mitochondria-related genes, allowing selection to act on lifespan extension. Fourth, the GC content in bee genomes was shown to be lower than in ant genomes (Honeybee Genome Sequencing Consortium 2006; Simola et al. 2013). Some parts of the bee genomes, in particular their mitochondrial genomes (Crozier and Crozier 1993; Gotzek et al. 2010; Tan et al. 2011), display extreme bias in nucleotide composition, which leads to significant effect on both the codon usage patterns and amino-acid composition of proteins and may have interfered with the action of positive selection.

If positive selection acted to optimize the functioning of mitochondria in ants, it could be expected that the mitochondrial genome itself should be targeted by positive selection. However, mitochondrial genes generally exhibit very low dN/dS ratios (Montooth and Rand 2008) and there was no clear evidence in our results for positive selection on the 13 genes of the mitochondrial genome itself. This suggests that innovations related to mitochondrial activity could arise more easily on nuclear genes, whereas mitochondrial genes seem more likely to maintain conserved core functionalities.

In conclusion, this study provides a detailed analysis of the extent of positive selection events on protein-coding genes in seven ant species. Because false positives are a major concern for whole-genome scans for positive selection, we used a conservative methodology. We also reanalyzed data in bees and flies with the same methods to permit an unbiased and robust comparison of positive selection between lineages. The comparison between these three lineages provided interesting perspectives on the evolution of genes implicated in immunity, neurogenesis, and olfaction, and allowed us to pinpoint positive selection events that were specific to the ant lineage. In particular, we found that the evolution of extreme lifespan in ants was associated with positive selection on genes with mitochondrial functions, suggesting that a more efficient functioning of mitochondrial genes might have been an important step toward the extreme lifespan extension that characterizes this lineage. It would be interesting to complement this study by scans for genes under lineage-specific strong or relaxed purifying selection, to get a more global picture of natural selection patterns in ant genomes, and uncover additional genes that could have played a significant role during the evolution of the ant lineage.

Materials and Methods

Single-Copy Orthologs Gene Families Data Set

Protein-coding gene sequences of the seven ant genomes were downloaded from the Hymenoptera Genome Database (http://hymenopteragenome.org/ant_genomes/, last accessed April 24, 2014) (Munoz-Torres et al. 2011).

The complete annotated gene sets were OGS_1.0 for Acromyrmex echinatior (Nygaard et al. 2011), OGS_1.2 for Atta cephalotes (Suen et al. 2011), OGS_2.2.3 for Solenopsis invicta (Wurm et al. 2011), OGS_1.2 for Pogonomyrmex barbatus (Smith, Smith, et al. 2011), OGS_3.3 for Camponotus floridanus (Bonasio et al. 2010), OSG_1.2 for Linepithema humile (Smith, Zimin, et al. 2011), and OGS_3.3 for Harpegnathos saltator (Bonasio et al. 2010). Coding sequences of five outgroup species were downloaded from the Hymenoptera Genome Database for the honey bee (Apis mellifera Amel_pre_release2) (Honeybee Genome Sequencing Consortium 2006) and the jewel wasp (Nasonia Vitripenis OGS_v1.2) (Werren et al. 2010), from Flybase (Tweedie et al. 2009) for the fruit fly (Drosophila melanogaster FB5.29) (Adams et al. 2000), from BeetleBase (Kim, Murphy, et al. 2010) for the flour beetle (Tribolium castaneum Tcas_3.0) (Tribolium Genome Sequencing Consortium 2008), and from vectorBase (Lawson et al. 2009) for the body louse (Pediculus humanus PhumU1.2) (Kirkness et al. 2010).

Gene families were obtained from a custom run of the OrthoDB pipeline for the Ant Genomic Consortium (http://cegg.unige.ch/orthodbants and http://bioinfo.unil.ch/supdata/Roux_positive_selection_ants/orthoDB_run.zip, last accessed April 24, 2014; pipeline of OrthoDB release 4) (Waterhouse, Zdobnov, Tegenfeldt, et al. 2011; Simola et al. 2013). Briefly, OrthoDB implements a Best Reciprocal Hit clustering algorithm based on all-against-all Smith–Waterman protein sequence comparisons. The longest alternatively spliced form of genes is used. The orthologous groups are built at different taxonomic levels and it is possible to query for specific phyletic profiles by combining the criteria of absent, present, single-copy, multicopy, or no restriction, for each species within the studied clade.

Gene families including strictly one ortholog in each of the 12 species were selected (2,756 gene families). Because annotations of the studied genomes are likely to be incomplete (Simola et al. 2013), families with a few missing genes—gene losses or unannotated genes—were included, with the restriction that at least four genes out of the seven ant species, and three genes out of the five outgroup species should be present in the gene family. Simola et al. (2013) have shown that among the seven ant species, there were generally few lost or missing genes, apart from S. invicta (less than 400 S. invicta genes were missing in single-copy orthologs families) and Ac. echinatior (<150 Ac. echinatior genes were missing in single-copy orthologs families). Our gene family selection criteria allow handling such a moderate amount of missing genes in families. In order to transfer functional annotations from D. melanogaster, only families including a fruit fly ortholog were retained. With these criteria, the number of OrthoDB groups in the data set increased to 4,337. All gene families were assumed to follow the species tree topology (fig. 1). The exclusion of families that experienced gene duplication facilitates the comparison of branches between gene families, and keeps our analysis from biases related to differential duplication among lineages (Waterhouse, Zdobnov, Kriventseva 2011) and among genes (Davis and Petrov 2004; He and Zhang 2006), and to the consequences of duplication (Force et al. 1999; Brunet et al. 2006). Finally, results on single-copy orthologs can be easily compared with previously published studies using similar gene family topologies (Drosophila 12 Genomes Consortium 2007; Kosiol et al. 2008; Studer et al. 2008; Lindblad-Toh et al. 2011).

Basic sequence quality features were first controlled as in Hambuch and Parsch (2005). CDS (coding sequences) whose length was not a multiple of 3 or did not correspond to the length of the predicted protein, or that contained an internal stop codon, were eliminated; the longest CDS of genes showing multiple isoforms was retained; CDS shorter than 100 nt were eliminated.

Because misalignment errors can be an important source of false positives in genome-wide scans for positive selection in coding sequences (Schneider et al. 2009; Markova-Raina and Petrov 2011; Yang and dos Reis 2011; Jordan and Goldman 2012), we took great care at filtering the potentially problematic sites in the alignments. The quality filtering pipeline used here is adapted from the pipeline of the Selectome database release 4 (http://selectome.unil.ch, last accessed April 24, 2014) (Proux et al. 2009; Moretti et al. 2014). The multiple alignment of the protein sequences in each gene family was computed by M-Coffee (Wallace et al. 2006) from the T-Coffee package v8.93 (Notredame et al. 2000), which combines the output of different aligners. Similarly to Ensembl Compara (see http://www.ensembl.org/info/docs/compara/homology_method.html [last accessed April 24, 2014] for more details) (Vilella et al. 2009), four different aligners were used for M-Coffee (mafftgins_msa, muscle_msa, kalign_msa, and t_coffee_msa). M-Coffee outputs a consensus of four alignments from the different aligners, and a quality score for each residue based on the concordance of the alignment at each position by different aligners. Scores lie between 0, if a residue was not aligned at the same position by the different aligners, and 9 if it is reliably aligned at the same position in all cases. Reliably aligned residues with a score of 7 or above were retained. We used the heuristic algorithm of MaxAlign v1.1 (Gouveia-Oliveira et al. 2007) to detect and remove sequences badly aligned as a whole (gap-rich sequences) in the multiple sequence alignments. When a sequence was removed, the gene family was realigned and refiltered using M-Coffee. Families left with less than four sequences were discarded because of insufficient power to detect positive selection. The protein alignments were reverse-translated to nucleotide alignments using the seq_reformat utility of the T-Coffee package (Notredame et al. 2000).

We used a stringent Gblocks filtering (v0.91b; type = codons; minimum length of a block = 4; no gaps allowed) (Talavera and Castresana 2007) to remove gap-rich regions from the alignments, as these are problematic for positive selection inference (Fletcher and Yang 2010; Markova-Raina and Petrov 2011). The large memory requirements of M-Coffee for long alignments led us to use only Gblocks without M-Coffee scoring if the length of the alignment was greater than 9,000 nt.

After filtering, our data set included 4,261 gene families with an average of 10.4 branches per family to test (fig. 1; 44,306 branches to test; median = 10 branches per family). The mean length of filtered alignment was 1,133 nt (median = 885 nt), ranging from a minimum of 54 nt to a maximum of 22,248 nt. Of note, lost or missing genes in families affect the topology of the trees and the possibility to compare equivalent branches of different families. In total, our data set contains 36,681 branches (83%) in 4,256 families which corresponded to the canonical topology defined by the species tree (fig. 1) and could be compared across families (e.g., table 2).

Our analyses are likely to underestimate the genome-wide number of positive selection events because 1) single-copy orthologs tend to evolve under stronger purifying selection than multicopy gene families (Waterhouse, Zdobnov, Kriventseva 2011), 2) the ant genomes still lack good annotation of gene models and single-copy orthologs gene families could be missed, and 3) we filtered out unreliable parts of sequence alignments including fast evolving residues that are difficult to align (Fletcher and Yang 2010; Privman et al. 2012). The last point is balanced by the fact that conserved regions might be more prone to positively selected substitutions (Bazykin and Kondrashov 2012) and that the removal of unreliable regions seems to increase the power to detect positive selection (Jordan and Goldman 2012; Privman et al. 2012).

Extensive Gene Families Data Set

Another data set gathered all gene families from the OrthoDB database that could pass our quality filters, and notably families that experienced duplications. The CDS were filtered as described earlier. Amino-acid sequences were aligned using PAGAN version 0.47 (Loytynoja et al. 2012). The program GUIDANCE (v1.1) was used to assess alignment confidence and mask unreliably aligned residues (Penn et al. 2010; Privman et al. 2012). The combination of a phylogeny-aware aligner (PAGAN replaces PRANK [Löytynoja and Goldman 2008] and is based on the same principle) and of this filtering algorithm was shown to perform the best in recent benchmark studies on simulated data (Jordan and Goldman 2012; Privman et al. 2012). Gene family phylogenies were built using RAxML (v7.2.9) (Stamatakis 2006) from the amino-acid sequences, with the LG matrix and the CAT model. Amino-acid alignments were reverse-translated into the corresponding codon alignments. This resulted in 6,186 families tested, with an average of 11 genes, and an average length of filtered alignment of 3,129 nt (median of 2,385 nt, ranging from a minimum of 192 nt to a maximum of 20,556 nt).

Mitochondrial Gene Families Data Set

Contigs corresponding to mitochondrial genomes could be downloaded for five of the seven ant genomes (Ac. echinatior, At. cephalotes, S. invicta, P. barbatus, and L. humile). They were submitted to MITOS, a web server for the annotation of metazoan mitochondrial genomes (http://mitos.bioinf.uni-leipzig.de/index.py, last accessed April 24, 2014) (Bernt et al. 2012). This gave us the predicted coordinates of 13 mitochondrial protein-coding genes in these species. Frameshift errors or incomplete gene predictions were manually corrected. Mitochondrial genes from the outgroup species Ap. mellifera, N. Vitripenis, and T. castaneum were downloaded from GenBank (accession numbers: L06178; EU746609.1, and EU746613.1; AJ312413.2 and NC_003081.2, respectively). Mitochondrial genes from D. melanogaster were downloaded from Flybase at ftp://ftp.flybase.net/genomes/Drosophila_melanogaster/dmel_r5.43_FB2012_01/fasta/dmel-dmel_mitochondrion_genome-CDS-r5.43.fasta.gz (last accessed April 24, 2014). The alignment and filtering steps for the 13 mitochondrial gene families were identical to the data set of single-copy orthologs nuclear gene families (see above). A total of 119 branches were tested in this data set (average of 9.2 and median of 9 branches per family; average length of filtered alignment of 641 nt and median of = 621 nt, ranging from a minimum of 39 nt to a maximum of 1,413 nt).

Twelve Drosophila Data Set

Single-copy ortholog gene families from the twelve sequenced Drosophila species were downloaded from ftp://ftp.flybase.net/12_species_analysis/clark_eisen/alignments/ (last accessed April 24, 2014) (files “all_species.guide_tree.longest.cds.tar.gz” and “all_species.guide_tree.longest.translation.tar.gz”) (Drosophila 12 Genomes Consortium 2007). The alignment and filtering steps for these gene families were identical to the data set of single-copy ortholog gene families used for the ant analysis. Out of 6,698 initially downloaded Drosophila gene families, 3,749 (56%) passed our filters and could be tested for positive selection, resulting in 77,495 branches tested (average of 20.7 and median of 21 branches per family; average length of filtered alignment of 876 nt and median of 708 nt, ranging from a minimum of 15 nt to a maximum of 14,535 nt).

Bee Data Set

Single-copy ortholog gene families from 10 bee species were downloaded from http://insectsociogenomics.illinois.edu/ (last accessed April 24, 2014). This set of gene families is incomplete as it is derived from the sequencing of expressed sequence tags (using 454 Life Science/Roche GS-FLX platform) from nine bee species (Woodard et al. 2011), and from gene models of the honey bee Ap. mellifera (Honeybee Genome Sequencing Consortium 2006). The alignment and filtering steps for these gene families were identical to the data set of single-copy ortholog gene families used for the ant analysis. Out of 3,647 initially downloaded gene families, 2,256 (62%) passed our filters and could be tested for positive selection, resulting in 20,169 branches tested (average of 8.9 and median of 9 branches per family; average length of filtered alignment of 611 nt and median of 528 nt, ranging from a minimum of 27 nt to a maximum of 3,945 nt).

Branch-Site Test for Positive Selection

We used the updated branch-site test (Zhang et al. 2005) of Codeml from the package PAML v4.4c (Yang 2007) to detect Darwinian positive selection experienced by a gene family in a subset of sites in a specific branch of its phylogenetic tree. This test was previously used in genome-wide scans for positive selection in various lineages (Bakewell et al. 2007; Kosiol et al. 2008; Studer et al. 2008; Vamathevan et al. 2008; Oliver et al. 2010; George et al. 2011) and is used by the Selectome project (http://selectome.unil.ch, last accessed April 24, 2014) (Proux et al. 2009; Moretti et al. 2014). It is acknowledged to be more sensitive for the detection of positive selection than branch tests (Yang 1998) or site tests (Yang et al. 2000), because it does not average the signal over all codons in the alignment (branch test) nor over all branches of the phylogeny (site test) (Yang and dos Reis 2011). It is also robust to relaxation of purifying selection (ω close to 1) since this scenario is accounted for in the null model (Zhang 2004; Zhang et al. 2005). The alternative model is contrasted to the null model using a likelihood-ratio test (LRT), where log-likelihood ratios are compared to a chi-square distribution with 1 degree of freedom (Zhang et al. 2005). Previous studies have reported the branch-site test to be conservative in this setup (Bakewell et al. 2007; Studer et al. 2008; Fletcher and Yang 2010; Yang and dos Reis 2011; Gharib and Robinson-Rechavi 2013). We did not use the ω estimates to infer the strength of positive selection because they were shown to be unreliable (Bakewell et al. 2007; Yang and dos Reis 2011).

In the absence of a specific a priori hypothesis regarding which branches to test for positive selection, our implementation runs the test multiple times on each gene family, successively changing the branch selected as foreground. The branches considered as foreground are highlighted in red in figure 1. This approach was shown to be legitimate if P-values from the successive tests are corrected for multiple testing (Anisimova and Yang 2007; Yang and dos Reis 2011). We applied a FDR correction (Benjamini and Hochberg 1995) over all the P-values treated as one series (number of branches tested × number of gene families tested). In the ant single-copy orthologs nuclear data set, we analyzed a maximum of 15 branches leading to the 7 ant species, summing to 44,306 tests performed. In the ant mitochondrial data set, we analyzed a maximum of 11 branches leading to 5 ant species, summing to 119 tests (branches in red in supplementary fig. S3, Supplementary Material online). In the Drosophila single-copy orthologs data set, we analyzed a maximum of 21 branches, leading to a total of 77,495 tests (supplementary fig. S4, Supplementary Material online). Finally in the bee data set, we analyzed a maximum of 17 branches, leading to a total of 20,169 tests (supplementary fig. S5, Supplementary Material online).

All computations were performed using Slimcodeml (release May 4, 2011) (Schabauer et al. 2012), an optimized version of Codeml, based on the release 4.4c of the PAML package (downloadable at http://selectome.unil.ch/cgi-bin/download.cgi, last accessed April 24, 2014). Slimcodeml was estimated to run the branch-site models about 1.77 times faster than the original Codeml thanks to the use of external standard libraries for linear algebra calculations and specific optimizations for the computer architecture used. We verified on a subset of the gene families that the results given by Slimcodeml were identical with the original Codeml. Examples of Slimcodeml/Codeml control files used are provided in supplementary text, Supplementary Material online. For the ant mitochondrial data set, Codeml was used with the option “icode = 4” to use the Invertebrate mitochondrial genetic code (http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi#SG5, last accessed April 24, 2014).

The branch-site model is known to display convergence problems in the calculation of likelihoods (Yang and dos Reis 2011), leading to negative or artificially large log-likelihood ratios. We thus launched three independent runs for both the alternative and null hypotheses, for each branch of each gene family, and kept the best likelihood value of each run to calculate the log-likelihood ratio (Yang and dos Reis 2011). Of note, the likelihood differences observed across the three runs were most of the time very small. Even after reconciliation of three replicate runs, we still observed a number of negative log-likelihood ratios (8% of the tests—most of them very close to 0). In such cases, we manually set the log-likelihood ratios to 0 (meaning nonsignificance). We recorded the largest differences in likelihood values between the three independent runs in both fixed and alternative models (d). The distribution of differences was bimodal, with a first major mode at d = 0, gathering most data, and a second minor mode at d ∼ 1. A cutoff at d = 0.004 clearly separated the two peaks. We used this stringent cutoff (d > 0.004) to eliminate all tests with potential convergence problems in the fixed and alternative models (see supplementary text and table S23, Supplementary Material online).

Values of dN and dS were calculated with parameters extracted from Codeml results files (.mlc files).

All calculations were performed on the SIB Vital-IT cluster in Lausanne (http://www.vital-it.ch/, last accessed April 24, 2014). All three runs and the two hypotheses of each test were performed on the same node of the cluster.

Site Test for Positive Selection

The site test (Yang et al. 2000) of Codeml from the package PAML v4.4e (Yang 2007), allowing the dN/dS ratio (ω) to vary among sites, was run on the extensive data set of 6,186 families (see above). We contrasted the null model M8a (beta and ω with ω = 1) to the alternative model M8 (beta and ω with ω ≥ 1) with 11 site classes (Swanson et al. 2003; Wong et al. 2004). Examples of Codeml control files used are provided in supplementary text, Supplementary Material online. Similar to the branch-site test, we launched three independent runs for both the alternative and null hypotheses for each gene family and kept the best likelihood value of each run for the LRT (supplementary table S19, Supplementary Material online). The likelihood ratios were compared to a chi-square distribution with 1 degree of freedom as recommended in PAML user’s guide (http://abacus.gene.ucl.ac.uk/software/pamlDOC.pdf, last accessed April 24, 2014).

Reconstruction of Ancestral G + C Content

The program nhPhyml (Galtier and Gouy 1998; Guindon and Gascuel 2003; Boussau and Gouy 2006) was used to estimate the G + C content at third codon positions at each node of the gene family trees (topology fixed, transition/transversion ratio estimated, alpha parameter estimated with eight categories). Following Studer et al. (2008), we calculated the shift in GC3 content at each branch as the difference between GC3 contents at the nodes delimitating that branch.

Olfactory Receptors Family

Olfactory receptors are difficult to process in automated pipelines since they are characterized by dynamic patterns of duplications and pseudogenization during evolution (Nozawa and Nei 2007). Furthermore, the sequences of ORs are highly variable and notoriously difficult for automatic gene annotation. Accordingly, our main data set of single-copy orthologs was depleted in genes involved in olfaction (supplementary tables S7, S8, S10, and S11, Supplementary Material online) and GO categories related to olfaction could not be tested for enrichment of positively selected genes because they included too few annotated genes. We therefore used a more comprehensive data set of 873 manually annotated protein-coding sequences of OR genes (excluding suspected pseudogenes) provided by Hugh Robertson for P. barbatus (291 genes) (Smith, Smith, et al. 2011), L. humile (320 genes) (Smith, Zimin, et al. 2011), and N. vitripennis (262 genes) (Werren et al. 2010). Nucleotide sequences were translated and amino-acid sequences were aligned using MAFFT (Katoh et al. 2005). Unreliably aligned residues were masked using GUIDANCE based on 32 bootstrap samples and a cutoff of 0.2 that was chosen so that the 15% lowest scoring residues are masked (Penn et al. 2010; Privman et al. 2012). Phylogeny was reconstructed using RAxML with the JTT substitution matrix, the CAT approximation, and 100 bootstrap samples (Stamatakis 2006). Because the resulting gene tree was too large for an analysis with the branch-site test of Codeml, we divided it into 16 smaller subtrees, each containing less than 100 leaves. Branches with as high as possible bootstrap support were chosen as splitting points. The 16 subtrees include all ant sequences but only 105 N. vitripennis sequences. The sequences from each subtree were realigned using PRANK version 100701 (Löytynoja and Goldman 2008) and reverse-translated into corresponding codon alignments. GUIDANCE was used to mask unreliably aligned codons (0.8 cutoff). Phylogeny was reconstructed using RAxML as above. Out of 1,744 branches in the initial tree, 1,400 branches from the subtrees were tested using the branch-site test of Codeml (see above), and the computation was successful (both null and alternative hypotheses) for 1,184 branches. Significant branches are highlighted in red in figure 3 and in supplementary figure S1, Supplementary Material online. Full results of the branch-site test on all 16 clades are shown in supplementary table S18, Supplementary Material online. A full tree with branch names and bootstrap values is provided as supplementary figure S1, Supplementary Material online. Newick trees of the 16 individual subtrees along with annotation of tested branches are available in supplementary text, Supplementary Material online.

Tests of Functional Category Enrichment

GO (Ashburner et al. 2000) annotations for gene families were taken from the annotation of the D. melanogaster gene member they include (downloaded from http://flybase.org/static_pages/downloads/FB2011_02/go/gene_association.fb.gz, last accessed April 24, 2014). The annotation of children GO categories was propagated to their parent categories following the GO graph structure. GO categories mapped to 10 genes or less were discarded for the enrichment analysis.

To identify over- and underrepresented functional categories present in the data sets used in this study, the package topGO version 2.4 (Alexa et al. 2006) of Bioconductor (Gentleman et al. 2004) was used. A Fisher's exact test was used, combined with the “elim” algorithm of topGO, which decorrelates the graph structure of the GO to reduce nonindependence problems (Alexa et al. 2006). The reference set was constituted of all OrthoDB families including a D. melanogaster gene with GO annotation. GO categories with an FDR < 20% are reported (Benjamini and Hochberg 1995).

Regarding the functional enrichment of genes targeted by positive selection, the Fisher's exact test approach has been criticized because it imposes the choice of an arbitrary cutoff to dichotomize genes into “significant” and “nonsignificant” categories. This leads to a loss of information and limits the power and robustness of this method (Allison et al. 2006; Tintle et al. 2009; Daub et al. 2013). To test for GO functional categories for enrichment for positively selected genes, we instead used a gene set enrichment approach, which tests whether the distribution of scores of genes from a gene set differs from the whole data set scores distribution, allowing the detection of gene sets that contain many marginally significant genes. Different implementations for this approach have been proposed. The most widely used is the gene set enrichment analysis (GSEA) (Subramanian et al. 2005), but it was shown to perform relatively poorly (Kim and Volsky 2005; Efron and Tibshirani 2007; Tintle et al. 2009). Here, we used a SUMSTAT test: for a given gene set g including n genes, the SUMSTAT statistic is defined as the sum of scores of the n genes. This statistic was shown to be more sensitive than a panel of other methods, while controlling well for the rate of false positives (Ackermann and Strimmer 2009; Tintle et al. 2009; Fehringer et al. 2012; Daub et al. 2013). To be able to use the distribution of log-likelihood ratios of the positive selection test—which follows a chi-square distribution with 1 degree of freedom and spans several orders of magnitude—as scores in the SUMSTAT test, we applied a fourth root transformation as variance stabilizing method. This transformation conserves the ranks of gene families (see http://udel.edu/∼mcdonald/stattransform.html, last accessed April 24, 2014) (Canal 2005; McDonald 2009). According to the Central Limit Theorem, the distribution of SUMSTAT scores from random gene sets approaches a normal distribution whose mean and variance derives from the mean and variance of the scores of the complete list of tested genes G:

graphic file with name msu141um1.jpg

and

graphic file with name msu141um2.jpg

We performed bidirectional tests against this distribution to test whether the SUMSTAT statistic for a given gene set is higher or lower than expected by chance, corresponding to respectively enrichment or depletion for positively selected genes in this gene set. We verified the accuracy of this methodology by drawing an empirical null distribution for each gene set of size n found in the real data set, based on scores of 10,000 gene sets of same size n randomly picked from the whole data set. The distribution of SUMSTAT scores of these randomized gene sets approximates closely a normal distribution, even when the set size is small (supplementary fig. S6, Supplementary Material online). This makes the SUMSTAT test less computationally intensive than other gene set enrichment approaches (e.g., GSEA) (Subramanian et al. 2005) where the null distribution cannot be inferred mathematically and randomizations have to be performed for each individual test. We verified that a GSEA approach gave broadly similar results (not shown).

Because different gene sets sometimes share many genes in common, the list of significant gene sets resulting from enrichment tests is usually highly redundant. We implemented the “elim” algorithm from the Bioconductor package topGO, to decorrelate the graph structure of the GO (Alexa et al. 2006). Briefly, the GO categories are tested recursively starting from the deeper levels of the GO tree, and the genes annotated to these significant categories are removed from all their parent categories. As the tests for different categories are not independent, it is not clear whether classical approaches to assess the FDR (e.g., Benjamini and Hochberg 1995) are accurate. Thus, we calculated empirically an FDR at each P-value threshold by performing 100 randomizations where the scores of gene families were permuted and the gene set enrichment test rerun. The FDR is estimated as

graphic file with name msu141um3.jpg

where at a given P-value threshold N0 represents the mean number of false positives obtained in the randomizations and Nt represents the number of positives obtained with the real data set. The FDR obtained with this approach was in good agreement with the Benjamini–Hochberg FDR (Benjamini and Hochberg 1995). GO categories with an FDR < 20% are reported. Functional categories depleted in positive selection reflect the most conserved sets of functional categories, under the action of purifying selection. These are not discussed in this article.

The gene set enrichment test ran on each individual branch of the tree with results of the branch-site test yields heterogeneous results, probably resulting from differences in power of the branch-site test on different branches of the phylogeny (supplementary table S9, Supplementary Material online; only branches Sinv, Pbar, Hsal, #3 and #6 show some significant categories at FDR 20%). This test could also be sensitive to false positive results of the branch-site test (e.g., GC-biased gene conversion, discussed in supplementary text, Supplementary Material online). Thus, we designed a test less sensitive to these problems. We considered a unique score per gene family reflecting the evidence for positive selection globally in the ant lineage, the mean of the branch-site test scores on the 13 individual ant branches. This scoring scheme should unveil functional categories of genes that experienced extensive and probably recurrent episodes of positive selection in the ant lineage, but is not strictly equivalent to using the results of a site test on ants branches, since it allows the detection of gene families with positive selection events affecting different sites on different branches. We also checked that in most cases, the enriched categories were not significant only because of a single outlier gene with a strong positive selection score, but displayed a significant shift in the distribution of positive selection scores of numerous genes (supplementary fig. S7, Supplementary Material online).

Finally, as a sanity check, the gene set enrichment test was also performed using KEGG pathways annotation. KEGG pathways and the mapping to D. melanogaster genes were downloaded with the KEGG REST API (http://www.kegg.jp/kegg/rest/keggapi.html, last accessed April 24, 2014). Because hierarchical relationships among KEGG pathways are limited, we did not use the “elim” decorrelation algorithm. Pathways mapped to more than 10 genes were retained. In total, 51 KEGG pathways were tested.

Tests of Phenotypic Category Enrichment

Mutant phenotype annotations of D. melanogaster genes were extracted from Flybase (Drysdale 2001; Osumi-Sutherland et al. 2013). The following ontologies were downloaded from the OBO foundry (Smith et al. 2007): the Flybase controlled vocabulary ontology (http://www.berkeleybop.org/ontologies/obo-all/flybase_vocab/flybase_vocab.obo, last accessed April 24, 2014), the Drosophila anatomical ontology (http://www.berkeleybop.org/ontologies/obo-all/fly_anatomy/fly_anatomy.obo, last accessed April 24, 2014), and the Drosophila developmental stages ontology (http://www.berkeleybop.org/ontologies/obo-all/fly_development/fly_development.obo, last accessed April 24, 2014). The relationships between genes and alleles, and between alleles and phenotypes (anatomical and developmental ontology categories) were extracted from Flybase (ftp://ftp.flybase.net/releases/FB2012_01/reporting-xml/FBgn.xml.gz, last accessed April 24, 2014; “derived_pheno_class” and “derived_pheno_manifest” entities). The information on gain or loss-of-function alleles was extracted from the file ftp://ftp.flybase.net/releases/FB2012_01/reporting-xml/FBal.xml.gz (last accessed April 24, 2014) (loss of function: controlled vocabulary term FBcv:0000287 and child terms; gain of function: FBcv:0000290 and child terms). The annotation of child phenotypic categories (anatomy of development) was propagated to their parent categories following the respective ontologies structures.

To perform an enrichment analysis based on mutant phenotypes in fruit fly, we used the SUMSTAT test. Because the annotation is scarcer than the GO annotation, we used only the categories mapped to more than five genes for the enrichment analysis. The reported results include the annotation for gain and loss-of-function alleles. We observed very similar results when gain-of-function alleles were removed from the annotation (Weng and Liao 2011) (not shown).

Expression Data

Microarray expression data from S. invicta (Ometto et al. 2011) were provided by the authors upon request. These included expression levels of clones of the spotted microarray used, as well as the list of genes identified to be overexpressed in each of the three castes (workers, queens, and males), both at pupal and adult stages. The mapping of clones to the gene model of S. invicta (OGS_2.2.3) (Wurm et al. 2011) was provided by Y. Wurm, and is similar to the mapping used in Hunt et al. (2011). If multiple clones mapped to the same gene, the average signal was used for expression. For differential expression, we used the results of the original study (BAGEL analysis, where a clone was considered to be differentially expressed between conditions if the Bayesian posterior probability was P < 0.001, corresponding to an FDR ∼ 5%) (Ometto et al. 2011). A gene was considered differentially expressed if at least one clone mapped to it was found differentially expressed. Expression data were available for 1,327 genes of the single-copy orthologs data set, including 603 genes overexpressed in at least one condition. We ran a SUMSTAT gene set enrichment test on the sets of genes with caste-specific expression (pupal male, pupal queen, pupal worker, adult male, adult queen, and adult worker). P-values were obtained by comparison to an empirical distribution created with 10,000 randomizations of gene scores.

Aging Genes

Aging and oxidative stress associated genes were obtained from a microarray study in D. melanogaster comparing the expression of genes in 10-day-old flies to 61-day-old flies, and flies exposed to 100% O2 for 7 days to controls (Landis et al. 2004). We tested the enrichment for positively selected genes (SUMSTAT test) in four gene sets constituted of up and downregulated genes in both contrasts. P-values were obtained by comparison to an empirical distribution created with 10,000 randomizations of gene scores.

Genes with Mitochondrial Function

Genes with mitochondrial function were identified as those mapped to any of the 310 GO categories including “mitochondria*” in their names or synonym names (using the search engine on http://amigo.geneontology.org/, last accessed April 24, 2014). Three hundred and thirteen of the identified genes had available microarray expression data in S. invicta.

Data Availability

Raw and filtered alignments used in these analyses track files for the alignment editor Jalview (Clamp et al. 2004), Codeml control files and result files can be downloaded at http://bioinfo.unil.ch/supdata/Roux_positive_selection_ants/Roux_et_al_datasets.tar.gz (last accessed April 24, 2014).

A simple web interface displaying gene families, GO mapping, Codeml results, and alignments (through a Jalview applet) is available at http://bioinfo.unil.ch/supdata/Roux_positive_selection_ants/families.html (last accessed April 24, 2014). Jalview tracks display the regions used or filtered out in the original protein alignments, as well as the residues found to be under positive selection by Bayes Empirical Bayes (Yang et al. 2005) in all the branches tested for each of the three replicate runs (fig. 2).

Supplementary Material

Supplementary tables S1–S26, figures S1–S7, and supplementary text are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).

Supplementary Data

Acknowledgments

The authors thank Yannick Wurm, Miguel Corona Villegas, Nicolas Salamin, Corrie Moreau, and members of the Keller laboratory for stimulating discussions. They are grateful to Oksana Riba-Grognuz, Lino Ometto, Roberto Bonasio, Robert Waterhouse, Hannes Schabauer, Walid Gharib, and members of the Ant Comparative Genomics Consortium for making data or software available for this study; Ben-Yang Liao and Meng-Pin Weng for help with Flybase phenotypic data extraction; Alexander Wild for providing illustrations for figure 1; and four anonymous reviewers for valuable comments. Computations were performed at the Vital-IT (http://www.vital-it.ch) center for high-performance computing of the SIB Swiss Institute of Bioinformatics. J.R. was funded by a Swiss NSF grant attributed to L.K., a Swiss NSF postdoc mobility fellowship (PBLAP3-134342) and a Marie Curie fellowship. M.R.R. and S.M. acknowledge funding from a Swiss NSF grant (31003A 133011/1), the Swiss Platform for High-Performance and High-Productivity Computing (HP2C), and project UNIL.5/SMSCG as part of the AAA/SWITCH. M.R.R. and J.T.D. acknowledge funding from a Swiss NSF ProDoc grant (PDFMP3_130309). L.K. is supported by several grants from the Swiss NSF and an ERC advanced grant. J.R., E.P., and L.K. designed the study; J.R. and E.P. analyzed data; J.T.D., S.M., and M.R.R. contributed code or programs; J.R. and L.K. wrote the manuscript with input from M.R.R.

References

  1. Abouheif E, Wray GA. Evolution of the gene network underlying wing polyphenism in ants. Science. 2002;297:249–252. doi: 10.1126/science.1071468. [DOI] [PubMed] [Google Scholar]
  2. Ackermann M, Strimmer K. A general modular framework for gene set enrichment analysis. BMC Bioinformatics. 2009;10:47. doi: 10.1186/1471-2105-10-47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Amanatides PG, Scherer SE, Li PW, Hoskins RA, Galle RF, et al. The genome sequence of Drosophila melanogaster. Science. 2000;287:2185–2195. doi: 10.1126/science.287.5461.2185. [DOI] [PubMed] [Google Scholar]
  4. Alexa A, Rahnenfuhrer J, Lengauer T. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics. 2006;22:1600–1607. doi: 10.1093/bioinformatics/btl140. [DOI] [PubMed] [Google Scholar]
  5. Allison DB, Cui X, Page GP, Sabripour M. Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 2006;7:55–65. doi: 10.1038/nrg1749. [DOI] [PubMed] [Google Scholar]
  6. Amdam GV, Omholt SW. The regulatory anatomy of honeybee lifespan. J Theor Biol. 2002;216:209–228. doi: 10.1006/jtbi.2002.2545. [DOI] [PubMed] [Google Scholar]
  7. Anisimova M, Yang Z. Multiple hypothesis testing to detect lineages under positive selection that affects only a few sites. Mol Biol Evol. 2007;24:1219–1228. doi: 10.1093/molbev/msm042. [DOI] [PubMed] [Google Scholar]
  8. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Bakewell MA, Shi P, Zhang J. More genes underwent positive selection in chimpanzee evolution than in human evolution. Proc Natl Acad Sci U S A. 2007;104:7489–7494. doi: 10.1073/pnas.0701705104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bard F, Casano L, Mallabiabarrena A, Wallace E, Saito K, Kitayama H, Guizzunti G, Hu Y, Wendler F, Dasgupta R, et al. Functional genomics reveals genes involved in protein secretion and Golgi organization. Nature. 2006;439:604–607. doi: 10.1038/nature04377. [DOI] [PubMed] [Google Scholar]
  11. Bazin E, Glemin S, Galtier N. Population size does not influence mitochondrial genetic diversity in animals. Science. 2006;312:570–572. doi: 10.1126/science.1122033. [DOI] [PubMed] [Google Scholar]
  12. Bazykin GA, Kondrashov AS. Major role of positive selection in the evolution of conservative segments of Drosophila proteins. Proc R Soc B Biol Sci. 2012;279:3409–3417. doi: 10.1098/rspb.2012.0776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bechstedt S, Albert JT, Kreil DP, Muller-Reichert T, Gopfert MC, Howard J. A doublecortin containing microtubule-associated protein is implicated in mechanotransduction in Drosophila sensory cilia. Nat Commun. 2010;1:11. doi: 10.1038/ncomms1007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Beller M, Sztalryd C, Southall N, Bell M, Jäckle H, Auld DS, Oliver B. COPI complex is a regulator of lipid homeostasis. PLoS Biol. 2008;6:e292. doi: 10.1371/journal.pbio.0060292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Methodol. 1995;57:289–300. [Google Scholar]
  16. Bernt M, Donath A, Juhling F, Externbrink F, Florentz C, Fritzsch G, Putz J, Middendorf M, Stadler PF. MITOS: improved de novo metazoan mitochondrial genome annotation. Mol Phylogenet Evol. 2012;69:313–319. doi: 10.1016/j.ympev.2012.08.023. [DOI] [PubMed] [Google Scholar]
  17. Blanke S, Jackle H. Novel guanine nucleotide exchange factor GEFmeso of Drosophila melanogaster interacts with Ral and Rho GTPase Cdc42. FASEB J. 2006;20:683–691. doi: 10.1096/fj.05-5376com. [DOI] [PubMed] [Google Scholar]
  18. Bodily KD, Morrison CM, Renden RB, Broadie K. A novel member of the Ig superfamily, turtle, is a CNS-specific protein required for coordinated motor control. J Neurosci. 2001;21:3113–3125. doi: 10.1523/JNEUROSCI.21-09-03113.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Bonasio R, Zhang G, Ye C, Mutti NS, Fang X, Qin N, Donahue G, Yang P, Li Q, Li C, et al. Genomic comparison of the ants Camponotus floridanus and Harpegnathos saltator. Science. 2010;329:1068–1071. doi: 10.1126/science.1192428. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Boussau B, Gouy M. Efficient likelihood computations with nonreversible models of evolution. Syst Biol. 2006;55:756–768. doi: 10.1080/10635150600975218. [DOI] [PubMed] [Google Scholar]
  21. Brady SG, Schultz TR, Fisher BL, Ward PS. Evaluating alternative hypotheses for the early evolution and diversification of ants. Proc Natl Acad Sci U S A. 2006;103:18172–18177. doi: 10.1073/pnas.0605858103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Bronstein R, Levkovitz L, Yosef N, Yanku M, Ruppin E, Sharan R, Westphal H, Oliver B, Segal D. Transcriptional regulation by CHIP/LDB complexes. PLoS Genet. 2010;6:e1001063. doi: 10.1371/journal.pgen.1001063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Brunet FG, Crollius HR, Paris M, Aury JM, Gibert P, Jaillon O, Laudet V, Robinson-Rechavi M. Gene loss and evolutionary rates following whole-genome duplication in teleost fishes. Mol Biol Evol. 2006;23:1808–1816. doi: 10.1093/molbev/msl049. [DOI] [PubMed] [Google Scholar]
  24. Brunet-Rossinni A, Austad S. Ageing studies on bats: a review. Biogerontology. 2004;5:211–222. doi: 10.1023/B:BGEN.0000038022.65024.d8. [DOI] [PubMed] [Google Scholar]
  25. Brunet-Rossinni AK. Reduced free-radical production and extreme longevity in the little brown bat (Myotis lucifugus) versus two non-flying mammals. Mech Ageing Dev. 2004;125:11–20. doi: 10.1016/j.mad.2003.09.003. [DOI] [PubMed] [Google Scholar]
  26. Bulmer MS. Evolution of immune proteins in insects. Encyclopedia of life sciences (ELS) Chichester (NH): John Wiley & Sons, Ltd; 2010. [Google Scholar]
  27. Burton RS, Barreto FS. A disproportionate role for mtDNA in Dobzhansky–Muller incompatibilities? Mol Ecol. 2012;21:4942–4957. doi: 10.1111/mec.12006. [DOI] [PubMed] [Google Scholar]
  28. Canal L. A normal approximation for the chi-square distribution. Comput Stat Data Anal. 2005;48:803–808. [Google Scholar]
  29. Chen CC, Wu JK, Lin HW, Pai TP, Fu TF, Wu CL, Tully T, Chiang AS. Visualizing long-term memory formation in two neurons of the Drosophila brain. Science. 2012;335:678–685. doi: 10.1126/science.1212735. [DOI] [PubMed] [Google Scholar]
  30. Clamp M, Cuff J, Searle SM, Barton GJ. The Jalview Java alignment editor. Bioinformatics. 2004;20:426–427. doi: 10.1093/bioinformatics/btg430. [DOI] [PubMed] [Google Scholar]
  31. Collier S, Chan HYE, Toda T, McKimmie C, Johnson G, Adler PN, O’Kane C, Ashburner M. The Drosophila embargoed gene is required for larval progression and encodes the functional homolog of Schizosaccharomyces Crm1. Genetics. 2000;155:1799–1807. doi: 10.1093/genetics/155.4.1799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Corley LS, Lavine MD. A review of insect stem cell types. Semin Cell Dev Biol. 2006;17:510–517. doi: 10.1016/j.semcdb.2006.07.002. [DOI] [PubMed] [Google Scholar]
  33. Corona M, Hughes KA, Weaver DB, Robinson GE. Gene expression patterns associated with queen honey bee longevity. Mech Ageing Dev. 2005;126:1230–1238. doi: 10.1016/j.mad.2005.07.004. [DOI] [PubMed] [Google Scholar]
  34. Corona M, Velarde RA, Remolina S, Moran-Lauter A, Wang Y, Hughes KA, Robinson GE. Vitellogenin, juvenile hormone, insulin signaling, and queen honey bee longevity. Proc Natl Acad Sci U S A. 2007;104:7128–7133. doi: 10.1073/pnas.0701909104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Cortopassi GA. A neutral theory predicts multigenic aging and increased concentrations of deleterious mutations on the mitochondrial and Y chromosomes. Free Radic Biol Med. 2002;33:605–610. doi: 10.1016/s0891-5849(02)00966-8. [DOI] [PubMed] [Google Scholar]
  36. Cronin SJ, Nehme NT, Limmer S, Liegeois S, Pospisilik JA, Schramek D, Leibbrandt A, Simoes Rde M, Gruber S, Puc U, et al. Genome-wide RNAi screen identifies genes involved in intestinal pathogenic bacterial infection. Science. 2009;325:340–343. doi: 10.1126/science.1173164. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Crozier RH, Crozier YC. The mitochondrial genome of the honeybee Apis mellifera: complete sequence and genome organization. Genetics. 1993;133:97–117. doi: 10.1093/genetics/133.1.97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Cui H, Kong Y, Zhang H. Oxidative stress, mitochondrial dysfunction, and aging. J Signal Transduct. 2012;2012:646354. doi: 10.1155/2012/646354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Curtis C, Landis GN, Folk D, Wehr NB, Hoe N, Waskar M, Abdueva D, Skvortsov D, Ford D, Luu A, et al. Transcriptional profiling of MnSOD-mediated lifespan extension in Drosophila reveals a species-general network of aging and metabolic genes. Genome Biol. 2007;8:R262. doi: 10.1186/gb-2007-8-12-r262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Daub JT, Hofer T, Cutivet E, Dupanloup I, Quintana-Murci L, Robinson-Rechavi M, Excoffier L. Evidence for polygenic adaptation to pathogens in the human genome. Mol Biol Evol. 2013;30:1544–1558. doi: 10.1093/molbev/mst080. [DOI] [PubMed] [Google Scholar]
  41. Davis JC, Petrov DA. Preferential duplication of conserved proteins in eukaryotic genomes. PLoS Biol. 2004;2:e55. doi: 10.1371/journal.pbio.0020055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Dawkins R, Krebs JR. Arms races between and within species. Proc R Soc B Biol Sci. 1979;205:489–511. doi: 10.1098/rspb.1979.0081. [DOI] [PubMed] [Google Scholar]
  43. Didelot G, Molinari F, Tchenio P, Comas D, Milhiet E, Munnich A, Colleaux L, Preat T. Tequila, a neurotrypsin ortholog, regulates long-term memory formation in Drosophila. Science. 2006;313:851–853. doi: 10.1126/science.1127215. [DOI] [PubMed] [Google Scholar]
  44. Drosophila 12 Genomes Consortium. Evolution of genes and genomes on the Drosophila phylogeny. Nature. 2007;450:203–218. doi: 10.1038/nature06341. [DOI] [PubMed] [Google Scholar]
  45. Drysdale R. Phenotypic data in FlyBase. Brief Bioinform. 2001;2:68–80. doi: 10.1093/bib/2.1.68. [DOI] [PubMed] [Google Scholar]
  46. Duret L. Neutral theory: the null hypothesis of molecular evolution. Nat Educ. 2008;1:218. [Google Scholar]
  47. Efron B, Tibshirani R. On testing the significance of sets of genes. Ann Appl Stat. 2007;1:107–129. [Google Scholar]
  48. Farris SM, Schulmeister S. Parasitoidism, not sociality, is associated with the evolution of elaborate mushroom bodies in the brains of hymenopteran insects. Proc R Soc B Biol Sci. 2011;278:940–951. doi: 10.1098/rspb.2010.2161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Fehringer G, Liu G, Briollais L, Brennan P, Amos CI, Spitz MR, Bickeböller H, Wichmann HE, Risch A, Hung RJ. Comparison of pathway analysis approaches using lung cancer GWAS data sets. PLoS One. 2012;7:e31816. doi: 10.1371/journal.pone.0031816. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Finkel T, Holbrook NJ. Oxidants, oxidative stress and the biology of ageing. Nature. 2000;408:239–247. doi: 10.1038/35041687. [DOI] [PubMed] [Google Scholar]
  51. Fischman BJ, Woodard SH, Robinson GE. Molecular evolutionary analyses of insect societies. Proc Natl Acad Sci U S A. 2011;108(Suppl 2):10847–10854. doi: 10.1073/pnas.1100301108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Fletcher W, Yang Z. The effect of insertions, deletions, and alignment errors on the branch-site test of positive selection. Mol Biol Evol. 2010;27:2257–2267. doi: 10.1093/molbev/msq115. [DOI] [PubMed] [Google Scholar]
  53. Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. Preservation of duplicate genes by complementary, degenerative mutations. Genetics. 1999;151:1531–1545. doi: 10.1093/genetics/151.4.1531. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Galtier N, Gouy M. Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis. Mol Biol Evol. 1998;15:871–879. doi: 10.1093/oxfordjournals.molbev.a025991. [DOI] [PubMed] [Google Scholar]
  55. Gentleman R, Carey V, Bates D, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80. doi: 10.1186/gb-2004-5-10-r80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. George RD, McVicker G, Diederich R, Ng SB, MacKenzie AP, Swanson WJ, Shendure J, Thomas JH. Trans genomic capture and sequencing of primate exomes reveals new targets of positive selection. Genome Res. 2011;21:1686–1694. doi: 10.1101/gr.121327.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Gerber AS, Loggins R, Kumar S, Dowling TE. Does nonneutral evolution shape observed patterns of DNA variation in animal mitochondrial genomes? Annu Rev Genet. 2001;35:539–566. doi: 10.1146/annurev.genet.35.102401.091106. [DOI] [PubMed] [Google Scholar]
  58. Gharib WH, Robinson-Rechavi M. The branch-site test of positive selection is surprisingly robust but lacks power under synonymous substitution saturation and variation in GC. Mol Biol Evol. 2013;30:1675–1686. doi: 10.1093/molbev/mst062. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Gotzek D, Clarke J, Shoemaker D. Mitochondrial genome evolution in fire ants (Hymenoptera: Formicidae) BMC Evol Biol. 2010;10:300. doi: 10.1186/1471-2148-10-300. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Gouveia-Oliveira R, Sackett P, Pedersen A. MaxAlign: maximizing usable data in an alignment. BMC Bioinformatics. 2007;8:312. doi: 10.1186/1471-2105-8-312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003;52:696–704. doi: 10.1080/10635150390235520. [DOI] [PubMed] [Google Scholar]
  62. Hall DW, Goodisman MA. The effects of kin selection on rates of molecular evolution in social insects. Evolution. 2012;66:2080–2093. doi: 10.1111/j.1558-5646.2012.01602.x. [DOI] [PubMed] [Google Scholar]
  63. Hambuch TM, Parsch J. Patterns of synonymous codon usage in Drosophila melanogaster genes with sex-biased expression. Genetics. 2005;170:1691–1700. doi: 10.1534/genetics.104.038109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Harman D. The biologic clock: the mitochondria? J Am Geriatr Soc. 1972;20:145–147. doi: 10.1111/j.1532-5415.1972.tb00787.x. [DOI] [PubMed] [Google Scholar]
  65. Harpur BA, Zayed A. Accelerated evolution of innate immunity proteins in social insects: adaptive evolution or relaxed constraint? Mol Biol Evol. 2013;30:1665–1674. doi: 10.1093/molbev/mst061. [DOI] [PubMed] [Google Scholar]
  66. He X, Zhang J. Higher duplicability of less important genes in yeast genomes. Mol Biol Evol. 2006;23:144–151. doi: 10.1093/molbev/msj015. [DOI] [PubMed] [Google Scholar]
  67. Hölldobler B, Wilson E. The ants. Cambridge (MA): Belknap Press of Harvard University Press; 1990. [Google Scholar]
  68. Honeybee Genome Sequencing Consortium. Insights into social insects from the genome of the honeybee Apis mellifera. Nature. 2006;443:931–949. doi: 10.1038/nature05260. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Hovemann BT, Sehlmeyer F, Malz J. Drosophila melanogaster NADPH–cytochrome P450 oxidoreductase: pronounced expression in antennae may be related to odorant clearance. Gene. 1997;189:213–219. doi: 10.1016/s0378-1119(96)00851-7. [DOI] [PubMed] [Google Scholar]
  70. Hunt BG, Ometto L, Wurm Y, Shoemaker D, Yi SV, Keller L, Goodisman MAD. Relaxed selection is a precursor to the evolution of phenotypic plasticity. Proc Natl Acad Sci U S A. 2011;108:15936–15941. doi: 10.1073/pnas.1104825108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Ingram KK, Oefner P, Gordon DM. Task-specific expression of the foraging gene in harvester ants. Mol Ecol. 2005;14:813–818. doi: 10.1111/j.1365-294X.2005.02450.x. [DOI] [PubMed] [Google Scholar]
  72. Jemielity S, Chapuisat M, Parker JD, Keller L. Long live the queen: studying aging in social insects. Age. 2005;27:241–248. doi: 10.1007/s11357-005-2916-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Jensen TF, Holm-Jensen I. Energetic cost of running in workers of three ant species, Formica fusca L., Formica rufa L., and Camponotus herculeanus L. (Hymenoptera, Formicidae) J Comp Physiol. 1980;137:151–156. [Google Scholar]
  74. Jobson RW, Nabholz B, Galtier N. An evolutionary genome scan for longevity-related natural selection in mammals. Mol Biol Evol. 2010;27:840–847. doi: 10.1093/molbev/msp293. [DOI] [PubMed] [Google Scholar]
  75. Jordan G, Goldman N. The effects of alignment error and alignment filtering on the sitewise detection of positive selection. Mol Biol Evol. 2012;29:1125–1139. doi: 10.1093/molbev/msr272. [DOI] [PubMed] [Google Scholar]
  76. Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res. 2005;33:511–518. doi: 10.1093/nar/gki198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Keller L, Genoud M. Extraordinary lifespans in ants: a test of evolutionary theories of ageing. Nature. 1997;389:958–960. [Google Scholar]
  78. Keller L, Jemielity S. Social insects as a model to study the molecular basis of ageing. Exp Gerontol. 2006;41:553–556. doi: 10.1016/j.exger.2006.04.002. [DOI] [PubMed] [Google Scholar]
  79. Kim HJ, Morrow G, Westwood JT, Michaud S, Tanguay RM. Gene expression profiling implicates OXPHOS complexes in lifespan extension of flies over-expressing a small mitochondrial chaperone, Hsp22. Exp Gerontol. 2010;45:611–620. doi: 10.1016/j.exger.2009.12.012. [DOI] [PubMed] [Google Scholar]
  80. Kim HS, Murphy T, Xia J, Caragea D, Park Y, Beeman RW, Lorenzen MD, Butcher S, Manak JR, Brown SJ. BeetleBase in 2010: revisions to provide comprehensive genomic information for Tribolium castaneum. Nucleic Acids Res. 2010;38:D437–D442. doi: 10.1093/nar/gkp807. [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Kim SY, Volsky DJ. PAGE: parametric analysis of gene set enrichment. BMC Bioinformatics. 2005;6:144. doi: 10.1186/1471-2105-6-144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Kirkness EF, Haas BJ, Sun W, Braig HR, Perotti MA, Clark JM, Lee SH, Robertson HM, Kennedy RC, Elhaik E, et al. Genome sequences of the human body louse and its primary endosymbiont provide insights into the permanent parasitic lifestyle. Proc Natl Acad Sci U S A. 2010;107:12168–12173. doi: 10.1073/pnas.1003379107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Kishita Y, Tsuda M, Aigaki T. Impaired fatty acid oxidation in a Drosophila model of mitochondrial trifunctional protein (MTP) deficiency. Biochem Biophys Res Commun. 2012;419:344–349. doi: 10.1016/j.bbrc.2012.02.026. [DOI] [PubMed] [Google Scholar]
  84. Kiss DL, Andrulis ED. Genome-wide analysis reveals distinct substrate specificities of Rrp6, Dis3, and core exosome subunits. RNA. 2010;16:781–791. doi: 10.1261/rna.1906710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Kosiol C, Vinar T, da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, Siepel A. Patterns of positive selection in six mammalian genomes. PLoS Genet. 2008;4:e1000144. doi: 10.1371/journal.pgen.1000144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Kowald A, Kirkwood TB. Evolution of the mitochondrial fusion-fission cycle and its role in aging. Proc Natl Acad Sci U S A. 2011;108:10237–10242. doi: 10.1073/pnas.1101604108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Kuan YS, Brewer-Jensen P, Bai WL, Hunter C, Wilson CB, Bass S, Abernethy J, Wing JS, Searles LL. Drosophila suppressor of sable protein [Su(s)] promotes degradation of aberrant and transposon-derived RNAs. Mol Cell Biol. 2009;29:5590–5603. doi: 10.1128/MCB.00039-09. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Kucherenko MM, Marrone AK, Rishko VM, Magliarelli Hde F, Shcherbata HR. Stress and muscular dystrophy: a genetic screen for dystroglycan and dystrophin interactors in Drosophila identifies cellular stress response components. Dev Biol. 2011;352:228–242. doi: 10.1016/j.ydbio.2011.01.013. [DOI] [PubMed] [Google Scholar]
  89. Kulmuni J, Wurm Y, Pamilo P. Comparative genomics of chemosensory protein genes reveals rapid evolution and positive selection in ant-specific duplicates. Heredity. 2013;110:538–547. doi: 10.1038/hdy.2012.122. [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Landis GN, Abdueva D, Skvortsov D, Yang J, Rabin BE, Carrick J, Tavare S, Tower J. Similar gene expression patterns characterize aging and oxidative stress in Drosophila melanogaster. Proc Natl Acad Sci U S A. 2004;101:7663–7668. doi: 10.1073/pnas.0307605101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. Lawson D, Arensburger P, Atkinson P, Besansky NJ, Bruggner RV, Butler R, Campbell KS, Christophides GK, Christley S, Dialynas E, et al. VectorBase: a data resource for invertebrate vector genomics. Nucleic Acids Res. 2009;37:D583–D587. doi: 10.1093/nar/gkn857. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Leboeuf AC, Benton R, Keller L. The molecular basis of social behavior: models, methods and advances. Curr Opin Neurobiol. 2013;23:3–10. doi: 10.1016/j.conb.2012.08.008. [DOI] [PubMed] [Google Scholar]
  93. Lee HY, Chou JY, Cheong L, Chang NH, Yang SY, Leu JY. Incompatibility of nuclear and mitochondrial genomes causes hybrid sterility between two yeast species. Cell. 2008;135:1065–1073. doi: 10.1016/j.cell.2008.10.047. [DOI] [PubMed] [Google Scholar]
  94. Lenaz G. Role of mitochondria in oxidative stress and ageing. Biochim Biophys Acta. 1998;1366:53–67. doi: 10.1016/s0005-2728(98)00120-0. [DOI] [PubMed] [Google Scholar]
  95. Lihoreau M, Latty T, Chittka L. An exploration of the social brain hypothesis in insects. Front Physiol. 2012;3:442. doi: 10.3389/fphys.2012.00442. [DOI] [PMC free article] [PubMed] [Google Scholar]
  96. Lindblad-Toh K, Garber M, Zuk O, Lin MF, Parker BJ, Washietl S, Kheradpour P, Ernst J, Jordan G, Mauceli E, et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature. 2011;478:476–482. doi: 10.1038/nature10530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Linksvayer TA, Wade MJ. Genes with social effects are expected to harbor more sequence variation within and between species. Evolution. 2009;63:1685–1696. doi: 10.1111/j.1558-5646.2009.00670.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Löytynoja A, Goldman N. Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science. 2008;320:1632–1635. doi: 10.1126/science.1158395. [DOI] [PubMed] [Google Scholar]
  99. Loytynoja A, Vilella AJ, Goldman N. Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm. Bioinformatics. 2012;28:1684–1691. doi: 10.1093/bioinformatics/bts198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Markova-Raina P, Petrov D. High sensitivity to aligner and high rate of false positives in the estimates of positive selection in the 12 Drosophila genomes. Genome Res. 2011;21:863–874. doi: 10.1101/gr.115949.110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. McDonald JH. Handbook of biological statistics. Baltimore (MD): Sparky House Publishing; 2009. [Google Scholar]
  102. Meiklejohn CD, Montooth KL, Rand DM. Positive and negative selection on the mitochondrial genome. Trends Genet. 2007;23:259–263. doi: 10.1016/j.tig.2007.03.008. [DOI] [PubMed] [Google Scholar]
  103. Mockett RJ, Orr WC, Rahmandar JJ, Benes JJ, Radyuk SN, Klichko VI, Sohal RS. Overexpression of Mn-containing superoxide dismutase in transgenic Drosophila melanogaster. Arch Biochem Biophys. 1999;371:260–269. doi: 10.1006/abbi.1999.1460. [DOI] [PubMed] [Google Scholar]
  104. Molnar J, Fong KSK, He QP, Hayashi K, Kim Y, Fong SFT, Fogelgren B, Molnarne Szauter K, Mink M, Csiszar K. Structural and functional diversity of lysyl oxidase and the LOX-like proteins. Biochim Biophys Acta. 2003;1647:220–224. doi: 10.1016/s1570-9639(03)00053-0. [DOI] [PubMed] [Google Scholar]
  105. Molnar J, Ujfaludi Z, Fong SF, Bollinger JA, Waro G, Fogelgren B, Dooley DM, Mink M, Csiszar K. Drosophila lysyl oxidases Dmloxl-1 and Dmloxl-2 are differentially expressed and the active DmLOXL-1 influences gene expression and development. J Biol Chem. 2005;280:22977–22985. doi: 10.1074/jbc.M503006200. [DOI] [PubMed] [Google Scholar]
  106. Montooth KL, Rand DM. The spectrum of mitochondrial mutation differs across species. PLoS Biol. 2008;6:e213. doi: 10.1371/journal.pbio.0060213. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Montoya-Burgos JI. Patterns of positive selection and neutral evolution in the protein-coding genes of Tetraodon and Takifugu. PLoS One. 2011;6:e24800. doi: 10.1371/journal.pone.0024800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  108. Moreau CS, Bell CD, Vila R, Archibald SB, Pierce NE. Phylogeny of the ants: diversification in the age of angiosperms. Science. 2006;312:101–104. doi: 10.1126/science.1124891. [DOI] [PubMed] [Google Scholar]
  109. Moretti S, Laurenczy B, Gharib WH, Castella B, Kuzniar A, Schabauer H, Studer RA, Valle M, Salamin N, Stockinger H, et al. Selectome update: quality control and computational improvements to a database of positive selection. Nucleic Acids Res. 2014;42:D917–D921. doi: 10.1093/nar/gkt1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
  110. Munch D, Amdam GV, Wolschin F. Ageing in a eusocial insect: molecular and physiological characteristics of life span plasticity in the honey bee. Funct Ecol. 2008;22:407–421. doi: 10.1111/j.1365-2435.2008.01419.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  111. Munoz-Torres MC, Reese JT, Childers CP, Bennett AK, Sundaram JP, Childs KL, Anzola JM, Milshina N, Elsik CG. Hymenoptera Genome Database: integrated community resources for insect species of the order Hymenoptera. Nucleic Acids Res. 2011;39:D658–D662. doi: 10.1093/nar/gkq1145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Neumuller RA, Richter C, Fischer A, Novatchkova M, Neumuller KG, Knoblich JA. Genome-wide analysis of self-renewal in Drosophila neural stem cells by transgenic RNAi. Cell Stem Cell. 2011;8:580–593. doi: 10.1016/j.stem.2011.02.022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  113. Niven JE, Scharlemann JP. Do insect metabolic rates at rest and during flight scale with body mass? Biol Lett. 2005;1:346–349. doi: 10.1098/rsbl.2005.0311. [DOI] [PMC free article] [PubMed] [Google Scholar]
  114. Notredame C, Higgins DG, Heringa J. T-coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol. 2000;302:205–217. doi: 10.1006/jmbi.2000.4042. [DOI] [PubMed] [Google Scholar]
  115. Nozawa M, Nei M. Evolutionary dynamics of olfactory receptor genes in Drosophila species. Proc Natl Acad Sci U S A. 2007;104:7122–7127. doi: 10.1073/pnas.0702133104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  116. Nygaard S, Zhang G, Schiøtt M, Li C, Wurm Y, Hu H, Zhou J, Ji L, Qiu F, Rasmussen M, et al. The genome of the leaf-cutting ant Acromyrmex echinatior suggests key adaptations to advanced social life and fungus farming. Genome Res. 2011;21:1339–1348. doi: 10.1101/gr.121392.111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  117. Oliveira DC, Raychoudhury R, Lavrov DV, Werren JH. Rapidly evolving mitochondrial genome and directional selection in mitochondrial genes in the parasitic wasp nasonia (hymenoptera: pteromalidae) Mol Biol Evol. 2008;25:2167–2180. doi: 10.1093/molbev/msn159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Oliver TA, Garfield DA, Manier MK, Haygood R, Wray GA, Palumbi SR. Whole-genome positive selection and habitat-driven evolution in a shallow and a deep-sea urchin. Genome Biol Evol. 2010;2:800–814. doi: 10.1093/gbe/evq063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  119. Ometto L, Shoemaker D, Ross KG, Keller L. Evolution of gene expression in fire ants: the effects of developmental stage, caste, and species. Mol Biol Evol. 2011;28:1381–1392. doi: 10.1093/molbev/msq322. [DOI] [PubMed] [Google Scholar]
  120. Osumi-Sutherland D, Marygold S, Millburn G, McQuilton P, Ponting L, Stefancsik R, Falls K, Brown N, Gkoutos G. The Drosophila phenotype ontology. J Biomed Semantics. 2013;4:30. doi: 10.1186/2041-1480-4-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Parker JD, Parker KM, Sohal BH, Sohal RS, Keller L. Decreased expression of Cu-Zn superoxide dismutase 1 in ants with extreme lifespan. Proc Natl Acad Sci U S A. 2004;101:3486–3489. doi: 10.1073/pnas.0400222101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  122. Penick CA, Prager SS, Liebig J. Juvenile hormone induces queen development in late-stage larvae of the ant Harpegnathos saltator. J Insect Physiol. 2012;58:1643–1649. doi: 10.1016/j.jinsphys.2012.10.004. [DOI] [PubMed] [Google Scholar]
  123. Penn O, Privman E, Landan G, Graur D, Pupko T. An alignment confidence score capturing robustness to guide tree uncertainty. Mol Biol Evol. 2010;27:1759–1767. doi: 10.1093/molbev/msq066. [DOI] [PMC free article] [PubMed] [Google Scholar]
  124. Perez-Campo R, López-Torres M, Cadenas S, Rojas C, Barja G. The rate of free radical production as a determinant of the rate of aging: evidence from the comparative approach. J Comp Physiol B Biochem Syst Environ Physiol. 1998;168:149–158. doi: 10.1007/s003600050131. [DOI] [PubMed] [Google Scholar]
  125. Privman E, Penn O, Pupko T. Improving the performance of positive selection inference by filtering unreliable alignment regions. Mol Biol Evol. 2012;29:1–5. doi: 10.1093/molbev/msr177. [DOI] [PubMed] [Google Scholar]
  126. Proux E, Studer RA, Moretti S, Robinson-Rechavi M. Selectome: a database of positive selection. Nucleic Acids Res. 2009;37:D404–D407. doi: 10.1093/nar/gkn768. [DOI] [PMC free article] [PubMed] [Google Scholar]
  127. Qi H, Rath U, Wang D, Xu YZ, Ding Y, Zhang W, Blacketer MJ, Paddy MR, Girton J, Johansen J, et al. Megator, an essential coiled-coil protein that localizes to the putative spindle matrix during mitosis in Drosophila. Mol Biol Cell. 2004;15:4854–4865. doi: 10.1091/mbc.E04-07-0579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  128. Rajakumar R, San Mauro D, Dijkstra MB, Huang MH, Wheeler DE, Hiou-Tim F, Khila A, Cournoyea M, Abouheif E. Ancestral developmental potential facilitates parallel evolution in ants. Science. 2012;335:79–82. doi: 10.1126/science.1211451. [DOI] [PubMed] [Google Scholar]
  129. Remolina SC, Chang PL, Leips J, Nuzhdin SV, Hughes KA. Genomic basis of aging and life-history evolution in Drosophila melanogaster. Evolution. 2012;66:3390–3403. doi: 10.1111/j.1558-5646.2012.01710.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  130. Richly E, Chinnery PF, Leister D. Evolutionary diversification of mitochondrial proteomes: implications for human disease. Trends Genet. 2003;19:356–362. doi: 10.1016/S0168-9525(03)00137-9. [DOI] [PubMed] [Google Scholar]
  131. Robertson HM, Wanner KW. The chemoreceptor superfamily in the honey bee, Apis mellifera: expansion of the odorant, but not gustatory, receptor family. Genome Res. 2006;16:1395–1403. doi: 10.1101/gr.5057506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  132. Roth P, Xylourgidis N, Sabri N, Uv A, Fornerod M, Samakovlis C. The Drosophila nucleoporin DNup88 localizes DNup214 and CRM1 on the nuclear envelope and attenuates NES-mediated nuclear export. J Cell Biol. 2003;163:701–706. doi: 10.1083/jcb.200304046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  133. Schabauer H, Valle M, Pacher C, Stockinger H, Stamatakis A, Robinson-Rechavi M, Yang Z, Salamin N. SlimCodeML: an optimized version of CodeML for the branch-site model. 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum. Shanghai. p. 2012:706–714. [Google Scholar]
  134. Schneider A, Souvorov A, Sabath N, Landan G, Gonnet GH, Graur D. Estimates of positive Darwinian selection are inflated by errors in sequencing, annotation, and alignment. Genome Biol Evol. 2009;1:114–118. doi: 10.1093/gbe/evp012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  135. Schwander T, Lo N, Beekman M, Oldroyd BP, Keller L. Nature versus nurture in social insect caste differentiation. Trends Ecol Evol. 2010;25:275–282. doi: 10.1016/j.tree.2009.12.001. [DOI] [PubMed] [Google Scholar]
  136. Shen YY, Liang L, Zhu ZH, Zhou WP, Irwin DM, Zhang YP. Adaptive evolution of energy metabolism genes and the origin of flight in bats. Proc Natl Acad Sci U S A. 2010;107:8666–8671. doi: 10.1073/pnas.0912613107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  137. Simola DF, Wissler L, Donahue G, Waterhouse RM, Helmkampf M, Roux J, Nygaard S, Glastad KM, Hagen DE, Viljakainen L, et al. Social insect genomes exhibit dramatic evolution in gene composition and regulation while preserving regulatory features linked to sociality. Genome Res. 2013;23:1235–1247. doi: 10.1101/gr.155408.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  138. Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W, Goldberg LJ, Eilbeck K, Ireland A, Mungall CJ, et al. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol. 2007;25:1251–1255. doi: 10.1038/nbt1346. [DOI] [PMC free article] [PubMed] [Google Scholar]
  139. Smith CD, Zimin A, Holt C, Abouheif E, Benton R, Cash E, Croset V, Currie CR, Elhaik E, Elsik CG, et al. Draft genome of the globally widespread and invasive Argentine ant (Linepithema humile) Proc Natl Acad Sci U S A. 2011;108:5673–5678. doi: 10.1073/pnas.1008617108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  140. Smith CR, Smith CD, Robertson HM, Helmkampf M, Zimin A, Yandell M, Holt C, Hu H, Abouheif E, Benton R, et al. Draft genome of the red harvester ant Pogonomyrmex barbatus. Proc Natl Acad Sci U S A. 2011;108:5667–5672. doi: 10.1073/pnas.1007901108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  141. Smith CR, Toth AL, Suarez AV, Robinson GE. Genetic and genomic analyses of the division of labour in insect societies. Nat Rev Genet. 2008;9:735–748. doi: 10.1038/nrg2429. [DOI] [PubMed] [Google Scholar]
  142. Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22:2688–2690. doi: 10.1093/bioinformatics/btl446. [DOI] [PubMed] [Google Scholar]
  143. Stroschein-Stevenson SL, Foley E, O’Farrell PH, Johnson AD. Identification of Drosophila gene products required for phagocytosis of Candida albicans. PLoS Biol. 2005;4:e4. doi: 10.1371/journal.pbio.0040004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  144. Studer RA, Penel S, Duret L, Robinson-Rechavi M. Pervasive positive selection on duplicated and nonduplicated vertebrate protein coding genes. Genome Res. 2008;18:1393–1402. doi: 10.1101/gr.076992.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  145. Suarez RK. Energy metabolism during insect flight: biochemical design and physiological performance. Physiol Biochem Zool. 2000;73:765–771. doi: 10.1086/318112. [DOI] [PubMed] [Google Scholar]
  146. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102:15545–15550. doi: 10.1073/pnas.0506580102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  147. Suen G, Teiling C, Li L, Holt C, Abouheif E, Bornberg-Bauer E, Bouffard P, Caldera EJ, Cash E, Cavanaugh A, et al. The genome sequence of the leaf-cutter ant Atta cephalotes reveals insights into its obligate symbiotic lifestyle. PLoS Genet. 2011;7:e1002007. doi: 10.1371/journal.pgen.1002007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  148. Swanson WJ, Nielsen R, Yang Q. Pervasive adaptive evolution in mammalian fertilization proteins. Mol Biol Evol. 2003;20:18–20. doi: 10.1093/oxfordjournals.molbev.a004233. [DOI] [PubMed] [Google Scholar]
  149. Talavera G, Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007;56:564–577. doi: 10.1080/10635150701472164. [DOI] [PubMed] [Google Scholar]
  150. Tan H-W, Liu G-H, Dong X, Lin R-Q, Song H-Q, Huang S-Y, Yuan Z-G, Zhao G-H, Zhu X-Q. The complete mitochondrial genome of the Asiatic cavity-nesting honeybee Apis cerana (Hymenoptera: Apidae) PLoS One. 2011;6:e23008. doi: 10.1371/journal.pone.0023008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  151. Tauber E, Eberl DF. Song production in auditory mutants of Drosophila: the role of sensory feedback. J Comp Physiol A. 2001;187:341–348. doi: 10.1007/s003590100206. [DOI] [PubMed] [Google Scholar]
  152. Tintle N, Borchers B, Brown M, Bekmetjev A. Comparing gene set analysis methods on single-nucleotide polymorphism data from Genetic Analysis Workshop 16. BMC Proc. 2009;3:S96. doi: 10.1186/1753-6561-3-s7-s96. [DOI] [PMC free article] [PubMed] [Google Scholar]
  153. Tribolium Genome Sequencing Consortium. The genome of the model beetle and pest Tribolium castaneum. Nature. 2008;452:949–955. doi: 10.1038/nature06784. [DOI] [PubMed] [Google Scholar]
  154. Trifunovic A, Hansson A, Wredenberg A, Rovio AT, Dufour E, Khvorostov I, Spelbrink JN, Wibom R, Jacobs HT, Larsson NG. Somatic mtDNA mutations cause aging phenotypes without affecting reactive oxygen species production. Proc Natl Acad Sci U S A. 2005;102:17993–17998. doi: 10.1073/pnas.0508886102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  155. Trifunovic A, Wredenberg A, Falkenberg M, Spelbrink JN, Rovio AT, Bruder CE, Bohlooly-Y M, Gidlöf S, Oldfors A, Wibom R, et al. Premature ageing in mice expressing defective mitochondrial DNA polymerase. Nature. 2004;429:417–423. doi: 10.1038/nature02517. [DOI] [PubMed] [Google Scholar]
  156. Tweedie S, Ashburner M, Falls K, Leyland P, McQuilton P, Marygold S, Millburn G, Osumi-Sutherland D, Schroeder A, Seal R, et al. FlyBase: enhancing Drosophila Gene Ontology annotations. Nucleic Acids Res. 2009;37:D555–D559. doi: 10.1093/nar/gkn788. [DOI] [PMC free article] [PubMed] [Google Scholar]
  157. Vamathevan J, Hasan S, Emes R, Amrine-Madsen H, Rajagopalan D, Topp SD, Kumar V, Word M, Simmons MD, Foord SM, et al. The role of positive selection in determining the molecular cause of species differences in disease. BMC Evol Biol. 2008;8:273. doi: 10.1186/1471-2148-8-273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  158. Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E. EnsemblCompara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 2009;19:327–335. doi: 10.1101/gr.073585.107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  159. Viljakainen L, Evans JD, Hasselmann M, Rueppell O, Tingek S, Pamilo P. Rapid evolution of immune proteins in social insects. Mol Biol Evol. 2009;26:1791–1801. doi: 10.1093/molbev/msp086. [DOI] [PubMed] [Google Scholar]
  160. Wallace IM, O’Sullivan O, Higgins DG, Notredame C. M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 2006;34:1692–1699. doi: 10.1093/nar/gkl091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  161. Wang T, Montell C. A phosphoinositide synthase required for a sustained light response. J Neurosci. 2006;26:12816–12825. doi: 10.1523/JNEUROSCI.3673-06.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  162. Waterhouse RM, Zdobnov EM, Kriventseva EV. Correlating traits of gene retention, sequence divergence, duplicability and essentiality in vertebrates, arthropods, and fungi. Genome Biol Evol. 2011;3:75–86. doi: 10.1093/gbe/evq083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  163. Waterhouse RM, Zdobnov EM, Tegenfeldt F, Li J, Kriventseva EV. OrthoDB: the hierarchical catalog of eukaryotic orthologs in 2011. Nucleic Acids Res. 2011;39:D283–D288. doi: 10.1093/nar/gkq930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  164. Weng MP, Liao BY. DroPhEA: Drosophila phenotype enrichment analysis for insect functional genomics. Bioinformatics. 2011;27:3218–3219. doi: 10.1093/bioinformatics/btr530. [DOI] [PubMed] [Google Scholar]
  165. Werren JH. Biology of Wolbachia. Annu Rev Entomol. 1997;42:587–609. doi: 10.1146/annurev.ento.42.1.587. [DOI] [PubMed] [Google Scholar]
  166. Werren JH, Richards S, Desjardins CA, Niehuis O, Gadau J, Colbourne JK, The Nasonia Genome Working Group Functional and evolutionary insights from the genomes of three parasitoid Nasonia species. Science. 2010;327:343–348. doi: 10.1126/science.1178028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  167. Wong WS, Yang Z, Goldman N, Nielsen R. Accuracy and power of statistical methods for detecting adaptive evolution in protein coding sequences and for identifying positively selected sites. Genetics. 2004;168:1041–1051. doi: 10.1534/genetics.104.031153. [DOI] [PMC free article] [PubMed] [Google Scholar]
  168. Woodard SH, Fischman BJ, Venkat A, Hudson ME, Varala K, Cameron SA, Clark AG, Robinson GE. Genes involved in convergent evolution of eusociality in bees. Proc Natl Acad Sci U S A. 2011;108:7472–7477. doi: 10.1073/pnas.1103457108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  169. Wurm Y, Wang J, Riba-Grognuz O, Corona M, Nygaard S, Hunt BG, Ingram KK, Falquet L, Nipitwattanaphon M, Gotzek D, et al. The genome of the fire ant Solenopsis invicta. Proc Natl Acad Sci U S A. 2011;108:5679–5684. doi: 10.1073/pnas.1009690108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  170. Yang Z. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol. 1998;15:568–573. doi: 10.1093/oxfordjournals.molbev.a025957. [DOI] [PubMed] [Google Scholar]
  171. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  172. Yang Z, dos Reis M. Statistical properties of the branch-site test of positive selection. Mol Biol Evol. 2011;28:1217–1228. doi: 10.1093/molbev/msq303. [DOI] [PubMed] [Google Scholar]
  173. Yang Z, Nielsen R, Goldman N, Pedersen A-MK. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000;155:431–449. doi: 10.1093/genetics/155.1.431. [DOI] [PMC free article] [PubMed] [Google Scholar]
  174. Yang Z, Wong WSW, Nielsen R. Bayes empirical bayes inference of amino acid sites under positive selection. Mol Biol Evol. 2005;22:1107–1118. doi: 10.1093/molbev/msi097. [DOI] [PubMed] [Google Scholar]
  175. Yui R, Ohno Y, Matsuura ET. Accumulation of deleted mitochondrial DNA in aging Drosophila melanogaster. Genes Genet Syst. 2003;78:245–251. doi: 10.1266/ggs.78.245. [DOI] [PubMed] [Google Scholar]
  176. Zhang G, Cowled C, Shi Z, Huang Z, Bishop-Lilly KA, Fang X, Wynne JW, Xiong Z, Baker ML, Zhao W, et al. Comparative analysis of bat genomes provides insight into the evolution of flight and immunity. Science. 2013;339:456–460. doi: 10.1126/science.1230835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  177. Zhang J. Frequent false detection of positive selection by the likelihood method with branch-site models. Mol Biol Evol. 2004;21:1332–1339. doi: 10.1093/molbev/msh117. [DOI] [PubMed] [Google Scholar]
  178. Zhang J, Nielsen R, Yang Z. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005;22:2472–2479. doi: 10.1093/molbev/msi237. [DOI] [PubMed] [Google Scholar]
  179. Zhou X, Slone JD, Rokas A, Berger SL, Liebig J, Ray A, Reinberg D, Zwiebel LJ. Phylogenetic and transcriptomic analysis of chemosensory receptors in a pair of divergent ant species reveals sex-specific signatures of odor coding. PLoS Genet. 2012;8:e1002930. doi: 10.1371/journal.pgen.1002930. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Data Availability Statement

Raw and filtered alignments used in these analyses track files for the alignment editor Jalview (Clamp et al. 2004), Codeml control files and result files can be downloaded at http://bioinfo.unil.ch/supdata/Roux_positive_selection_ants/Roux_et_al_datasets.tar.gz (last accessed April 24, 2014).

A simple web interface displaying gene families, GO mapping, Codeml results, and alignments (through a Jalview applet) is available at http://bioinfo.unil.ch/supdata/Roux_positive_selection_ants/families.html (last accessed April 24, 2014). Jalview tracks display the regions used or filtered out in the original protein alignments, as well as the residues found to be under positive selection by Bayes Empirical Bayes (Yang et al. 2005) in all the branches tested for each of the three replicate runs (fig. 2).


Articles from Molecular Biology and Evolution are provided here courtesy of Oxford University Press

RESOURCES