Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2019 Nov 1;9:15849. doi: 10.1038/s41598-019-52312-4

Tissue-specific expression profiles and positive selection analysis in the tree swallow (Tachycineta bicolor) using a de novo transcriptome assembly

Alexandra B Bentz 1,2,, Gregg W C Thomas 1,3, Douglas B Rusch 1,4, Kimberly A Rosvall 1,2
PMCID: PMC6825141  PMID: 31676844

Abstract

Tree swallows (Tachycineta bicolor) are one of the most commonly studied wild birds in North America. They have advanced numerous research areas, including life history, physiology, and organismal responses to global change; however, transcriptomic resources are scarce. To further advance the utility of this system for biologists across disciplines, we generated a transcriptome for the tree swallow using six tissues (brain, blood, ovary, spleen, liver, and muscle) collected from breeding females. We de novo assembled 207,739 transcripts, which we aligned to 14,717 high confidence protein-coding genes. We then characterized each tissue with regard to its unique genes and processes and applied this transcriptome to two fundamental questions in evolutionary biology and endocrinology. First, we analyzed 3,015 single-copy orthologs and identified 46 genes under positive selection in the tree swallow lineage, including those with putative links to adaptations in this species. Second, we analyzed tissue-specific expression patterns of genes involved in sex steroidogenesis and processing. Enzymes capable of synthesizing these behaviorally relevant hormones were largely limited to the ovary, whereas steroid binding genes were found in nearly all other tissues, highlighting the potential for local regulation of sex steroid-mediated traits. These analyses provide new insights into potential sources of phenotypic variation in a free-living female bird and advance our understanding of fundamental questions in evolutionary and organismal biology.

Subject terms: Molecular evolution, Transcriptomics

Introduction

The rapid increase of available transcriptomes for non-model organisms over the past few years has greatly contributed to ecological and evolutionary advances in natural systems13. This is particularly true for avian species4, yet one key species used in eco-evolutionary and organismal biology has a surprising absence of molecular resources. Tree swallows (Tachycineta bicolor) are one of the best-studied, free-living native species in North America5. They are found continent-wide and their willingness to breed in artificial nest boxes makes them easy to observe and experimentally manipulate. This has led to decades of data on their breeding ecology and life-history, which has advanced understanding of the effects of anthropogenic global change6, ecotoxicology7, and trade-offs in physiology and behavior810. Tree swallows are arguably the “white rat” of free-living, North American birds11, yet only a handful of studies have performed candidate gene analyses1216. A reference genome was recently assembled for this species17, but a genome lacks the ability to indicate gene function, whereas transcriptomic data can facilitate a better understanding of the molecular underpinnings of phenotypic plasticity. Thus, while tree swallows are an invaluable free-living model organism for exploring phenotypic responses to social and environmental variation, transcriptomic resources are sorely needed. Moreover, the wealth of prior research in this system makes the tree swallow well-suited to address fundamental questions in evolutionary and organismal biology.

A major goal in evolutionary biology is to identify genes that are subject to adaptive evolution (i.e., evolving under positive selection)18. Avian transcriptomes have been assembled for a wide range of species4, which holds promise for new insights into the molecular processes driving adaptation. For example, research in mammals suggests that genes expressed at low levels and those that are tissue‐specific may evolve more rapidly19, and work thus far in birds largely mirrors these findings2022. However, only a few studies have used comparative transcriptomics to investigate evolutionary patterns between bird lineages2326, and even fewer include multiple independent tissues22. Tree swallows are well-suited for exploring the relationship between protein evolution and tissue specificity, particularly related to tissues involved in reproductive competition and immune function. As aerial insectivores that feed on emergent aquatic insects, they are more susceptible to accumulated contaminants27 and this can negatively impact immune function28 and reproductive success29. Tree swallows are also secondary-cavity nesters with limited nesting sites, which generates steep competition for territories and a substantive population of non-breeding individuals30. Predictably, tree swallows have one of the highest rates of extra-pair paternity of any socially monogamous songbird (50–90% of nests contain extra-pair young)3133. Thus, this system can facilitate the discovery of protein-coding genes undergoing positive selection, providing insights into lineage-specific adaptations related to competition and immunity.

Transcriptomes can also be used to explore candidate gene pathways to improve understanding of organismal biology34,35. Sex steroids, such as testosterone and estradiol, regulate many physiological and behavioral traits, and there is growing interest in whether or how different sex steroid-mediated traits can evolve in concert or independently36. Specifically in females, selection is hypothesized to favor mechanisms that allow for greater phenotypic independence37, where traits are less dependent on the direct actions of circulating steroids due to tissue-specific variation in steroid production and/or sensitivity. Unfortunately, most prior work in this field of study has only focused on genes that function downstream of androgen production (e.g., androgen and estrogen receptors and aromatase) in a limited set of tissues (e.g., gonad and brain)38. Female tree swallows are especially amenable for addressing these knowledge gaps because they fiercely compete to obtain9 and maintain39,40 limited nesting cavities and this aggression is, in part, mediated by sex steroids41, the consequences of which have important transgenerational effects42,43. Future work providing detailed knowledge of tissue-specific regulation of the full suite of steroid processing genes would greatly improve our understanding of the potential for correlated phenotypic evolution across tissues.

Here, we present a functionally annotated tree swallow transcriptome generated from six tissues (brain, blood, ovary, muscle, spleen, and liver) in two females. To capture genes expressed across the breeding season, one female was collected early in the breeding season during territory establishment and another was collected later during incubation. Because we included multiple tissues, we also identified unique genes and processes to better characterize tissue functions. In addition, we report on two initial applications of this new molecular resource. First, to identify genes that are more rapidly evolving in the tree swallow lineage, we performed a comparative analysis of protein-coding sequences, comparing our transcriptome to 8 other avian species with high-quality sequence data. We further analyze characteristics of positively selected genes thought to impact protein evolution19, including gene expression levels and tissue specificity, with a focus on the tissue-specific functions identified here. We relate these findings back to tree swallow life history for a greater understanding of molecular evolution in this exceptionally well-studied wild bird. Second, we leveraged analyses of tissue-specific gene expression profiles to ask a fundamental question about organismal regulation of sex steroid-mediated phenotypes. Collectively, these analyses shed light on important questions in evolutionary biology and endocrinology, showcasing this transcriptome as a molecular tool to expand understanding of evolutionary and organismal biology.

Results

Assembly evaluation

Sequencing was performed with both NextSeq and MiSeq platforms to generate both short and long reads. The average number of paired-end reads per sample was ~136 million (Table S1). Trimmed reads from all six tissues were assembled into 207,739 transcripts, totaling 278,915,174 nucleotides with a mean length of 1,343 base pairs (bp) and an N50 of 1889 bp (see Table 1 for a summary of assessment metrics). Sequence length distribution of these transcripts and other assembly metrics were comparable to other avian species (Supplementary Fig. S1, Table S2). Ultimately, there were 144,119 (69.4%) transcripts that aligned to known proteins in the NCBI database using BLAST (e < 1e-10). We applied a stringent filter to remove any transcripts that did not display high confidence (with at least 50% coverage of the full-length protein and at least 70% identity), which yielded 22,825 transcripts (Table 1). Putatively unspliced introns and largely redundant transcripts were also removed resulting in a set of high confidence protein-coding segments (n = 14,717). 9,344 (63.5%) of these had alignments covering ≥ 90% of the full-length protein. Some of the annotated transcripts were identified as ‘uncharacterized’ proteins (467; 3.2%); however, we included these in subsequent analyses as some could represent unique genes to birds with unknown functions. Over 50% of annotated transcripts aligned best to Parus major or Sturnus vulgaris (Supplementary Fig. S2), and the 14,717 protein-coding genes we identified is comparable to the 15,183 in the Parus major assembly (Parus_major1.0.3).

Table 1.

Summary statistics for the de novo assembly.

Category Value
Total transcripts 207,739
GC (%) 45.87
All transcript contigs
N50 transcript length (bp) 1,889
median contig length (bp) 809
mean contig length (bp) 1,343
total assembled bases 278,915,174
Annotation
Transcripts with BLAST hits 144,119 (69.4%)
Transcripts with ≥70% identity and ≥50% coverage 22,825 (11.0%)
Genes with unique BLAST match 14,717 (7.1%)

We evaluated transcriptome assembly quality and accuracy using TransRate44 (including backmapping), and completeness using BUSCO (Benchmarking Universal Single-Copy Orthologs)45. The final transcriptome had a TransRate score of 0.38 (optimized score of 0.51) and, of the total number of assembled reads, 82% successfully mapped back to the transcriptome. Assessment of completeness using the BUSCO45 database of 4,915 orthologs shared among all metazoans suggested that our assembly is 91.7% complete (3,626 complete single-copy BUSCOs and 879 complete duplicated BUSCOs), with 4.8% of contigs fragmented (237 BUSCOs) and 3.5% missing (173 BUSCOs). We assigned Gene Ontology (GO) terms to 9,563 (65.0%) transcripts, and categories were well distributed among biological processes, molecular functions, and cellular components (Fig. 1). Thus, this first draft transcriptome assembly is a good foundation for future research.

Figure 1.

Figure 1

Functional classification of the de novo assembly of the tree swallow transcriptome. All tissues are combined and the three main Gene Ontology categories are depicted (biological process, molecular function, and cellular component).

Tissue expression profiles

An UpSet plot was constructed to compare presence/absence of genes across tissues (Fig. 2). Most of the 14,717 putative genes were expressed in the ovary (n = 13,479 genes; 91.6%), followed by brain (n = 13,332; 90.6%), spleen (n = 12,419; 84.4%), liver (n = 9,868; 67.1%), muscle (n = 8,545; 58.1%), and blood (n = 7,789; 52.9%). Libraries were constructed with equal amounts of cDNA, suggesting these differences likely reflect biological variation. We identified genes only present in one tissue to gain insight into tissue-specific functions and provide context for downstream analyses on tissue specificity and positive selection. Brain and then ovary had the most unique genes expressed, while blood had the least (Fig. 2). The most abundant, unique gene in each tissue was myelin proteolipid protein (PLP1) in brain, zona pellucida sperm-binding protein (ZP2) in ovary, C-X-C motif chemokine 2-like (CXCL2) in spleen, complement component C6 (C6) in liver, troponin C, skeletal muscle (TNNC2) in muscle, and translocator protein 2 (TSPO2) in blood. The top 10 most abundant, unique genes for each tissue are listed in Supplementary Table S3. GO overrepresentation analyses of all uniquely expressed genes for each tissue indicated that immune response (GO:0006955) was the most significant biological process in spleen, musculoskeletal movement (GO:0050881) in muscle, gene silencing by RNA (GO:0031047) in ovary, and developmental induction (GO:0031128) in brain; for a full list see Supplementary Tables S4-7. Uniquely expressed genes in liver and blood did not have overrepresented GO terms.

Figure 2.

Figure 2

UpSet plot depicting the number of unique and shared transcripts with protein hits found in each tissue. Intersection size is the number of transcripts with TPM ≥ 1 in designated sets or groups.

The greatest overlap in presence/absence of genes occurred between brain and ovary (n = 854 genes; Fig. 2), and an overrepresentation analysis indicated that these genes are involved in the regulation of hormone secretion (GO:00468; FDR = <0.001) and aggressive behavior (GO:0002118; FDR = 0.03). Spleen overlapped in expression most with ovary (n = 213 genes), liver with spleen (n = 50), blood with spleen (n = 42), and muscle with ovary (n = 40).

Expression levels (reported as transcripts-per-million, TPM) were highly variable among tissues and genes (median = 105.0 TPM; mean = 404.8 TPM ± 5348.9 SD). The summed expression level (log TPM) for a given gene across tissues was negatively correlated with the index of tissue specificity (τ) (Spearman ρ = −0.40, p < 0.0001; Fig. 3), suggesting that highly expressed genes are less tissue specific. However, tissue specificity also varied based on which tissue maximally expressed the gene (ANOVA, F5,14617 = 204.1, p < 0.0001). The lowest τ was found in genes that had their highest expression in blood, while the highest τ was found in genes with maximal expression in brain and liver (Table 2).

Figure 3.

Figure 3

Correlation of summed expression levels (log TPM) across all tissues and index of tissue specificity (τ). Genes under positive selection are depicted as points (color and shape indicate the tissue in which each gene was maximally expressed). All other genes are shown as 2D kernel density estimates. Dashed lines create quadrats that denote expression values on the x axis (low, TPM < 10; medium, TPM = 10–50; high, TPM > 50) and τ on the y axis (broadly expressed, τ < 0.3; moderate, τ = 0.3–0.8; highly specific, τ > 0.8).

Table 2.

Mean index of tissue specificity of expression (τ) for genes with maximal expression in each of the six tissues, together with 95% confidence intervals (CI). Nmax represents the number of genes with maximal expression in each of the tissues.

Tissue Nmax τavg 95% CI (τ)
Liver 847 0.522 0.510–0.534
Brain 3977 0.520 0.515–0.526
Muscle 413 0.459 0.438–0.480
Ovary 5507 0.445 0.441–0.449
Spleen 3127 0.417 0.412–0.423
Blood 752 0.388 0.377–0.398

Positive selection analysis

To identify candidate genes under positive selection, we compared protein-coding sequences in the tree swallow with 8 other avian species. The branch-site test in PAML46,47 identified 46 genes with significant evidence of positive selection in the lineage leading to tree swallows using two different codon alignment methods (see Methods). Although there were no significantly overrepresented GO terms for this set, several genes were associated with processes that are relevant in the context of tree swallow life history or ecology, including feeding behavior (RMI1, CNRIP1), lipid metabolic processes (DECR2, LIPT1), reproduction (ALKBH5, PTTG1, EIF2B4, DRC7, CCDC40), immune system processes (IL12B, TRIM25, USP14, CPPED1, ICOS, TSPAN2), muscle system process (GTF2IRD1), and visual perception (FAM161A). The complete list of positively selected genes is in Supplementary Table S8.

We next explored whether accelerated molecular evolution of these proteins was associated with expression levels or tissue specificity. Similar to the overall patterns reported above, the summed expression level (log TPM) for positively selected genes was also negatively correlated with τ (ρ = −0.37, p = 0.01; Fig. 3). Positively selected genes tended to be expressed at lower levels across tissues (260.08 TPM, 95% CI = 126.33–393.83) compared with non-positively selected genes (405.23, 95% CI = 318.26–492.20; p = 0.08, by bootstrapping), but they did not differ in their tissue specificity (positively selected genes: τ = 0.434, 95% CI = 0.382–0.486 vs. non-positively selected genes: τ = 0.460, 95% CI = 0.457–0.463; p = 0.33). Two positively selected genes were highly tissue specific in their expression, cannabinoid receptor interacting protein 1 (CNRIP1) and inducible T-cell costimulator (ICOS), which had maximal expression in the brain and spleen, respectively (Fig. 3). Expression levels of positively selected genes did vary significantly across tissues (ANOVA, F5 = 3.27, p < 0.01), with ovary expressing significantly higher levels of these genes (average = 84.35 TPM, 95% CI = 44.82) than muscle (24.53 TPM, 95% CI = 20.87) or liver (27.50 TPM, 95% CI = 20.69), which expressed the lowest levels (Tukey test, p = 0.02).

Tissue-specific capacity for steroid synthesis and sensitivity

We further explored tissue-specific expression patterns of steroidogenic genes because of the emerging endocrine research on the role of different tissues in mediating sex steroid-related phenotypes. Our transcriptome included the major enzymes involved in sex steroid synthesis (StAR, p450scc, CYP17, 3βHSD1; see 17βHSD analysis below), although most were limited in their expression to the ovary and occasionally spleen (all τ > 0.63), with CYP17 being the most ovary specific (τ = 0.84) followed by 3βHSD1 (τ = 0.78). P450scc expression was generally low (max = 5.2 TPM), and MiSeq detected very low 3βHSD1 expression in brain and liver (2.0 and 1.7 TPM, respectively), suggesting only limited de novo steroidogenic potential outside of the ovary and spleen. Enzymes involved in later sex steroid conversion were primarily expressed in the brain and ovary, with SRD5A2 and AROM being predominantly ovary specific (τ = 0.81 and 0.79, respectively) and SRD5A1 less so (τ = 0.55). Sex steroid binding capabilities, on the other hand, were more ubiquitous; most sex steroid receptors examined (e.g. AR and ESR1) were expressed across most tissues (excluding blood), with GPER1 being more specific to the brain (all τ < 0.66). Thus, tissue specificity appeared to decrease along the steroidogenic pathway, from steroid synthesis to receptor binding (Fig. 4).

Figure 4.

Figure 4

Steroidogenic capacity and sensitivity by tissue. The y-axis values are held constant throughout but are arbitrary as only relative expression levels are of importance. 117βHSD data were excluded because of uncertainty around isoform function (see text). Abbreviations: StAR, steroidogenic acute regulatory protein; P450scc, cytochrome P450 side-chain cleavage; CYP17, cytochrome P450 17α-hydroxylase/17,20lysase; DHEA, dehydroepiandrosterone; 3βHSD1, 3β-hydroxysteroid dehydrogenase/isomerase; A4, androstenedione; 17βHSD, 17β-hydroxysteroid dehydrogenase; T, testosterone; SRD5A1, steroid 5 alpha-reductase 1; SRD5A2, steroid 5 alpha-reductase 2; AROM, cytochrome P450 aromatase; DHT, dihydrotestosterone; E2, 17β-estradiol; AR, androgen receptor; ESR1, estrogen receptor alpha; ESR2, estrogen receptor beta; GPER1, G protein-coupled estrogen receptor 1.

We excluded 17βHSD from the above analysis because it has several isoforms that both synthesize and metabolize sex steroids, and there is some suggestion that avian isoforms may function differently than the more well-characterized mammalian isoforms48,49. To ameliorate this uncertainty regarding pathways of sex steroid synthesis in birds, we identified 7 isoforms of 17βHSD and clustered them into two groups based on tissue-level expression patterns: cluster 1 was more ovary-specific and included 17βHSD1, 17βHSD2, and 17βHSD8; and cluster 2 had more general expression and included 17βHSD4, 17βHSD7, 17βHSD12, and 17βHSD13 (Supplementary Fig. S3). The most tissue-specific isoforms (17βHSD1, τ = 0.74, and 17βHSD2, τ = 0.79) were expressed most highly in the ovary, the primary site of sex steroidogenesis, while all other isoforms ranged from τ = 0.34–0.58.

Discussion

Transcriptomic data from multiple tissues can shed light on important questions in evolutionary and organismal biology. Tree swallows are one of the most studied free-living birds across many disciplines within ecology, evolution, and behavior, and the de novo transcriptome assembly here provides a key molecular resource for further advancing these fields. Using this new resource, we first identified several dozen genes showing signs of positive selection along the tree swallow lineage, many of which are related to metabolic, reproductive, and immune processes. We also identified tissue-specific functions, while highlighting the potential for various tissues to locally produce and bind sex steroids. Below, we discuss the implications of these findings and how they inform our understanding of molecular and organismal evolution.

Validations suggest this transcriptome covers a wide range of protein-coding sequences, including 91.7% of BUSCOs thought to be found in all metazoans. This compares quite favorably to other de novo transcriptomes in passerines using BUSCO to evaluate completeness against the vertebrate gene set (30–62% complete)50. Additionally, our TransRate score of 0.38 is higher than > 50% of the transcriptomes deposited in the NCBI Transcriptome Shotgun Assembly database44. Finally, our backmapping rate of 82% is also comparable to other de novo transcriptomes in birds, including European starlings (Sturnus vulgaris, 82%)50 and rock doves (Columba livia, 70–80%)35. In addition, the 14,717 annotated transcripts closely match the 15,183 protein-coding genes identified in the Parus major assembly (Parus_major1.0.3), which shared the highest degree of sequence similarity to our data. This number is also similar to other de novo transcriptomes in passerines (ranges from 7,135–17,898; Table S2) and well-established reference transcriptomes in songbird species, like the zebra finch (Taeniopygia guttata; taeGut3.2.4) and collared flycatcher (Ficedula albicollis; FicAlb_1.4), which contain 17,488 and 15,303 annotated protein-coding genes, respectively. Subsequent studies would ideally include samples from males, non-breeding stages, and juveniles; however, data thus far suggest that this transcriptome from 2 females at 2 breeding stages (6 tissues each) has good coverage.

Tissue-specific gene expression provides one mechanism by which the same genome can generate differentiated phenotypes among tissues. In our dataset, uniquely expressed genes were largely considered typical for each tissue. For example, muscle-specific genes were associated with muscle contraction and spleen-specific genes with immune response. Two tissues, blood and liver, had few unique genes and no overrepresented GO terms, indicating they may perform more diverse functions. However, genes with maximum expression in the liver had relatively high tissue-specificity, and unique genes in the liver were related to processes like complement system, lipoprotein production (presumably for eggs), and metabolism (e.g., glucuronidation). The brain had the most uniquely expressed genes which were associated with synaptic transmission, neuropeptide signaling, and other neural processes. Furthermore, genes maximally expressed in the brain were relatively tissue-specific, suggesting the brain is enriched with differentiated cell types and/or unique functionality. This finding is supported by studies in mammals indicating the brain may have more fine-tuned expression networks than other tissues5153. The ovary expressed the second highest number of unique genes and showed the most significant overrepresentation for terms pertaining to gene silencing, likely having to do with germ cell development. We also explored the shared gene expression patterns between ovary and brain, which had the greatest gene overlap. While shared expression profiles do not necessarily imply similar function54, these tissues shared genes associated with processes like hormone secretion and aggression, consistent with the well-established coordination of these tissues via the hypothalamic-pituitary-gonadal, or HPG axis, which mediates many reproductive and aggressive behaviors.

In an initial application of this transcriptome, we identified 46 putative positively selected genes in the lineage leading to tree swallows, compared to 8 other avian species. While we did not find any significantly overrepresented processes among these apparently rapidly evolving genes, GO terms associated with these genes represent potentially important and well-studied adaptations in this system. For example, tree swallows are the only aerial insectivore in our analysis (data for other swallows and swifts is currently being generated5557). Aerial insectivores can spend 80% or more of the day in flight5, gathering prey during acrobatic flight, potentially requiring enhanced muscle functioning and visual perception, but they must also contend with periods of food shortages, which they do by adjusting growth to environmental conditions5861. This metabolic flexibility could relate to positively selected genes involved in feeding behavior and lipid metabolism, such as RMI1, which regulates feeding behavior and energy homeostasis62, and CNRIP1, a highly brain-specific gene that modulates appetite through cannabinoid receptor 1 activity63. Genes related to fat metabolism generally appear to be more susceptible to rapid evolution along the passerine lineage23. We also found several genes associated with immune function that are potentially under positive selection, such as ICOS, a highly spleen-specific gene, associated with enhanced T-cell responses. T-cell functioning is important for tree swallows because their diet of emergent aquatic insects makes them more susceptible to the bioaccumulation of toxins that can reduce immune responsiveness28. Our selection analysis also identified genes associated with sperm performance (CCDC40 and DRC7)64 that could be related to the high rates of extra-pair paternity in tree swallows31,33, which is known to create strong selection pressures associated with sperm competition6567. Therefore, many of the candidate genes under positive selection can be explained by well-studied adaptations specific to tree swallow life history and ecology.

Several factors interact to influence adaptive molecular evolution, including protein function, expression level, tissue specificity, and more19. Of these factors, the effects of expression level and tissue specificity are particularly well suited to transcriptomic analyses. For instance, more rapidly evolving genes tend to have lower expression, potentially due to selection against protein misfolding19. Our finding that genes under positive selection tended to be expressed at a lower level compared to the rest of the transcriptome supports the idea that fast-evolving genes are characterized by lower expression levels20,22. Furthermore, we found that expression levels were negatively correlated with tissue specificity, suggesting genes with more specific functions have lower expression levels20,21,34. Higher tissue specificity may act to release genes from evolutionary constraints by allowing for greater compartmentalization and avoiding potentially competing roles19. Specifically, genes expressed primarily in immune and reproductive tissues have been shown to be under strong selection20,21,68,69. While we did not find that positively selected genes were significantly more tissue specific, our analyses highlighted many positively selected genes related to reproduction (ALKBH5, PTTG1, EIF2B4, DRC7, CCDC40) and immune system processes (IL12B, TRIM25, USP14, CPPED1, ICOS, TSPAN2). Positively selected genes were also most abundant in the ovary, and two highly tissue-specific positively selected genes were expressed in the brain and spleen. Thus, genes in these tissues may be more rapidly evolving in tree swallows, which is consistent with their high rate of extra-pair paternity31,33 and propensity to accumulate environmental contaminants27. However, these conclusions should be regarded conservatively, because there are likely other genes under positive selection that we did not detect due to the species we included. For example, the cavity-nesting species (Ficedula albicollis, Parus major, Cyanistes caeruleus, and Sturnus vulgaris) shared a high degree of protein similarity with the tree swallow, likely preventing us from finding unique protein changes associated with adaptations required for cavity nesting (e.g., heightened territorial aggression). As more high-quality transcriptomes are developed for non-model organisms, we can begin to more accurately address these evolutionary questions.

In a second application of our transcriptome, we analyzed tissue-level variability along the pathway of sex steroid hormone production and binding as a window into the organismal potential for independent regulation of suites of sex steroid-mediated traits. The degree to which tissues can independently regulate this has become a hot topic in evolutionary endocrinology, due to the potential to decouple hormones from their organism-wide pleiotropic effects and elicit more specific, potentially adaptive suites of traits37,70,71. Recent work suggests that this kind of hormonal and behavioral plasticity may be particularly important for females, potentially allowing regulation of sex steroids, like testosterone, in a tissue-specific manner to avoid the systemic costs of elevated testosterone15,41. We found that steroid receptors were expressed almost ubiquitously across tissues (excluding blood), suggesting that most tissues studied here have the potential to locally regulate steroid sensitivity. This finding is not surprising given the dynamic functions steroids perform72, and past studies showing expression of these genes within the tissues included here35,7375. Fewer studies, however, have measured expression of steroidogenic enzymes, particularly in peripheral tissues15,7476. We found that these enzymes were primarily ovary-specific in gene expression, but as you move further along the steroidogenic pathway from de novo synthesis to later metabolism, genes became less tissue specific in their expression. The spleen was the only other tissue to express a sex steroid-synthesizing enzyme; however, it was 3βHSD1, which can be co-opted to synthesize corticosterone or other immunosteroids known to respond to metabolic and immune stressors74,77. Strikingly, steroid-synthesizing enzymes early in the pathway were largely absent from brain expression profiles. This is surprising considering that both male and female birds typically express enzymes capable of de novo steroid synthesis in the brain48,7880. One possibility is that we failed to detect genes expressed at very low levels, and had we measured protein levels, which tend to be considerably higher than mRNA expression levels79, we would have found these genes in more tissues. Another possibility is that neurosteroid synthesis varies seasonally in its expression and our lack of non-breeding females precluded us from finding these genes in the brain. Enzymes capable of neurosteroid synthesis may have higher neural activity during the non-breeding season80, which could be indicative of a seasonal switch in how sex steroids are regulated. Future work with more biological replication is needed to fully tease apart these alternatives, though our findings clearly show marked sex steroid processing and binding abilities across many different tissues.

Components of the steroid pathway are still unresolved in birds, specifically for 17βHSD, which has multiple isoforms with unknown expression and/or function. Thus, as a final application of this transcriptome, we used patterns of tissue-specific gene expression to improve knowledge of putative 17βHSD functionality. We were unable to identify all known mammalian isoforms, an issue also addressed in the zebra finch genome assembly48. Of the 14 known 17βHSD isoforms81, we found 7 in our transcriptome. The two isoforms responsible for testosterone synthesis in most mammals (17βHSD3 and 17βHSD5)49 were not present in the current transcriptome. Within the zebra finch genome assembly, 17βHSD5 was also not found and 17βHSD3 was localized to the Z chromosome48. These findings could stem from low sequence homology, or birds may not have or express all known steroidogenic enzymes or they may synthesize testosterone with different isoforms than mammals. In general, the role of 17βHSD enzymes are controversial because they can be species-specific and act on a large set of substrates (e.g., steroids, fatty acids, retinols, etc.), so their function is partly defined by tissue expression81. We found three isoforms that were primarily expressed in the ovary and had putative roles with steroid processes, including estradiol synthesis (17βHSD1) and steroid inactivation (17βHSD2 and 17βHSD8) in mammals49. Of these, 17βHSD1 is the most likely candidate to perform testosterone synthesis in female tree swallows. It was largely ovary-specific (τ = 0.74) and has demonstrated the ability to synthesize both estrogen and testosterone in other species (e.g., mice82). Regardless, this highlights the need for future studies on 17βHSD in birds, particularly 17βHSD1, which our data implicate as a key steroidogenic isoform.

Methods

Sample collection

Tissues were collected from adult female tree swallows during territory establishment early in the breeding season (n = 1 female) and during incubation (n = 1). Collections occurred in Monroe and Brown County, Indiana (39°9 N, 86°31 W) in April and May 2016 between 900–1200. Females were euthanized with an overdose of isoflurane, followed by decapitation, and tissues were immediately collected. Gonad, liver, spleen, pectoral muscle, trunk blood, and brain were frozen on powdered dry ice and transferred to −80 °C in the lab. The female collected during territory establishment was actively engaging in aggressive interactions at an empty nest box (KAR, pers. obs.) and had recrudesced ovaries with small white follicles. The female collected in May was incubating a clutch of 5 eggs completed 3 days earlier; she had mostly white ovarian follicles with approx. 4 small yellow follicles. While the use of only two adult females does limit our ability to detect some genes (e.g., genes only expressed in juveniles or males), it helps reduce concerns about integrating allelic variation thereby improving the assembly. This study was approved by the Bloomington Institutional Animal Care and Use Committee under protocol #15–004 and all methods were performed in accordance with the relevant guidelines and regulations.

RNA extraction, library preparation, and sequencing

Total RNA was extracted from each sample separately using the phenol-chloroform-based Trizol method, following the manufacturer’s instructions (Invitrogen, Carlsbad, CA). Total RNA was resuspended in water, and quality (RIN > 8.0) and quantity of RNA was analyzed with an Agilent 2200 TapeStation (Agilent Technologies, Santa Clara, CA). Total RNA was prepared into equimolar pools for each tissue and submitted to Indiana University’s Center for Genomics and Bioinformatics for cDNA library construction using a TruSeq Stranded mRNA LT Sample Prep Kit (Illumina) following the standard manufacturing protocol. When preparing larger fragments, the fragmentation step was reduced from 8 min to 10 s at 94 °C. Sequencing was performed by using an Illumina NextSeq 500/550 Kit v2 with a 150-cycle sequencing module generating 81 bp paired-end reads. After the sequencing run, demultiplexing was performed with bcl2fastq v2.20.0.422. We additionally performed sequencing using a MiSeq Kit v3 with a 600-cycle sequencing module generating 305 bp paired-end reads. These longer reads were used to improve confidence in our assembly.

Assembly and annotation

Trimmomatic (version 0.36)83 was used to trim reads and to remove adapter sequences and low-quality reads. The transcriptomes were assembled using Trinity (version 2.6.5) and spades (version 3.11.1) with a minimum contig length of 100. The spades assemblies were performed with a k of 35, 55, and 71. The different assemblies were then compared and merged using the EvidentialGene pipeline (http://arthropods.eugenes.org/EvidentialGene/trassembly.html). The final merged assembly consisted of 207,739 transcripts. These transcripts were then searched against several reference genomes, including Homo sapiens, Gallus gallus, Parus major, Serinus canaria, Sturnus vulgaris, Ficedula albicollis, Zonotrichia albicollis, Taeniopygia guttata, Geospiza fortis, and Chaetura pelagica to identify potential homologous proteins using NCBI BLAST (version 2.2.26), resulting in 144,119 transcripts with significant BLAST hits (e < 1e-10). The peptide alignment information was used to identify discrete protein-coding segments with at least 70% identity that also covered at least 50% of the best matching full-length protein (n = 22,825 transcripts). Putatively unspliced introns were removed from the putative protein-coding segments. Finally, largely redundant transcripts were removed using cd-hit-est (version 4.6.8)84,85. This resulted in a set of high confidence protein-coding segments (n = 14,717) that could be used both for phylogenetic comparisons as well as abundance measurements. Reads from each tissue (NextSeq and MiSeq) were mapped against the protein-coding portions of the transcriptome using bwa mem (version 0.7.17). We assessed assembly quality and accuracy using TransRate44 (including backmapping rate), and we assessed assembly completeness using BUSCO45 (lineage dataset = aves_odb9; 40 species; 4915 conserved genes).

Transcripts were converted to their associated gene name using bioDBnet (https://biodbnet-abcc.ncifcrf.gov/db/db2db.php) and functional annotations were retrieved from the GO database (version 1.2). Transcripts with TPM values of less than 1 were considered absent from a tissue (n = 14,623 transcripts had a TPM ≥ 1 in at least one tissue) and we further characterized gene expression levels as low expression = TPM < 10, medium expression = 10 < TPM < 50, and high expression = TPM > 5086.

Index of tissue specificity

We calculated an index of tissue specificity of gene expression (τ)54, using the methods presented in Mank et al.87. In order to reduce the effect of sampling stochasticity from genes with low expression, TPM was set to 2 to account for tissues with no detected expression. The range of τ for a gene is between 0 and 1; highly tissue-specific transcripts have values close to one (τ > 0.8) and widely expressed transcripts (e.g., housekeeping genes) have lower values (τ < 0.3)54. We explored the accuracy of these τ cut-offs by comparing the distribution of the most abundant and unique gene in each tissue (Supplementary Table S3) with well-known housekeeping genes, including SDHA, UBC, GAPDH, RPL4, HMBS, and ACTB, many of which are commonly used in birds88,89. The distributions were significantly different (Kolmogorov-Smirnov test: p = 0.002) as the unique genes ranged from τ = 0.80–0.89 and the housekeeping genes ranged from τ = 0.15–0.30, supporting the proposed cut-offs.

Characterizing tissue expression profiles

To characterize patterns of gene expression unique to each tissue, we compared transcript presence/absence across all tissues using the UpSetR package90 and the 10 most abundant, unique genes were identified in each tissue. To clarify processes unique to each tissue, transcripts expressed in only a single tissue were subjected to a GO overrepresentation analysis (see below). We additionally identified steroidogenic enzymes (StAR, P450scc, CYP17, 3βHSD1, SRD5A1, and AROM) and receptors (androgen and estrogen receptors) expressed in each tissue to evaluate tissue-specific steroidogenic capacity and sensitivity. We also analyzed the multiple 17βHSD isoforms by clustering them based on tissue expression using Euclidean distance.

Positive selection analysis

To compare patterns of protein coding sequence evolution along the lineage leading to the tree swallow we downloaded the peptides and coding nucleotide sequences on July 27, 2018 for the 7 other avian species belonging to the order Passeriformes on the NCBI database (white-throated sparrow, Zonotrichia albicollis; American crow, Corvus brachyrhynchos; great tit, Parus major; collared flycatcher, Ficedula albicolllis; zebra finch, Taeniopygia guttata; blue tit, Cyanistes caeruleus; and European starling, Sturnus vulgaris), along with the chicken (Gallus gallus) to serve as an outgroup (Supplementary Table S9). Sequences were filtered such that only the longest isoform of each gene was retained. The filtered protein sequences from these 8 species plus the predicted peptides for the tree swallow generated here were clustered into orthologous groups using an all-v-all BLAST search91 that generated e-values used to inform the main clustering criterion for the MCL program92. Of the resulting groups of orthologous transcripts, we identified 3,015 single-copy peptide groups that have exactly one copy of the gene present in each species. We aligned these with two codon alignment programs (PRANK93 and MACSE94) allowing us to look for overlapping genes in the final list of genes under positive selection as a means to eliminate possible errors due to poor alignments. Alignments were also masked with GBlocks95 to remove poorly aligned or gap-ridden positions. We reconstructed gene trees with RAxML96 which were used to infer a species tree with ASTRAL (Supplementary Fig. S4)97. To identify genes evolving under positive selection along the tree swallow lineage (orange branch in Supplementary Fig. S4), we set that branch of the phylogeny as the foreground branch in a PAML46 branch-site test47. However, because we observed high amounts of discordance between individual gene trees and the inferred species tree (incomplete lineage sorting; ILS) in our phylogeny, we ran the branch-site test on the coding sequence of each gene using the gene tree inferred from the coding sequence of that locus, rather than the inferred species tree for all loci. This helps minimize the effect of substitutions produced by ILS98. For detailed methods and a discussion of the phylogenetic methods see supplementary information (Supplementary Methods: Positive Selection Analysis). To better characterize genes under positive selection, we performed a GO overrepresentation analysis (see below) and explored relationships between positive selection, gene expression levels, and tissue specificity.

GO overrepresentation analyses

To investigate the functions of genes uniquely expressed in each tissue and genes under positive selection, we performed GO overrepresentation analyses99,100. We assessed the complete biological processes, molecular functions, and cellular components using Gallus gallus as the background reference and a Fisher’s Exact test with FDR correction. Only GO terms with ≥3 genes are considered.

Supplementary information

Supplementary Materials (571.5KB, doc)

Acknowledgements

This work was supported by the National Science Foundation (grant IOS-1656109) and the Indiana University Clinical and Translational Sciences Institute. We would like to thank EM George and SE Lipshutz for feedback on the manuscript. We also acknowledge the Indiana University Research and Teaching Preserve and the Indiana Department of Natural Resources for access to field sites, and the Indiana University Center for Genomics and Bioinformatics.

Author contributions

A.B.B. and K.A.R. designed the study; K.A.R. collected the samples; A.B.B. performed R.N.A. extractions and statistical analyses; and G.W.C.T. and D.B.R. performed bioinformatic analyses. A.B.B. and K.A.R. drafted the manuscript. All authors read, approved, and contributed to the final manuscript.

Data availability

Raw sequence reads can be obtained from the Gene Expression Omnibus database (GEO accession number GSE126210). The transcriptome has been submitted to the Transcriptome Shotgun Assembly project and deposited at DDBJ/EMBL/GenBank under the accession GHGE00000000. The version described in this paper is the first version, GHGE01000000.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

is available for this paper at 10.1038/s41598-019-52312-4.

References

  • 1.Aubin Horth N, Renn SCP. Genomic reaction norms: using integrative biology to understand molecular mechanisms of phenotypic plasticity. Mol. Ecol. 2009;18:3763–3780. doi: 10.1111/j.1365-294X.2009.04313.x. [DOI] [PubMed] [Google Scholar]
  • 2.Zuk M, Balenger SL. Behavioral ecology and genomics: new directions, or just a more detailed map? Behav. Ecol. 2014;25:1277–1282. doi: 10.1093/beheco/aru172. [DOI] [Google Scholar]
  • 3.Ekblom R, Galindo J. Applications of next generation sequencing in molecular ecology of non-model organisms. Heredity. 2011;107:1. doi: 10.1038/hdy.2010.152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Jax E, Wink M, Kraus RH. Avian transcriptomics: opportunities and challenges. J. Ornithol. 2018;159:599–629. doi: 10.1007/s10336-018-1532-5. [DOI] [Google Scholar]
  • 5.Winkler, D. W. et al. Tree swallow (Tachycineta bicolor). The Birds of North American Online (2011).
  • 6.Dunn PO, Winkler DW. Climate change has affected the breeding date of tree swallows throughout North America. Proc. R. Soc. Lond. 1999;266:2487–2490. doi: 10.1098/rspb.1999.0950. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.McCarty JP. Use of tree swallows in studies of environmental stress. Rev. Toxicol. 2001;4:61–104. [Google Scholar]
  • 8.Ardia DR, Schat KA, Winkler DW. Reproductive effort reduces long-term immune function in breeding tree swallows (Tachycineta bicolor) Proc. R. Soc. Lond. 2003;270:1679–1683. doi: 10.1098/rspb.2003.2424. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Rosvall KA. Sexual selection on aggressiveness in females: evidence from an experimental test with tree swallows. Anim. Behav. 2008;75:1603–1610. doi: 10.1016/j.anbehav.2007.09.038. [DOI] [Google Scholar]
  • 10.Vitousek MN, et al. The lingering impact of stress: brief acute glucocorticoid exposure has sustained, dose-dependent effects on reproduction. Proc. R. Soc. Lond. 2018;285:20180722. doi: 10.1098/rspb.2018.0722. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Jones J. Tree swallows (Tachycineta bicolor): a new model organism? Auk. 2003;120:591–599. doi: 10.1642/0004-8038(2003)120[0591:TSTBAN]2.0.CO;2. [DOI] [Google Scholar]
  • 12.Whittingham LA, Slikas B, Winkler DW, Sheldon FH. Phylogeny of the tree swallow genus, Tachycineta (Aves: Hirundinidae), by Bayesian analysis of mitochondrial DNA sequences. Mol. Phylogenet. Evol. 2002;22:430–441. doi: 10.1006/mpev.2001.1073. [DOI] [PubMed] [Google Scholar]
  • 13.Ouyang JQ, et al. Do hormones, telomere lengths, and oxidative stress form an integrated phenotype? A case study in free-living tree swallows. Integr. Comp. Biol. 2016;56:138–145. doi: 10.1093/icb/icw044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Schmitt C, et al. Patterns of diversity and spatial variability of β-defensin innate immune genes in a declining wild population of tree swallows. J. Hered. 2017;108:262–269. doi: 10.1093/jhered/esx005. [DOI] [PubMed] [Google Scholar]
  • 15.Bentz AB, Dossey EK, Rosvall KA. Tissue-specific gene regulation corresponds with seasonal plasticity in female testosterone. Gen. Comp. Endocrinol. 2019;270:26–34. doi: 10.1016/j.ygcen.2018.10.001. [DOI] [PubMed] [Google Scholar]
  • 16.Virgin EE, Rosvall KA. Endocrine-immune signaling as a predictor of survival: A prospective study in developing songbird chicks. Gen. Comp. Endocrinol. 2018;267:193–201. doi: 10.1016/j.ygcen.2018.08.008. [DOI] [PubMed] [Google Scholar]
  • 17.Taff, C. C., Campagna, L. & Vitousek, M. N. Genome-wide variation in DNA methylation is associated with stress resilience and plumage brightness in a wild bird. Mol. Ecol. in press (2019). [DOI] [PubMed]
  • 18.Ellegren H, Sheldon BC. Genetic basis of fitness differences in natural populations. Nature. 2008;452:169–175. doi: 10.1038/nature06737. [DOI] [PubMed] [Google Scholar]
  • 19.Zhang J, Yang J-R. Determinants of the rate of protein sequence evolution. Nature Rev. Genet. 2015;16:409. doi: 10.1038/nrg3950. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Axelsson E, et al. Natural selection in avian protein‐coding genes expressed in brain. Mol. Ecol. 2008;17:3008–3017. doi: 10.1111/j.1365-294X.2008.03795.x. [DOI] [PubMed] [Google Scholar]
  • 21.Ekblom R, French L, Slate J, Burke T. Evolutionary analysis and expression profiling of zebra finch immune genes. Genome Biol. Evol. 2010;2:781–790. doi: 10.1093/gbe/evq061. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Hao Y, et al. Comparative transcriptomics of 3 high-altitude passerine birds and their low-altitude relatives. Proc. Nat. Acad. Sci. 2019;116:11851–11856. doi: 10.1073/pnas.1819657116. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Backström N, Zhang Q, Edwards SV. Evidence from a house finch (Haemorhous mexicanus) spleen transcriptome for adaptive evolution and biased gene conversion in passerine birds. Mol. Biol. Evol. 2013;30:1046–1050. doi: 10.1093/molbev/mst033. [DOI] [PubMed] [Google Scholar]
  • 24.Balakrishnan CN, Chapus C, Brewer MS, Clayton DF. Brain transcriptome of the violet-eared waxbill Uraeginthus granatina and recent evolution in the songbird genome. Open Biol. 2013;3:130063. doi: 10.1098/rsob.130063. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Künstner A, et al. Comparative genomics based on massive parallel transcriptome sequencing reveals patterns of substitution and selection across 10 bird species. Mol. Ecol. 2010;19:266–276. doi: 10.1111/j.1365-294X.2009.04487.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Wang Q, et al. Comparative transcriptomics in three Passerida species provides insights into the evolution of avian mitochondrial complex I. Comp. Biochem. Physiol. D. 2018;28:27–36. doi: 10.1016/j.cbd.2018.06.002. [DOI] [PubMed] [Google Scholar]
  • 27.Custer, C. M. Swallows as a sentinel species for contaminant exposure and effect studies in Wildlife Ecotoxicology (eds Elliot, J. E., Bishop, C. A. & Morrissey, C. A.) 45–91 (Springer, 2011).
  • 28.Hawley DM, Hallinger KK, Cristol DA. Compromised immune competence in free-living tree swallows exposed to mercury. Ecotoxicology. 2009;18:499–503. doi: 10.1007/s10646-009-0307-4. [DOI] [PubMed] [Google Scholar]
  • 29.Brasso RL, Cristol DA. Effects of mercury exposure on the reproductive success of tree swallows (Tachycineta bicolor) Ecotoxicology. 2008;17:133–141. doi: 10.1007/s10646-007-0163-z. [DOI] [PubMed] [Google Scholar]
  • 30.Stutchbury BJ, Robertson RJ. Floating populations of female tree swallows. Auk. 1985;102:651–654. doi: 10.1093/auk/102.3.651. [DOI] [Google Scholar]
  • 31.Kempenaers B, et al. Extra-pair paternity and the reproductive role of male floaters in the tree swallow (Tachycineta bicolor) Behav. Ecol. Sociobiol. 2001;49:251–259. doi: 10.1007/s002650000305. [DOI] [Google Scholar]
  • 32.Dunn PO, Whittingham LA, Lifjeld JT, Robertson RJ, Boag PT. Effects of breeding density, synchrony, and experience on extrapair paternity in tree swallows. Behav. Ecol. 1994;5:123–129. doi: 10.1093/beheco/5.2.123. [DOI] [Google Scholar]
  • 33.Barber CA, Robertson RJ, Boag PT. The high frequency of extra-pair paternity in tree swallows is not an artifact of nestboxes. Behav. Ecol. Sociobiol. 1996;38:425–430. doi: 10.1007/s002650050260. [DOI] [Google Scholar]
  • 34.Ekblom R, Balakrishnan CN, Burke T, Slate J. Digital gene expression analysis of the zebra finch genome. BMC Genomics. 2010;11:219. doi: 10.1186/1471-2164-11-219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.MacManes MD, et al. Widespread patterns of sexually dimorphic gene expression in an avian hypothalamic–pituitary–gonadal (HPG) axis. Sci. Rep. 2017;7:45125. doi: 10.1038/srep45125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Lipshutz, S. E., George, E. M., Bentz, A. B. & Rosvall, K. A. Evaluating testosterone as a phenotypic integrator: From tissues to individuals to species. Mol. Cell. Endocrinol. in press (2019). [DOI] [PMC free article] [PubMed]
  • 37.Ketterson ED, Atwell JW, McGlothlin JW. Phenotypic integration and independence: hormones, performance, and response to environmental change. Integr. Comp. Biol. 2009;49:365–379. doi: 10.1093/icb/icp057. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Ball GF, Balthazart J. Individual variation and the endocrine regulation of behaviour and physiology in birds: a cellular/molecular perspective. Phil. Trans. R. Soc. 2007;B 363:1699–1710. doi: 10.1098/rstb.2007.0010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Leffelaar D, Robertson RJ. Nest usurpation and female competition for breeding opportunities by tree swallows. Wilson Bull. 1985;97:221–224. [Google Scholar]
  • 40.Stutchbury BJ, Robertson RJ. Behavioral tactics of subadult female floaters in the tree swallow. Behav. Ecol. Sociobiol. 1987;20:413–419. doi: 10.1007/BF00302984. [DOI] [Google Scholar]
  • 41.Rosvall KA. Life history trade-offs and behavioral sensitivity to testosterone: an experimental test when female aggression and maternal care co-occur. PLoS One. 2013;8:e54120. doi: 10.1371/journal.pone.0054120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Whittingham LA, Schwabl H. Maternal testosterone in tree swallow eggs varies with female aggression. Anim. Behav. 2002;63:63–67. doi: 10.1006/anbe.2001.1889. [DOI] [Google Scholar]
  • 43.Bentz AB, Navara KJ, Siefferman L. Phenotypic plasticity in response to breeding density in tree swallows: an adaptive maternal effect? Horm. Behav. 2013;64:729–736. doi: 10.1016/j.yhbeh.2013.08.009. [DOI] [PubMed] [Google Scholar]
  • 44.Smith-Unna R, Boursnell C, Patro R, Hibberd JM, Kelly S. TransRate: reference-free quality assessment of de novo transcriptome assemblies. Genome Res. 2016;26:1134–1144. doi: 10.1101/gr.196469.115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31:3210–3212. doi: 10.1093/bioinformatics/btv351. [DOI] [PubMed] [Google Scholar]
  • 46.Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007;24:1586–1591. doi: 10.1093/molbev/msm088. [DOI] [PubMed] [Google Scholar]
  • 47.Zhang J, Nielsen R, Yang Z. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol. Biol. Evol. 2005;22:2472–2479. doi: 10.1093/molbev/msi237. [DOI] [PubMed] [Google Scholar]
  • 48.London SE, Clayton DF. Genomic and neural analysis of the estradiol-synthetic pathway in the zebra finch. BMC Genomics. 2010;11:46. doi: 10.1186/1471-2164-11-46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Mindnich R, Möller G, Adamski J. The role of 17 beta-hydroxysteroid dehydrogenases. Mol. Cellular Endocrinol. 2004;218:7–20. doi: 10.1016/j.mce.2003.12.006. [DOI] [PubMed] [Google Scholar]
  • 50.Richardson MF, Sherwin WB, Rollins LA. De novo assembly of the liver transcriptome of the European starling, Sturnus vulgaris. J. Genomics. 2017;5:54–57. doi: 10.7150/jgen.19504. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Enard W, et al. Intra-and interspecific variation in primate gene expression patterns. Science. 2002;296:340–343. doi: 10.1126/science.1068996. [DOI] [PubMed] [Google Scholar]
  • 52.Cáceres M, et al. Elevated gene expression levels distinguish human from non-human primate brains. Proc. Natl. Acad. Sci. 2003;100:13030–13035. doi: 10.1073/pnas.2135499100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Khaitovich P, Enard W, Lachmann M, Pääbo S. Evolution of primate gene expression. Nat. Rev. 2006;7:693. doi: 10.1038/nrg1940. [DOI] [PubMed] [Google Scholar]
  • 54.Yanai I, et al. Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics. 2004;21:650–659. doi: 10.1093/bioinformatics/bti042. [DOI] [PubMed] [Google Scholar]
  • 55.Formenti G, et al. SMRT long reads and direct label and stain optical maps allow the generation of a high-quality genome assembly for the European barn swallow (Hirundo rustica rustica) GigaScience. 2019;8:giy142. doi: 10.1093/gigascience/giy142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Arai E, et al. Physiological conditions and genetic controls of phaeomelanin pigmentation in nestling barn swallows. Behav. Ecol. 2017;28:706–716. doi: 10.1093/beheco/arx012. [DOI] [Google Scholar]
  • 57.Looi QH, Amin H, Aini I, Zuki M, Omar A. De novo transcriptome analysis shows differential expression of genes in salivary glands of edible bird’s nest producing swiftlets. Behav. Ecol. 2017;18:504. doi: 10.1186/s12864-017-3861-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.O’Connor RJ. Structure in avian growth patterns: a multivariate study of passerine development. J. Zool. 1978;185:147–172. doi: 10.1111/j.1469-7998.1978.tb03319.x. [DOI] [Google Scholar]
  • 59.Emlen ST, Wrege PH, Demong NJ, Hegner RE. Flexible growth rates in nestling white-fronted bee-eaters: a possible adaptation to short-term food shortage. Condor. 1991;93:591–597. doi: 10.2307/1368191. [DOI] [Google Scholar]
  • 60.Brzęk P, Konarzewski M. Effect of refeeding on growth, development, and behavior of undernourished bank swallow (Riparia riparia) nestlings. Auk. 2004;121:1187–1198. doi: 10.1093/auk/121.4.1187. [DOI] [Google Scholar]
  • 61.McCarty JP. Variation in growth of nestling tree swallows across multiple temporal and spatial scales. Auk. 2001;118:176–190. doi: 10.1093/auk/118.1.176. [DOI] [Google Scholar]
  • 62.Suwa A, Kurama T, Shimokawa T. Adipocyte hyperplasia and RMI1 in the treatment of obesity. FEBS J. 2011;278:565–569. doi: 10.1111/j.1742-4658.2010.07980.x. [DOI] [PubMed] [Google Scholar]
  • 63.Niehaus JL, et al. CB1 cannabinoid receptor activity is modulated by the interacting protein CRIP1a. Mol. Pharmacol. 2007;72:1557–1566. doi: 10.1124/mol.107.039263. [DOI] [PubMed] [Google Scholar]
  • 64.Pereira R, Sá R, Barros A, Sousa M. Major regulatory mechanisms involved in sperm motility. Asian J. Androl. 2017;19:5. doi: 10.4103/1008-682X.167716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Moller AP, Briskie JV. Extra-pair paternity, sperm competition and the evolution of testis size in birds. Behav. Ecol. Sociobiol. 1995;36:357–365. doi: 10.1007/BF00167797. [DOI] [Google Scholar]
  • 66.Laskemoen T, et al. Sperm quantity and quality effects on fertilization success in a highly promiscuous passerine, the tree swallow Tachycineta bicolor. Behav. Ecol. Sociobiol. 2010;64:1473–1483. doi: 10.1007/s00265-010-0962-8. [DOI] [Google Scholar]
  • 67.Birkhead TR, Pizzari T. Evolution of sex: postcopulatory sexual selection. Nat. Rev. 2002;3:262. doi: 10.1038/nrg774. [DOI] [PubMed] [Google Scholar]
  • 68.Nielsen R, et al. A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 2005;3:e170. doi: 10.1371/journal.pbio.0030170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 69.Shultz AJ, Sackton TB. Immune genes are hotspots of shared positive selection across birds and mammals. Elife. 2019;8:e41815. doi: 10.7554/eLife.41815. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 70.Hau M. Regulation of male traits by testosterone: implications for the evolution of vertebrate life histories. BioEssays. 2007;29:133–144. doi: 10.1002/bies.20524. [DOI] [PubMed] [Google Scholar]
  • 71.Fuxjager MJ, Schuppe ER. Androgenic signaling systems and their role in behavioral evolution. J. Steroid Biochem. Mol. Biol. 2018;184:47–56. doi: 10.1016/j.jsbmb.2018.06.004. [DOI] [PubMed] [Google Scholar]
  • 72.Staub NL, De Beer M. The role of androgens in female vertebrates. Gen. Comp. Endocrinol. 1997;108:1–24. doi: 10.1006/gcen.1997.6962. [DOI] [PubMed] [Google Scholar]
  • 73.Fuxjager MJ, Miles MC, Schlinger BA. Evolution of the androgen-induced male phenotype. J. Comp. Physiol. A. 2018;204:81–92. doi: 10.1007/s00359-017-1215-3. [DOI] [PubMed] [Google Scholar]
  • 74.Schmidt KL, et al. Neurosteroids, immunosteroids, and the balkanization of endocrinology. Gen. Comp. Endocrinol. 2008;157:266–274. doi: 10.1016/j.ygcen.2008.03.025. [DOI] [PubMed] [Google Scholar]
  • 75.Soma K. Testosterone and aggression: Berthold, birds and beyond. J. Neuroendocrinol. 2006;18:543–551. doi: 10.1111/j.1365-2826.2006.01440.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 76.Labrie, F. et al. Structure and control of expression of the 3βHSD and 17βHSD genes in classical steroidogenic and peripheral intracrine tissues in Molecular Basis of Reproductive Endocrinology (eds Leung, P. C. K., Hsueh, A. J. W. & Friesen, H. G.) 112–143 (Springer, 1993).
  • 77.Taves MD, Gomez-Sanchez CE, Soma KK. Extra-adrenal glucocorticoids and mineralocorticoids: evidence for local synthesis, regulation, and function. Am. J. Physiol. Endocrinol. Metab. 2011;301:E11–E24. doi: 10.1152/ajpendo.00100.2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Tsutsui K, Matsunaga M, Ukena K. Biosynthesis and biological actions of neurosteroids in the avian brain. Avian Poul. Biol. Rev. 2003;14:63–78. doi: 10.3184/147020603783641297. [DOI] [Google Scholar]
  • 79.London SE, Monks DA, Wade J, Schlinger BA. Widespread capacity for steroid synthesis in the avian brain and song system. Endocrinology. 2006;147:5975–5987. doi: 10.1210/en.2006-0154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 80.Pradhan DS, et al. Aggressive interactions rapidly increase androgen synthesis in the brain during the non-breeding season. Horm. Behav. 2010;57:381–389. doi: 10.1016/j.yhbeh.2010.01.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 81.Prehn C, Möller G, Adamski J. Recent advances in 17beta-hydroxysteroid dehydrogenases. J. Steroid Biochem. Mol. Bio. 2009;114:72–77. doi: 10.1016/j.jsbmb.2008.12.024. [DOI] [PubMed] [Google Scholar]
  • 82.Nokelainen P, et al. Molecular cloning of mouse 17β‐Hydroxysteroid Dehydrogenase Type 1 and characterization of enzyme activity. Eur. J. Biochem. 1996;236:482–490. doi: 10.1111/j.1432-1033.1996.00482.x. [DOI] [PubMed] [Google Scholar]
  • 83.Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–2120. doi: 10.1093/bioinformatics/btu170. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28:3150–3152. doi: 10.1093/bioinformatics/bts565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 85.Li W, Godzik A. CD-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–1659. doi: 10.1093/bioinformatics/btl158. [DOI] [PubMed] [Google Scholar]
  • 86.Uhlen M, et al. Towards a knowledge-based human protein atlas. Nature. 2010;28:1248. doi: 10.1038/nbt1210-1248. [DOI] [PubMed] [Google Scholar]
  • 87.Mank JE, Hultin-Rosenberg L, Zwahlen M, Ellegren H. Pleiotropic constraint hampers the resolution of sexual antagonism in vertebrate gene expression. Am. Nat. 2007;171:35–43. doi: 10.1086/523954. [DOI] [PubMed] [Google Scholar]
  • 88.Olias P, Adam I, Meyer A, Scharff C, Gruber AD. Reference genes for quantitative gene expression studies in multiple avian species. PLoS one. 2014;9:e99678. doi: 10.1371/journal.pone.0099678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 89.Zinzow-Kramer WM, Horton BM, Maney DL. Evaluation of reference genes for quantitative real-time PCR in the brain, pituitary, and gonads of songbirds. Horm. Behav. 2014;66:267–275. doi: 10.1016/j.yhbeh.2014.04.011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 90.Lex A, et al. UpSet: visualization of intersecting sets. IEEE Trans. Vis. Comput. Graph. 2014;20:1983–1992. doi: 10.1109/TVCG.2014.2346248. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 91.Altschul SF, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. J. Nucleic Acids Res. 1997;25:3389–3402. doi: 10.1093/nar/25.17.3389. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 92.Enright AJ, Van Dongen S, Ouzounis CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 2002;30:1575–1584. doi: 10.1093/nar/30.7.1575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 93.Löytynoja, A. Phylogeny-aware alignment with PRANK in Multiple Sequence Alignment Methods (ed. Russell, D. J.) 155–170 (Springer, 2014). [DOI] [PubMed]
  • 94.Ranwez V, et al. MACSE v2: toolkit for the alignment of coding sequences accounting for frameshifts and stop codons. Mol. Biol. Evol. 2018;35:2582–2584. doi: 10.1093/molbev/msy159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 2000;17:540–552. doi: 10.1093/oxfordjournals.molbev.a026334. [DOI] [PubMed] [Google Scholar]
  • 96.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30:1312–1313. doi: 10.1093/bioinformatics/btu033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 97.Sayyari E, Mirarab S. Fast coalescent-based computation of local branch support from quartet frequencies. Mol. Biol. Evol. 2016;33:1654–1668. doi: 10.1093/molbev/msw079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 98.Mendes FK, Hahn MW. Gene tree discordance causes apparent substitution rate variation. Syst. Biol. 2016;65:711–721. doi: 10.1093/sysbio/syw018. [DOI] [PubMed] [Google Scholar]
  • 99.Ashburner M, et al. Gene ontology: tool for the unification of biology. Nat. Genet. 2000;25:25–29. doi: 10.1038/75556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 100.Gene Ontology Consortium Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res. 2016;45:D331–D338. doi: 10.1093/nar/gkw1108. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Materials (571.5KB, doc)

Data Availability Statement

Raw sequence reads can be obtained from the Gene Expression Omnibus database (GEO accession number GSE126210). The transcriptome has been submitted to the Transcriptome Shotgun Assembly project and deposited at DDBJ/EMBL/GenBank under the accession GHGE00000000. The version described in this paper is the first version, GHGE01000000.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES