Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2016 Nov 8;113(47):E7367–E7374. doi: 10.1073/pnas.1605202113

Molecular and physiological evidence of genetic assimilation to high CO2 in the marine nitrogen fixer Trichodesmium

Nathan G Walworth a, Michael D Lee a, Fei-Xue Fu a, David A Hutchins a, Eric A Webb a,1
PMCID: PMC5127367  PMID: 27830646

Significance

The free-living cyanobacterium Trichodesmium is an important nitrogen-fixer in the global oceans, yet virtually nothing is known about its molecular evolution to increased CO2. Here we show that Trichodesmium can fix a plastic, short-term response upon long-term adaptation, potentially through genetic assimilation. We provide transcriptional evidence for molecular mechanisms that parallel the fixation of the plastic phenotype, thereby demonstrating an important evolutionary capability in Trichodesmium CO2 adaptation. Transcriptional shifts involve transposition and other regulatory mechanisms (sigma factors) that control a variety of metabolic pathways, suggesting alterations in upstream regulation to be important under genetic assimilation. Together, these data highlight potential biochemical evidence of genetic assimilation in a keystone marine N2-fixer, with broad implications for microbial evolution and biogeochemistry.

Keywords: diazotroph, evolution, CO2, genetic assimilation, plasticity

Abstract

Most investigations of biogeochemically important microbes have focused on plastic (short-term) phenotypic responses in the absence of genetic change, whereas few have investigated adaptive (long-term) responses. However, no studies to date have investigated the molecular progression underlying the transition from plasticity to adaptation under elevated CO2 for a marine nitrogen-fixer. To address this gap, we cultured the globally important cyanobacterium Trichodesmium at both low and high CO2 for 4.5 y, followed by reciprocal transplantation experiments to test for adaptation. Intriguingly, fitness actually increased in all high-CO2 adapted cell lines in the ancestral environment upon reciprocal transplantation. By leveraging coordinated phenotypic and transcriptomic profiles, we identified expression changes and pathway enrichments that rapidly responded to elevated CO2 and were maintained upon adaptation, providing strong evidence for genetic assimilation. These candidate genes and pathways included those involved in photosystems, transcriptional regulation, cell signaling, carbon/nitrogen storage, and energy metabolism. Conversely, significant changes in specific sigma factor expression were only observed upon adaptation. These data reveal genetic assimilation as a potentially adaptive response of Trichodesmium and importantly elucidate underlying metabolic pathways paralleling the fixation of the plastic phenotype upon adaptation, thereby contributing to the few available data demonstrating genetic assimilation in microbial photoautotrophs. These molecular insights are thus critical for identifying pathways under selection as drivers in plasticity and adaptation.


Warming temperatures and increasing anthropogenic carbon dioxide (CO2) emissions have galvanized investigations of both short- and long-term responses to global change factors in numerous biological systems. Studies assessing responses of both carbon-fixing (primary producers) and nitrogen-fixing (diazotrophs) organisms to ocean acidification have been of particular interest because of their bottom-up control of global biogeochemical cycles and food webs (1). However, attributing observed phenotypic changes to specific environmental perturbations in situ remains an ongoing challenge, particularly when delineating between phenotypic plasticity and adaptive evolution (2). Phenotypic plasticity occurs when individuals in a population of a given genotype change their phenotype as part of a rapid response to environmental change, whereas adaptive evolution occurs when the underlying genetic (allelic) composition of a population changes the phenotype as a result of natural selection (3). It is also worth noting that population-level phenotypic changes may also ultimately result from environmental stress (2).

Additionally, it has been shown that a range of phenotypic plasticity can exist within a single species (4, 5) and that phenotypic plasticity itself can evolve and aid in adaptation (3, 6, 7). As such, plasticity can potentially affect evolution in opposing ways. It may either facilitate adaptation by having natural selection fix a beneficial plastic trait (phenotype; i.e., genetic assimilation) (8), or it can shield certain genotypes from natural selection if optimal phenotypes may be produced by plasticity alone (3). Hence, these phenomena necessitate investigations into the effects of plasticity on population-level adaptations during periods of environmental pressure. Here, we define genetic assimilation to occur when a trait that originally responded to environmental change loses environmental sensitivity (i.e., plasticity) and ultimately becomes constitutively expressed (i.e., fixed) in a population (8).

Laboratory-based experimental evolution studies enable analysis of organismal and population responses to defined experimental conditions as they transition from plastic to adaptive (7). These insights better inform environmental phenotypic observations and offer more constrained time scales of plasticity vs. adaptation. However, aside from being typically restricted to rapidly dividing microorganisms, the main experimental challenge resides in extrapolating laboratory evolutionary potential to predicting adaptive capacities in natural populations. Thus, comprehensively interpreting in situ genetic and phenotypic datasets remains challenging because of limited knowledge of fundamental biology, gene flow, population sizes, mutation, and recombination rates (3).

One promising approach is to couple molecular techniques with experimental evolution to elucidate the coordination of underlying molecular changes as they influence both the plasticity phenotype and/or evolutionary phenotype/genotype (8, 9). For example, one recent study examining the effect of high CO2 on gene expression changes in the eukaryotic calcifying alga Emiliana huxleyi found that opposing plastic and adaptive phenotypes were also reflected by their corresponding gene expression changes (10). In a preceding study, Lohbeck et al. confirmed adaptation through reciprocal transplantation and observed significant growth rate increases in high-CO2 selected lines relative to those of low-CO2 selected lines under elevated CO2 conditions (11). Reduced growth and calcification in the plastic response correlated with reductions in expression of genes involved in pH regulation, photosynthesis, carbon transport, and calcification, whereas partly restored growth and calcification observed in the adaptive response were associated with the significant recovery of the expression of these genes. Hence, this experiment elucidated an opposing phenotypic relationship between plasticity and adaptation mirrored by underlying gene expression changes in E. huxleyi. However, no studies to date have characterized the molecular progression underlying this transition for marine nitrogen-fixing organisms.

In a prior study, we showed growth and N2 fixation rate increases in response to short-term, elevated CO2 (i.e., plastic response) that became fixed upon long-term adaptation in the biogeochemically important marine diazotroph Trichodesmium (12). High-CO2 selected cell lines exhibited higher fitness in the ancestral (i.e., low) CO2 condition than low-CO2 selected and ancestral cell lines (i.e., positive correlated response), similar to Lohbeck et al. (11). This finding contrasts with other CO2-correlated responses reported in marine phytoplankton (3, 7, 13), although other positive correlated responses have been reported in heterotrophic microbial evolution studies (14). In our study, the observed adaptive response resulted in constitutive growth and N2 fixation rate increases under both CO2 levels, demonstrating an apparent loss of environmental sensitivity to low CO2.

Before our study, all other field and laboratory studies to date have only characterized Trichodesmium short-term responses to environmental change, which has set the stage for evolutionary investigations for factors like iron (Fe), phosphorus (P), and CO2 (1519). For example, multiple short-term (weeks to months) laboratory (17, 2024) and field (1, 25) experiments have demonstrated increases in both growth and N2 fixation in Trichodesmium spp. in response to elevated CO2, revealing a plastic high-CO2 phenotype under specific nutrient conditions. Furthermore, Hutchins et al. (26) observed divergent taxon-specific responses (i.e., reaction norms) to increased CO2 among biogeographically distinct diazotrophic cyanobacteria including Trichodesmium, a result that has also been observed in eukaryotic phytoplankton (5, 27, 28). A reaction norm is defined as a short-term response of a given genotype, which describes trait values such as rate of growth as a function of two or more environments. The shape of a reaction norm will determine which genotypes can most rapidly respond under a changing environment where a greater slope reflects a greater range in plasticity that may increase the competitive advantage for that genotype (29). Hence, genus-level conservation of physiological variability across broad eukaryotic and prokaryotic genera highlights the potential importance of maintaining evolutionary plasticity in certain environmental regimes and suggests differential taxonomic CO2 selection over geological time, which could have ultimately influenced biogeographic distributions. Additionally, Walworth et al. (30) recently validated the conservation of genome architecture and coding potential of the cyanobacterial diazotroph Trichodesmium erythraeum IMS101 (hereafter IMS101) with different Trichodesmium isolates, as well as with natural Trichodesmium populations sampled decades apart. These data help to environmentally contextualize the molecular results described here, which further aid in the extrapolation of laboratory molecular adaptation to the evolutionary potential contained within natural populations.

Leveraging these phenotypic and genetic data, we investigated the global transcriptional underpinnings of long-term CO2 selection of a single IMS101 starting population as its phenotype transitioned from plastic to adaptive. We sequenced biological triplicate transcriptomes of both long-term CO2 treatments after 4.5 y of selection (380-selected and 750-selected), as well as both reciprocal transfers (380s-to-750 and 750s-to-380) after 2 wk in the reciprocal CO2 concentration. One of the most striking insights separating the short- and long-term responses relative to the 380-selected phenotype (i.e., low-CO2 phenotype) was the differential regulation of RNA polymerase sigma factors, which have been shown to induce broad shifts in metabolic pathways in response to carbon and nitrogen fluctuations in other microbial systems (31, 32). Changes in sigma factor expression have also been proposed as mechanisms for the expression of broad gene circuits to undergo canalization (i.e., fixation or loss of low-CO2 plasticity in this case) in genetic assimilation (8), which is evidenced in our data by certain sigma factors and coexpressed genes sharing parallel expression profiles in both plastic and adaptive responses. Additionally, differential expression of transposition was detected in both short- and long-term CO2 responses. Hence, our data suggest that differential regulation of transposition and sigma factor expression may mediate genetic assimilation to long-term CO2 selection, potentially leading to broad, downstream changes in metabolic pathways.

Results and Discussion

One cell line was divided into two CO2 treatments of six biological replicates each and experimentally adapted at both low [380 microatmospheres (µatm)] and high (750 µatm) concentrations for ∼4.5 y (∼570–850 generations, depending on CO2 treatment) with growth rate as a proxy for reproductive fitness (9, 12). At the onset of this incubation, cell lines placed in 750 µatm CO2 (750 ancestral) rapidly increased both growth and N2 fixation rates, whereas cell lines in 380 µatm CO2 (380 ancestral) sustained lower physiological rates (Fig. 1) (12). This immediate fitness increase in response to high CO2 is consistent with the classically observed plastic response of IMS101 (high-CO2 phenotype) as previously shown (see above). After 4.5 y of low and high CO2 selection, no further changes in growth or N2 fixation were observed for either the 380-selected (low-CO2 genotype) or 750-selected (high-CO2 genotype) lines relative to their corresponding 380- and 750-ancestral time points, respectively (Fig. 1). All six replicates in the 750-selected cell lines still maintained significantly higher growth and N2 fixation rates relative to the 380-selected cell lines (Fig. 1; P = 2.4 × 10−5), but showed no further fitness increase after the initial plastic growth rate response, despite ∼850 subsequent generations of selection at high CO2 (Fig. 1, orange bars). Once subcultures of the 380-selected cell lines were placed in high (750 µatm) CO2 for 2 wk after the 4.5-y incubation at low (380 µatm) CO2 (380s-to -750), both growth (fitness) and N2 fixation rapidly increased, similar to the 750-ancestral response and consistent with the aforementioned experiments (Fig. 1, green bars; P = 1.3 × 10−3). However, when subcultures of the 750-selected cell lines were reciprocally transplanted back to the ancestral CO2 condition (750s-to-380; correlated response), a 44% fitness increase was observed relative to both the 380-selected and ancestral cell lines (Fig. 1, Lower; blue bars; P = 3.2 × 10−3), similar to a positive correlated response in one other study (11), but contrasting with most others reported in marine phytoplankton (3, 7, 13). This positive correlated response is corroborated by a nonsignificant selection x assay interaction from the two-way analysis of variance (ANOVA, F = 14.99, P = 0.18; Fig. 1). As such, the 750-selected cell lines after long-term high-CO2 selection were characterized not by steady fitness increases in the selection environment, but a loss of environmental sensitivity to low CO2 by the measured phenotypic traits (growth and N2 fixation). Hence, because both the 380s-to-750 and 750-selected responses to increased CO2 exhibited the same high-CO2 phenotype in the selection environment, the plastic response appears to have been fixed upon adaptation, suggesting that the low-CO2 genotype underwent genetic assimilation to produce the high-CO2 genotype (8). Additionally, upon graphing the growth rate slopes of the 380-ancestral to 750 switch (380a-to-750), 380s-to-750, and the 750s-to-380 across CO2 regimes, positive slopes are observed for both the 380a-to-750 and the 380s-to-750 going from 380 to 750 µatm CO2 (Fig. S1). In contrast, a negative slope is observed for the 750s-to-380 cell lines, suggesting an evolutionary shift in reaction norms between the low-CO2 genotype treatments (380-ancestral and 380-selected) and the high-CO2 genotype (750-selected): a criterion of genetic assimilation (8). The lack of a positive slope of the 750s-to-380 going from 380 to 750 µatm CO2 provides strong evidence that environmentally responsive traits (here growth and N2 fixation) lose environmental sensitivity by maintaining significantly increased rates in 380 µatm CO2 relative to the 380-selected cell lines in the same CO2 condition (8). The similar growth rate slopes of the 380a-to-750 and 380s-to-750 corroborate that the 380-ancestral and 380-selected cell lines are genetically analogous in terms of CO2, which demonstrates that the 380-ancestral cell lines growing in low CO2 have not evolutionarily shifted from the 380-selected cell lines in low CO2, but have indeed evolutionary shifted from the 750-selected cell lines under low CO2 (Fig. S1).

Fig. 1.

Fig. 1.

Growth (Lower) and N2 fixation (Upper) rates of the ancestral and CO2-selected cultures before and after 4.5 y of selection, respectively. The assay condition is denoted on the x-axis, and the selection condition is denoted by the colors of the bar border. The bar colors denote the different experimental treatments that are indicated above each bar. Assays done after 4.5 y of long-term CO2 selection are denoted after the gray solid vertical line. The yellow background denotes the 380-µatm CO2 assay condition, and the gray background denotes that of the 750-µatm CO2. Statistically significant differences were determined by two-way ANOVAs (selection x assay conditions) followed by Tukey’s HSD post hoc tests. Asterisks denote statistical significance between two respective treatments. **P ≤ 0.01. Error bars are SEs of six biological replicate cultures.

Fig. S1.

Fig. S1.

Growth rate slopes for three separate, short-term switch experiments where growth rate was measured after ∼5 generations of acclimation to the new environment. The “380-ancestral to 750 switch and the 380s-to-750” are the low-CO2 selected cell lines and the “750s-to-380” are the high-CO2 selected cell lines.

Hereafter, we have chosen the terms “plastic response” and “adaptive response” to describe the physiological and transcriptional responses deriving from the low- and high-CO2 genotypes, respectively. The 380s-to-750 is a plastic (i.e., nonadapted) response to high CO2 deriving from the 380-selected cell lines (i.e., low-CO2 genotype), and thus we term both the physiological and transcriptional data of the 380s-to-750 treatment a plastic response. Accordingly, because we demonstrated adaptation to have taken place in the 750-selected cell lines (12), the physiological and transcriptional data of the 750-selected cell lines are an adaptive response of the high-CO2 genotype. Because the 750-selected, the 380s-to-750, and the 750s-to-380 all exhibited the high-CO2 phenotype (Fig. 1), genes sharing parallel expression profiles among all three treatments represent those that both rapidly responded to increased CO2 as part of the plastic response and subsequently maintained these profiles as part of the adaptive response, making them putative candidates for genetic assimilation (8). These changes provide evidence for genes whose expression may have been canalized (i.e., loss of low-CO2 plasticity) (8) reflected in the 750s-to-380 condition. Because the cell lines in the 750s-to-380 treatment are the 750-selected cell lines (e.g., same high-CO2 genotype), the transcriptional and physiological data deriving from this 750s-to-380 treatment are a mixture of the 750-selected cell lines transcriptional plasticity to low-CO2 (e.g., the 308 down-regulated genes in the lower blue portion of the 750s-to-380 treatment circle in the Venn diagram in Fig. 2) and genes whose expressions are putatively assimilated as a product of adaptation in which expression profiles should be analogous to those of the 750-selected treatment, regardless of CO2 concentration (i.e., the 45 down-regulated gene portion of the Venn diagram shared by the 750-selected and 750s-to-380 treatments in Fig. 2). Hence, gene expression profiles from the 750-selected and 750s-to-380 treatment derive from the same high-CO2 genotype (i.e., adapted to high CO2), and thus we describe these shared expression profiles as part of an adaptive response reflecting the loss of physiological CO2 plasticity (i.e., genetic assimilation). Hence, the transcriptional pool shared among the high-CO2 genotype treatments (750-selected and 750s-to-380) may contain mechanisms that are potentially driving the maintenance of the high-CO2 phenotype, even after transfer back to ancestral CO2 levels. Therefore, we characterize gene expression profiles shared between the 750-selected and 750s-to-380 conditions that parallel this phenotypic maintenance, reflecting the loss of low-CO2 sensitivity after long-term, high-CO2 adaptation.

Fig. 2.

Fig. 2.

Shown are down-regulated GO-enriched pathways relative to 380-selected and transcriptional profiles of sigma factors, sigC and sigF as well as Fur proteins. (A) Down-regulated GO-enriched pathways for the all high-CO2 phenotype treatments. (B) Differential expression of transcriptional regulators with stars representing statistical significance relative to the 380-selected and error bars being SEs.

To examine transcriptional responses, we gently filtered replicate cultures of each treatment growing semicontinuously during the middle of the photoperiod (∼11:00 AM) followed by flash-freezing in liquid nitrogen and storing until processing (SI Materials and Methods). Libraries were then constructed by using equimolar amounts of RNA per library and sequenced on the Illumina HiSeq 2000 (i.e., RNA-Seq), yielding 50-base pair, single-end reads (SI Materials and Methods). Reads were then quality trimmed and mapped onto the IMS101 reference genome, followed by normalization and differential expression analysis using the edgeR package (see SI Materials and Methods for full statistical description). Briefly, libraries were normalized using the trimmed mean of M-values method. M-values are the library size-adjusted log-ratio of counts between the control RNA-Seq library (380-selected), and the treatment of interest in which the most extreme 30% of M values are trimmed before calculating the resulting trimmed mean. This method attempts to eliminate systematic differences in the counts between the libraries (e.g., RNA pools between treatments) by assuming that most genes are not differentially expressed. Common dispersion was estimated by fitting a generalized linear model (GLM), and differentially expressed genes were determined by fitting the negative binomial GLM followed by a likelihood ratio test. Finally, genes with a Benjamini–Hochberg false discovery rate (FDR) < 0.05 were deemed differentially expressed. General sequencing and differential expression statistics can be found in Tables S1 and S2, and global expression maps can be examined in Fig. S3.

Table S1.

RNA-Seq statistics

Treatment Replicate Library size No. of reads mapped to genome Mapping success, % Approximate genome coverage
380-selected 1 42,766,345 32,459,668 75.90 211
380-selected 2 37,602,118 21,933,592 58.33 142
380-selected 3 49,559,670 31,865,307 64.30 207
380s-to-750 1 54,536,007 35,335,952 64.79 229
380s-to-750 2 51,275,763 41,927,359 81.77 272
380s-to-750 3 51,877,957 41,174,051 79.37 267
750-selected 1 42,770,001 37,265,614 87.13 242
750-selected 2 49,272,478 42,002,066 85.24 273
750-selected 3 48,817,802 32,923,568 67.44 214
750s-to-380 1 44,784,621 38,290,004 85.50 249
750s-to-380 2 57,606,174 46,913,499 81.44 305
750s-to-380 3 59,270,407 50,228,170 84.74 326

Table S2.

Amount of differentially expressed genes per treatment

Treatment Up Down
750-selected 260 122
380s-to-750 84 29
750s-to-380 462 398

Down, down-regulated genes; Up, up-regulated genes.

Fig. S3.

Fig. S3.

Scatter plots of total gene expression per treatment. Red data points denote differentially expressed genes relative to the 380-selected control treatment. Red data points lying above the “0” on the logFC (log fold change) axis represent up-regulated genes, and red data points lying below the “0” represent down-regulated genes. The horizontal “Average logCPM” represents the log counts per million reads mapped to each gene.

We were able to identify processes involved in the short (380s-to-750; plastic), long (750-selected; adaptive), and correlated (750s-to-380) responses, implicating them to be important for instigating and sustaining the high-CO2 phenotype (i.e., genetic assimilation; Dataset S1). Genes exhibiting significant decreases in expression (down-regulation) in the high-CO2 phenotype (Fig. 2A “Plastic + Adaptive”) relative to low-CO2 phenotype levels were enriched in Gene Ontology (GO) metabolisms involving broad metabolic processes—particularly of note, sigma factor activity and carbon transport [Fig. 2A, hypergeometric test with Benjamini–Hochberg correction FDR <= 0.1 (33)]. Differential expression of specific transposition types (Fig. 3 and below) was also correlated to the high-CO2 phenotype, suggesting them to be potential targets of CO2 selection or the result of prolonged CO2 exposure. Similarly, genes with increased expression (up-regulation) shared across all three treatments exhibiting the high-CO2 phenotype (380s-to-750, 750-selected, and 750s-to-380) were detected in widespread pathways (Fig. 4 and below), making them candidate genes that may have undergone genetic assimilation in the transition from plasticity to adaptation.

Fig. 3.

Fig. 3.

Shown are hierarchical clustering of logtwofold changes of TE centroids differentially expressed in at least one high-CO2 phenotype treatment (A), the distribution of TE copies within different genomic elements (B), and the distribution of TE centroids and their corresponding genome copies (C). (A) Hierarchical clustering of log-twofold changes of differentially expressed TE centroids in at least one high-CO2 phenotype treatment (SI Materials and Methods) resulting in two well-defined clusters with one representing TE centroids with increased average log-twofold changes in the high-CO2 phenotype (blue) treatments relative to the 380-selected and vice versa for the “decreased” cluster (red). Bolded/asterisked labels indicate differentially expressed TE clusters in all high-CO2 phenotype treatments vs. the 380-selected treatment. (B) Pie chart of the distribution of all detected TE copies in the genome. (C) A genome plot of TE centroids and their corresponding copies as well as all detected TE copies in the genome. Going from outside to inside: Track 1 shows TE copies in the genome on the forward strand and is colored according to genomic element. Track 2 is the same, but on the minus strand. Track 3 contains the names and symbols for differentially expressed TE centroids in A relative to the 380-selected treatment. Track 4 shows the distribution of the corresponding copies of differentially expressed TE centroids in track 3 via blue and red colored links. Underlying gray links represent paralogous copies from TE clusters showing no change in expression.

Fig. 4.

Fig. 4.

Shown are up-regulated GO-enriched pathways for the all high CO2 phenotype treatments relative to the 380-selected replicates. Significantly up-regulated genes were classified into GO pathways and tested for significant enrichment among the treatments (SI Materials and Methods).

To test for the probability of sharing a given amount of differentially expressed genes between treatments by chance alone, pairwise hypergeometric tests were conducted for both up- and down-regulated gene sets. These results revealed very low probabilities that the number of genes exhibiting parallel expression profiles were shared by chance alone among down-regulated gene sets (Fig. 2A) between 380s-to-750 and 750s-to-380 (P < 10−13), 750-selected and 380s-to-750 (P < 10−29), and 750-selected to 750s-to-380 (P < 10−42). Similarly, low probabilities were also observed for up-regulated gene sets (Fig. 4) between the 380s-to-750 and 750s-to-380 (P < 10−50), the 750-selected and 380s-to-750 (P < 10−48), and the 750-selected to 750s-to-380 (P < 10−79). Because of the low probabilities of sharing these many genes or more between treatments by chance alone, it is likely that some of the downstream effects of these shared expression changes are associated with the plastic and/or adaptive responses to high CO2.

Sigma Succession Underlying Plasticity-Mediated Adaptation.

Our data show that increased CO2 correlates with lower expression of RNA polymerase sigma factors sigC (Tery_1956; group 2) and sigF (Tery_3916; group 3) (Fig. 2B; ref. 34). Differential regulation of sigma factors, “sigma switching,” aids in both stress responses and adaptation via transcriptional initiation of gene sets specific to particular environmental or internal cellular changes (35). For example, sigC transcripts have been shown to increase under short-term nitrogen limitation in diazotrophic cyanobacteria (31), which is consistent with the simultaneous decrease in sigC (ortholog to Anabaena sigC, reciprocal best blast hit, evalue < 1e-10) expression and increase in nitrogen fixation in all high-CO2 phenotypes (Fig. 2B, red bars).

Furthermore, homologs of the ferric uptake regulator protein, Fur, have been shown to bind to the promoter region of sigC in cyanobacteria, implicating sigC to have a connective role in both nitrogen and iron homeostasis, and potentially oxidative stress as well (36). Accordingly, the ferric uptake regulators, furA (Tery_1958) and furB (Tery_1953) genes (orthologs to Anabaena, reciprocal best blast hit, evalue < 1e-10), exhibited parallel decreases in expression with sigC under prolonged high CO2 (Fig. 2B). In contrast, a fur paralog (Tery_3404) showed no changes in expression after prolonged exposure to high CO2 in replete iron. These transcriptional reductions of fur homologs in high CO2 may enhance tetrapyrrole production (Fig. 4 and below), as observed in the cyanobacterial diazotroph Anabaena (37). Furthermore, furA (Tery_1958) and furB slightly increased expression in the 750s-to-380 treatment compared with the 750-selected treatment, suggesting a short-term response to low CO2 exposure relative to their decreased expression seen in the 750-selected under high CO2. Together, these data provide some evidence of the coregulation of sigC and specific fur genes as part of both short-term plastic and long-term adaptive responses. Additionally, the maintenance of sigC down-regulation in the transition from the low- to high-CO2 genotype (Figs. 1 and 2B) provides strong evidence for genetic assimilation of this sigma factor and its targets.

Conversely, sigF (ortholog to Synechocystis sigF, reciprocal best blast hit, evalue < 1e-10) transcription was only significantly decreased in 750-selected and 750s-to-380 treatments (high-CO2 genotype), suggesting significant down-regulation of sigF to be primarily involved in adaptation rather than initiation of the high-CO2 phenotype (Fig. 2B, purple bars). sigF is involved in a variety of cellular processes and has been shown to target other transcriptional regulators such as rsfA in the Gram-positive Bacillus subtilis (38), as well as a phytochrome-like histidine kinase in the cyanobacterium Synechocystis PCC6803 (34). Interestingly, an IMS101 hypothetical protein (Tery_2530), containing an rsfA domain (BLASTx, default settings), as well as a PAS/PAC signal transduction histidine kinase (Tery_4221) containing several overlapping portions of conserved domains including bacteriophytochrome (COG4251), phosphate regulon sensor kinase (PhoR; TIGR02966) (39), and NtrY (COG5000) also exhibited parallel down-regulation with sigF (Dataset S1). Intriguingly, NtrY modulates nifA expression that specifically controls expression of N2-fixing nif genes in the symbiotic diazotroph Azorhizobium caulinodans ORS571 (40). However, no IMS101 nifA homologs to that of ORS571 were detected. PAS-containing histidine kinases have also been shown to bind to a wide array of cofactors and are important signaling modules that monitor changes in light, redox potential, small ligands, and cellular energy (41, 42). Hence, Tery_4221 may regulate several different metabolic functions aiding in increased growth and N2 fixation. Regardless, its significantly decreased expression in conjunction with sigF after long-term high CO2 exposure implicates a role in influencing the high-CO2 genotype/phenotype.

Together, these data suggest that the fixation of long-term, increased growth and N2 fixation in 750-selected cell lines is associated with a short-term, plastic response reflected in sigC and other genes whose expression profiles were similar in both the 380s-to-750, 750-selected, and 750s-to-380 treatments. Reduced sigC transcription is associated with the initiation of the high-CO2 phenotype, whereas decreased expression of both sigC and sigF ultimately contribute to its adaptive maintenance.

Transposition Regulation in Plasticity and Adaptation.

In addition to sigma switching, shifts in transposable element (TE) regulation have been shown to be involved in both environmental plasticity and adaptation (43). Numerous partial TE genome sequences are marks of neutral maintenance based on TE deletion bias (43), and IMS101 is indeed enriched in TEs and TE pseudogenes (30). Accordingly, it has been shown that repetitive elements such as TEs can selectively mediate genome plasticity, whereas partial TEs can also act to inhibit transposition, which may be partly why both IMS101 and natural populations of Trichodesmium have retained numerous repetitive elements and TE pseudogenes (30). Hence, the long-term maintenance of TEs and their expression in situ may indicate an important role for both the plasticity and adaptation of Trichodesmium to environmental change in situ. However, reliably quantifying expression of repetitive DNA sequences (repeats) such as TEs (e.g., insertion sequences) remains a significant challenge for next-generation sequencing methods using short-read technology (e.g., Illumina sequencing typically 50–150 base pairs) because of difficulty in mapping repetitive sequences to a single genomic location (e.g., multireads; SI Note 1) (44).

To try and circumvent these challenges, we developed a method to quantify the expression of TE clusters by binning TE sequences with ≥70% identity into clusters (45) and quantifying the expression of each cluster across treatments (see SI Note 1 for detailed methodology). These analyses resulted in several (5 of 16) of the clusters exhibiting significantly different mean expression values between the low- CO2 (i.e., 380-selected) and high- CO2 (i.e., 380s-to-750, 750-selected, and 750s-to-380) phenotypes (Fig. 3 A and C, bolded/asterisk labels). We also identified a cluster (TE_67) that initially responded to high CO2 in the plastic response (380s-to-750) and maintained its expression profile in the adaptive response (750-selected and 750s-to-380), consistent with other genes’ expression (e.g., sigC), showing evidence for genetic assimilation (Fig. S2). Other TE clusters’ mean expression was also different from the 380-selected treatments in other high-CO2 phenotype conditions, suggesting consistent responses to CO2 across conditions (Fig. S2).

Fig. S2.

Fig. S2.

Bar graph of select TE clusters that shared significantly different (edgeR; SI Materials and Methods) mean expression profiles between at least two high-CO2 phenotype treatments relative to the 380-selected cell lines. Stars represent statistically significant up-regulation and circles down-regulation. Error bars are SEs.

Additionally, 77% (53 of 69) of total TE clusters (n = 69; Dataset S2) showed no differences in expression indicating maintained cluster expression irrespective of phenotype, which suggests widespread TE activity devoid of selection (Fig. 3C, light gray links). Together, these patterns corroborate the maintenance of transposition as a result of neutral processes (43) and/or weak selection (46, 47) where stable coexistence occurs between TEs and the host genome. Furthermore, ∼75% of all detected TE paralogs reside within either genic or pseudogenic bodies (Fig. 3B), suggesting pseudogenization as a mechanism for generating degenerate TE genome copies. Upon plotting the locations of all differentially expressed centroid sequences along with their corresponding paralogs (Fig. 3C), some TE clusters contained numerous copies with widespread distributions across the genome (e.g., TE_30 and TE_56), whereas others contained only one or two copies (e.g., TE_12 and TE_40). The mechanisms involved in the differing degrees of TE cluster proliferation remain unknown, but these potential TE-controlling mechanisms may also contribute to the high genome conservation observed between IMS101, other isolates, and natural populations (30). In summary, the fact that most TE clusters show no changes in expression between the low- and high-CO2 phenotypes suggests that these are being maintained under neutral processes and/or weak selection, whereas the few that did exhibit differences between phenotypes are potential candidates under selection. Hence, these data implicate differential regulation of certain transposition types to be involved in or caused by both plastic and adaptive high-CO2 phenotypes (see SI Note 1 for more discussion).

Functional GO-Enriched Transcription in Plasticity and Adaptation.

The plastic response (i.e., 380s-to-750) exhibited significant GO enrichment of tetrapyrrole biosynthesis driven by the up-regulation of Tery_3684 (Anabaena sp. wa102 homolog, hemB, AA650_24065). These expression profiles are consistent with prior observations of hemB induction via decreases in Fur transcription in the cyanobacterial nitrogen-fixer Anabaena (37). González et al. (37) also showed dual roles of Fur regulation in some Heme proteins involved in tetrapyrrole biosynthesis, including transcriptional repression (e.g., hemB, hemC, and ho1) and activation (e.g., hemK and hemH). However, other hem genes showed either broad variability or no changes in expression and seemed to be regulated independently of Fur, leading the authors to suggest each hem gene to be under different regulatory mechanisms. Similar to Anabaena, we detected the up-regulation of hemB, whereas other detected hems either showed variable or no changes in expression. Additionally, the significant fraction of shared up-regulated genes between the 380s-to-750 and the 750s-to-380 (hypergeometric test, P < 10−50) were enriched in GTP and cytochrome b6f complex homologs, which transfers electrons from PSII to PSI. The enrichment of PSII light-harvesting and electron transport metabolisms was also consistent with previously observed decreases in PSI:PSII ratios under short-term exposure to high CO2 (20) (Fig. 4). Together, shifts in transcription of genes involved in electron flow deriving from PSII seem to be part of a rapid plastic response to general fluctuations in CO2 regardless of sign. Interestingly, all high-CO2 phenotype treatments demonstrated slightly different enrichments of photosynthesis GO subpathways, but they all shared up-regulation of proteins involving electron flow and light reactions, suggesting these transcriptional changes to be important in the genetic assimilation of the high-CO2 genotype/phenotype.

Hutchins (2007) and Levitan et al. (2007) observed no significant changes in either photosynthetic rates or photochemical activity of PSII, respectively, between short-term, low- and high-CO2 treatments in IMS101. These observations led them to suggest that increased growth and N2 fixation is energized from decreased energetic demands in other cellular processes (e.g., alleviation of carbon limitation) rather than increased photosynthetic electron flow (17, 20). However, Levitan et al. (2007) also observed lower PSI:PSII ratios under high CO2, indicating decreased investment in PSI biosynthesis generally consistent with our expression results (see above), and leading them to hypothesize that a reduction in iron-heavy PSI would free up available Fe for nitrogenase (20). Our results generally support these observations through decreases in carbon transport and increases in PSII-associated gene expression (Figs. 2 and 4). It is worth noting that the discrepancy between the increases in PSII-associated gene expression and the lack of observed changes in photosynthetic and PSII activity may be due to several possibilities, including time of sampling, posttranscriptional, posttranslational, and protein degradation regulation (48). Future studies investigating changes in gene expression and/or photosynthesis-related protein abundances involved in photosystem electron flow should include several diel sampling points with simultaneous measurement of photosynthetic rates to determine the specific roles of the photosystem expression and activity associated with increased growth and N2 fixation.

GO-enriched groups in both plastic and adaptive responses (Fig. 4, Center, Plastic + Adaptive) included enhanced energy production (Fig. 4, blue symbols), carbon fixation (orange symbols; consistent with ref. 17), nitrogen storage (orange symbols), and carbon storage (magenta symbols). As such, these enriched metabolisms highlight pathways potentially influenced by CO2 concentrations on short timescales, which appear to have been subsequently fixed upon prolonged CO2 exposure in a stable, nutrient replete environment.

Teasing Apart the Molecular Succession Underlying Potential Genetic Assimilation.

Although the physiological transition from plasticity to adaptation was phenotypically neutral in the selection environment, the same high-CO2 phenotype shared between the low- CO2 (380s-to-750) and high-CO2 (750-selected and 750s-to-380) genotypes enabled identification of expression changes that were initially involved in short-term increased growth and sustained in long-term adaptive maintenance, thereby corroborating gene expression canalization via genetic assimilation. These gene expression changes may cause or be part of other phenotypic changes, so our phenotypically indistinguishable (growth and N2 fixation) low- and high-CO2 genotypes may in fact differ through other expressed traits unmeasured in this study. Future studies analyzing genes exhibiting diel patterns of expression should also include sampling at alternative time points to see how diel expression changes correlate with the observed physiology. However, other portions of the metabolic pool exhibited clear expression differences solely in the high-CO2 genotype (750-selected and 750s-to-380 treatments) relative to the 380-selected, suggesting these gene expression changes to be specific to the long-term maintenance of the high-CO2 phenotype, even in the ancestral environment. Several other lines of evidence also corroborate these molecular and physiological parallels.

First, of all differentially expressed genes, it is unlikely that the amount of genes exhibiting consistent expression between the 750-selected and 750-to-380s conditions were shared by chance alone for both the down-regulated (hypergeometric test, P < 10−42) and up-regulated (P < 10−79; Figs. 2 and 4, Adaptive sections) fractions, suggesting their expression to be nonrandomly associated to this genotype. Second, the strong statistical support for the differential down-regulation of specific sigma factors and other metabolic genes (see above) in the plastic vs. adaptive responses suggests differing roles in short- and long-term CO2 phenotypes, respectively. It is worth noting that changes in expression of these sigma factors and other genes may either be part of the mechanisms producing the observed phenotype or secondary effects after upstream metabolic/mechanistic processes responding to high CO2. For example, differential regulation of sigC may either help mediate or be a product of the transition from plasticity to adaptation, whereas differences in sigF expression may be primarily associated with the adaptation (Fig. 2).

Additionally, the plastic response of Trichodesmium to high CO2 in stable light and replete nutrients may initially shield it from adaptation on short timescales because an optimum phenotype is achieved by plasticity alone. However, upon prolonged selective CO2 pressure, initial short-term, plastic responses appear to become fixed if held under constant conditions, which in this case seems to have led to a loss of the low-CO2 phenotype (i.e., genetic assimilation) (Fig. 1, blue bars). Underlying this physiological trend are canalized pathways in both down-regulation (Fig. 2) and numerous up-regulated pathways (Fig. 4). Although our physiological and transcriptional data conform to prior criteria and observations set forth by other independent studies observing genetic assimilation (8, 49, 50), future studies can include various time series assays (e.g., reaction norms and functional genomics), which may further elucidate underlying mechanisms contributing to the adaptive walk of genetic assimilation as these mechanisms currently remain unclear (49). For example, in a theoretical modeling study, Kronholm and Collins (2015) suggest that one potential genetic assimilation mechanism may be that an epigenetic mutation produces an optimal (plastic) phenotype that is then later replaced by a genetic mutation to maintain it (now an adaptive phenotype) (49). This replacement thus results in a trait that is now environmentally robust to the environmental fluctuation that first triggered it (in this case, CO2; Fig. 1).

In summary, the adaptation of IMS101 to high CO2 in stable light and replete nutrients is mediated through an initial plastic response reflected in corresponding changes in both phenotype and gene expression. Our data suggest upstream regulatory elements (e.g., sigma factors) and differential regulation of transposition clusters to influence both short- and long-term CO2 responses. The maintenance of the adaptive phenotype in the ancestral condition may be influenced by both plasticity-derived gene expression as well as canalized gene expression after adaptation. Additionally, increased transcription of photosystem electron flow and its mechanical components [e.g., histidine enrichment (51, 52)], in concert with the differential expression of potential iron and redox sensing regulation, possibly suggests constant light and replete iron to be synergistically acting with enhanced CO2 to initiate and maintain increased growth and N2 fixation. Indeed, the short-term achievement of the plastic high-CO2 phenotype has been demonstrated in natural populations when conditions were appropriate (1, 25), but our observed form of laboratory adaptation, defined by the apparent loss of the low-CO2 phenotype, will likely depend on both genotype (26) and the availability of in situ compensatory environmental factors [e.g., replete phosphorus and iron (53)] to maintain increased growth and N2 fixation. Future efforts will involve alternative DNA-sequencing technologies (i.e., long-read DNA sequencing and optical mapping) to identify potentially adaptive genomic rearrangements under high CO2 to circumvent analysis issues produced by short-read sequencing (see above).

Because the plastic high-CO2 phenotype seemed to have been fixed upon adaptation in IMS101, optimal plastic phenotypes that may have initially shielded adaptive genotypes can be acted on by natural selection to facilitate adaptation upon longer selection. Varying physiological results have been observed in other algal systems in which the plastic response is either maintained during evolution as in this study or ultimately reversed by adaptation (13, 54). These types of data provide environmentally relevant genetic context to physiological adaptation (see SI Note 2 for more discussion) and future efforts examining both genetic and epigenetic effects on adaptation should provide insight into potential mechanisms driving ultimate differences in expression levels between experimental conditions (9, 49). In summary, this study supports both past observed short-term CO2 responses and contributes evolutionary observations corroborating genetic assimilation in the globally distributed and biogeochemically important Trichodesmium.

SI Note 1

Multiread mapping can result in erroneous or biased read counts per repetitive element, which inevitably skews downstream quantification. Several statistical methods have been developed in attempts to more accurately assign multireads to specific genomic locations, but nearly all of them focus on novel isoform detection in eukaryotic genomes (reviewed in ref. 44). Hence, to analyze expression of paralogous TEs associated with the high-CO2 phenotype in the TE-heavy IMS101 genome (30), we developed a method to quantify relative transcription of TE clusters irrespective of genomic position. Briefly, TE sequences were obtained from Walworth et al. (30) and clustered at 70% identity by using USEARCH (59), in which representative sequences (centroids) for each of the 69 clusters detected were identified (Dataset S2). Next, BLASTn (60) was used to search for all paralogous sequences for each centroid sequence (cluster) within the IMS101 genome (SI Materials and Methods) followed by mapping of RNA-Seq reads (55) to all sequences within every cluster. Then, read counts for all sequences within a cluster were summed to produce aggregated read counts per cluster followed by normalization and differential expression analysis using edgeR (56). Finally, any cluster differentially expressed in at least one high-CO2 phenotype treatment (380s-to-750, 750-selected, or 750s-to-380) relative to the 380-selected reference was selected for further analysis, and log-twofold changes were calculated for each high-CO2 phenotype treatment of each cluster relative to the 380-selected condition.

To identify TE-cluster expression correlated to the high-CO2 phenotype, hierarchical clustering with multiscale bootstrap resampling (replicates = 1,000) of log-twofold changes for all high-CO2 phenotype treatments (Dataset S2) was conducted (45), resulting in two well-defined groups of TE clusters whose average high-CO2 phenotype expression was either increased or decreased relative to the 380-selected reference (Fig. 3A). Because of the difficulty in quantifying location-specific TE expression (see above), in addition to the lack of knowledge of TE regulation in IMS101, this method was primarily developed to conservatively correlate TE types (clusters) to increased growth and N2 fixation. It also serves as a hypothesis-generating tool to identify transcriptionally responsive TE types to high CO2 for downstream genome-wide studies correlating TE transcription to actual transposition. To test for the strength of positive or negative associations of specific TE clusters to the high-CO2 phenotype, Welch’s t tests were conducted between the 380-selected biological replicates (n = 3) and all high-CO2 phenotype replicates (n = 9) for each TE cluster exhibiting differential expression in at least one high-CO2 phenotype treatment (Fig. 3A; n = 16 clusters). These results indicate that expression between the high-CO2 phenotype treatments relative to the 380-selected phenotype for these TE clusters was generally consistent in sign as a result of high CO2 exposure, implicating the simultaneous enrichment of specific types of TE elements along with the concurrent repression of others underlying increased growth and N2 fixation. No apparent trends were observed between increased/decreased TE-cluster expression and strand, genic/intergenic body, cluster copy number, or location.

SI Note 2

In situ genomic/transcriptomic/proteomic surveys provide snapshots of microbial metabolic potential and activity in situ, which is then typically correlated to fluctuating physicochemical parameters in the surrounding environment. Hence, these subcellular profiles presumably represent plastic processes interacting with a dynamic environment. Presently, there is very little data connecting plastic mechanisms (from genes to traits) to evolutionary mechanisms (3). In other words, little is known about the genes/pathways/traits that natural selection will act upon in response to global change stressors over time. Because natural microbial populations are currently experiencing changing habitats at different rates depending on geographic location, one goal is to potentially identify natural populations adapting to global change in situ. By conducting experimental evolution studies using global change stressors such as this one, we can identify pathways/traits that respond to high CO2 on both short and long timescales to establish links between plastic parts of the metabolism and how this plasticity may change or stay the same upon long-term exposure to CO2. Our findings that certain gene expression and other traits (e.g., N2 fixation) have the potential to be fixed from plasticity to adaptation are important findings. Our fixation observation provides nuanced details to inform genomic surveys of natural populations experiencing global change factors in situ. Here we are focusing on both those metabolic pathways that have the potential to respond and evolve in the future ocean and those that do not seem to be directly affected by CO2. Importantly, Schaum and Collins (13) showed that plastic traits can reverse in sign upon adaptation in a marine phytoplankton—data that contrast with our findings showing traits being fixed from plasticity to adaptation. We believe that these system-specific differences will be important to determine the “winners and losers” in global ecosystems as they experience global change, which may improve our interpretations of metaomics surveys containing these organisms over time as global change continues to impact population. For more information, see Collins et al. (3).

SI Materials and Methods

Culturing and Physiology.

T. erythraeum strain IMS101 (IMS101) was originally isolated off the coast of North Carolina in ∼1990 (61) and reisolated as a single clone from the original culture in 2000 at Woods Hole Oceanographic Institution in Massachusetts. After obtaining this clonal culture, we maintained it in a modified Aquil medium (12, 53) containing standard vitamins and trace metals with 500 nM iron and 20 µM phosphate and began bubbling with CO2 in ∼2008. The medium was devoid of fixed nitrogen, and cultures were grown under a light intensity of 120 µmol photons per meter squared per second with a light-dark cycle of 12:12 light:dark in 26 °C incubators. Semicontinuous culturing methods were used on six replicate cell lines per treatment to allow for the measurement of CO2 effects during acclimated, steady-state growth. Each replicate was diluted individually based on the growth rate calculated for the respective replicate (12, 17, 26), and cultures were kept optically thin to avoid self-shading, nutrient limitation, and perturbations to targeted CO2 levels. Total population size in each biological replicate was ∼7 × 105 to 1.1 × 106 cells, depending on growth stage, based on microscopic cell counts. Growth rates were calculated from microscopic cell counts for reported values, and in vivo chlorophyll measurements were used for growth rate calculations for semicontinuous dilutions in real-time with no significant differences between the two when compared (12).

Cultures were continuously bubbled with prepared air/CO2 mixtures (Praxair) to maintain stable CO2 concentrations of 380 (380-selected) and 750 (750-selected) µatm CO2 for 4.5 y, respectively (see below for carbonate system measurements). Next, a pair of short-term 2-wk CO2 reciprocal transfer experiments were conducted using these long-term cultures where the 380-selected cell lines were transferred to 750 µatm CO2 and vice versa for the 750-selected cell lines, with experimental conditions and dilution frequencies identical to those of the long-term cultures. At the end of the 2-wk incubation (four to five generations), growth and N2 fixation were measured in the middle of the photoperiod for all treatments as described (12). For all experiments, significant differences were determined by using a two-way ANOVA (selection x assay) followed by pairwise Tukey’s HSD post hoc multiple comparisons tests. The data were determined to follow normal distribution via inspection of a QQplot. All physiological analysis was conducted in R (R Core Team 2014).

Carbonate System Measurements.

The carbonate buffer system was analyzed as described (12) to ensure target CO2 concentrations were achieved throughout the experiment. The seawater carbonate buffer system was analyzed in each replicate periodically throughout the entire experiment including at each sampling time point. pH was measured by using an Orion 5 STAR pH meter (Thermo Fisher Scientific) with a combined glass electrode, and the meter was calibrated with National Bureau of Standards buffer solutions of pH 4, 7, and 10. Dissolved inorganic carbon was measured with CO2 coulometry (model CM 140, UIC). The CO2 partial pressure was calculated from the dissolved inorganic carbon and measured pH values by using the CO2SYS software (12, 17, 26). Photosynthesis and respiration had minimal effects on the seawater carbonate system because of the close control of pCO2 equilibrium via constant bubbling as measured pCO2 was always within ∼5% of the two targeted values.

RNA Isolation and Extraction for Illumina Sequencing.

Samples for RNA were taken concomitantly with growth and N2 fixation measurements in the middle of the photoperiod for each replicate as described (30). Genome coverage and read mapping statistics can be found in Table S1. Briefly, cells were swiftly and gently filtered at 11:00 AM onto 5-µm polycarbonate filters (Whatman), immediately flash-frozen, and stored in liquid nitrogen until RNA extraction. RNA was extracted from three randomly chosen biological replicates per treatment by using the Ambion MirVana miRNA Isolation Kit (Thermo Fisher Scientific) in an RNase-free environment according to the manufacturer’s instructions, followed by two incubations with Ambion’s Turbo DNA-free kit to degrade trace amounts of DNA. Furthermore, multiple chromosomal segments recruited zero reads indicating that any contaminating DNA was either sufficiently removed or sufficiently low. Ribosomal RNA was removed using Epicentre’s Ribo-zero Magnetic Bacteria kit (MRZMB126), quantified by using the Qubit RNA BR Assay Kit (Life Technologies), and cDNA and library construction was done according to the USC Epigenome Center online protocols of Illumina’s Tru-Seq kit (epigenome.usc.edu/services/nextgen/making_libraries.html). Multiplexed libraries were sequenced using Illumina Hi-Seq2000 yielding single-end 50-base pair read libraries.

Differential Expression Analysis.

Raw fastq files were quality trimmed and filtered as described (30) and mapped onto IMS101, IMG-called genes (https://img.jgi.doe.gov/) by using Bowtie2 (Version 2.2.6) (55) with default settings followed by differential expression analysis using edgeR (56). Briefly, genes containing >1 read per million reads in at least three samples were retained, followed by library normalization using the trimmed mean of M-values method. M-values are the library size-adjusted log- ratio of counts between the control RNA-Seq library (380-selected) and the treatment of interest (62). The most extreme 30% of the M-values were trimmed, and the mean of remaining M-values was calculated. This trimmed mean is the log-normalization factor between two libraries, which attempts to eliminate systematic differences in the counts between the libraries (or RNA pools between treatments) by assuming that most genes are not differentially expressed. Common dispersion was estimated by fitting a generalized linear model (GLM) using the estimateGLMCommonDisp() function, and differentially expressed genes (except transposases) were determined by fitting the negative binomial GLM using glmFit() followed by likelihood ratio tests with glmLRT(). Genes with a Benjamini–Hochberg FDR < 0.05 were deemed differentially expressed. Global differential expression values and plots can be found in Table S2 and Fig. S3, respectively. These methods used by edgeR were recently shown to be in good agreement with other popular RNA-Seq statistical packages, as well as microarray data thus supporting the robustness of our statistical methods with other widely used packages (63).

Venn diagrams were produced by using differentially expressed gene lists per treatment (50) to determine both shared and exclusive genes between treatments. The “phyper” function in R (R Core Team 2014) was used to determine the probability of sharing n or more genes (where n is a number within the shared portions of the Venn diagrams; Figs. 2 and 4) by chance between two treatments.

GO Enrichment Analysis.

GO annotations for Trichodesmium were downloaded from the Genome2D web server (pepper.molgenrug.nl/index.php/bacterial-genomes). Next, the “phyper” function in R (R Core Team 2014) was used to test for significant enrichment of GO categories among the treatments and P values were corrected with the Benjamini–Hochberg method (58) using the “p.adjust” function (P ≤ 0.1) (33). Finally, genes in enriched GO categories were manually checked.

TE Expression.

TE sequences were downloaded from Walworth et al. (30) and clustered at 70% identity by using USEARCH (59), which yielded representative centroid sequences for each of the resulting 69 clusters. Next, BLASTn (60) was used to search for all paralogous sequences for each centroid (cluster) within the IMS101 genome with e-value ≤ 1e-5 and a minimum length threshold of ≥ 70% of the original centroid sequence length. Quality trimmed RNA-Seq reads were then mapped (see above) to all paralogous sequences within every cluster, and read counts for all sequences within a cluster were summed to produce aggregated read counts per cluster followed by aforementioned normalization and differential expression (see above). Finally, any cluster differentially expressed in at least one high-CO2 phenotype treatment (380s-to-750, 750-selected, or 750s-to-380) relative to the 380-selected reference was selected for downstream analysis, and log-twofold changes were calculated for each cluster of every high-CO2 phenotype treatment relative to the 380-selected condition. TE clusters exhibiting differential expression in at least two high-CO2 phenotype treatments are graphed in Fig. S2.

Hierarchical clustering with multiscale bootstrap resampling (replicates = 1,000) of log-twofold changes for all high-CO2 phenotype treatments was conducted by using “pvclust” (45) resulting in two well-defined groups of TE clusters whose average high-CO2 phenotype expression was either increased or decreased relative to the 380-selected reference (Fig. 3A). Welch’s T-tests assuming heteroscedasticity of high-CO2 phenotype replicates (n = 9) vs. the 380-selected replicates (n = 3) were conducted in Microsoft Excel.

Materials and Methods

Growth and N2 fixation data were obtained from Hutchins et al. (12). Treatments were analyzed in biological triplicate with both sampling and RNA isolation being conducted as described (30). Raw fastq files were processed as described (30) and mapped onto IMS101, IMG-called genes (https://img.jgi.doe.gov/) using Bowtie2 (Version 2.2.6) (55) with default settings followed by differential expression analysis using edgeR (56).

Venn diagrams were produced by using differentially expressed gene lists per treatment (57). The “phyper” function in “R” (R Core Team 2014) was used for hypergeometric tests, and P values were corrected with the Benjamini–Hochberg method (58) using the “p.adjust” function (33). Transposable element sequences were downloaded from Walworth et al. (30) and clustered at 70% identity using USEARCH (59) followed by mapping of RNA-Seq reads to each cluster. Hierarchical clustering was conducted with “pvclust” (45), and Welch’s T-tests assuming heteroscedasticity were conducted in Microsoft Excel. See SI Materials and Methods for further details.

Supplementary Material

Supplementary File
pnas.1605202113.sd01.xlsx (51.7KB, xlsx)
Supplementary File
pnas.1605202113.sd02.xlsx (38.9KB, xlsx)

Acknowledgments

We thank Ian Ehrenreich and Sinead Collins for insightful discussions. This work was supported by US National Science Foundation Grant OCE 1143760 (to D.A.H., E.A.W., and F.-X.F.).

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Data deposition: The sequences reported in this paper have been deposited in the NCBI Sequence Read Archive database, www.ncbi.nlm.nih.gov/sra (accession no. PRJNA312342).

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1605202113/-/DCSupplemental.

References

  • 1.Hutchins DA, Mulholland MR, Fu F-X. Nutrient cycles and marine microbes in a CO2-enriched ocean. Oceanography (Wash DC) 2009;22(4):128–145. [Google Scholar]
  • 2.Merilä J, Hendry AP. Climate change, adaptation, and phenotypic plasticity: The problem and the evidence. Evol Appl. 2014;7(1):1–14. doi: 10.1111/eva.12137. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Collins S, Rost B, Rynearson TA. Evolutionary potential of marine phytoplankton under ocean acidification. Evol Appl. 2014;7(1):140–155. doi: 10.1111/eva.12120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Boyd PW, et al. Marine phytoplankton temperature versus growth responses from polar to tropical waters—outcome of a scientific community-wide study. PLoS One. 2013;8(5):e63091. doi: 10.1371/journal.pone.0063091. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Schaum E, Rost B, Millar AJ, Collins S. Variation in plastic responses of a globally distributed picoplankton species to ocean acidification. Nat Clim Chang. 2012;3(3):298–302. [Google Scholar]
  • 6.Draghi JA, Whitlock MC. Phenotypic plasticity facilitates mutational variance, genetic variance, and evolvability along the major axis of environmental variation. Evolution. 2012;66(9):2891–2902. doi: 10.1111/j.1558-5646.2012.01649.x. [DOI] [PubMed] [Google Scholar]
  • 7.Schaum C-E, Rost B, Collins S. Environmental stability affects phenotypic evolution in a globally distributed marine picoplankton. ISME J. 2016;10(1):75–84. doi: 10.1038/ismej.2015.102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Ehrenreich IM, Pfennig DW. Genetic assimilation: A review of its potential proximate causes and evolutionary consequences. Ann Bot (Lond) 2016;117(5):769–779. doi: 10.1093/aob/mcv130. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Elena SF, Lenski RE. Evolution experiments with microorganisms: The dynamics and genetic bases of adaptation. Nat Rev Genet. 2003;4(6):457–469. doi: 10.1038/nrg1088. [DOI] [PubMed] [Google Scholar]
  • 10.Lohbeck KT, Riebesell U, Reusch TBH. Gene expression changes in the coccolithophore Emiliania huxleyi after 500 generations of selection to ocean acidification. Proc Biol Sci. 2014;281(1786):20140003. doi: 10.1098/rspb.2014.0003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Lohbeck KT, Riebesell U, Reusch TBH. Adaptive evolution of a key phytoplankton species to ocean acidification. Nat Geosci. 2012;5(5):346–351. [Google Scholar]
  • 12.Hutchins DA, et al. Irreversibly increased nitrogen fixation in Trichodesmium experimentally adapted to elevated carbon dioxide. Nat Commun. 2015;6:8155. doi: 10.1038/ncomms9155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Schaum CE, Collins S. Plasticity predicts evolution in a marine alga. Proc Biol Sci. 2014;281(1793):20141486. doi: 10.1098/rspb.2014.1486. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Travisano M, Vasi F, Lenski RE. Long-term experimental evolution in Escherichia coli. III. Variation replicate populations in correlated responses to novel environments. Evolution. 1995;49(1):189–200. doi: 10.1111/j.1558-5646.1995.tb05970.x. [DOI] [PubMed] [Google Scholar]
  • 15.Sañudo-Wilhelmy SA, et al. Phosphorus limitation of nitrogen fixation by Trichodesmium in the central Atlantic Ocean. Nature. 2001;411(6833):66–69. doi: 10.1038/35075041. [DOI] [PubMed] [Google Scholar]
  • 16.Mills MM, Ridame C, Davey M, La Roche J, Geider RJ. Iron and phosphorus co-limit nitrogen fixation in the eastern tropical North Atlantic. Nature. 2004;429(6989):292–294. doi: 10.1038/nature02550. [DOI] [PubMed] [Google Scholar]
  • 17.Hutchins DA, et al. CO2 control of Trichodesmium N2 fixation, photosynthesis, growth rates, and elemental ratios. Limnol Oceanogr. 2007;52(4):1293–1304. [Google Scholar]
  • 18.Sohm JA, Webb EA, Capone DG. Emerging patterns of marine nitrogen fixation. Nat Rev Microbiol. 2011;9(7):499–508. doi: 10.1038/nrmicro2594. [DOI] [PubMed] [Google Scholar]
  • 19.Chappell PD, Moffett JW, Hynes AM, Webb EA. Molecular evidence of iron limitation and availability in the global diazotroph Trichodesmium. ISME J. 2012;6(9):1728–1739. doi: 10.1038/ismej.2012.13. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Levitan O, et al. Elevated CO2 enhances nitrogen fixation and growth in the marine cyanobacterium Trichodesmium. Glob Change Biol. 2007;13(2):531–538. [Google Scholar]
  • 21.Barcelos e Ramos J, Biswas H, Schulz KG, LaRoche J, Riebesell U. Effect of rising atmospheric carbon dioxide on the marine nitrogen fixer Trichodesmium. Global Biogeochem Cycles. 2007;21:GB2028. [Google Scholar]
  • 22.Kranz SA, et al. Combined effects of CO2 and light on the N2-fixing cyanobacterium Trichodesmium IMS101: Physiological responses. Plant Physiol. 2010;154(1):334–345. doi: 10.1104/pp.110.159145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Levitan O, et al. Regulation of nitrogen metabolism in the marine diazotroph Trichodesmium IMS101 under varying temperatures and atmospheric CO2 concentrations. Environ Microbiol. 2010;12(7):1899–1912. doi: 10.1111/j.1462-2920.2010.02195.x. [DOI] [PubMed] [Google Scholar]
  • 24.Garcia NS, et al. Interactive effects of irradiance and CO2 on CO2 fixation and N2 fixation in the diazotroph Trichodesmium erythraeum (cyanobacteria) J Phycol. 2011;47(6):1292–1303. doi: 10.1111/j.1529-8817.2011.01078.x. [DOI] [PubMed] [Google Scholar]
  • 25.Shetye S, Sudhakar M, Jena B, Mohan R. 2013. Occurrence of nitrogen fixing cyanobacterium Trichodesmium under elevated pCO2 conditions in the western Bay of Bengal. Int J Oceanogr 2013: 10.1155/2013/350465.
  • 26.Hutchins DA, Fu F-X, Webb EA, Walworth N, Tagliabue A. Taxon-specific response of marine nitrogen fixers to elevated carbon dioxide concentrations. Nat Geosci. 2013;6(7):1–6. [Google Scholar]
  • 27.Langer G, Nehrke G, Probert I, Ly J, Ziveri P. Strain-specific responses of Emiliania huxleyi to changing seawater carbonate chemistry. Biogeosciences. 2009;6:2637–2646. [Google Scholar]
  • 28.Kremp A, et al. Intraspecific variability in the response of bloom-forming marine microalgae to changed climate conditions. Ecol Evol. 2012;2(6):1195–1207. doi: 10.1002/ece3.245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Reusch TBH, Boyd PW. Experimental evolution meets marine phytoplankton. Evolution. 2013;67(7):1849–1859. doi: 10.1111/evo.12035. [DOI] [PubMed] [Google Scholar]
  • 30.Walworth N, et al. Trichodesmium genome maintains abundant, widespread noncoding DNA in situ, despite oligotrophic lifestyle. Proc Natl Acad Sci USA. 2015;112(14):4251–4256. doi: 10.1073/pnas.1422332112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Brahamsha B, Haselkorn R. Identification of multiple RNA polymerase sigma factor homologs in the cyanobacterium Anabaena sp. strain PCC 7120: Cloning, expression, and inactivation of the sigB and sigC genes. J Bacteriol. 1992;174(22):7273–7282. doi: 10.1128/jb.174.22.7273-7282.1992. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Caslake LF, Gruber TM, Bryant DA. Expression of two alternative sigma factors of Synechococcus sp. strain PCC 7002 is modulated by carbon and nitrogen stress. Microbiology. 1997;143(Pt 12):3807–3818. doi: 10.1099/00221287-143-12-3807. [DOI] [PubMed] [Google Scholar]
  • 33.Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Imamura S, Asayama M. Sigma factors for cyanobacterial transcription. Gene Regul Syst Bio. 2009;3:65–87. doi: 10.4137/grsb.s2090. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Camsund D, Lindblad P. Engineered transcriptional systems for cyanobacterial biotechnology. Front Bioeng Biotechnol. 2014;2(Suppl.):40. doi: 10.3389/fbioe.2014.00040. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.López-Gomollón S, et al. Cross-talk between iron and nitrogen regulatory networks in Anabaena (Nostoc) sp. PCC 7120: Identification of overlapping genes in FurA and NtcA regulons. J Mol Biol. 2007;374(1):267–281. doi: 10.1016/j.jmb.2007.09.010. [DOI] [PubMed] [Google Scholar]
  • 37.González A, Bes MT, Valladares A, Peleato ML, Fillat MF. FurA is the master regulator of iron homeostasis and modulates the expression of tetrapyrrole biosynthesis genes in Anabaena sp. PCC 7120. Environ Microbiol. 2012;14(12):3175–3187. doi: 10.1111/j.1462-2920.2012.02897.x. [DOI] [PubMed] [Google Scholar]
  • 38.Wu LJ, Errington J. Identification and characterization of a new prespore-specific regulatory gene, rsfA, of Bacillus subtilis. J Bacteriol. 2000;182(2):418–424. doi: 10.1128/jb.182.2.418-424.2000. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Su Z, Olman V, Xu Y. Computational prediction of Pho regulons in cyanobacteria. BMC Genomics. 2007;8(1):156. doi: 10.1186/1471-2164-8-156. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Pawlowski K, Klosse U, de Bruijn FJ. Characterization of a novel Azorhizobium caulinodans ORS571 two-component regulatory system, NtrY/NtrX, involved in nitrogen fixation and metabolism. Mol Gen Genet. 1991;231(1):124–138. doi: 10.1007/BF00293830. [DOI] [PubMed] [Google Scholar]
  • 41.Narikawa R, Okamoto S, Ikeuchi M, Ohmori M. Molecular evolution of PAS domain-containing proteins of filamentous cyanobacteria through domain shuffling and domain duplication. DNA Res. 2004;11(2):69–81. doi: 10.1093/dnares/11.2.69. [DOI] [PubMed] [Google Scholar]
  • 42.Ashby MK, Houmard J. Cyanobacterial two-component proteins: Structure, diversity, distribution, and evolution. Microbiol Mol Biol Rev. 2006;70(2):472–509. doi: 10.1128/MMBR.00046-05. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Iranzo J, Gómez MJ, López de Saro FJ, Manrubia S. Large-scale genomic analysis suggests a neutral punctuated dynamics of transposable elements in bacterial genomes. PLOS Comput Biol. 2014;10(6):e1003680. doi: 10.1371/journal.pcbi.1003680. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Treangen TJ, Salzberg SL. Repetitive DNA and next-generation sequencing: Computational challenges and solutions. Nat Rev Genet. 2012;13(1):36–46. doi: 10.1038/nrg3117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Suzuki R, Shimodaira H. Pvclust: An R package for assessing the uncertainty in hierarchical clustering. Bioinformatics. 2006;22(12):1540–1542. doi: 10.1093/bioinformatics/btl117. [DOI] [PubMed] [Google Scholar]
  • 46.Navarro-Quezada A, Schoen DJ. Sequence evolution and copy number of Ty1-copia retrotransposons in diverse plant genomes. Proc Natl Acad Sci USA. 2002;99(1):268–273. doi: 10.1073/pnas.012422299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Delihas N. Impact of small repeat sequences on bacterial genome evolution. Genome Biol Evol. 2011;3(0):959–973. doi: 10.1093/gbe/evr077. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Vogel C, Marcotte EM. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat Rev Genet. 2012;13(4):227–232. doi: 10.1038/nrg3185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kronholm I, Collins S. Epigenetic mutations can both help and hinder adaptive evolution. Mol Ecol. 2016;25(8):1856–1868. doi: 10.1111/mec.13296. [DOI] [PubMed] [Google Scholar]
  • 50.Pigliucci M, Murren CJ, Schlichting CD. Phenotypic plasticity and evolution by genetic assimilation. J Exp Biol. 2006;209(Pt 12):2362–2367. doi: 10.1242/jeb.02070. [DOI] [PubMed] [Google Scholar]
  • 51.Tang X-S, et al. Identification of histidine at the catalytic site of the photosynthetic oxygen-evolving complex. Proc Natl Acad Sci USA. 1994;91(2):704–708. doi: 10.1073/pnas.91.2.704. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Onidas D, Stachnik JM, Brucker S, Krätzig S, Gerwert K. Histidine is involved in coupling proton uptake to electron transfer in photosynthetic proteins. Eur J Cell Biol. 2010;89(12):983–989. doi: 10.1016/j.ejcb.2010.08.007. [DOI] [PubMed] [Google Scholar]
  • 53.Walworth NG, et al. Mechanisms of increased Trichodesmium fitness under iron and phosphorus co-limitation in the present and future ocean. Nat Commun. 2016;7:12081. doi: 10.1038/ncomms12081. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Reusch TBH. Climate change in the oceans: Evolutionary versus phenotypically plastic responses of marine animals and plants. Evol Appl. 2014;7(1):104–122. doi: 10.1111/eva.12109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–359. doi: 10.1038/nmeth.1923. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Robinson MD, McCarthy DJ, Smyth GK. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–140. doi: 10.1093/bioinformatics/btp616. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Heberle H, Meirelles GV, da Silva FR, Telles GP, Minghim R. InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams. BMC Bioinformatics. 2015;16(1):169. doi: 10.1186/s12859-015-0611-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc B. 1995;57(1):289–300. [Google Scholar]
  • 59.Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26(19):2460–2461. doi: 10.1093/bioinformatics/btq461. [DOI] [PubMed] [Google Scholar]
  • 60.Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–410. doi: 10.1016/S0022-2836(05)80360-2. [DOI] [PubMed] [Google Scholar]
  • 61.Prufert-Bebout L, Paerl HW, Lassen C. Growth, nitrogen fixation, and spectral attenuation in cultivated Trichodesmium species. Appl Environ Microbiol. 1993;59(5):1367–1375. doi: 10.1128/aem.59.5.1367-1375.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Robinson MD, Oshlack A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010;11(3):R25. doi: 10.1186/gb-2010-11-3-r25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63.Nookaew I, et al. A comprehensive comparison of RNA-Seq-based transcriptome analysis from reads to differential gene expression and cross-comparison with microarrays: a case study in Saccharomyces cerevisiae. Nucleic Acids Res. 2012;40(20):10084–10097. doi: 10.1093/nar/gks804. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.1605202113.sd01.xlsx (51.7KB, xlsx)
Supplementary File
pnas.1605202113.sd02.xlsx (38.9KB, xlsx)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES