Skip to main content
Applied and Environmental Microbiology logoLink to Applied and Environmental Microbiology
. 2013 Mar;79(5):1516–1522. doi: 10.1128/AEM.03439-12

Patterns of Nucleotide Diversity of the ldpA Circadian Gene in Closely Related Species of Cyanobacteria from Extreme Cold Deserts

Ka Wai Ng 1, Stephen B Pointing 1, Volodymyr Dvornyk 1,
PMCID: PMC3591978  PMID: 23263969

Abstract

In the circadian system of cyanobacteria, the ldpA gene is a component of the input to the clock. We comparatively analyzed nucleotide polymorphism of this gene in populations of two closely related species of cyanobacteria (denoted as Synechococcus species S1 and S2, respectively) from extreme cold deserts in Antarctica, the Canadian Arctic, and Tibet. Although both species manifested similarly high haplotype diversities (0.990 and 0.809, respectively), the nucleotide diversity differed significantly (0.0091 in S1 and 0.0037 in S2). The populations of species S2 were more differentiated (FST = 0.2242) compared to those of species S1 (FST between 0.0296 and 0.1188). An analysis of positive selection with several tests yielded highly significant values (P < 0.01) for both species. On the other hand, these results may be somewhat compromised by fluctuating population sizes of the species. The apparent selection pressure coupled with the pronounced demographic factors, such as population expansion, small effective population size, and genetic drift, may thus result in the observed significant interpopulation differentiation and subsequent speciation of cyanobacteria.

INTRODUCTION

Circadian systems have been identified in many species of animals and plants, and more recently also in microorganisms, particularly in the Cyanobacteria (1). They serve to regulate responses to environmental stimuli, notably the light and dark cycle, and this has significant adaptive advantage as it allows cells to adjust their metabolism to the light cycle. The Cyanobacteria is an ancient phylum (2), globally distributed and particularly so in arid landscapes that have persisted for millennia (3). This makes cyanobacterial circadian systems of particular importance in evolutionary studies over extended timescales during which the Earth's environment, including light regimen, has changed markedly (47). However, the nature of such evolutionary change and its drivers are still scarcely understood.

The ldpA gene mediates an input signal to the central oscillator of the circadian system and was first identified in Synechococcus elongatus PCC 7942 (8). It belongs to the family of ferredoxins and, in addition to the core HycB domain, possesses two conserved terminal domains, which are unique to this gene (8, 9). The deduced LdpA protein has two Fe-S motifs, probably 3Fe-4S and 4Fe-4S (8). These motifs were suggested to play an important role in regulating the circadian input through sensing changes in intracellular redox state in response to light intensity (10). The phylogenetic tree of the LdpA proteins in cyanobacteria features two major distinct clades that correspond to the two types of the circadian system: kaiABC, which features all kai genes, and kaiBC, which lacks kaiA (7, 9). The ldpA genes of both clades have similar evolutionary constraints that suggests their similar functions (9).

Previous studies of circadian genes in cyanobacteria suggested that their evolution was governed by many factors, such as lateral transfers, duplications, and selection (47). However, with the exception of our recent studies of the kai and cpmA genes (11, 12), data on evolutionary mechanisms of the other circadian genes in cyanobacteria at the population level are lacking. Here, we sample genetic diversity of the ldpA gene from several populations of two closely related species of cyanobacteria from cold, arid environments around the world.

MATERIALS AND METHODS

Sampling sites and cyanobacterial cultures.

Environmental samples were collected from three locations, which are characterized by extremely arid regimes over extended timescales (2, 13). Three strains were isolated during enrichment culture from lithic substrates in McKelvey Valley, McMurdo Dry Valleys, East Antarctica (global positioning system [GPS] location 77°24.604′S, 161°11.702′E); two strains were from the central Tibetan plateau (GPS location 29°07.943′N, 85°22.508′E) and four strains were from Devon Island in the Canadian Arctic (GPS location 75°20.704′N, 89°45.790′W). All locations are cold deserts covered by snow for several months of the year and are all classified as polar frost climatically (14).

Nine cyanobacterial strains were obtained via enrichment culture and isolation using BG11 growth medium and cultivation in 24 h illumination at 25°C. The cyanobacterial cultures were maintained for 12 weeks to obtain sufficient biomass for DNA extraction, since slow growth rates in vitro are typical for desert cyanobacteria.

DNA extraction, PCR, and sequencing.

Genomic DNA from cultures was extracted using a hot phenol extraction method, optimized for cyanobacteria (15). Template quality and quantity were checked using gel electrophoresis and NanoDrop quantification. The PCR amplification primers were designed using NetPrimer (Premier Biosoft) and the published annotated sequences of the ldpA gene or its 4Fe-4S ferredoxin homologues from other cyanobacteria: Anabaena variabilis ATCC 29413 (GenBank accession no. NC_007413, locus tag Ava_0125), Arthrospira platensis strain Paraca (NZ_ACSK01000254, locus tag AplaP_010100006987), Nodularia spumigena CCY9419 (NZ_AAVW01000087, locus tag N9414_23468), and Synechococcus elongatus PCC 7942 (NC_007604, locus tag Synpcc7942_0624).

The PCR was performed with the forward primer ldpA-F-35 (5′-CTGATTTGCGGCGCGAGCTATC-3′) and reverse primer ldpA-R-297 (5′-GCCAATATCGCCGCTCAT-3′), yielding an ldpA gene fragment of ∼690 bp. A final volume of 25 μl of PCR mixture consisted of 10 ng of DNA template, 1 U of rTaq DNA polymerase (TaKaRa Biotechnology [Dalian] Co., Ltd., Liaoning, China), 10× rTaq PCR buffer (with MgCl2) (TaKaRa Biotechnology Co.), 5 mM deoxynucleoside triphosphates, 1/10 Tween 20, 10 μM concentrations of each primer, and sterile deionized water. PCR was performed in a thermal cycler (model 2700 and 2720; Applied Biosystems, Foster City, CA) using the following profile: an initial denaturing step of 5 min at 94°C, followed by 40 cycles of denaturation at 94°C for 1 min, annealing at 62°C for 1 min 30 s, and extension at 72°C for 1 min, with a final incubation at 72°C for 10 min. The PCR products were visualized on a 1% agarose gel and subsequently purified with the GFX PCR DNA and gel band purification kit (Amersham Biosciences, Piscataway, NJ). The purified PCR products were cloned using a TOPO TA cloning kit (Invitrogen, Carlsbad, CA) and sequenced with M13 primers. The sequence chromatograms were checked for accuracy and trimmed for unresolved bases. The DNA sequences were aligned using CLUSTAL W (16).

Phylogenetic analysis and sequence selection.

The fit of the substitution models was determined by the jModelTest 0.1.1 software (17) for DNA sequences and by the Prottest 3.0 (18) for proteins. The best-fitting models, based on both Akaike and Bayesian information criteria (AIC and BIC) (19), were F81+G (20) with α = 0.133 and JTT (21), which were thus utilized for the phylogenetic analysis of DNA and amino acid sequences, respectively. The neighbor-joining algorithm (22), as implemented in MEGA5 (23), was used to infer a phylogenetic tree of the ldpA genes. Statistical significance of the tree nodes was evaluated using the bootstrap procedure with 1,000 replications.

The two-step algorithm was utilized to select sequences for analyses. First, a phylogenetic analysis was applied. It resulted in grouping the sequences into two distinct clades with nearly 100% bootstrap support and unresolved subtrees within the clades (see Fig. S1 in the supplemental material). Second, two randomly chosen amino acid sequences from each clade were used as probes to conduct a protein BLAST (24) of the GenBank nonredundant database. This search yielded a highest similarity to the 4Fe-4S binding domain protein of Synechococcus sp. strain PCC 7335 (ZP_05036324), which belongs to the LdpA superfamily. Based on these considerations, we assumed that the above two clades represent two closely related species of cyanobacteria. We termed them Synechococcus species S1 (S1 hereafter) and Synechococcus species S2 (S2), respectively. In total, 59 ldpA sequences of S1 and 27 sequences of S2 were recovered and used in the subsequent analyses. The sequences were grouped in populations and denoted according to the sampling locations of the cyanobacterial strains: ANT for Antarctica, ARC for the Canadian Arctic, and TIB for Tibet.

Analysis of intra- and interpopulation genetic diversity.

The following common genetic diversity parameters were estimated: haplotype diversity (Hd), nucleotide diversity (π), number of polymorphic sites (S) (25), theta (θ) per site from S (25), and the number of pairwise differences (K). The DnaSP v5.10 (26) was used for the above computations. Between-population differentiation was computed by FST from the analysis of molecular variance (27) using Arlequin 3.5 (28), and with Jost's D (29) using SPADE (30). The functional domains and motifs were recognized according to the previously described ldpA gene structure (8).

We estimated various parameters of recombination and linkage disequilibrium (LD): the recombination parameter R per gene and the minimum number of recombination events, Rm (31), the ZnS statistics (32), which shows association between polymorphisms in the sample, and the ZZ statistics (33), which measures the effect of intragenic recombination. We obtained the confidence intervals for the above parameters by coalescent simulations with 10,000 replicates under the assumption of no recombination. DnaSP v5.10 (26) was utilized for all computations. The recombination rate c was computed by using the Bayesian algorithm with the default settings implemented in LAMARC 2.1.6 (34).

We also estimated effective population size from the recombination and LD values. The relationship between these three parameters may be approximated as ZnS ≈ 1/(2Nec + 1), where ZnS measures LD by averaging R2 (35) over all pairwise comparisons of polymorphic sites in a set of sequences, Ne is the effective population size, and c is a recombination rate.

Neutrality tests.

We applied several tests to examine whether the obtained data correspond to the neutral expectations: Tajima's D (36), Fay and Wu's H (37), Fu's Fs test of selective neutrality (38), Fu and Li's D* and F* (39), and Achaz's Y/Y* (40). These tests were conducted using an online tool (http://wwwabi.snv.jussieu.fr/achaz/neutralitytest.html) and DnaSP v5.10 (26). The presence of positive selection was analyzed with the compound test (41), which is a combination of Tajima's D, Fay and Wu's normalized H, and Ewens-Watterson estimate. The rates of synonymous (dS) and nonsynonymous substitutions (dN) and the dN/dS ratio were computed according to the modified Nei-Gojobori method (42), assuming the transition/transversion bias to be 2.4. This analysis was conducted using MEGA5 (23).

Analysis of population history.

We used several parameters to test whether the studied populations have experienced size changes in their history. Tajima's D (36) and R2 statistics (43) are based on mutation frequency distribution, the Fs statistic (38) is computed from the haplotype distribution, and the raggedness statistic r (44) is derived from the pairwise differences between sequences (i.e., the mismatch distribution). These tests were shown to be the most robust for detecting population size changes (43). In addition, we estimated the exponential population growth rate, g, using a Bayesian procedure implemented in LAMARC 2.1.6 (34). Parameter g is derived from the following equation: θt = θpresent time exp(−gt), where θ is a scaled time-dependent mutation parameter, and t is time before the present (34).

Nucleotide sequence accession numbers.

Nucleotide sequences obtained in this study were deposited to GenBank under accession numbers JX846499 to JX846584.

RESULTS

DNA polymorphism, within- and between-population diversity of the ldpA locus in Synechococcus sp.

Both species manifested high haplotype and low nucleotide diversity at both population- and species-wide levels. However, the ldpA gene in species S1 displayed, on average, ∼2.5-fold-higher nucleotide diversity than that in species S2. The average species-wide nucleotide diversities of the gene (π ± the standard error) were 0.0091 ± 0.0014 for S1 and 0.0037 ± 0.0009 for S2 (Table 1). The patterns of the interpopulation diversity were also different: the Arctic and Tibetan populations of species S1 (S1-ARC and S1-TIB) had nearly the same values of π, whereas those of species S2 (S2-ARC and S2-TIB) differed by almost 5-fold. Both species showed similar distribution of nucleotide polymorphism along the sequenced region of the gene, with the Fer4_10 domain (a member of the 4Fe-4S binding domain family) being most conserved, followed by the LdpA_C and the N-terminal domains (Fig. 1 and Table 2). These domain-specific patterns of nucleotide diversity varied slightly in some populations. In particular, the most conserved domain in population S1-TIB was LdpA_C (Table 2). There were also marked differences in domain polymorphism between the Arctic populations of the studied species: all three domains were significantly more variable in S1-ARC than in S2-ARC (Table 2). However, the studied partial sequence of the LdpA protein from both species was highly similar to LdpA of Synechococcus elongatus PCC 7942 (see Fig. S2 in the supplemental material).

Table 1.

Population genetic diversity parameters of the ldpA gene in species S1 and S2

Parameter S1
S2
S1-ANT S1-ARC S1-TIB Total S2-ARC S2-TIB Total
Sequences (n) 29 10 20 59 22 5 27
Haplotypes (H) 23 10 17 49 10 5 15
Haplotype diversity (Hd) 0.968 1.000 0.984 0.990 0.710 1.000 0.809
Nucleotide diversity (π) 0.0061 0.0125 0.0130 0.0100 0.0022 0.0106 0.0039
Polymorphic sites (S) 46 28 43 96 11 15 24
θ, per site (from S) 0.0170 0.0143 0.0176 0.0299 0.0044 0.0104 0.0090
Pairwise differences (K) 3.975 7.756 7.900 6.251 1.468 6.800 2.558

Fig 1.

Fig 1

Distribution of nucleotide polymorphism along the 86 partial sequences of the ldpA gene. A sliding window of 100 bp with increments of 10 bp is depicted. The black boxes indicate the functional motifs: 1, putative hydrophobic motif (positions 25 to 90 in the alignment) (9); 2, 3Fe-4S motif (positions 232 to 270); and 3, 4Fe-4S motif (positions 325 to 357). The shaded boxes indicate the putative functional domains (from left to right): N-terminal domain (positions 1 to 216); Fer4_10, 4Fe-4S binding domain (positions 217 to 360); and LdpA_C domain (positions 427 to 690).

Table 2.

Domain-specific polymorphisms in the ldpA gene of species S1 and S2

Species and population Domain-specific polymorphism (π) ± SEM
N-terminal Fer4_10 LdpA_C Total
S1 0.0156 ± 0.0035 0.0059 ± 0.0012 0.0068 ± 0.0021 0.0091 ± 0.0014
    S1-ANT 0.0081 ± 0.0021 0.0039 ± 0.0011 0.0054 ± 0.0034 0.0058 ± 0.0010
    S1-ARC 0.0201 ± 0.0057 0.0070 ± 0.0028 0.0081 ± 0.0030 0.0113 ± 0.0022
    S1-TIB 0.0212 ± 0.0047 0.0082 ± 0.0030 0.0074 ± 0.0021 0.0116 ± 0.0020
S2 0.0051 ± 0.0014 0.0020 ± 0.0011 0.0034 ± 0.0014 0.0037 ± 0.0009
    S2-ARC 0.0017 ± 0.0009 0.0012 ± 0.0011 0.0025 ± 0.0014 0.0021 ± 0.0007
    S2-TIB 0.0178 ± 0.0063 0.0056 ± 0.0033 0.0069 ± 0.0032 0.0099 ± 0.0027

Differences in DNA diversity between the species were particularly noticeable in the putative functional motifs (Fig. 1). Specifically, S1 had four synonymous and six nonsynonymous substitutions in the putative hydrophobic motif versus only one synonymous substitution in that motif of S2, and it had three synonymous and three nonsynonymous versus no substitutions in the 3Fe-4S motif. The most conserved 4Fe-4S motif acquired one replacement substitution in S2 versus no substitutions in S1. On the other hand, the studied species manifested similar high haplotype diversity (0.990 for S1 and 0.809 for S2) (Table 1).

Interpopulation genetic distances in both species were small (Table 3). They varied from 0.0097 to 0.0131 in species S1 and were slightly larger than that of species S2 (0.0068). Despite the small distances, the populations indicated significant differentiation as measured by pairwise FST and Jost's D values (Table 3). The average interpopulation FST was much lower in S1 (0.0835) than in S2 (0.2312), suggesting significantly higher percentage of variability residing within populations (91.66% versus 76.88%). This ratio even increased when only corresponding populations of the species (ARC and TIB) were taken into account: the percentage of intrapopulation variability in species S1 became ca. 97% (FST = 0.0296) (Table 3).

Table 3.

Genetic differentiation at the ldpA locus between the studied populationsa

Population S1-ANT S1-ARC S1-TIB S2-ARC S2-TIB
S1-ANT 0.0745*/0.869 0.1188*/1.0
S1-ARC 0.0097 0.0296/1.0
S1-TIB 0.0107 0.0131
S2-ARC 0.2242*/1.0
S2-TIB 0.0068
a

Below diagonal, pairwise between-population genetic distance; above diagonal, pairwise FST values/Jost's D.

*

, P < 0.01; −, not applicable.

Recombination and linkage disequilibrium.

The values of the LD and recombination estimates were different for the studied species (Table 4). All of the parameters but one (ZnS in S1, P = 0.001) were not statistically significantly different from zero. Overall, this indicated that recombination does not contribute significantly to the observed ldpA gene variation (Table 4).

Table 4.

Linkage disequilibrium and recombination parameters in the studied speciesa

Species ZnS ZZ R Rm c Ne
S1 0.0257 –0.0013 18.5 5 0.0007 ∼13,000
S2 0.0816 0.0733 7.5 1 0.5937 ∼6
a

ZnS, the ZnS statistics (32); ZZ, the ZZ statistics (33); R, recombination parameter (per gene); Rm, the minimum number of recombination events; c, recombination rate (31); Ne, effective population size.

Neutrality tests.

The results of the tests for neutrality suggested that evolution of the ldpA gene deviates from the neutral model in both studied species (Table 5). The results were more significant in S1, suggesting that selection is more pronounced in this species compared to S2. The compound DHEW test for presence of positive selection (41) returned significant P values for both species. However, while the estimates for selection were significant species-wide, they were not significant in some populations. For example, Fu and Li's D* and F* values suggested neutral evolution in populations S1-ARC, S1-TIB, and S2-TIB but selection operating on both species as a whole (Table 5).

Table 5.

Results of neutrality tests of the studied speciesa

Parameter S1
S2
ANT ARC TIB Total ARC TIB Total
D –2.4670 –1.0360 –1.3921 –2.413 –1.7770 –0.4062 –2.1820
    P <0.01 >0.10 0.065 <0.01 0.013 >0.10 <0.01
D* –3.8479 –1.3020 –1.5028 –5.0752 –2.0926 –0.4062 –2.8757
    P <0.02 >0.10 >0.10 <0.02 0.031 >0.10 <0.05
F* –4.0069 –1.3940 –1.7127 –4.8556 –2.1787 –0.4328 –3.1198
    P <0.02 >0.10 >0.10 <0.02 0.028 >0.10 <0.05
H –2.0631 –0.9441 –1.8625 –3.1550 –1.9363 0.6182 –1.6231
    P 0.032 0.123 0.042 0.008 0.041 >0.10 0.058
Fs –19.1886 –4.2727 –6.8852 –52.9325 –5.8424 –0.8745 –8.7859
    P <0.001 0.013 0.005 <0.001 <0.001 0.164 <0.001
Y –2.5929 –0.9569 –1.5034 –2.2977 –1.8819 –0.0550 –2.0072
    P 0.001 >0.10 0.068 0.003 0.027 >0.10 0.016
Y* –1.9255 0.2928 –0.9933 –1.6642 –1.6510 ND –1.7821
    P 0.015 >0.10 0.165 0.030 0.041 0.030
DHEW 0.002 1 1 <0.001 0.002 1 0.003
dN 0.0047 0.0102 0.0094 0.0076 0.0015 0.0089 0.0030
dS 0.0082 0.0136 0.0163 0.0126 0.0037 0.0120 0.0054
dN/dS 0.5732 0.7500 0.5767 0.6032 0.4054 0.7417 0.5556
a

D, Tajima's D (36); H, Fay and Wu's H (37); Fs, Fu's Fs test of selective neutrality (38); D* and F*, Fu and Li's D* and F* test statistics (39); Y and Y*, Achaz's Y and Y* test statistics (40); E, test for directional selection (P values) (60); DHEW, compound test for positive selection (P values) (41); dS, synonymous changes; dN, nonsynonymous changes; dN/dS, ratio of nonsynonymous to synonymous substitutions. Statistically significant values are bold.

Demographic history of the Synechococcus populations.

The results of the population history analysis rejected a null hypothesis regarding the constant population size in both studied species. All computed parameters, including Tajima's D (36), Fu's Fs (38) (Table 5), raggedness statistics r (44), and R2 (43) (Table 6) were statistically significant and indicated population growth of both Synechococcus species. The Bayesian analysis yielded values of the population growth parameter, g, from 853 to 956 for all populations, which is taken as further strong evidence for population expansion of the studied species.

Table 6.

Results of analysis for population size changesa

Species r statistical values
R2 statistical values
r 95% CI P R2 95% CI P
S1 0.0069 0.0109–0.1493 0.004 0.0239 0.0516–0.1665 <0.001
S2 0.0174 0.0241–0.4956 0.008 0.0530 0.0712–0.2046 <0.001
a

r, raggedness statistics (44); R2, Ramos-Onsins and Rozas's R2 statistics (43); CI, confidence interval.

DISCUSSION

This study is the first to identify species-specific patterns of DNA diversity and evolution of a circadian gene between closely related cyanobacterial species from the same or similar habitats. The strains were all isolated from extreme cold (polar) deserts and experienced growth seasons typified by long (or continuous) daylight followed by inactive winter periods of complete darkness (due to obliquity in polar locations and deep snow cover in Tibet). The influence of circadian pathways on the seasonal revival and shutting down of these organisms is likely key to survival, as well as the likely complex homeostatic regulation required for photoautotrophic metabolism during periods of continuous daylight, in addition to diel changes more typical for nonextreme locations.

Similar low levels of intraspecific DNA diversity but different patterns of interpopulation differentiation at the ldpA locus of the two Synechococcus species.

Both species manifested overall low levels of nucleotide diversity at the ldpA gene. However, the population diversity within each species was quite different in some cases (Table 1). Recently, we reported no significant differences in the values of intrapopulation nucleotide diversity for another circadian gene, cpmA, from a stress-tolerant cyanobacterium Chroococcidiopsis sp. sampled from similar locations (12). The average DNA diversity of the cpmA gene was 0.0034, which is close to or lower than the respective estimates for ldpA in the present study (Table 1). On the other hand, the data on the two key circadian genes, kaiB and kaiC, of a filamentous cyanobacterium Nostoc linckia from the ecological model microsites, Evolution Canyons I and II (Israel), indicated that the Nostoc strains from the environmentally stressful south-facing slopes had ∼1,000-fold-higher substitution rate compared to the strains from the temperate north-facing slopes (11). The species-wide diversity of these genes (0.0522 ± 0.0061) was also much higher than that for the cpmA and ldpA genes (11, 12).

It is difficult to compare the obtained estimates of DNA diversity with those of other bacteria due to the limited volume of data available from other studies. Apart from the above-mentioned estimates for some circadian genes in cyanobacteria (11, 12), there is some information about a few housekeeping genes of several bacterial species. The most relevant data are from a cyanobacterium, Microcystis aeruginosa (45). The authors of that study reported a mean diversity of 0.023 for 7 housekeeping genes of this prokaryote, with the lowest of 0.013 for recA and the highest of 0.043 for pgi. The population analysis of three nonhousekeeping genes (fliC, proA, and mompS) from environmental isolates of Legionella pneumophila yielded species-wide estimates of their diversity between 0.0155 and 0.0291 (46). These estimates are close to or higher than the values obtained for the ldpA gene. By its level of polymorphism, the ldpA gene appears to position between cpmA and housekeeping genes, which are believed to be under extreme selective constraints due to their significance for basic functions of an organism (47).

The ldpA gene indicated significant between-population variability in both Synechococcus species (Table 3). This is quite different from what was observed for the cpmA gene in Chroococcidiopsis sp., which manifested virtually no variability between populations (12). Several factors may account for the differences. First, ldpA and cpmA belong to different functional divisions of the circadian system: input and output, respectively. These divisions play very different roles and, therefore, have different evolutionary constraints. Indeed, previous studies of these genes showed that cpmA is slightly more conserved than ldpA at the level of species and above (6, 9). Importantly, this ratio is supported by the data at the population level too (see above). Second, the circadian system has high adaptive significance (48) and each circadian element likely makes certain contributions to that. With respect to ldpA and cpmA, the data presented above may suggest that the less constrained ldpA gene may be more evolutionarily “responsive” to the environmental fluctuations in particular habitats and thus result in more significant interpopulation differentiation. The importance of environmental conditions is shown by significant between-population differences in nucleotide diversity (Table 1).

Another interesting finding of the present study is that, while the overall profiles of domain-specific polymorphism within the ldpA gene were quite similar in both Synechococcus species (Fig. 1), the proportion of interpopulation and interspecific polymorphism levels differed significantly (Table 2). These differences may result from various factors. For example, specific functions of the domains and functional motifs in relation to adaptation may couple with fluctuating environmental conditions in a particular microhabitat of a population/species and produce the observed patterns.

We provide the first direct comparison of the intra- and interspecific DNA diversity of the same circadian gene in closely related species of bacteria. A study of three housekeeping genes (atpD, glnII, and recA) in closely related species of symbiotic bacteria, Bradyrhizobium japonicum and B. canariense (49), identified slightly lower population diversity of these genes in B. canariense but did not determine any between-population differentiation. Similar to ldpA and cpmA, these genes manifested a high number of pairwise nucleotide differences (6.1 to 14.4) and haplotype diversity (0.83 to 1.00). However, in contrast to the circadian genes, the three housekeeping genes showed significant recombination and no signs of selection.

With respect to other available data on the nucleotide diversity of bacterial genes (see, for example, references 50, 51, and 52), ldpA appears to be fairly conserved. The observed conservation of ldpA is in line with the overall high conservation of the circadian system and its elements (7) as key players in maintenance of intracellular homeostasis and adaptation (53). The role in adaptation seems to be particularly appropriate to explain the observed DNA polymorphism patterns of the ldpA gene, since it was suggested to adjust the circadian clock in response to environmental light fluctuations (8). Unlike the cpmA gene (12), the biochemical function of ldpA is probably not housekeeping: disruption of the gene does not result in a cell death but affects the amplitude of the free-running period (8). The importance of ldpA for adaptation coupled with the higher variability may also underlie another difference of ldpA from cpmA and the housekeeping genes: its significant interpopulation differentiation (Table 3).

The observed high haplotype diversity of bacterial genes, including circadian genes (11, 12; the present study), may result from the adaptation of multiple ecotypes to microenvironmental extremes in specific ecological microniches (45) and, respectively, the partitioning of any single population into micropopulations. This process may eventually lead to sympatric ecological differentiation (54) and, eventually, speciation. The proposed scenario does not seem impossible, given the environmental extremes and quite pronounced environmental fluctuations (temperature, light, humidity, etc.) at the sampling locations.

Selection at the ldpA gene and population structure of the Synechococcus species.

The ldpA gene belongs to the input pathway of the circadian system in cyanobacteria (8). The LdpA protein interacts with the other circadian proteins from the different divisions of the circadian system, CikA (input division), KaiA (central oscillator), and SasA (output division) and transfers information about a light signal indirectly, through sensing a redox state of the cell (10). Given that the activity of ldpA depends on the light intensity, it is logical to suppose that selection should favor alleles, which better account for the light fluctuations.

Studies addressing non-neutral evolution or positive selection of the circadian genes in prokaryotes and eukaryotes are scarce. Previous macroevolutionary studies of the circadian genes in prokaryotes showed that, while some positive selection might occur (e.g., at the ldpA and sasA loci), purifying selection was prevailing at the above-species level (6, 9, 55). However, our previous studies reported episodic positive selection operating on the cpmA gene in populations of stress-tolerant Chroococcidiopsis sp. (12) and for two core circadian genes, kaiB and kaiC, in Nostoc linckia (11). For some genes (e.g., cpmA), the results of the positive selection analyses at the levels above- and below-species may be inconsistent. This may occur if selection is weak (which is common) and operates only during relatively short periods of evolutionary time. In such a case, the methods of analysis may fail to determine positive selection at the macrolevel due to their insufficient power. In addition, selection may work only in some populations, as was reported for the period 2 gene in humans (56).

According to the results of the population history analysis, both S1 and S2 species experienced recent population expansion. On the other hand, this expansion was coupled with very small effective population size. Small effective population size (Ne) is usually associated with significant genetic drift and, consequently, a reduced diversity of populations. In turn, genetic drift and small Ne favor interpopulation differentiation and, ultimately, speciation. Indeed, the populations of species S1 and S2 have low diversity and significant differentiation, which is in agreement with these considerations. The populations are significantly differentiated even despite the high migration rate among them. This is in contrast to observations for another circadian gene, cpmA, in populations of Chroococcidiopsis sp., which showed no differentiation (12). The observed differentiation of the Synechococcus populations may result from the much smaller effective population size compared to that of Chroococcidiopsis sp. (∼50,000). Ne of the ldpA gene in species S2 is extremely small. This may result from drastic fluctuations of the population size due to the extreme environment. In addition, species S1 may be much less abundant due to the high specialization to a particular ecological microniche.

The departure from demographic equilibrium is common among infectious and symbiotic bacteria: they commonly pass through a population bottleneck after antibiotic therapy or change of hosts, which is followed by population growth (57, 58). Likewise, bacterial populations from extreme habitats may experience similar processes during their recovery after stochastic extremes in environmental conditions.

In a view of the above population demography of the Synechococcus species, the significant results of the tests for positive selection should be interpreted with some reservations. The selection indeed may take place; however, its signs may be compromised by other factors, such as strong genetic drift and fluctuations in population size, which may yield false-positive results of some tests for selection (36). Also, in populations with very small Ne, purifying selection (which usually predominates) becomes relaxed and, respectively, more nonsynonymous and therefore potentially selectable substitutions will behave neutrally compared to populations with large Ne. However, the concordant results of all test for selection applied in the present study are in favor that evolution of the ldpA gene under extreme stress is likely non-neutral.

Overall, the processes of genetic drift, selection and local adaptation have resulted in the observed population structure of the studied Synechococcus species: multiple small microniche-specific populations (ecotypes) within a habitat. Such a structure was recently reported for thermophilic Synechococcus from microbial mats of the geothermal Mushroom Spring in Yellowstone National Park (59).

Conclusion.

We demonstrate extreme conservation and non-neutral evolution of circadian clock genes at the population level for species of the genus Synechococcus. Some evidence for positive-selection effects suggests that adaptive evolution may occur in microbial circadian systems.

Supplementary Material

Supplemental material

ACKNOWLEDGMENTS

We thank Claus Vogl (University of Veterinary Medicine Vienna) for valuable comments on the manuscript.

This study was supported by grant 10208127 from the University of Hong Kong.

Footnotes

Published ahead of print 21 December 2012

Supplemental material for this article may be found at http://dx.doi.org/10.1128/AEM.03439-12.

REFERENCES

  • 1. Kondo T, Strayer CA, Kulkarni RD, Taylor W, Ishiura M, Golden SS, Johnson CH. 1993. Circadian rhythms in prokaryotes: luciferase as a reporter of circadian gene expression in cyanobacteria. Proc. Natl. Acad. Sci. U. S. A. 90:5672–5676 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Bahl J, Lau MC, Smith GJ, Vijaykrishna D, Cary SC, Lacap DC, Lee CK, Papke RT, Warren-Rhodes KA, Wong FK, McKay CP, Pointing SB. 2011. Ancient origins determine global biogeography of hot and cold desert cyanobacteria. Nat. Commun. 2:163. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Pointing SB, Belnap J. 2012. Microbial colonization and controls in dryland systems. Nat. Rev. Microbiol. 10:551–562 [DOI] [PubMed] [Google Scholar]
  • 4. Baca I, Sprockett D, Dvornyk V. 2010. Circadian input kinases and their homologs in cyanobacteria: evolutionary constraints versus architectural diversification. J. Mol. Evol. 70:453–465 [DOI] [PubMed] [Google Scholar]
  • 5. Dvornyk V, Vinogradova ON, Nevo E. 2003. Origin and evolution of circadian clock genes in prokaryotes. Proc. Natl. Acad. Sci. U. S. A. 100:2495–2500 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Dvornyk V. 2006. Subfamilies of cpmA, a gene involved in circadian output, have different evolutionary histories in cyanobacteria. Microbiology 152:75–84 [DOI] [PubMed] [Google Scholar]
  • 7. Dvornyk V. 2009. The circadian clock gear in cyanobacteria: assembled by evolution, p 241–258 In Ditty JL, Mackey S, Johnson CH. (ed), Bacterial circadian programs. Springer-Verlag, Berlin, Germany [Google Scholar]
  • 8. Katayama M, Kondo T, Xiong J, Golden SS. 2003. ldpA encodes an iron-sulfur protein involved in light-dependent modulation of the circadian period in the cyanobacterium Synechococcus elongatus PCC 7942. J. Bacteriol. 185:1415–1422 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Dvornyk V. 2005. Molecular evolution of ldpA, a gene mediating circadian input signal in cyanobacteria. J. Mol. Evol. 60:105–112 [DOI] [PubMed] [Google Scholar]
  • 10. Ivleva NB, Bramlett MR, Lindahl PA, Golden SS. 2005. LdpA: a component of the circadian clock senses redox state of the cell. EMBO J. 24:1202–1210 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Dvornyk V, Vinogradova ON, Nevo E. 2002. Long-term microclimatic stress causes rapid adaptive radiation of kaiABC clock gene family in a cyanobacterium, Nostoc linckia, from the “Evolution Canyons” I and II, Israel. Proc. Natl. Acad. Sci. U. S. A. 99:2082–2087 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Dvornyk V, Jahan AS. 2012. Extreme conservation and non-neutral evolution of the cpmA circadian locus in a globally distributed Chroococcidiopsis sp. from naturally stressful habitats. Mol. Biol. Evol. 29:3899–3907 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Caruso T, Chan Y, Lacap DC, Lau MC, McKay CP, Pointing SB. 2011. Stochastic and deterministic processes interact in the assembly of desert microbial communities on a global scale. ISME J. 5:1406–1413 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Peel MC, Finlayson BL, McMahon TA. 2007. Updated world map of the Koppen-Geiger climate classification. Hydrol. Earth Syst. Sci. 11:1633–1644 [Google Scholar]
  • 15. Golden SS, Brusslan J, Haselkorn R. 1987. Genetic engineering of the cyanobacterial chromosome. Methods Enzymol. 153:215–231 [DOI] [PubMed] [Google Scholar]
  • 16. Thompson JD, Higgins DG, Gibson TJ. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673–4680 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Posada D. 2008. jModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25:1253–1256 [DOI] [PubMed] [Google Scholar]
  • 18. Abascal F, Zardoya R, Posada D. 2005. ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21:2104–2105 [DOI] [PubMed] [Google Scholar]
  • 19. Akaike H. 1974. New look at statistical model identification tree. IEEE T. Automat. Contr. AC 19:716–723 [Google Scholar]
  • 20. Felsenstein J. 1981. Evolutionary trees from DNA sequences: a maximum-likelihood approach. J. Mol. Evol. 17:368–376 [DOI] [PubMed] [Google Scholar]
  • 21. Jones DT, Taylor WR, Thornton JM. 1992. The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci. 8:275–282 [DOI] [PubMed] [Google Scholar]
  • 22. Saitou N, Nei M. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406–425 [DOI] [PubMed] [Google Scholar]
  • 23. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum-parsimony methods. Mol. Biol. Evol. 28:2731–2739 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389–3402 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Watterson GA. 1975. On the number of segregating sites in genetic models without recombination. Theor. Pop. Biol. 7:256–276 [DOI] [PubMed] [Google Scholar]
  • 26. Librado P, Rozas J. 2009. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25:1451–1452 [DOI] [PubMed] [Google Scholar]
  • 27. Excoffier L, Smouse PE, Quattro JM. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:479–491 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Excoffier L, Laval G, Schneider S. 2005. Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evol. Bioinform. Online 1:47–50 [PMC free article] [PubMed] [Google Scholar]
  • 29. Jost L. 2008. GST and its relatives do not measure differentiation. Mol. Ecol. 17:4015–4026 [DOI] [PubMed] [Google Scholar]
  • 30. Chao A, Shen T. 2010. SPADE (species prediction and diversity estimation): program and user's guide. http://chao.stat.nthu.edu.tw/softwareCE.html
  • 31. Hudson RR. 1987. Estimating the recombination parameter of a finite population model without selection. Genet. Res. 50:245–250 [DOI] [PubMed] [Google Scholar]
  • 32. Kelly JK. 1997. A test of neutrality based on interlocus associations. Genetics 146:1197–1206 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Rozas J, Gullaud M, Blandin G, Aguade M. 2001. DNA variation at the rp49 gene region of Drosophila simulans: evolutionary inferences from an unusual haplotype structure. Genetics 158:1147–1155 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Kuhner MK. 2006. LAMARC 2.0: maximum-likelihood and Bayesian estimation of population parameters. Bioinformatics 22:768–770 [DOI] [PubMed] [Google Scholar]
  • 35. Hill WG, Robertson A. 1968. Linkage disequilibrium in finite populations. Theor. Appl. Genet. 38:226–231 [DOI] [PubMed] [Google Scholar]
  • 36. Tajima F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123:585–596 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Fay JC, Wu CI. 2000. Hitchhiking under positive Darwinian selection. Genetics 155:1405–1413 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Fu YX. 1997. Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics 147:915–925 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Fu Y-X, Li W-H. 1993. Statistical tests of neutrality of mutations. Genetics 133:693–709 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Achaz G. 2008. Testing for neutrality in samples with sequencing errors. Genetics 179:1409–1424 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Zeng K, Shi S, Wu CI. 2007. Compound tests for the detection of hitchhiking under positive selection. Mol. Biol. Evol. 24:1898–1908 [DOI] [PubMed] [Google Scholar]
  • 42. Nei M, Gojobori T. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3:418–426 [DOI] [PubMed] [Google Scholar]
  • 43. Ramos-Onsins SE, Rozas J. 2002. Statistical properties of new neutrality tests against population growth. Mol. Biol. Evol. 19:2092–2100 [DOI] [PubMed] [Google Scholar]
  • 44. Harpending HC. 1994. Signature of ancient population growth in a low-resolution mitochondrial DNA mismatch distribution. Hum. Biol. 66:591–600 [PubMed] [Google Scholar]
  • 45. Tanabe Y, Kasai F, Watanabe MM. 2007. Multilocus sequence typing (MLST) reveals high genetic diversity and clonal population structure of the toxic cyanobacterium Microcystis aeruginosa. Microbiology 153:3695–3703 [DOI] [PubMed] [Google Scholar]
  • 46. Coscollá M, Gosalbes MJ, Catalán V, González-Candelas F. 2006. Genetic variability in environmental isolates of Legionella pneumophila from Comunidad Valenciana (Spain). Environ. Microbiol. 8:1056–1063 [DOI] [PubMed] [Google Scholar]
  • 47. Jordan IK, Rogozin IB, Wolf YI, Koonin EV. 2002. Essential genes are more evolutionarily conserved than are nonessential genes in bacteria. Genome Res. 12:962–968 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Woelfle MA, Ouyang Y, Phanvijhitsiri K, Johnson CH. 2004. The adaptive value of circadian clocks: an experimental assessment in cyanobacteria. Curr. Biol. 14:1481–1486 [DOI] [PubMed] [Google Scholar]
  • 49. Vinuesa P, Silva C, Werner D, Martinez-Romero E. 2005. Population genetics and phylogenetic inference in bacterial molecular systematics: the roles of migration and recombination in Bradyrhizobium species cohesion and delineation. Mol. Phylogenet. Evol. 34:29–54 [DOI] [PubMed] [Google Scholar]
  • 50. Silva C, Vinuesa P, Eguiarte LE, Souza V, Martinez-Romero E. 2005. Evolutionary genetics and biogeographic structure of Rhizobium gallicum sensu lato, a widely distributed bacterial symbiont of diverse legumes. Mol. Ecol. 14:4033–4050 [DOI] [PubMed] [Google Scholar]
  • 51. Perrineau MM, Le Roux C, de Faria SM, de Carvalho Balieiro F, Galiana A, Prin Y, Bena G. 2011. Genetic diversity of symbiotic Bradyrhizobium elkanii populations recovered from inoculated and non-inoculated Acacia mangium field trials in Brazil. Syst. Appl. Microbiol. 34:376–384 [DOI] [PubMed] [Google Scholar]
  • 52. Rooney AP, Swezey JL, Friedman R, Hecht DW, Maddox CW. 2006. Analysis of core housekeeping and virulence genes reveals cryptic lineages of Clostridium perfringens that are associated with distinct disease presentations. Genetics 172:2081–2092 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Johnson CH. 2005. Testing the adaptive value of circadian systems. Methods Enzymol. 393:818–837 [DOI] [PubMed] [Google Scholar]
  • 54. Shapiro BJ, Friedman J, Cordero OX, Preheim SP, Timberlake SC, Szabo G, Polz MF, Alm EJ. 2012. Population genomics of early events in the ecological differentiation of bacteria. Science 336:48–51 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Dvornyk V, Deng HW, Nevo E. 2004. Structure and molecular phylogeny of sasA genes in cyanobacteria: insights into evolution of the prokaryotic circadian system. Mol. Biol. Evol. 21:1468–1476 [DOI] [PubMed] [Google Scholar]
  • 56. Cruciani F, Trombetta B, Labuda D, Modiano D, Torroni A, Costa R, Scozzari R. 2008. Genetic diversity patterns at the human clock gene period 2 are suggestive of population-specific positive selection. Eur. J. Hum. Genet. 16:1526–1534 [DOI] [PubMed] [Google Scholar]
  • 57. O'Fallon B. 2008. Population structure, levels of selection, and the evolution of intracellular symbionts. Evolution 62:361–373 [DOI] [PubMed] [Google Scholar]
  • 58. Tazi L, Perez-Losada M, Gu W, Yang Y, Xue L, Crandall KA, Viscidi RP. 2010. Population dynamics of Neisseria gonorrhoeae in Shanghai, China: a comparative study. BMC Infect. Dis. 10:13 doi:10.1186/1471-2334-10-13 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Becraft ED, Cohan FM, Kuhl M, Jensen SI, Ward DM. 2011. Fine-scale distribution patterns of Synechococcus ecological diversity in microbial mats of Mushroom Spring, Yellowstone National Park. Appl. Environ. Microbiol. 77:7689–7697 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60. Zeng K, Fu YX, Shi S, Wu CI. 2006. Statistical tests for detecting positive selection by utilizing high-frequency variants. Genetics 174:1431–1439 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61. Nicholas KB, Nicholas HB, Jr, Deerfield DW., II 1997. GeneDoc: analysis and visualization of genetic variation. EMBNEW.NEWS 4:14 [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental material

Articles from Applied and Environmental Microbiology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES