Abstract
Background
The Río Tinto (RT) is distinguished from other acid mine drainage systems by its natural and ancient origins. Microbial life from all three domains flourishes in this ecosystem, but bacteria dominate metabolic processes that perpetuate environmental extremes. While the patchy geochemistry of the RT likely influences the dynamics of bacterial populations, demonstrating which environmental variables shape microbial diversity and unveiling the mechanisms underlying observed patterns, remain major challenges in microbial ecology whose answers rely upon detailed assessments of community structures coupled with fine-scale measurements of physico-chemical parameters.
Methodology/Principal Findings
By using high-throughput environmental tag sequencing we achieved saturation of richness estimators for the first time in the RT. We found that environmental factors dictate the distribution of the most abundant taxa in this system, but stochastic niche differentiation processes, such as mutation and dispersal, also contribute to observed diversity patterns.
Conclusions/Significance
We predict that studies providing clues to the evolutionary and ecological processes underlying microbial distributions will reconcile the ongoing debate between the Baas Becking vs. Hubbell community assembly hypotheses.
Introduction
Geological and geochemical studies show the Río Tinto to be an acidic river situated at the core of the largest Pyritic Belt on Earth (Fig. 1) whose chemistry has been shaped by the metabolism of chemolithotrophic microbes bioleaching its rich metallic ores for the past 60 My [1]. These microbial activities produce sulfuric acid resulting in a pH below 3 and high concentrations of heavy metals very much like acid mine drainage systems but of natural and very ancient origin. The RT has also attracted the interests of Astrobiologists because its geochemical characteristics are relevant to Martian hematite sites [1]. Research over the past 15 years shows the river contains predominantly microscopic organisms from the three domains of life. Bacteria outnumber archaea by at least ten fold [2]. Eukaryotes are conspicuous and diverse [3] and phototrophs and fungi comprise the largest biomass [4]. While the patchy geochemistry of the RT likely influences the dynamics of the most abundant bacterial populations [2], [5], demonstrating how environmental factors shape microbial community structure of low, moderate and high abundance microbes remains a first order question in microbial ecology research. Environmental tag sequencing methods [6] are ideal for addressing this issue as they allow for deeper sampling of the molecular populations of PCR amplicons. These methods capitalize on the intrinsic phylogenetic information contained in genetically hypervariable regions of the 16S ribosomal RNA gene (rDNA) to simultaneously provide accurate assessments of the relative abundances of all microbial community members and their taxonomic affinities (Text S1). We applied Serial Analysis of Ribosomal Sequence Tags of the V6 hypervariable region (SARST-V6 [7]) to replicate samples from three sites at three stations along the RT (Fig. 1). We coupled these data with measurements of physico-chemical parameters to explore how the environment shapes bacterial community structure. In this study rather than describing the microbial community of the RT, we concentrate on microbial (alpha and beta) ecological diversity. We first aimed to demonstrate that in spite of the dearth of saturation and replication in microbial ecology studies so far, they are in fact essential to provide a comprehensive view of natural microbial assemblages. Our second aim was to cluster short tag sequences into ecologically differentiated populations to shed light on the evolutionary ecological processes underlying microbial diversity patterns in the RT.
Results and Discussion
Defining a criterion for clustering sequences in microbial ecology
Clustering sequences into operational taxonomic units (OTUs) is the first step in a molecular study exploring ecological diversity. Microbiologists traditionally use a 97% similarity cut-off value to form OTUs that delineate microbial species [8]. Cohan [9] and Polz et al. [10] recommend an infraspecific taxonomic level to define significant units in microbial ecology and advocate for an evolutionary ecological criterion to identify distinct microbial populations adapted to a given habitat (ecotypes). Recent bacterial diversity studies identified the presence of microdiverse rDNA clusters at the 99% similarity level denoting bacterial populations that probably arose by selective sweeps followed by effectively neutral diversification [11]–[13]. Furthermore, at least for Vibrio spp., these clusters constituted individuals different at the genomic level but whose divergence should be neutral (i.e. with no selective advantage) because of the small spatial scale in which they coexisted [14]. Through environmental sequencing of RT samples we found a total of 1,212 unique ribosomal sequence tags (RSTs) out of 10,529 SARST-V6 tags. RSTs have been deposited in GenBank under accession numbers FJ005322-FJ006533. Most of the microdiversity we observed involved sequences that cluster at >98.5% similarity. The average tag length was 62 bp but the aligned V6 tag regions spanned 142 bp so this represents a 2 bp difference between aligned sequences. The number of clusters at this cut-off was 50% of the maximum possible number of clusters (Fig. 2). Clustering at a 3 bp difference (98%) only decreased the number of clusters by 8.6% (Fig 2). Until the implementation of more appropriate methods than similarity cut-off criteria for defining ecotypes [9], clustering sequences at a 99% similarity for rDNA is the best compromise to form cohesive neutral units of diversity. Linking physicochemical parameters with the resulting genotypic microclusters, however, is still necessary to corroborate that they are differentiated populations that constitute ecologically significant units or ecotypes [9], [10] rather than interoperon heterogeneity within one cell [15].
In-depth microbial community composition: the known, the new and the rare
The majority of the 458 OTUs from this study matched sequences previously found in anthropogenically impacted acidic soils or streams but were not previously detected in the RT using culture dependent and other culture independent methods to study microbial diversity. A relatively small number of OTUs dominated all sites (Fig. 3). This pattern is applicable in situations where one or a few factors dominate the ecology of an assemblage [16], as in the RT [2]. The most abundant OTUs previously detected in the RT gave a 100% match to phylogenetic ribotypes of Acidithiobacillus ferrooxidans and Leptospirillum ferrooxidans and other relatively less abundant ribotypes (Fig. 3) found with the same prevalence, at the same sites during the same time of year by colleagues using DGGE and FISH methods [see Fig. 3 and 4 and Table 5 in ref [2]. The equivalent sites are as follows (this study/ Gonzalez-Toril et al. [2]): (OR1/RT5; OR2/RT2; OR3/RT1; AG/RT6; BE/RT9)]. We interpret this observation as evidence that the same bacterial populations reoccur at certain geochemically stable RT locations. We found Acidiphilium sp. related tags to be in lower numbers than in the Gonzalez et al. study [2] and attribute this difference to a mismatch in our SARST-V6 primer. Nevertheless, because the bias is consistent across samples it should not invalidate our down-stream ecological diversity analyses [17]. Taxa that had escaped detection in this river so far include the second most common OTU in our dataset (1,654 tags), which matches uncultured bacterial clones MPKCSC9 and TrefC11 (Fig. 3) with 100% similarity. These bacteria dominate macroscopic biofilms thriving in two acidic, metal-rich streams from copper and pyrite mines of Wales and are described as novel acidophilic autotrophic iron oxidizers [18] 99.9% similar to uncultured bacterial clones TRA3-20 and Tui3-12 from acid mine drainage areas in California and New Zealand, respectively. Our OTU also matches these two clones at 100% similarity and has its highest relative abundance at RT sites Anabel's Garden AG1 (175 tags of 539 tags sequenced for this site) and AG3 (783 tags of 1679 tags sequenced), both with similar concentrations of As [19], Fe, S, Zn and pH to the mines where the macroscopic biofilms dominate [18]. Only a few abundant OTUs detected by SARST-V6 (Fig. 3) exhibit low similarity to anything in the databases. In contrast, of the total number of RSTs, 15% differ more than 10% from anything in the databases, and all of them are found at relatively low abundance. This result coincides with previous findings of a “rare biosphere” accompanying the most abundant taxa in microbial communities [20]. Equally remarkable is that a large proportion of less abundant members of the bacterial assemblages in the RT have a 97% similarity to rRNA gene sequences deposited in GenBank whose best match is a sequence of a microbe from an acidic environment. Among these rarer members are bacterial endosymbionts of acidophilic eukaryotes or bacteria previously reported from digestive systems, as well as free-living bacteria observed in metal impacted soils or acid mine drainages.
Measuring ecological diversity and evaluating the importance of saturation and replicate sampling
In order to generate accurate estimates of diversity, our sampling scheme consisted of a replicated sampling design and large area of coverage of the stations [21]. We measured both alpha and beta diversity. Alpha diversity provides assessments of microbial richness in a particular natural environment. Comparisons of alpha diversity are univariate, two samples could have the same species richness or evenness but not share any taxa. In contrast, beta diversity measures (dis)similarity among samples through the use of multivariate methods that compare samples based on taxon composition and relative abundance. It is important to note that taxonomic description is not required for assessment of either of these measures.
Alpha diversity
Each method for estimating richness and comparing alpha diversity between samples has advantages and drawbacks [16], [17], [22]. Rarefaction, a measure of alpha diversity that reflects sample coverage at a site, is a good comparative method of observed microbial richness among samples at the same sampling effort [16]. Statistical differences in rarefaction curves among RT sample replicates from this study emphasize the importance of replication when measuring ecological diversity. In bacterial assemblages, most taxa are rare [23] and therefore rarefaction curves continue to increase with sampling effort and rarely reach an asymptote unless diversity is very low or sampling is very thorough. In contrast to rarefaction, coverage-based non-parametric richness estimators estimate overall species richness and compare alpha diversity of communities provided they reach an asymptote [16]. In this study, non-parametric estimators Chao1 and ACE ranged between 152 and 461 estimated OTUs for all the sites considered (Table 1). Chao1 values leveled off in more samples than ACE and it reached saturation in nearly all sites and in at least one sample per site (Table 1). Representation of Chao1 against sampling effort showed that alpha diversity is not significantly different among sites (P<0.05) at the same sampling effort (Fig. 4). This is effectively true for all sites except AG1 and AG3 because they did not reach saturation (Table 1). The highest OTU richness is found in the less extreme sites, a result that coincides with DGGE analysis [2]. The estimated overall number of OTUs is low in the RT sites compared to other environments with less extreme characteristics like soils or sediments where non-parametric estimators might not perform well (cf. [17]). When dealing with highly diverse samples Hong et al. have suggested a new set of statistical approaches to calculate microbial richness from parametric models [22].
Table 1. Alpha diversity measurements at Río Tinto samples and sites.
Samples | Number of tags | Number of OTUs | ACE | Chao1 (95% CIs) |
AG1.2 | 485 | 77 | 210* | 197*(133–326) |
AG1.3 | 62 | 18 | 38 | 80(29–394) |
AG2.1 | 291 | 44 | 149 | 136*(82–285) |
AG2.2 | 116 | 15 | 45 | 35(22–93) |
AG2.3 | 719 | 43 | 153 | 170*(89–419) |
AG3.1 | 704 | 63 | 140 | 159*(105–294) |
AG3.2 | 478 | 76 | 205* | 220*(145–382) |
AG3.3 | 624 | 101 | 274 | 251*(181–388) |
BE1.1 | 543 | 87 | 332* | 354*(207–688) |
BE1.2 | 308 | 37 | 86 | 80*(55–152) |
BE1.3 | 59 | 17 | 29 | 43*(23–141) |
BE2.1 | 719 | 52 | 196* | 148*(91–294) |
BE2.3 | 462 | 68 | 333* | 252*(151–486) |
BE3.1 | 376 | 53 | 153 | 152*(95–298) |
BE3.3 | 760 | 43 | 132 | 112*(70–232) |
OR1.1 | 672 | 103 | 373 | 356(231–610) |
OR1.2 | 334 | 57 | 115* | 119(91–177) |
OR2.1 | 551 | 72 | 235* | 201*(135–341) |
OR2.2 | 344 | 40 | 96 | 160(80–438) |
OR2.3 | 451 | 76 | 219 | 202(138–345) |
OR3.1 | 776 | 90 | 383 | 308(200–530) |
OR3.3 | 695 | 51 | 123* | 116*(86–175) |
Sites | ||||
AG1 | 547 | 83 | 231 | 211(141–363) |
AG2 | 1126 | 81 | 399 | 268*(164–500) |
AG3 | 1806 | 167 | 401 | 432(315–642) |
BE1 | 879 | 112 | 461* | 312*(216–496) |
BE2 | 910 | 99 | 275* | 277*(186–462) |
BE3 | 1136 | 78 | 217* | 152*(112–237) |
OR1 | 1006 | 129 | 370* | 310*(227–462) |
OR2 | 1246 | 130 | 346 | 328*(234–505) |
OR3 | 1471 | 110 | 222* | 254*(184–391) |
Number of tags, number of OTUs, and ACE and Chao1 non-parametric richness estimators arranged for each sample (top) and for pooled samples by site (bottom). The asterisk indicates estimators that plateau at a given number of OTUs.
Beta diversity
To evaluate relationships among samples based on shared OTU relative abundance, we present results using the Morisita-Horn pairwise similarity coefficient. This index is widely used because it is less influenced by species richness and sample size than other (dis)similarity measures of quantitative data [16] and showed the best agreement between all methods employed to compare beta diversity in RT samples (see Materials & Methods). Non-metric Multi-Dimensional Scaling (MDS) ordination in conjunction with clustering analysis with the Unweighted Pair Group Method with Arithmetic mean (UPGMA) and ANalysis Of SIMilarities (ANOSIM) indicated a high similarity between Berrocal (BE) and AG2 samples (Fig. 5). Two other groups emerged from these analyses: one group included AG1 and AG3 samples and the other group Origin (OR) samples that further split at 85% similarity. Because assemblages vary in composition over space and time for stochastic reasons, sampling replication as well as saturation of alpha diversity help to capture the randomness of OTU recovery in microbial communities providing a more accurate estimate of beta diversity. For instance, OR2 samples were spread out in the 2-D plot (Fig. 5). This indicated poor replication in OTU composition of these samples, which is in agreement with the high OR2 site alpha diversity (Fig. 4). Only through replication do we obtain a better representation of the metapopulation at this site (Fig. 5 inset and see next section). Furthermore, samples that displayed unsaturated non-parametric alpha richness (Fig. 4) did not plot in the same position as when pooled by site (compare MDS plot of Fig. 5 with Fig. 5 inset) nor when comparing their distribution using environmental variables (see next section).
Linking community attributes to environmental variables
Amils et al. [24] proposed a geomicrobiological model for the RT controlled by iron and based on the geochemistry and the metabolism of the most abundant bacteria and archaea. The major non-photosynthetic primary producers A. ferrooxidans and L. ferrooxidans obtain their energy from pyrite (FeS2) and the oxidized metabolites can in turn be mineralized by heterotrophic microbes like Acidiphilium spp., Ferromicrobium or sulfate reducers. Ferric iron buffers the pH at or below pH 3. In our study, of the 22 environmental parameters measured, seven best explained the variation in the data (As, Fe, Mn, Sr, Zn, pH, and redox) (see Material & Methods). We used these variables to perform Canonical Correspondence Analysis (CCA) using OTUs at a 99% similarity cut-off with both samples and sites. The CCA plot for samples, sites or OTUs with respect to environmental variables showed a strong correlation of the canonical axes with the variables chosen (Fig. 6). Monte Carlo permutation tests for the first and all axes for samples and sites were highly significant (P = 0.002) indicating that these environmental parameters are important in explaining community diversity. For instance, AG2 is more similar in geochemistry and relative abundance of OTUs to BE (∼30 km away) than to AG1 and AG3, only meters away (Fig. 1). OTUs that plotted near BE and AG2 sites may therefore be better adapted to relatively higher concentrations of Zn and lower concentrations of As than OTUs with a higher relative abundance at other sites (Fig. 1 and 5). Furthermore, we observed that several OTUs had exactly the same match in GenBank and occupied the same position in the ordination plot (Fig. 6). We infer they are members of the same subspecific unit or ecotype that is better adapted to particular environmental characteristics.
Implications for microbial community dynamics and biogeography
Stochastic vs. deterministic community assembly hypotheses are being tested in parallel to explain the distributional patterns of organisms in natural environments [25], [26]. In microbiology, the debate over the causes of niche apportionment started early in the nineteenth century. Baas Becking pointed towards a deterministic composition of the microbial communities with the hypothesis of “everything is everywhere, but, the environment selects” to explain his recurring observation of resuscitating microbial forms in enrichment cultures. This idea has generated much debate in recent years [27]. Hubbell's neutral theory of biodiversity and biogeography [28] examines the consequences of assuming a per capita ecological equivalence of trophically similar individuals of all sympatric species in a given community when shaped by ecological drift, random migration and random speciation. He concludes that these mechanisms decouple niche differentiation from control of species richness and relative species abundance in ecological communities. Sloan et al. [26] corroborate that immigration and chance are important processes shaping microbial communities demonstrating that stochastic neutral community models can describe the assemblage patterns of microorganisms. If we equate immigration with dispersal in the microbial world, dispersal and mutation are important processes driving bacterial population diversity patterns in the RT. Regarding mutation, we found that the most abundant OTUs are generally composed of a unique RST with the highest numbers of tags characterized by exact matches to sequences in GenBank (100% if it is a known species), and a few other unique RSTs with lower numbers of tags and correspondingly lower matches to sequences in GenBank. This pattern of within-OTU microdiversity cannot be explained by standard Taq error rates [12] alone and is best explained by high mutation rates in bacterial populations not yet being purged by selection. Genetic variation from mutation is an important process that might play a significant role in the population dynamics of asexual organisms [29]–[31]. In the long term, the ecosystem as a whole benefits from high biodiversity levels as it assures a good response to environmental variation. Yachi and Loreau [32] have referred to this as the “insurance effect”. With respect to dispersal, Hubbell's neutral model predicts species abundances to follow a log series distribution when immigration is unlimited if point mutation is the dominant form of speciation [28]; all RT samples from this study follow this model of species abundance when singletons are eliminated (p>0.05). Departure from the log series distribution at larger sampling efforts in RT samples might be explained by the high dispersal rate typical of bacteria (cf. [33]), which would make the tail of less abundant OTUs longer than expected for migration rates typical of macroorganisms. These less abundant taxa constitute allochthonous microbiota that arrive by dispersal and if they survive in RT extreme conditions they do so by competing for the leftovers from the dominant ecotypes. Our data agree with bacterial assemblages composed of “core” taxa and a “seed bank” [23]. The first are the most abundant and active organisms using relevant chemical elements as electron donors or acceptors and therefore adapted to fine changes in those elements in the system. The second are the “occasional taxa” derived from mutation and migration that might constitute a reservoir of diversity to respond to environmental changes (cf. [34]). For instance, the high abundance of the OTU that matches uncultured bacterial clones MPKCSC9 and TrefC11 in AG3 and AG1 sites (see above) contrasts with its low prevalence in AG2 (45 tags of 989 tags sequenced).
On the other hand, the presence of endemic ecotypes that correlate with particular environmental factors seem to contradict Baas Becking's ideas on global bacterial distribution [35], [36]. Our findings show that these ideas are not contradictory. A large number of OTUs match at 100% similarity those in geographically distant environments with similar physico-chemical characteristics (see above). Another interesting case is an OTU that had a 100% match to a sequence from a symbiont originally described from an amoeba in Iron Mountain (California) Candidatus captivus acidiprotistae [37]. This suggests a global distribution of its eukaryotic host via adaptation to low pH, and high Fe, As and Sr and Mn environments. Because we are dealing with a highly variable region of the small-subunit rRNA gene confirmed by the high mutation rate within each OTU, the presence of these highly similar sequences across the globe can only be explained if they are part of the same genetic pool. Coincidentally, the dominance of best competitors in a given environment is predicted by simulation when dispersal is not limited [38]. Thus, our results favor a scenario in which high immigration rates allow the global dispersion of ecotypes better adapted to certain environmental conditions, which prevail over less adapted units that emerge locally. Pommier et al. (2007) and Ramette et al. (2007) have suggested a similar pattern of global deterministic ecotype adaptation [39], [40]. Whether we consider this cosmopolitanism of ecotypes or local adaptation at a global scale is a question of lexical taste. Dispersal rather than niche differentiation is therefore the process eventually responsible for the observed deterministic pattern of most abundant members of the communities under this hypothesis reconciling neutral versus deterministic models of microbial community assemblage.
Perspectives
Seasonal sampling that integrates bacterial, archaeal and eukaryal components of the microbial community is the necessary next step to understanding whether interaction of all trophic levels in the RT confirm or reject our scenario of the global distribution of adapted ecotypes.
Materials and Methods
Sampling sites, sample collection and DNA extraction
Our study included three stations in the RT that have distinct physico-chemical parameters and biology [2], [3], [41]; 1) the river's Origin (OR), 2) Anabel's Garden (AG) and 3) Berrocal (BE) (Fig. 1). At the OR station (N 37° 43.32′×W 6° 33.06′) we sampled three sites a few meters apart including OR3 that has some of the most extreme conditions along the river. The AG station (N 37° 43.49′×W 6° 33.62′) contains abundant and distinct biofilms. AG sampling sites are in a small stream and in a small ephemeral pool fed by seeps along the stream bank. The geochemical characteristics change over a relatively small spatial scale at AG. Higher water flow at BE station (N 37°35.58′×W 6° 33.04′) results in a well-mixed water column resulting in our most homogenous station. In October 2002, we sampled surface water in triplicate from three different sites at each of the three stations. We designated our samples using the following naming convention: Station abbreviation, site number, sample replicate number e.g. AG1.2 is the second replicate sample from site 1 at AG station. We rinsed 4 L plastic buckets three times with water from each site immediately before each replicate sample collection. We filtered 1–2 L from each sample by hand on site through 0.22 µm Sterivex filters (Millipore, Billerica, MA USA) and post-washed filters with 2 mL sterile acid water (pH 1.8).
We added Cell Lysis Solution from the Puregene® DNA extraction kit (Gentra Systems, Inc, Minneapolis, MI USA) directly to the sterivex filter using a 3cc. syringe, sealed the filter, and placed it into a liquid nitrogen dry shipper (Model SC14/2V, Custom BioGenic Systems, Shelby Township, MI). We extracted total DNA within one week of collection using the Puregene Bacteria DNA purification procedure with the following modifications. We added lysozyme (67 µL of 50 mg/mL solution) and proteinase K (10 µL at 20 mg/mL) consecutively directly to the sterivex filters and incubated these enzymes with agitation as indicated in the protocol. We then transferred incubated samples to three 2-mL microfuge tubes to proceed with the protein precipitation step. Nucleic acid precipitation occurred in 1 volume of isopropanol. Finally we resuspended DNA pellets in 30 µL Puregene© DNA Hydration Solution per sample and stored them at −20°C until further processing.
Physicochemical measurements
For each sample both filtered and unfiltered 15 mL water aliquots were analyzed using Total Reflection X-ray Fluorescence (TXRF) at the Universidad Autónoma de Madrid (UAM Scientific Service, Spain) to determine the concentration of 22 chemical elements in the water samples examined. Given filtered vs. unfiltered geochemistries were not significantly different, we report results with unfiltered samples.
We measured redox potential and pH (using a Crison 506 pH/Eh meter) and conductivity (using a Orion-122 conductivity-meter) at the time of water collection from 15 mL aliquots. Oxygen concentration and water temperature were measured using an Orion-810 oxymeter in situ in the river at the time of water collection. These two parameters varied with time of day as the sun rose and heated the river so they were not included in our analyses.
SARST-V6 amplification, sequencing, sequence analysis and Operational Taxonomic Unit (OTU) determination
SARST-V6 produces sequences of large concatemers of PCR-amplified ribosomal sequence tags (RSTs) from homologous V6 hypervariable regions. We performed amplification and purification of the V6 region of bacteria following [7] except that we used Accuprime™ Supermix (Invitrogen Inc., Carlsbad CA, USA) at a later phase of this project. PCR products were then ligated into concatemers, cloned and sequenced as previously described. A single sequence product contains information of multiple bacteria present in the DNA sample in the form of RSTs. The pipeline for SARST-V6 sequence analysis [42] parses concatemers into single RSTs, purges artifacts and pools RSTs into unique tag sequences. A combination of BLAST against the GenBank database (http://www.ncbi.nlm.nih.gov/GeneBank) and RDPQuery [43] against the RDPII database [44] guided taxonomic assignments of tags. A quality control step served to remove tags that hit non-ribosomal sequences, phage, virus, plasmid, chloroplast or vector sequences in GenBank. We then imported unique RSTs into ARB [45] along with the sequences of top GenBank and RDPII matches not already in ARB to generate a multiple sequence alignment used to pare-down tags that violated secondary structure in the V6 stem. The Clusterer program version 1.1.20060314 [46] served to group aligned sequences into OTUs. Because average and single linkage clustering algorithms are considered to be less conservative and more dependent on sampling intensity [47], we used the complete linkage algorithm with default parameters, except that we collapsed subsequent gaps to avoid overestimating distances from the rapidly diverging V6 region. We employed customized Perl scripts to construct abundance matrices accounting for the number of unique tag sequences per sample for each particular OTU and for each BLAST top hit GenBank gi number. Names of OTUs follow the first BLAST hit that match the most abundant RST of the cluster, regardless of the OTU's identity to other hits.
Ecological diversity measurements
Alpha or inventory diversity
We first compared diversity between samples by representing relative abundances of OTUs in a rank/abundance plot. We then tested whether the data fit one of four statistical models of species abundance distributions: the geometric series, log normal, log series, or broken stick models. We transformed total abundance data into 0/1 matrices as input into the program EstimateS [48] to compute rarefaction curves, non-parametric richness estimators and several indices of alpha diversity.
Beta or differentiation diversity
We also used EstimateS to calculate Morisita-Horn, and the newly developed Chao-Jaccard and Chao-Sørensen abundance based beta similarity estimators [16], [49]. We further calculated Bray-Curtis similarities as similarity coefficients normalized by sample size using the PRIMER-E Ltd [50] software package. This software was used to perform non-metric multi-dimensional scaling (MDS) in conjunction with clustering analysis with the Unweighted Pair Group Method with Arithmetic mean (UPGMA). MDS was performed with 100 restarts at different random positions of samples to avoid local minima. To test the null hypothesis that there were no differences in community composition among sites we used ANalysis Of SIMilarities (ANOSIM) with the software PRIMER-E. ANOSIM is a simple non-parametric test better than the classical multivariate analysis of variance (MANOVA) [51] for this purpose [50]. It is based on the calculation of the R statistic over the rank similarities between samples, whose values can lie between −1 and 1. ANOSIM uses a Mantel permutation procedure combined with a randomization approach to generate significance levels (Monte Carlo tests).
We can consider our samples as real (independent) replicates of the studied sites because repetitive sampling was done independently from one sample to the next and the biological system under consideration is dynamic (the river water flows so the actual sampling space will never be the same from one moment to the next). This assumption not only validates ANOSIM analysis [50] but also allowed us to perform CCA with samples separately as replicates of the particular physico-chemical characteristics of a site to determine how samples behaved independently, and also because a larger number of samples allows for testing a larger number of environmental variables in constrained ordination. Although this is not always the case and careful consideration to this matter is necessary in microbial studies, this is the maximum level of replication that can be achieved when trying to explain beta diversity through environmental variables in natural environments (cf. [16]).
Using environmental data to explain diversity data: Canonical Correspondence Analysis (CCA)
We used CCA as a constrained ordination direct gradient analysis method to relate RSTs grouped into OTUs to the environmental variables measured. We used CANOCO 4.5 [52] to perform CCA with scaling focused on inter-sample distances for the sample vs. environmental variable biplot and inter-species distances for the species vs. environmental variable biplot. These biplots were then superimposed. When performing constrained ordination it is important to limit the number of explanatory (environmental) variables to avoid exceeding the number of samples, otherwise the analysis becomes unconstrained and no different from indirect gradient analysis techniques such as DCA [52]. To perform CCA we used a combination of CANOCO's manual forward selection feature, Pearson correlation, and knowledge of the ecology of the river to select the environmental variables that could serve as proxies of others. To statistically evaluate the significance of the first canonical axis and of all canonical axes together, we used the Monte Carlo permutation full model test (whenever possible) or reduced model test with 199 unrestricted permutations. The program CANODRAW within the CANOCO package helped to visualize the resulting biplots.
Supporting Information
Acknowledgments
We thank M. Neel, J. H. Martiny, Mitchell Sogin, Frederick Cohan, Martin Polz, David Mark Welch and R. Amann for suggestions, Abby Laatsch for sequencing and Ivica Ceraj and Robert K. Colwell for software support.
Footnotes
Competing Interests: The authors have declared that no competing interests exist.
Funding: This work was supported by grants from NASA Astrobiology Institute (NC-1054 LAZ) and from the Ministry of Science and Education (CGL2006/02534/BOS RA) and support to CP by the Spanish Centro de Astrobiología. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Fernández-Remolar D, Gómez-Elvira J, Sebastian E, Martín J, Manfredi JA, et al. The Tinto River, an extreme acidic environment under control of iron, as an analog of the Terra Meridiani hematite site of Mars. Planet Space Sci. 2004;52:239–248. [Google Scholar]
- 2.González-Toril E, Llobet-Brossa E, Casamayor EO, Amann R, Amils R. Microbial ecology of an extreme acidic environment, the Tinto River. Appl Environ Microbiol. 2003;69:4853–4865. doi: 10.1128/AEM.69.8.4853-4865.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Amaral Zettler LA, Gomez F, Zettler ER, Keenan BG, Amils R, et al. Eukaryotic diversity in Spain's River of Fire. Nature. 2002;417:137. doi: 10.1038/417137a. [DOI] [PubMed] [Google Scholar]
- 4.Aguilera A, Manrubia SC, Gomez F, Rodriguez N, Amils R. Eukaryotic community distribution and its relationship to water physicochemical parameters in an extreme acidic environment, Rio Tinto (Southwestern Spain). Appl Environ Microbiol. 2006;72:5325–5330. doi: 10.1128/AEM.00513-06. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Karavaiko GI, Turova TP, Kondrat'eva TF, Lysenko AM, Kolganova TV, et al. Phylogenetic heterogeneity of the species Acidithiobacillus ferrooxidans. Int J Syst Evol Microbiol. 2003;53:113–119. doi: 10.1099/ijs.0.02319-0. [DOI] [PubMed] [Google Scholar]
- 6.Green BD, Keller M. Capturing the uncultivated majority. Curr Opin Biotechnol. 2006;17:236–240. doi: 10.1016/j.copbio.2006.05.004. [DOI] [PubMed] [Google Scholar]
- 7.Kysela DT, Palacios C, Sogin ML. Serial analysis of V6 ribosomal sequence tags (SARST-V6): a method for efficient, high-throughput analysis of microbial community composition. Environ Microbiol. 2005;7:356–364. doi: 10.1111/j.1462-2920.2004.00712.x. [DOI] [PubMed] [Google Scholar]
- 8.Stackebrandt E, Goebel B. Taxonomic note: A place for DNA-DNA reassociation and 16S rRNA sequence analysis in the present species definition in bacteriology. INt J Syst Bacteriol. 1994;44:846–849. [Google Scholar]
- 9.Cohan FM. Towards a conceptual and operational union of bacterial systematics, ecology, and evolution. Philos Trans R Soc Lond B Biol Sci. 2006;29:1985–1996. doi: 10.1098/rstb.2006.1918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Polz M, Hunt D, Preheim S, Weinreich D. Patterns and mechanisms of genetic and phenotypic differentiation in marine microbes. Philos Trans R Soc Lond B Biol Sci. 2006;361:2009–2021. doi: 10.1098/rstb.2006.1928. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Klepac-Ceraj V, Bahr M, Crump BC, Teske AP, Hobbie JE, et al. High overall diversity and dominance of microdiverse relationships in salt marsh sulphate-reducing bacteria. Environ Microbiol. 2004;6:686–698. doi: 10.1111/j.1462-2920.2004.00600.x. [DOI] [PubMed] [Google Scholar]
- 12.Acinas SG, Klepac-Ceraj V, Hunt DE, Pharino C, Ceraj I, et al. Fine-scale phylogenetic architecture of a complex bacterial community. Nature. 2004;430:551–554. doi: 10.1038/nature02649. [DOI] [PubMed] [Google Scholar]
- 13.Koeppel A, Perry EB, Sikorski J, Krizanc D, Warner A, et al. Identifying the fundamental units of bacterial diversity: A paradigm shift to incorporate ecology into bacterial systematics. Proc Natl Acad Sci U S A. 2008;105:2504–2509. doi: 10.1073/pnas.0712205105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Thompson JR, Pacocha S, Pharino C, Klepac-Ceraj V, Hunt DE, et al. Genotypic diversity within a natural coastal bacterioplankton population. Science. 2005;307:1311–1313. doi: 10.1126/science.1106028. [DOI] [PubMed] [Google Scholar]
- 15.Cilia V, Lafay B, Christen R. Sequence heterogeneities among 16S ribosomal RNA sequences, and their effect on phylogenetic analyses at the species level. Mol Biol Evol. 1996;13:451–461. doi: 10.1093/oxfordjournals.molbev.a025606. [DOI] [PubMed] [Google Scholar]
- 16.Magurran AE. Measuring biological diversity. Malden: Blackwell Science Ltd; 2004. p. 256. [Google Scholar]
- 17.Hughes JB, Hellmann JJ, Ricketts TH, Bohannan BJ. Counting the uncountable: statistical approaches to estimating microbial diversity. Appl Environ Microbiol. 2001;67:4399–4406. doi: 10.1128/AEM.67.10.4399-4406.2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Hallberg KB, Coupland K, Kimura S, Johnson DB. Macroscopic streamer growths in acidic, metal-rich mine waters in North Wales consist of novel and remarkably simple bacterial communities. Appl Environ Microbiol. 2006;72:2022–2030. doi: 10.1128/AEM.72.3.2022-2030.2006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Slayman C. Proton chemistry and the ubiquity of proton pumps. BioScience. 1985;35:16–17. [Google Scholar]
- 20.Sogin ML, Morrison HG, Huber JA, Welch DM, Huse SM, et al. Microbial diversity in the deep sea and the underexplored “rare biosphere”. Proc Natl Acad Sci U S A. 2006;103:12115–12120. doi: 10.1073/pnas.0605127103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Dobyns JR. Effects of sampling intensity on the collection of spider (Araneae) Species and the estimation of species richness. Environ Entomol. 1997;26:150–162. [Google Scholar]
- 22.Hong S-H, Bunge J, Jeon S-O, Epstein SS. Predicting microbial species richness. Proc Natl Acad Sci U S A. 2006;103:117–122. doi: 10.1073/pnas.0507245102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Pedros-Alio C. Marine microbial diversity: can it be determined? Trends Microbiol. 2006;14:257–263. doi: 10.1016/j.tim.2006.04.007. [DOI] [PubMed] [Google Scholar]
- 24.Amils R, Gonzalez-Toril E, Fernández-Remolar D, Gomez F, Aguilera A, et al. Extreme environments as Mars terrestrial analogs: The Rio Tinto case. Planetary and Space Science. 2007;55:370–381. [Google Scholar]
- 25.McGill BJ, Maurer BA, Weiser MD. Empirical evaluation of neutral theory. Ecology. 2006;87:1411–1423. doi: 10.1890/0012-9658(2006)87[1411:eeont]2.0.co;2. [DOI] [PubMed] [Google Scholar]
- 26.Sloan WT, Lunn M, Woodcock S, Head IM, Nee S, et al. Quantifying the roles of immigration and chance in shaping prokaryote community structure. Environ Microbiol. 2006;8:732–740. doi: 10.1111/j.1462-2920.2005.00956.x. [DOI] [PubMed] [Google Scholar]
- 27.de Wit R, Bouvier T. ‘Everything is everywhere, but, the environment selects’; what did Baas Becking and Beijerinck really say? Environ Microbiol. 2006;8:755–758. doi: 10.1111/j.1462-2920.2006.01017.x. [DOI] [PubMed] [Google Scholar]
- 28.Hubbell SP. The Unified Neutral Theory of Biodiversity and Biogeography. Princeton, NJ: Princeton University Press; 2001. p. 448. [Google Scholar]
- 29.Boles BR, Thoendel M, Singh PK. From the Cover: Self-generated diversity produces “insurance effects” in biofilm communities. Proceedings of the National Academy of Sciences. 2004;101:16630–16635. doi: 10.1073/pnas.0407460101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.de Visser JAGM, Rozen DE. Clonal Interference and the Periodic Selection of New Beneficial Mutations in Escherichia coli. Genetics. 2006;172:2093–2100. doi: 10.1534/genetics.105.052373. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Yooseph S, Sutton G, Rusch DB, Halpern AL, Williamson SJ, et al. The Sorcerer II Global Ocean Sampling expedition: expanding the universe of protein families. PLoS Biol. 2007;5:e16. doi: 10.1371/journal.pbio.0050016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Yachi S, Loreau M. Biodiversity and ecosystem productivity in a fluctuating environment: the insurance hypothesis. Proc Natl Acad Sci U S A. 1999;96:1463–1468. doi: 10.1073/pnas.96.4.1463. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Magurran AE, Henderson PA. Explaining the excess of rare species in natural species abundance distributions. Nature. 2003;422:714–716. doi: 10.1038/nature01547. [DOI] [PubMed] [Google Scholar]
- 34.Falkowski PG, Godfrey LV. Electrons, life and the evolution of Earth's oxygen cycle. Philos Trans R Soc Lond B Biol Sci. 2008;363:2705–2716. doi: 10.1098/rstb.2008.0054. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Whitaker RJ, Grogan DW, Taylor JW. Geographic barriers isolate endemic populations of hyperthermophilic archaea. Science. 2003;301:976–978. doi: 10.1126/science.1086909. [DOI] [PubMed] [Google Scholar]
- 36.Pommier T, Pinhassi J, Hagström Å. Biogeographic analysis of ribosomal RNA clusters from marine bacterioplankton. Aquat Microb Ecol. 2005;41:79–89. [Google Scholar]
- 37.Baker BJ, Hugenholtz P, Dawson SC, Banfield JF. Extremely acidophilic protists from acid mine drainage host Rickettsiales-lineage endosymbionts that have intervening sequences in their 16S rRNA genes. Appl Environ Microbiol. 2003;69:5512–5518. doi: 10.1128/AEM.69.9.5512-5518.2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Hurtt GC, Pacala SW. The consequences of recruitment limitation: Reconciling chance, history, and competitive differnces between plants. J Theor Biol. 1995;176:1–12. [Google Scholar]
- 39.Pommier T, Canback B, Riemann L, Bostrom KH, Simu K, et al. Global patterns of diversity and community structure in marine bacterioplankton. Mol Ecol. 2007;16:867–880. doi: 10.1111/j.1365-294X.2006.03189.x. [DOI] [PubMed] [Google Scholar]
- 40.Ramette A, Tiedje JM. Biogeography: an emerging cornerstone for understanding prokaryotic diversity, ecology, and evolution. Microb Ecol. 2007;53:197–207. doi: 10.1007/s00248-005-5010-2. [DOI] [PubMed] [Google Scholar]
- 41.Lopez-Archilla AI, Marin I, Amils R. Microbial Community Composition and Ecology of an Acidic Aquatic Environment: The Tinto River, Spain. Microbial Ecology. 2001;41:20–35. doi: 10.1007/s002480000044. [DOI] [PubMed] [Google Scholar]
- 42.Palacios C, Olsson B, Lebaron P, Sogin ML. New high-throughput biotechnologies for sampling the microbial ecological diversity of the oceans: the informatics challenge. 2006. Proceedings ‘Ocean Biodiversity Informatics’–International Conference on Marine Biodiversity Data Management, Hamburg, Germany, 29 November–1 December 2004 IOC.
- 43.Dyszynski G, Sheldon WM. RDPquery: A Java program from the Sapelo Program Microbial Observatory for automatic classification of bacterial 16S rRNA sequences based on Ribosomal Database Project taxonomy and Smith-Waterman alignment. ( http://simo.marsci.uga.edu/public_db/rdp_query.htm, [version used])
- 44.Cole JR, Chai B, Farris R, Wang Q, Kulam SA, et al. The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis. Nucleic Acids Res. 2005;33:1362–4962. doi: 10.1093/nar/gki038. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Ludwig W, Strunk O, Westram R, Richter L, Meier H, et al. ARB: a software environment for sequence data. Nucleic Acids Res. 2004;32:1363–1371. doi: 10.1093/nar/gkh293. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Klepac-Ceraj V, Ceraj I, Polz MF. Clusterer: extendable java application for sequence grouping and cluster analyses. Online J Bioinformatics. 2006;7:15–21. [Google Scholar]
- 47.Schloss PD, Handelsman J. Introducing DOTUR, a Computer Program for Defining Operational Taxonomic Units and Estimating Species Richness. Appl Environ Microbiol. 2005;71:1501–1506. doi: 10.1128/AEM.71.3.1501-1506.2005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Colwell RK. EstimateS: Statistical estimation of species richness and shared species from samples. 2005. Version 7.5. User's Guide and application published at: http://purl.oclc.org/estimates.
- 49.Chao A, Chazdon RL, Colwell RK, Shen T-J. A new statistical approach for assessing compositional similarity based on incidence and abundance data. Ecology Letters. 2005;8:148–159. [Google Scholar]
- 50.Clarke KR, Warwick RW. Change in marine communities: an approach to statistical analysis and interpretation, 2nd edition. Plymouth: Primer-E; 2001. [Google Scholar]
- 51.Legendre P, Legendre L. Numerical ecology. Amsterdam, The Netherlands: Elsevier; 1998. p. 445. [Google Scholar]
- 52.ter Braak CJF, Šmilauer P. CANOCO Reference Manual and CanoDraw for Windows User's Guide: Software for Canonical Community Ordination (version 4.5) Ithaca NY, USA: Microcomputer Power; 2002. p. 500. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.