Abstract

Oil in subsurface reservoirs is biodegraded by resident microbial communities. Water-mediated, anaerobic conversion of hydrocarbons to methane and CO2, catalyzed by syntrophic bacteria and methanogenic archaea, is thought to be one of the dominant processes. We compared 160 microbial community compositions in ten hydrocarbon resource environments (HREs) and sequenced twelve metagenomes to characterize their metabolic potential. Although anaerobic communities were common, cores from oil sands and coal beds had unexpectedly high proportions of aerobic hydrocarbon-degrading bacteria. Likewise, most metagenomes had high proportions of genes for enzymes involved in aerobic hydrocarbon metabolism. Hence, although HREs may have been strictly anaerobic and typically methanogenic for much of their history, this may not hold today for coal beds and for the Alberta oil sands, one of the largest remaining oil reservoirs in the world. This finding may influence strategies to recover energy or chemicals from these HREs by in situ microbial processes.
Introduction
Microbial communities degrading hydrocarbon in oil fields with resident temperatures below 80–90 °C are thought to consist primarily of strictly anaerobic taxa, which convert light oil into heavy oil and then into bitumen by targeting low molecular weight components.1−5 As a result, the Alberta oil sands, spanning 140 000 km2 and estimated to hold two trillion barrels of bitumen, may once have been charged with twice this volume of lighter oil, which biodegraded over geological time scales.6 Although, in many cases, the current activities of microbes in oil sands, coal beds, and other hydrocarbon resource environments (HREs) are still poorly understood, it is known that microorganisms have substantial positive and negative impacts on these resources and the energy extraction processes.3,5 The negative effects include souring, corrosion, and biofouling. These can increase the cost of production and negatively impact the environment. However, microorganisms from HREs are also useful for bioremediation, bioconversion, and enhanced energy recovery.3,5 Clearly more information is needed to model, predict, and harness in situ activities with a view of reducing the environmental impact of fossil fuel production.
To acquire this information we have launched the Hydrocarbon Metagenomics Project (http://hydrocarbonmetagenomics.com/) and report here on the characterization of microbial communities in 160 samples from diverse HREs in North America by sequencing 16S rRNA genes. Multivariate analysis of microbial composition was used to select twelve samples for more extensive metagenomic analysis. The environmental impact of fossil fuel production and use have become major issues for society and a survey conducted as part of our project has indicated that reducing this impact (“greening”) should receive high priority and would positively influence public opinion of the fossil fuel industry (Supporting Information (SI), Table S1). Hence, there are multiple drivers for undertaking a comparative analysis of microbial communities in HREs.
Experimental Section
Sample Collection
Oil sands cores from SAGD exploratory drills were frozen at the well site immediately following collection by placing them outdoors. While frozen they were then transported to a company laboratory, where the core was cut longitudinally and a V-notch was cut from the flat face for bitumen content measurement. Frozen half-cores, stored at −80 °C, were divided in 5-cm subsamples by cutting with a sterile rock saw. Subsequently the outside surfaces of the cores were aseptically removed and the interior core material was used for DNA analysis.
Tailings pond samples were collected from Suncor ponds 5 (TP5; UTM 467138E 6318316N) and 6 (TP6; UTM 466358E 6319838N and 466418.9E 6320256.5N), at depths ranging from 2 to 29 m below the surface (mbs) in 2008, 2010, and 2011. Samples were collected into sterile 1-L Nalgene bottles filled to the top to limit air exposure. Pond access and sampling procedures have been described elsewhere.7−9 Upon arrival in the lab, samples were immediately placed in an anaerobic chamber containing 90% N2 and 10% CO2. Sub-samples for biodiversity studies were stored at −80 °C. The samples had a solids content from 30–60% (w/w) and an average pH of 7.5. Samples (1 L) from three sites in the Mildred Lake Settling Basin (MLSB), operated by Syncrude Canada Ltd (UTM 461400E 6325200N, 461169E 6325679N and 460613E 6326695N) were obtained at depths of 1.1 to 35.8 mbs. These had solids contents from 20 to 70% (w/w) and a bulk pH of 8 to 8.7. Surface waters (0–10 cm) were obtained from Suncor tailings ponds and from MLSB, as well as from Syncrude’s West In-Pit pond (Table S2), as described elsewhere.10
Produced water samples were collected from 11 production wells of the Medicine Hat Glauconitic C (MHGC) field (UTM 523092E 5543313N near Medicine Hat) near Medicine Hat, Alberta, which is a shallow (850 mbs), low temperature (30 °C) field from which water and heavy oil with an API gravity of 16° are produced by water injection.11,12 Samples were collected in sterile 1-L Nalgene bottles filled completely and transported to the lab within 5 h of collection. The 1-L bottles were then transferred into an anaerobic hood (10% CO2, 90% N2), where 100 mL of sample were used for DNA extraction.
Coalbed methane (CBM) samples were either cores, cuttings, or produced waters.13,14 CBM cuttings (pieces of core obtained by rotary drilling) were grouped into cuttings from less than 1000 mbs (CBM_cuts) and deep cuttings from greater than 1000 mbs (CBM_dcuts). At the well site, cores were cut into approximately 15-cm lengths and placed in sterile vacuum bags. Cuttings were collected into sterile PVC containers (35 by 15 cm) and sealed. Produced water samples were collected in sterile 4-L fuel cans to overflowing. All samples were transported to the lab within 24 h of collection. Once in the laboratory the cores were placed in an anaerobic hood (5% H2, 95% N2) and sections from the inner core were removed for DNA extraction. Cuttings were also handled in the anaerobic hood and subsamples were taken for DNA extraction. The water samples were filtered through a 0.2-μm Sterivex filter to collect biomass for DNA extraction.
DNA Extraction
The reproducibility of DNA extraction and amplification procedures was verified (Figure S1). DNA was extracted from Suncor tailings samples according to the bead-beating method of Ramos-Padrón et al.9 (Method 3 in Figure S1) and from 5_TP_MLSB samples as described by Foght et al.15 (Method 5). Alternatively, tailings were suspended in 0.1 M pyrophosphate buffer pH 7 to dislodge cells, which were then lysed by a Marmur-type procedure (Method 2). Surface tailings ponds waters were filtered (200 mL through a 0.22-μm filter) to concentrate bacteria and suspended solids. The filters were then extracted using the FastDNA Extraction Kit for Soil (MP Biomedicals), using procedures described by the manufacturer (Method 1). DNA was extracted from 1–2 g of oil sands core sections using the same FastDNA Extraction Kit according to the manufacturer’s protocol with minor changes (Method 4). Pellets on the catch tube were washed with 600–2400 μL of 5.5 M guanidine thiocyanate to remove humic acids and other contaminants before releasing the DNA with the SEWS-M wash. DNA was extracted from CBM cores, cuttings, and produced water samples according to the bead-beating method of Foght et al.15 (Method 5). CBM waters were filtered through 0.2-μm Millipore or Sterivex filters to collect biomass. Three aliquots of 0.5 g of cores and cuttings were extracted separately before pooling the final DNA extracts prior to PCR.
Community Analysis by 16S Amplicon Pyrosequencing
The 16S rRNA genes were amplified from extracted DNA by PCR using primers 454T_RA_X and 454T_FwB, which have universal 16S primer sequences 926Fw (AAACTYAAAKGAATTGRCGG) and 1392R (ACGGGCGGTGTGTRC) at their 3′-ends. The former was a modification of 926F (AAACTYAAAKGAATTGACGG) designed to improve coverage of methanogenic taxa. Primer 454T_RA_X has a 25 nucleotide A-adaptor (CGTATCGCCTCCCTCGCGCCATCAG) and a 10 nucleotide multiplex identifier barcode sequence, whereas primer 454T_FwB has a 25 nucleotide B-adaptor sequence (CTATGCGCCTTGCCAGCCCGCTCAG). Reactions (50 μL) contained 20 pmol/μL each of forward and reverse primers in 2 μL, 25 μL of 2 x PCR master mix [containing 0.05 units/μL Taq DNA polymerase, reaction buffer, 4 mM MgCl2 and 0.4 mM of each dNTP (Fermentas)], 21 μL of nuclease-free water, and 2 μL of DNA template (10–100 ng). PCR was performed with an initial denaturation of 3 min at 95 °C, followed by 25–50 cycles of 30 s at 95 °C, 45 s at 55 °C, and 1.5 min at 72 °C, and a final elongation of 10 min at 72 °C. The PCR product was quality-checked on a 0.7% agarose gel, purified with a QIAquick PCR Purification Kit (Qiagen), and its concentration was determined by a Qubit Fluorometer (Invitrogen), using a Quant-iT dsDNA HS Assay Kit (Invitrogen) or using a NanoDrop spectrophotometer (Fisher Scientific). Amplicons (typically 25 μL of 5 ng/μL) were subjected to pyrosequencing at the McGill University and Genome Quebec Innovation Centre, Montreal, Quebec, using a Genome Sequencer FLX Instrument and GS FLX Titanium Series Kit XLR70 (Roche Diagnostics Corporation). Typically, 60 samples were multiplexed on a single analysis chip, yielding 5000 to 15 000 reads per sample.
Analysis of 16S Amplicon Pyrosequencing Data
The analyses were conducted using Phoenix 2.16 The raw 16S rRNA sequence data are available from the Sequence Read Archive (SRA) at NCBI under accession numbers listed in Table S2. From these reads, operational taxonomic units (OTUs) were generated using average neighbor clustering at 5% distance cutoffs. The OTUs were assigned to taxa using the RDP classifier on the SILVA Small Subunit rRNA Database Release 108 (SSU ref NR 108; http://www.arb-silva.de/no_cache/download/archive/release_108/Exports). A bootstrap value cutoff of 60% was used for the assignments, which resulted in the taxa being at the genus or a higher taxonomic rank and the number of taxa being smaller than that of OTUs. Samples were compared via clustering and ordination methods in reduced space, using Mothur.17 The Bray–Curtis index was used as a measure of dissimilarity between communities. Communities were clustered into Newick-formatted trees using the UPGMA algorithm implemented in Mothur. The sample relation tree in Newick format was visualized using MEGA.18 Differences and similarities among amplicon libraries were also explored with the Non-Metric Multidimensional Scaling (NMDS) ordination method in Mothur using the majorization algorithm from Borg and Groenen.19 Sample OTUs were subsequently grouped by environment types identified in the NMDS to facilitate the comparison of alpha diversity between environment types. To determine if the spatial separations of the groups observed in the NMDS plot were statistically significant, the Analysis of Molecular Variance (AMOVA) and weighted Unifrac tools in Mothur were used. Mothur was also used to determine the pairwise correlation between each possible pair of OTUs identified in all of the 160 samples.
Co-Occurrence Analysis
From the taxonomic classification results from the RDP classifier with the Silva training data set, 41 orders with more than 0.1% abundance in at least one of the samples were selected for further analysis. The correlation values among the orders were then calculated using the relative abundance percentages of each order across all the 160 samples. For this calculation, we used the “otu.association” function of the Mothur package version 1.27, with the Spearman correlation, which outputs a positive or negative correlation value for each pair of orders compared. In the current order-level analysis, C (41, 2) = 820 pairs of orders exist, each of which has a correlation value. We then formed networks of orders using thresholds on the correlation values, which can range from −1 to 1. To extract those orders with positive correlation among them, we used a threshold of 0.5, resulting in two separate networks, Network A and Network B. Likewise, a threshold of −0.5 was used to extract negatively correlated orders. For each of the extracted set of orders, we used Cytoscape version 2.8.3 to visualize the corresponding network. In each of the networks shown, the nodes represent orders, and an edge between two nodes indicates that there is a strong correlation (as determined by the threshold), either positively or negatively. The node sizes represent the numbers of samples in which the orders are observed, and the edge widths represent the correlation magnitudes.
Metagenome Sequencing, Assembly, and Analysis
As described in detail in the SI, ten single-end and two paired-end shotgun DNA libraries were constructed and sequenced with 454 and Illumina technology, respectively. Following quality control and assembly,20−22 assembled contigs and singletons longer than 200 bp were submitted to the Integrated Microbial Genomes and Metagenomics (IMG/M) system23 for annotation. Functional profiling focused on the presence of genes for O2-independent and O2-dependent degradation of aromatic hydrocarbons, as well as of genes for methanogenesis and methane oxidation.
Results
Richness and Diversity
One hundred and sixty samples from 10 HREs, including Athabasca oil sands, oil sands tailings ponds,8−11 a conventional oil field,11,12 and CBM fields,13,14 were analyzed via 16S rRNA gene pyrotag sequencing analysis (Table 1, Table S2). These mostly represented environments with low in situ temperature (10–40 °C) and low salinity (1–10 g of NaCl/L). A total of 972 802 quality-controlled 16S rRNA gene reads, distributed over 13 865 operational taxonomic units (OTUs) were obtained (Table 1). Differences in community compositions are summarized in Figure 1. Community compositions clustered distinctly according to HRE, except for tailings pond surface water samples (TP_surface) and the CBM samples, which clustered together. In agreement with this HRE-dependent clustering, statistical analyses indicated that the variation in community compositions within each of the 10 HREs was smaller than the variation between HREs (Table S3).
Table 1. Survey of Hydrocarbon Resource Environments (HREs) Sampled in This Studya.
| HREb | depth (mbs)c | t (°C) | number of samples | number of reads | OTUs (95% identity) | estimated total OTUs (Chao)d | Shannon evenness indexd | clades (Figure 1) | number of taxad | archaeal taxa (%) | bacterial taxa (%) | Rave | σRe |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| oil_sands | 295–299 | 10 | 30 | 134,158 | 2,833 | 4,940 | 0.52 | 4 | 528 | 17 | 83 | –0.32 | 0.46 |
| TP5 | 2–29 | 15–20 | 13 | 55,216 | 3,369 | 6,097 | 0.66 | 3 | 327 | 27 | 73 | 0.11 | 0.17 |
| TP6 | 3–18 | 15–20 | 17 | 111,249 | 3,116 | 5,815 | 0.56 | 4 | 531 | 49 | 50 | 0.44 | 0.26 |
| TP_MLSB | 1–36 | 15–20 | 11 | 70,929 | 2,409 | 4,816 | 0.51 | 3 | 410 | 64 | 36 | 0.56 | 0.42 |
| TP_surface | 0–0.1 | 20 | 15 | 125,461 | 2,995 | 5,589 | 0.58 | 8 | 505 | 1 | 97 | –0.17 | 0.23 |
| oil_field | 850 | 30 | 30 | 231,194 | 1,936 | 3,255 | 0.46 | 6 | 407 | 69 | 31 | 0.74 | 0.18 |
| CBM_cores | 315–585 | 20 | 8 | 18,598 | 906 | 1,909 | 0.50 | 5 | 242 | 3 | 86 | –0.33 | 0.22 |
| CBM_cuts | 140–327 | 10–20 | 11 | 33,134 | 1,387 | 2,630 | 0.48 | 4 | 260 | 1 | 98 | –0.32 | 0.11 |
| CBM_dcuts | 1042–1610 | 40–50 | 10 | 75,916 | 1,057 | 2,081 | 0.42 | 3 | 302 | 0 | 100 | 0.13 | 0.30 |
| CBM_water | 300–500 | 30 | 15 | 116,947 | 1,507 | 2,750 | 0.50 | 11 | 359 | 18 | 82 | 0.02 | 0.40 |
| S, T or Avf | 160 | 972,802 | 13,865 | 1,020 | 25 | 74 |
The number of samples, and the total number of reads and of derived OTUs obtained for these samples, are shown for each HRE. The number of taxa represented by these OTUs and their distribution (%) over Archaea or Bacteria, as well as the average R-score (Rav) and its standard deviation (σR) for all samples of an HRE are also indicated. The number of reads and other information, including the R-score for individual samples is provided in Table S2.
Oil_sands, samples of oil sand cores; TP5 and TP6 samples of Suncor tailings pond (TP) 5 and 6; TP_ MLSB, samples from Syncrude’s Mildred Lake Settling Basin; TP_surface, samples of surface water of TPs; Oil_field, samples of produced water from the Medicine Hat Glauconitic C field; CBM_cores, samples of coalbed methane (CBM) cores; CBM_cuts, samples of CBM cuttings; CBM_dcuts, samples of deep CBM cuttings; CBM_water, samples of CBM produced waters.
Meters below the surface.
Identified or calculated as described in the text.
Average R-score, indicating whether communities contain anaerobic (R = 1.00) or aerobic (R = −1.00) taxa; σR represents the standard deviation from the mean.
Sum, total (bold), or average (italics).
Figure 1.
(I) Dendrogram for 160 amplicon libraries from 10 HREs (Table 1), generated using the UPMGA algorithm with the distance between communities calculated using the Bray–Curtis coefficient in the Mothur software package. Libraries with more than 72% sequence similarity were collapsed into the same clade; the number is indicated in parentheses. Clades with samples used for metagenome analysis are indicated (red solid star). Note that three samples in clade 18 were used for metagenome sequencing. (II) Presence of orders from Network A (red, predominantly anaerobic), Network B (blue, predominantly aerobic), and Group C (green), indicated in Figure 2. The R-score indicates the degree to which communities in each clade vary from strictly anaerobic (R = 1.00) to strictly aerobic (R = −1.00).
Nature of Prokaryotic Communities in HREs
Taxonomic assignment of OTU consensus sequences resulted in identification of 242–531 taxa for each HRE and 1020 taxa for all HREs (Table 1). The fractions of taxa belonging to the Domain Archaea or Bacteria are indicated in Table 1 for each HRE. Because the Archaea consisted mostly (97.3%) of methanogens, their presence served as an initial indicator of whether HREs harbored predominantly anaerobic communities. For example, amplicons from TP_MLSB and oil_field had on average 64% and 69% sequences from methanogenic Archaea (Table 1), whereas CBM_cores, CBM_cuts, and CBM_dcuts had only 0–3% sequences from this group.
Because the data represent the distribution of taxa in a large number of samples from diverse HREs, we used co-occurrence analysis24 to further characterize the structure of microbial communities. Analysis of all 13 865 OTUs indicated two main networks, Network A and Network B, identified at the taxonomic level of order in Figure 2. In addition, the 109 most prevalent genera (Table S4A) were characterized based on the ability of all validated species within these genera to grow by aerobic respiration. A total of 1955 species in these 109 genera listed at List of Prokaryotic Names with Standing in Nomenclature (LPSN; http://www.bacterio.cict.fr), were classified as anaerobic, facultative, aerobic, or unassigned through inspection of the literature (Table S4B). A ratio R = (Nan – Nar)/(Nan + Nfac + Nar) was calculated, where Nan, Nfac, and Nar are the number of anaerobic, facultative, and aerobic species of a genus. Genera with anaerobic and facultative or with aerobic and facultative species are referred to as anaerobic (0 < R < 1) or aerobic (−1 < R < 0), e.g. Propionibacterium (R = 0.57; 8 strictly anaerobic and 6 facultative species) and Flavobacterium (R = −0.81; 91 strictly aerobic and 22 facultative species), respectively. Methanobacterium, Aeromonas, and Methylobacterium are examples of strictly anaerobic, strictly facultative, and strictly aerobic genera with R = 1, R = 0, and R = −1, respectively. The R-scores of the type strain were used for genera occurring less frequently than those listed in Table S4B. R-scores for samples or groups of samples, or for higher phylogenetic units, were calculated as R = ΣfiRi, where fi and Ri are the fractions and R-scores of each genus (i) in these groups, respectively. Average R-scores (RAV) for all samples from the ten HREs and their distribution (σR) are indicated in Table 1.
Figure 2.

Positive co-occurrence analysis of taxonomic orders present as ≥0.1% of total pyrosequencing reads using a threshold of 0.5 for the Spearman coefficient in the OTU association function of Mothur. The corresponding networks were visualized with Cytoscape. The size of a circle (node) is proportional to the number of samples in which the order was observed. The thickness of a connection (edge) is proportional to the correlation value. Network A consists of 19 orders with mostly strictly anaerobic genera. Network B consists of 11 orders with genera/species which are either strictly aerobic or facultative. Group C consists of 11 non-co-occurring orders in the α-, β-, and γ-Proteobacteria, Bacteroidetes, Firmicutes, Fusobacteria, and Planctomycetes, listed in Table S5.
Using this approach for phylogenetic orders representing >0.1% of pyrosequencing reads, we found that the 19 orders in Network A were almost entirely anaerobic (R = 1), whereas the 11 orders in Network B were predominantly aerobic (−1 < R < 0). The 11 remaining orders of Group C did not form a co-occurring network and were also predominantly aerobic. Network A contains orders potentially involved in anaerobic, methanogenic hydrocarbon metabolism,25−27 whereas many of the orders in Network B are capable of aerobic hydrocarbon metabolism. Negative co-occurrence of Clostridiales and of Burkholderiales, Caulobacterales, and Rhizobiales (Figure S2) indicates that anaerobic and aerobic hydrocarbon metabolisms by these taxa are mutually exclusive in situ. The fractions of reads corresponding to each network are indicated in Figure 1 and in Table S2 together with the calculated R-scores. Tailings ponds and the oil field housed mostly anaerobic communities (0 < R < 1) with high proportions of Network A, whereas oil sands and tailings pond surface waters housed mostly aerobic communities (−1 < R < 0) with high proportions of Network B. The proportions of each network were linearly correlated with the R-score of the samples (Figure S3A and B). Communities in samples from CBM sites had high proportions of Group C and/or of Network B (Figure 1 and Figure S3C).
Metagenomic Analysis
Twelve samples selected for metagenomic analysis (Tables S5 and S6) gave 0.6–3.1 × 108 and 3.7–7.2 × 1010 of total sequenced bases, when sequenced with single-end 454 and paired-end Illumina technology, respectively (Table S6). The total length of assembled bases and the median contig length of the assembly have also been indicated in Table S6. The 16S rRNA genes, detected in the twelve metagenomic libraries, matched well to the 16S rRNA genes obtained from PCR amplicons of the same samples, except that the amplicons underrepresented Epsilonproteobacteria (order Campylobacterales) and overrepresented Euryarchaeota (orders Methanosarcinales, Methanomicrobiales, and Methanobacteriales) at the expense of other taxa (Table S5). This reflects known biases in the primer pair used for PCR amplification of 16S rRNA genes.
Metagenome analysis focused on the presence of genes for O2-independent (anaerobic) and O2-dependent (aerobic) degradation of aromatic hydrocarbons, as well as of genes for methanogenesis and methane oxidation (Table S6). Relative to metagenomes CG8, 10PW, and CG13 from CBM water and oil field water samples with R-scores from 0.96 to 0.59, metagenomes PDSYNTPW5, CO182, and CO183 for samples from tailings pond surface water and CBM cuttings samples with R-scores from −0.36 to −0.49 had a higher frequency of genes for aerobic and a lower frequency of genes for anaerobic aromatic hydrocarbon degradation, respectively. The latter metagenomes also exhibited a lower frequency of genes encoding acetotrophic or hydrogenotrophic methanogenesis pathway components, while having a higher frequency of genes for methane oxidation (Table S6). Statistical significance of increased counts for genes involved in aerobic hydrocarbon degradation and methane oxidation and of decreased counts for genes involved in anaerobic hydrocarbon degradation and methanogenesis in metagenomes PDSYNTPW5, CO182, and CO183, as compared to metagenomes CG8, CG13, and 10PW, is indicated in Figure S4. Hence, the results of both 16S rRNA gene surveys (Figure 1) and of metagenomic sequencing provide evidence for the existence of aerobic taxa and genes for aerobic hydrocarbon degradation in some HREs. This is not an artifact of sample retrieval by drilling, as communities present on the outside of cores were similar to those on the inside (results not shown) and aerobic taxa could also dominate inner core communities (Figure 3).
Figure 3.
Microbial community variation in oil sands core FB11 as a function of vertical distance. (I) Depiction of the core and its location in the subsurface (m below surface); the position of samples, representing 5-cm sections, is indicated. (II) Proportions of the microbial community that were in Network A (predominantly anaerobic), Network B (predominantly aerobic), indicated in Figure 2, and of Group C (predominantly aerobic). (III) Genus-level diversity of methanogenic Archaea in the indicated core sections. The bars represent fractions (%) of sequence reads. The R-score indicates the degree to which communities are strictly anaerobic (R = 1.00) or strictly aerobic (R = −1.00).
Microbial Communities and Metabolic Potential of Specific HREs
Oil sands samples can be obtained from surface mining or from in situ operations, targeting a pay zone at 50–100 mbs or in deeper sections at 250–500 mbs, respectively. For the latter, exploratory wells are drilled to localize the bitumen-containing layers prior to placement of horizontal wells. FB11 and two other cores from these drills at 295–299 mbs gave 30 subsamples, representing 5-cm sections of which the inner part was used for DNA isolation (Figure 3). Of these, 26 subsamples were aerobic (Figure 1: clades 1, 2, and 3, dominated by Network B), whereas 4 were anaerobic (Figure 1: clade 11, dominated by Network A). Microbial communities dominated by aerobic or anaerobic taxa were found in close proximity in FB11, together with communities intermediate between these two (Figure 3). The latter had high proportions of the methanogen Methanosarcina, which thrives in intermittently oxygenated environments.28 The observed pattern suggests limited oxygen ingress in specific regions of this core. A survey of bacterial taxa in these core sections is provided in Table S7. Network B aerobes included the orders/genera Rhizobiales/Rhizobium (R = −0.85), Burkholderiales/Cupriavidus (R = −0.85), Caulobacterales/Brevundimonas (R = −0.65), Burkholderiales/Delftia (R = −1), and Rhizobiales/Methylobacterium (R = −1). The first four of these have the potential for oxygen-dependent hydrocarbon degradation, whereas the obligately aerobic Methylobacterium oxidizes methanol.29−31 Thus extant communities in cores from these deep oil sands formations are characterized by high proportions of aerobic taxa interspersed with pockets of anaerobic communities.
Hot water extraction of mined bitumen produces one barrel (159 L) of bitumen per ton of oil sands ore, as well as 3 m3 of tailings, consisting of water, sand, clays, residual bitumen, and some diluent hydrocarbon. These are stored in tailings ponds, where anaerobic, methanogenic degradation of residual hydrocarbon causes methane emissions of up to 40 000 m3/day.7−9 Microbial communities at 3–60 mbs in Suncor TP5 and TP6 and in the Mildred Lake Settling Basin grouped together (Figure 1: clades 4 to 10) and had high fractions of anaerobic Network A with methanogenic Archaea of the order/genus Methanosarcinales/Methanosaeta, Methanomicrobiales/Methanoregula, and Methanomicrobiales/Methanolinea being most prominent. Other anaerobic taxa from Network A included the Rhodocyclales and the syntrophs Syntrophobacterales/Syntrophus and Syntrophobacterales/Smithella, which may also contribute to methanogenic hydrocarbon degradation. Functional gene analysis confirmed the presence of genes for anaerobic aromatic hydrocarbon degradation and methanogenesis, but also indicated high numbers of genes for aerobic aromatic hydrocarbon degradation and methane oxidation, indicating the potential for aerobic activity in situ (Table S6).
An estimated 20% of the methane formed in the deeper layers of tailings ponds is oxidized by methanotrophs in surface waters.10 These were mostly dominated by taxa from aerobic Network B (Figure 1: clades 35–41), including many genera of the order Burkholderiales. Strictly aerobic genera in this environment included the methanotrophic Methylococcales/Methylocaldum (Table S4A). Relative to those from other HREs, the metagenome from this environment had the highest counts of genes for aerobic aromatic hydrocarbon degradation and aerobic methane oxidation and the lowest counts of genes for anaerobic aromatic hydrocarbon degradation and methanogenesis (Table S6).
In contrast to production of highly viscous bitumen from deep oil sands by steam injection, lower viscosity crude oil is often recovered by water injection and produced as a mixture of oil and water. Samples of produced water were obtained from the Medicine Hat Glauconitic C (MHGC) field, located near Medicine Hat, Alberta, which produces oil from a depth of 850 mbs.11,12 The microbial communities in produced waters (Figure 1: clades 12–17) were dominated by taxa from anaerobic Network A, including methanogens of the orders Methanomicrobiales and Methanosarcinales and anaerobic fermenters and syntrophs of the orders Clostridiales and Syntrophobacterales (Table S4A), indicating anaerobic methanogenic oil degradation. Samples from this environment had the strongest anaerobic signature of all those surveyed (Figure 1: 0.37 < R < 0.92), although this may be exaggerated by PCR bias, which underestimated the Epsilonproteobacteria present in this HRE (Table S5). However, metagenomic functional gene comparison (Table S6) also indicated this environment to be among the most anaerobic surveyed with lower counts of genes involved in aerobic, as compared to anaerobic, hydrocarbon degradation and absence of genes for methane oxidation.
Coal is the most abundant fossil fuel on earth (bp.com/statistical review, 2012) with deep unminable coal seams having potential for recovery of coal bed methane (CBM). Microbial communities in coal deposits may generate new methane from coal under favorable growth conditions.13,14 Samples were collected from either active CBM sites or sites being explored for future CBM activities in the Western Canadian Sedimentary Basin (Alberta) and in the San Juan Basin (USA), as CBM waters, cores, or cuttings. Communities in samples from CBM sites were among the most diverse of those surveyed, often containing large fractions of orders from Group C (Figure 1), including Pseudomonadales/Pseudomonas (Table S6: R = −0.60). This genus was also prominent in other coal and coal bed produced water samples15,32,33 and may degrade coal-associated kerogen and associated solvent-extractable material. Communities from CBM_dcuts harbored unique anaerobic taxa not found in other HREs, such as Clostridiales/Acetobacterium known to generate acetic acid from H2 and CO2, and Fusobacteriales/Ilyobacter known to ferment aromatic compounds (Table S4A).34 Sulfate-reducing Desulfomicrobium species were also most prominent in deep CBM cuttings but methanogens were absent, which was confirmed by low counts of genes for enzymes involved in methanogenesis in the CBM_dcuts metagenome T_1560D (Table S6). Hence, although not methanogenic, the microbial communities in CBM_dcuts were dominated by more anaerobic functional genes than those from the more shallow CBM_cuts, which had a more aerobic functional gene count (Table S6: CO182 and CO183). Produced waters from CBM fields harbored very diverse communities, as indicated both by pyrosequencing of 16S rRNA gene amplicons (Figure 1: clades 18–25 and 42–45) and functional gene counts (Table S6). Anaerobic communities of CBM_water had high fractions of methanogenic Archaea and of the acetogenic Clostridiales/Sporomusa. These lacked genes for aerobic hydrocarbon degradation or methane oxidation (Table S6: CG8). However, aerobe-enriched CBM water communities were also found, dominated by Pseudomonadales/Pseudomonas, Rhizobiales/Rhizobium, and Alteromonadales/Marinobacter with a high gene count for aerobic hydrocarbon degradation and methane oxidation (Table S6: CG7). With the exception of some CBM_water samples, samples from CBM fields appeared to have low fractions of methanogens. The absence of methanogens from other coal samples and their presence in some produced waters has been reported elsewhere.32,35
Discussion
Microbial degradation of oil was long considered to be an aerobic process because oxygen was judged to be critical for hydrocarbon activation. This view was overthrown in the late 1980s, when bacteria capable of growth on hydrocarbons using alternate electron acceptors were isolated36 and consortia capable of water-mediated conversion of hydrocarbons to methane and CO2 were described.1,25−27 Although the gas geochemistry of heavy oils and oil sands points to past anaerobic biodegradation,6 the possibility of mixed aerobic and anaerobic oil degradation in a Brazilian oil reservoir was considered recently.37 Our finding that microbial communities in HREs are not universally dominated by anaerobes indicates that some HREs may have available oxygen at a concentration determined by convective and diffusive fluxes and by the rate with which oxygen reacts with inorganic (e.g., sulfide or pyrite) or organic targets. Convective flow of oxygen into oil-sands-containing strata involves the influx of oxygenated, precipitation-derived (meteoric) waters, which are accommodated by current upward expansion following the melting of a 2-km-thick icecap overlying this region 10 000 years ago.38 Slow oxygen-mediated degradation of oil sands bitumen may be caused by the fact that this is highly viscous (106 cP) under in situ conditions (10 °C) and consists largely of refractory, structurally diverse, high molecular weight molecules, such as asphaltenes and resins.39 These processes support only low numbers of microbes as judged from the recovery of only approximately 1 ng of DNA per gram of core from this HRE. This corresponds to 105 microbes per g of oil sands core, assuming a single 3 Mb genome per microbe. With the exception of CBM cores, other HREs yielded 10- to 100-fold more DNA per g or per mL, indicating correspondingly larger microbial populations. Oxygen ingress in coal beds may similarly be promoted by poor degradability of coal-associated hydrocarbons and their slow diffusion from the solid coal matrix.13,40 Also, coal groups in the Alberta Basin are quite fractured, allowing the influx of fresh meteoric water from recharged areas and outcrop boundaries.41
Although we have shown that, in addition to surface waters of tailings ponds,10 cores from oil sands and CBM fields also harbored communities dominated by aerobic and facultative taxa, we wish to emphasize that this conclusion applies to current in situ conditions and the extant communities sampled. Geochemical markers indicate that conversion of oil sands hydrocarbons from light oil to heavy oil to bitumen was a largely anaerobic process.1,6 Moreover, successional changes in HRE community structures over shorter time intervals associated with natural hydrocarbon conversion or human production processes cannot be ruled out. We do not draw any conclusions about the current overall hydrocarbon degradation rates by aerobic versus anaerobic communities. Understanding the nature of microbial communities presently inhabiting HREs is important for making informed choices on the potential for clean energy biotechnologies, the development of which is deemed desirable by an as-yet skeptical public (Table S1).
Acknowledgments
This work was funded by Genome Canada, Genome Alberta, the Government of Alberta, and Genome BC. Industry partners ARC Resources, Baker Hughes, ConocoPhillips, Encana, Profero Energy Inc., Shell, Suncor, Syncrude, Trident Exploration Corp., and Quicksilver are thanked for assistance in sampling, information, and comments. G.V. was supported by a Natural Sciences and Engineering Research Council of Canada (NSERC) Industrial Research Chair, and S.J.H. was supported through the Canadian Institute for Advanced Research (CIFAR). S.J.H. and S.L. were supported through the Canada Research Chairs program.
Supporting Information Available
Detailed description of the public opinion survey. Figures S1–S4: results of negative co-occurrence analysis, the R-score versus fractions of sequences in networks A and B or group C, the statistical validity of reported gene counts, and the reproducibility of DNA extraction methods used. Tables S1–S7: results of the public perception survey, additional sample information, pairwise comparison of community structures, the 22 most prevalent genera in each HRE, derivation of R-scores for 109 genera, comparison of community compositions by 16S rRNA amplicon sequencing and metagenomics, a characterization of 12 metagenomes, and a listing of bacterial taxa in the core sections of Figure 3. This material is available free of charge via the Internet at http://pubs.acs.org.
Accession Codes
16S rRNA gene sequences for the 160 samples were deposited in the Sequence Read Archive under SRA accession numbers 90658, 516417-8, 573715-8, 573725, 573727, 573736-41, 573756-63, 573768-70, 573772-73, 573775-6, 573778, 573780, 573782-4, 573786-826, 573828-30, 573832-3, 573835-6, 573839-40, 573843, 573845-9, 573851-2, 573855, 573857-71, 573874-80, 573882, 573886-902, 617109-11, 617113-8, 617126, 617130-1, 619006-7, 629337, 631208, 631211-2, 631807, 631809-10, 631815-6, 631822, 631896, and 631901 (Table S2). The metagenome sequence data were deposited under accession numbers SRX210867–SRX210872, SRX210875, SRX210880, SRX210884, SRX210886, SRX211003, and SRX211004 (Table S6). IMG/G accession numbers are also provided in Table S6.
Author Contributions
D.A., S.M.C., J.S., and G.V. were primarily responsible for curating and analyzing the data and the writing of the paper. All authors contributed to collecting and/or interpreting of the data. K.B., X.D., P.F.D., J.F., L.M.G., S.J.H., C.L.N., A.P., S.L., C.W.S., and G.W. also contributed to the writing of the paper.
The authors declare no competing financial interest.
Supplementary Material
References
- Head I. M.; Jones D. M.; Larter S. R. Biological activity in the deep subsurface and the origin of heavy oil. Nature 2003, 426, 344–352. [DOI] [PubMed] [Google Scholar]
- Grabowski A.; Nercessian O.; Fayolle F.; Blanchet D.; Jeanthon C. Microbial diversity in production waters of a low-temperature biodegraded oil reservoir. FEMS Microbiol. Ecol. 2005, 543427–443. [DOI] [PubMed] [Google Scholar]
- Ollivier B., Magot B., Eds. Petroleum Microbiology; ASM Press: Washington, DC, 2005. [Google Scholar]
- Pham V. D.; Hnatow L. L.; Zhang S.; Fallon R. D.; Jackson S. C.; Tomb J. F.; DeLong E. F.; Keeler S. J. Characterizing microbial diversity in production water from an Alaskan mesothermic petroleum reservoir with two independent molecular methods. Environ. Microbiol. 2009, 11, 176–187. [DOI] [PubMed] [Google Scholar]
- Youssef N.; Elshahed M. S.; McInerney M. J.. Microbial processes in oil fields: Culprits, problems and opportunities. In Advances in Applied Microbiology; Laskin A. I., Sariaslani S., Gadd G. M., Eds.; Academic Press: Burlington, MA, 2009; Vol. 66, pp 141–251. [DOI] [PubMed] [Google Scholar]
- Adams J. J.; Larter S. R.; Bennett B. ; Huang H. ; Westrich J.; van Kruisdijk C.. The dynamic interplay of oil mixing, charge timing, and biodegradation in forming the Alberta oil sands: Insights from geologic modeling and biogeochemistry. In Heavy-Oil and Oil-Sand Petroleum Systems in Alberta and Beyond; Hein F. J., Leckie D., Larter S. R., Suter J., Eds.; AAPG Studies in Geology; AAPG: Tulsa, OK, 2012; Vol. 64, pp 1–80. [Google Scholar]
- Holowenko F. M.; MacKinnon M. D.; Fedorak P. M. Methanogens and sulfate-reducing bacteria in oil sands fine tailings waste. Can. J. Microbiol. 2000, 46, 927–937 10.1139/w00-081. [DOI] [PubMed] [Google Scholar]
- Siddique T.; Fedorak P. M.; MacKinnon M. D.; Foght J. M. Metabolism of BTEX and naphtha compounds to methane in oil sands tailings. Environ. Sci. Technol. 2007, 41, 2350–2356. [DOI] [PubMed] [Google Scholar]
- Ramos-Padron E.; Bordenave S.; Lin S. P.; Bhaskar I. M.; Dong X. L.; Sensen C. W.; Fournier J.; Voordouw G.; Gieg L. M. Carbon and sulfur cycling by microbial communities in a gypsum-treated oil sands tailings pond. Environ. Sci. Technol. 2011, 45, 439–446. [DOI] [PubMed] [Google Scholar]
- Saidi-Mehrabad A.; He Z.; Tamas I.; Sharp C. E.; Brady A. L.; Rochman F.; Bodrossy L.; Abell G. C. J.; Penner T.; Dong X.; Sensen C. W.; Dunfield P. F. Methanotrophic bacteria in oilsands tailings ponds of northern Alberta. ISME J. 2013, 7, 908–921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Voordouw G.; Grigoryan A. A.; Lambo A.; Lin S.; Park H. S.; Jack T. R.; Coombe D.; Clay B.; Zhang F.; Ertmoed R.; Miner K.; Arensdorf J. J. Sulfide remediation by pulsed injection of nitrate into a low temperature Canadian heavy oil reservoir. Environ. Sci. Technol. 2009, 43249512–9518. [DOI] [PubMed] [Google Scholar]
- Agrawal A.; Park H. S.; Nathoo S.; Gieg L. M.; Jack T. R.; Miner K.; Ertmoed R.; Benko A.; Voordouw G. Toluene depletion in produced oil contributes to souring control in a field subjected to nitrate injection. Environ. Sci. Technol. 2012, 4621285–1292. [DOI] [PubMed] [Google Scholar]
- Strapoć D.; Mastalerz M.; Dawson K.; Macaladay J. L.; Callaghan A. V.; Wawrik B.; Turich C.; Ashby M. Biogeochemistry of microbial coal-bed methane. Annu. Rev. Earth Planet. Sci. 2011, 39, 617–656. [Google Scholar]
- Penner T. J.; Foght J. M.; Budwill K. Microbial diversity of western Canadian subsurface coal beds and methanogenic coal enrichment cultures. Int. J. Coal Geol. 2010, 82, 81–93. [Google Scholar]
- Foght J.; Aislabie J.; Turner S.; Brown C. E.; Ryburn J.; Saul D. J.; Lawson W. Culturable bacteria in subglacial sediments and ice from two Southern Hemisphere glaciers. Microb. Ecol. 2004, 47, 329–340. [DOI] [PubMed] [Google Scholar]
- Soh J.; Dong X.; Caffrey S. M.; Voordouw G.; Sensen C. W.. Phoenix 2: A locally installable large-scale 16S rRNA gene sequence analysis pipeline with Web interface. J. Biotechnol. 2013, 10.1016/j.jbiotec.2013.07.004in press [DOI] [PubMed]
- Schloss P. D.; Westcott S. L.; Ryabin T.; Hall J. R.; Hartmann M.; Hollister E. B.; Lesniewski R. A.; Oakley B. B.; Parks D. H.; Robinson C. J.; Sahl J. W.; Stres B.; Thallinger G. G.; Van Horn D. J.; Weber C. F. Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 2009, 75, 7537–7541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tamura K.; Peterson D.; Peterson N.; Stecher G.; Nei M.; Kumar S. MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 2011, 28, 2731–2739. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Borg I.; Groenen P.. Modern Multidimensional Scaling: Theory and Applications; Springer: New York, 2010. [Google Scholar]
- Hess M.; Sczyrba A.; Egan R.; Kim T.-W.; Chokhawala H.; Schroth G.; Luo S.; Clark D. S.; Chen F.; Zhang T.; Mackie R. I.; Pennacchio L. A.; Tringe S. G.; Visel A.; Woyke T.; Wang Z.; Rubin E. M. Metagenomic discovery of biomass-degrading genes and genomes from cow rumen. Science 2011, 331, 463–467. [DOI] [PubMed] [Google Scholar]
- Surget-Groba Y.; Montoya-Burgos J. I. Optimization of de novo transcriptome assembly from next-generation sequencing data. Genome Res. 2010, 20, 1432–1440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sommer D. D.; Delcher A. L.; Salzberg S. L.; Pop M. Minimus: A fast, lightweight genome assembler. BMC Bioinform. 2007, 8, 64. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Markowitz V. M.; Chen I. M.; Chu K.; Szeto E.; Palaniappan K.; Grechkin Y.; Ratner A.; Jacob B.; Pati A.; Huntemann M.; Liolios K.; Pagani I.; Anderson I.; Mavromatis K.; Ivanova N. N.; Kyrpides N. C. IMG/M: The integrated metagenome data management and comparative analysis system. Nucleic Acids Res. 2012, 40, D123–D129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Barbaran A.; Bates S. T.; Casamayor E. O.; Fierer N. Using network analysis to explore co-occurrence patterns in soil microbial communities. ISME J. 2012, 6, 343–351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zengler K.; Richnow H. H.; Rossello-Mora R.; Michaelis W.; Widdel F. Methane formation from long-chain alkanes by anaerobic microorganisms. Nature 1999, 401, 266–269. [DOI] [PubMed] [Google Scholar]
- Jones D. M.; Head I. M.; Gray N. D.; Adams J. J.; Rowan A. K.; Aitken C. M.; Bennett B.; Huang H.; Brown A.; Bowler B. F. J.; Oldenburg T. B. P.; Erdmann M.; Larter S. R. Crude-oil biodegradation via methanogenesis in subsurface petroleum reservoirs. Nature 2008, 451, 176–180. [DOI] [PubMed] [Google Scholar]
- Gieg L. M.; Duncan K. E.; Suflita J. M. Bioenergy production via microbial conversion of residual oil to natural gas. Appl. Environ. Microbiol. 2008, 74, 3022–3029. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Angel R.; Claus P.; Conrad R. Methanogenic archaea are globally ubiquitous in aerated soils and become active under wet anoxic conditions. ISME J. 2012, 6, 847–862. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fischer J.; Kappelmeyer U.; Kastner M.; Schauer R.; Heipieper H. J. The degradation of bisphenol A by the newly isolated bacterium Cupriavidus basilensis JF1 can be enhanced by biostimulation with phenol. Int. Biodeterior. Biodeg. 2010, 64, 324–330. [Google Scholar]
- Pérez-Pantoja D.; De La Iglesia R.; Pieper D. H.; González B. Metabolic reconstruction of aromatic compounds degradation from the genome of the amazing pollutant-degrading bacterium Cupriavidus necator JMP134. FEMS Microbiol. Rev. 2008, 32, 736–794. [DOI] [PubMed] [Google Scholar]
- Šmejkalová H.; Erb T. J.; Fuchs G. Methanol assimilation in Methylobacterium extorquens AM1: Demonstration of all enzymes and their regulation. PLoS ONE 2010, 5, art. no. e13001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li D.; Hendry P.; Faiz M. A survey of the microbial populations in some Australian coalbed methane reservoirs. Int. J. Coal Geol. 2008, 76, 14–24. [Google Scholar]
- Tang Y.-Q.; Ji P.; Lai G.-L.; Chi C.-Q.; Lui Z.-S.; Wu X.-L. Diverse microbial community from the coalbeds of the Ordos Basin, China. Int. J. Coal Geol. 2012, 90–91, 21–33. [Google Scholar]
- Brune A.; Evers S.; Kaim G.; Ludwig W.; Schink B. Ilyobacter insuetus sp. nov., a fermentative bacterium specialized in the degradation of hydroaromatic compounds. Int. J. Syst. Evol. Microbiol. 2002, 52, 429–432. [DOI] [PubMed] [Google Scholar]
- Singh D. N.; Kumar A.; Sarbhai M. P.; Tripathi A. K. Cultivation-independent analysis of archaeal and bacterial communities of the formation water in an Indian coal bed to enhance biotransformation of coal into methane. Appl. Microbiol. Biotechnol. 2012, 93, 1337–1350. [DOI] [PubMed] [Google Scholar]
- Widdel F.; Rabus R. Anaerobic biodegradation of saturated and aromatic hydrocarbons. Curr. Opin. Biotechnol. 2001, 12, 259–276. [DOI] [PubMed] [Google Scholar]
- da Cruz G. F.; de Vasconcellos S. P.; Angolini C. F.; Dellagnezze B. M.; Garcia I. N.; de Oliveira V. M.; Dos Santos Neto E. V.; Marsaioli A. J. Could petroleum biodegradation be a joint achievement of aerobic and anaerobic microrganisms in deep sea reservoirs?. AMB Express 2011, 1, 47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Andriashek L. D.; Atkinson N.. Buried channels and glacial drift aquifers in the Fort McMurray Region. In Earth Sciences Report 2007-1; Alberta Geological Survey, Alberta Energy and Utilities Board: Edmonton, AB, 2007. [Google Scholar]
- Bennett B.; Adams J. J.; Gray N. D.; Sherry A.; Oldenburg T. B. P.; Huang H.; Larter S. R.; Head I. The controls on the composition of biodegraded oils in the deep subsurface – Part 3. The impact of microorganism distribution on petroleum geochemical gradients in biodegraded petroleum reservoirs. Org. Geochem. 2013, 56, 94–105. [Google Scholar]
- Catcheside D. E. A.; Ralph J. P. Biological processing of coal. Appl. Microbiol. Biotechnol. 1999, 52, 16–24. [DOI] [PubMed] [Google Scholar]
- Bachu S.; Michael K. Possible controls of hydrogeological and stress regimes on the producibility of coalbed methane in Upper Cretaceous-Tertiary strata of the Alberta Basin, Canada. AAPG Bull. 2003, 87, 1729–1754. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.


