Abstract
The use and validation of a strategy that allows a universal set of bar-coded sequencing primers to be appended to an amplified PCR product is described. The strategy allows a modular approach, in that the same bar code can be used with two or more target-specific primer sets, even simultaneously.
TEXT
High-throughput sequencing technologies can produce as many as a billion sequence reads from a sample (8), making the sequencing of PCR amplicons produced from metagenomic DNA samples achievable. Bar-coding strategies are also widely employed to reduce per-sample costs and/or to analyze many samples and replicates to obtain biological insights (10). This strategy tags the PCR amplicons from each sample by using an oligonucleotide primer containing a specific sequence-based tag, which supports the later binning of the sequence data produced for each sample. Bar-coding the PCR amplicons produced from hypervariable regions of the 16S rRNA genes has now become a method of choice for the analysis of complex microbial communities (11), allowing cost-effective and greatly improved characterization of complex communities and their dynamics.
A common feature of such data is that the most predominant amplicon species often obscure the reads derived from the less abundant species. This issue can be overcome in two ways: (i) by employing, where possible, group-specific primers or (ii) by greatly increasing the sequencing depth (i.e., the number of sequences per sample). However, both strategies are associated with a cost increase due to the need to acquire new bar-coded group-specific primers for each new group to be analyzed and/or additional sequencing costs.
In an attempt to circumvent these issues, a two-step strategy was developed that allows a single set of bar-coded sequencing primers to be used with any set of specific primers (Fig. 1). In the first step, one of the target-specific primers is modified to include a linker sequence. After amplification, a second primer consisting of the bar code and linker is used to tag the amplicon. This strategy circumvents the need to purchase bar-coded primer sets for each targeted group. Furthermore, each bar-coded sequencing primer can be shared by many group-specific reactions at the same time, since the final tag is composed of the bar-code sequence and the group-specific primer.
The validity of this approach was assessed in three ways. First, the total bacterial profiles generated from this approach were compared with the profiles produced by a standard 454 “pyrotagging” approach using individual bar-coded primers to determine whether any biases from the approach were introduced by the extra elongation step. Second, and to illustrate the flexibility of the method, two group-specific primer sets were used. Third, and to demonstrate the modularity of the approach, the primers targeting all three taxa were tagged with the same bar code. These tests were performed using 8 different metagenomic DNA samples.
The DNA samples used here were prepared from matched cecal and fecal samples from 4 mice. These samples were chosen because of their relevance to ongoing studies in the research group and to the sample type expected to be the main factor affecting community structure. The initial PCR amplification consisted of 250 ng of template DNA, 5 pmol of each primer, and 12.5 μl of 2× iProof high-fidelity DNA polymerase (Bio-Rad, Hercules, CA). The cycling conditions included 30 s at 94°C, 45 s at the desired annealing temperature, and 45 s for elongation. The primers designed and employed here are described in Table 1. For the standard pyrotagging approach, primers 8F15B and 515R14MA were employed in triplicate for each sample. Primers 8F15B and LbacteriaR were used for the first step of the modular approach, using an annealing temperature of 55°C and 25 amplification cycles. For the Desulfovibrio group, primers BdesulfoF and LdesulfoR were employed with an annealing temperature of 61°C and 35 amplification cycles. The Enterobacteriaceae amplicons were produced using BenteroF and LenteroR as primers at 58°C and 35 amplification cycles. The quality of the resulting products was checked on an agarose gel, and their concentrations were measured using a Quant-iT PicoGreen kit (Invitrogen, Switzerland).
Table 1.
Primer name | Sequence (5′–3′) | Specificity and referenceb |
---|---|---|
8F15B | CCTATCCCCTGTGTGCCTTGGCAGTCTCAGCAACAGCTAGAGTTTGATCCTGG | Bacteria (5) |
515R14AM | CCATCTCATCCCTGCGTGTCTCCGACTCAG-bc-TTACCGCGGCTGCT | Bacteria (6) |
BenteroF | CCTATCCCCTGTGTGCCTTGGCAGTCTCAGCAACAGCTATGGCTGTCGTCAGCTCGT | Enterobacteria (7) |
BdesulfoF | CCTATCCCCTGTGTGCCTTGGCAGTCTCAGCAACAGCTGRGYCYGCGTYYCATTAGC | Desulfovibrio (2) |
LbacteriaR | CGATTCATTAAAGCAGATCTCGATCCCTTACCGCGGCTGCT | Bacteria (6) |
LenteroR | CGATTCATTAAAGCAGATCTCGATCCCCCTACTTCTTTTGCAACCCACTC | Enterobacteria (12) |
LdesulfoR | CGATTCATTAAAGCAGATCTCGATCCCSYCCGRCAYCTAGYRTYCATC | Desulfovibrio (2) |
AbcL | CCATCTCATCCCTGCGTGTCTCCGACTCAG-bc-CGATTCATTAAAGCAGATCTCGATCCC | Linker (9) |
Bold, linker sequence; -bc-, barcode; italics, group-specific sequence; underline, linking sequence; double underline, 454 adaptor A; dotted underline, 454 adaptor B.
The references indicate the origin of the group-specific sequence.
For the proposed two-step approach, the first-step primers were removed from 30 ng of each PCR mixture by treatment with 7.5 U (each) of ExoI/CIP (New England BioLabs, Beverly, MA) for 20 min at 37°C and were heat inactivated by incubation for 20 min at 80°C. The resulting mixtures were then used for the second PCR using the same forward primers but employing individual AbcL primers as the reverse primer (see Table 1). The same bar-coded AbcL primer was used for each sample regardless of the specificity of the forward primer. The PCRs were carried out as before but for only 10 amplification cycles. The efficiency of the bar-coded primer attachment was checked by visualizing a gel shift of the amplicons in 3% (wt/vol) LMP agarose gels.
Equimolar amounts of the products of the standard approach were pooled and gel extracted using a MinElute kit (Qiagen). Since each specific primer set produced various amplicon sizes, the products of the two-step approach were pooled by group specificity. Equimolar amounts of each amplicon were gel extracted as before, the concentration of the products was measured as previously described, and then the products were sequenced using a Roche 454 FLX sequencer with titanium chemistry.
The sequences obtained were processed using QIIME software (1). The sequences were first filtered for correct length and quality thresholds, assigned to sample- and group-specific reactions by the use of the bar code (sample)–linker (common)–primer (specificity) sequence as a specific tag, and grouped in operational taxonomic units (OTUs) at a 0.97 distance threshold.
To compare the profiles derived from the two strategies, the resulting OTU tables for Bacteria were first normalized for sample coverage by rarefying to the minimum number of sequences for any of the bar codes 100 times and then keeping the average. Values of less than 1.01 were transformed to 0 to reduce statistical noise. The data were then subjected to correspondence analysis using the ade4 R package (3) to explore overall differences between profiles and sample types. The χ2 dissimilarities between the profiles were also obtained. The average between-replicate dissimilarities were compared to the average dissimilarities of the two-step profiles with respect to the replicate profiles of the same sample by the use of a paired Wilcoxon signed-rank test (α = 0.05; n = 8) via the R package.
All of the reactions described above were easily carried out in 96-well microtiter plates, and the extra steps added by the proposed strategy did not substantially increase sample processing time. Although the differences in PCR protocols might have been expected to produce some degree of differential bias between methods, the two approaches generated highly similar profiles (Fig. 2). Moreover, the community profiles produced by the modular tagged PCR methods were not significantly different from those produced by the three technical replicates generated via the standard pyrotagging approach. The sample type was found to be the main driver of variation, with cecal samples showing an overall increase in the Bacteroides-to-Firmicutes ratio compared to the fecal samples (0.43 ± 14 and 0.33 ± 11, respectively), in concordance with previous observations made using human samples (4). When the group-specific PCRs were pooled in equimolar amounts and then sequenced, the proportions of sequences from the different groups were maintained; showing that no sequencing bias had been introduced.
Although the linker used here is 27 nucleotides (nt) long (9), shorter linkers can be readily developed and the same method used with other high-throughput technologies such as Illumina analysis. As such, the method further improves the efficiency and reduces the per-sample costs of sequencing amplicons from multiple samples at the same time and can be expanded to include functional genes. Provided that there is no overlap between the bar-coded, group-specific primer combinations, amplicons from different studies can be rapidly screened and sequenced together, with substantial reductions in both cost and time.
Acknowledgments
This research has been supported with funds provided by CSIRO's Preventative Health Flagship Research Program, a CSIRO OCE Science Leader award (to M.M.), and funds from CSIRO's Transformational Biology Capability Platform.
We thank Carly Rosewarne and Eline Klaassens for critical reading of the manuscript, as well as Rob Moore and Honglei Chen for the amplicon library preparation and sequencing. D.A.D.C., S.E.D., C.M., and M.M. conceived the experiment; D.A.D.C., S.E.D., and M.M. cowrote the paper. D.A.D.C. designed the validation experiment and carried out the data analysis.
Footnotes
Published ahead of print on 15 July 2011.
REFERENCES
- 1. Caporaso J. G., et al. 2010. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7:335–336 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Daly K., Sharp R. J., McCarthy A. J. 2000. Development of oligonucleotide probes and PCR primers for detecting phylogenetic subgroups of sulfate-reducing bacteria. Microbiology 146:1693–1705 [DOI] [PubMed] [Google Scholar]
- 3. Dray S., Dufour A. B. 2007. The ade4 package: implementing the duality diagram for ecologists. J. Stat. Softw. 22:1–20 [Google Scholar]
- 4. Eckburg P. B., et al. 2005. Diversity of the human intestinal microbial flora. Science 308:1635–1638 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Felske A., Rheims H., Wolterink A., Stakebrandt E., Akkermans A. 1997. Ribosome analysis reveals prominent activity of an uncultured member of the class Actinobacteria in grassland soils. Microbiology 143:2983–2989 [DOI] [PubMed] [Google Scholar]
- 6. Lane D. 1990. 16S and 23S rRNA sequencing, p. 115–175 In Stackebrandt E., Goodfellow M. (ed.), Nucleic acid techniques in bacterial systematics. John Wiley, New York, NY [Google Scholar]
- 7. Leser T. D., et al. 2002. Culture-independent analysis of gut bacteria: the pig gastrointestinal tract microbiota revisited. Appl. Environ. Microbiol. 68:673–690 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Metzker M. L. 2010. Sequencing technologies—the next generation. Nat. Rev. Genet. 11:31–46 [DOI] [PubMed] [Google Scholar]
- 9. Nakano M., et al. 2005. Adaptor PCR for single molecule amplification. J. Biosci. Bioeng. 100:216–218 [DOI] [PubMed] [Google Scholar]
- 10. Parameswaran P., et al. 2007. A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing. Nucleic Acids Res. 35:e130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Roh S. W., Abell G. C. J., Kim K.-H., Nam Y.-D., Bae J.-W. 2010. Comparing microarrays and next-generation sequencing technologies for microbial ecology research. Trends Biotechnol. 28:291–299 [DOI] [PubMed] [Google Scholar]
- 12. Sghir A., et al. 2000. Quantification of bacterial groups within human fecal flora by oligonucleotide probe hybridization. Appl. Environ. Microbiol. 66:2263–2266 [DOI] [PMC free article] [PubMed] [Google Scholar]