Table 1.
Experimental pitfalls and biases | General remarks and potential solutions |
---|---|
Step 1: sample collection | |
Transport and storage conditions can impact DNA yield and DNA quality prior to 16S rRNA gene NGS experiments. | Optimal preservation of microbial samples involves immediate freezing at − 20 °C or lower, followed by long-term storage at − 80 °C. Repeated freezing and thawing should be avoided. |
Step 2: DNA extraction | |
Different lysis methods can impact the final 16S rRNA gene NGS results. | The most efficient lysis method depends on the sample type and the target microbial species under investigation, which should ideally be determined by the end user. For reproducibility, the same method should be used in all subsequent experiments for this sample type. |
Step 3: PCR amplification | |
No 16S rRNA gene PCR primer pair is truly ‘universal’ and different primer pairs may hybridize to different proportions of ‘conserved’ sequences. The use of multi-template 16S rRNA gene PCRs inevitably generates PCR artifacts, resulting in inaccurate 16S rRNA gene NGS results. |
The most optimal PCR primer pair should be selected based on its primer binding capacity to the (expected or most clinically relevant) microbial species present within the investigated sample. The use of a clonal-based amplification methodology helps limit the PCR-competition induced biases and the formation of chimeric amplicons. |
Step 4: next-generation sequencing | |
Current most widely used NGS-platforms produce sequence reads that span only a few hundred nucleotides, which complicates the reliable assignment of short 16S rRNA gene sequences to in silico stored reference 16S rRNA gene sequences. | Targeting the 16S rRNA gene V4 region allows for a large overlap of DNA sequences that are obtained from both ends of the PCR amplicon using Illumina’s MiniSeq/MiSeq NGS-platforms. This results in accurate NGS results with negligible error rates, though the accompanying cost is a reduction of discriminatory power due to the short amplicon size. |
Step 5: Bioinformatics analysis | |
The evaluation of NGS data by different bioinformatics algorithms (and their settings) may lead to different 16S rRNA gene NGS results. An accurate taxonomic identification depends on the quality and completeness of the reference databases used. |
Several standardized bioinformatics pipelines are available that allow for automated sequence interpretation without the requirement for advanced bioinformatics skills. Manual evaluation of the main 16S rRNA gene NGS results is to be encouraged to ensure correct taxonomic identifications. |
Miscellaneous | |
16S rRNA gene NGS results are generally presented as proportional abundances of OTUs, which complicates cross-study comparability. The analysis of 16S rRNA genes is prone to the introduction of contaminating DNA derived from the experimental set-up during sample processing. |
The use of protocols that determine the absolute quantity of OTUs improves the standardization of 16S rRNA gene NGS results in different studies. An adequate number of negative (extraction) control samples should be included and analyzed to identify (and remove) any 16S rRNA gene copies originating from contaminating DNA. |
OTU operational taxonomic units—a group of very similar sequences