TABLE 1 .
Step or parameter | Characteristic or consideration | Description of characteristic or consideration in indicated type of analysisb |
Comments | ||||
---|---|---|---|---|---|---|---|
Open format |
Closed format |
||||||
TGS | SMS | MTS | FGAs | PGAs | |||
Sample preparation and analysis | Sample/target preparation | Complicated | Simple | Very complicated | Simple | Simple | DNA/RNA quality is important for all approaches |
Analysis of multiplex samples per assay | Large potential | Medium potential | Medium potential | Low (only one or two) | Low (only one or two) | FGAs and PGAs use 1 or 2 dyes for labeling, and it is difficult to multiplex samples in a single assay | |
PCR amplification or whole-genome analysis | Yes | No | No | No/yes | Yes/no | Amplification introduces major problems for quantification | |
Potential uneven hybridization | NA | NA | NA | Yes | Yes | Signal normalization is needed within and between arrays to correct signal differences due to systematic errors | |
Data processing and analysis | Raw data processing | Relatively easy | Difficult | Difficult | Easy | Easy | A major challenge for SMS and MTS with large raw datasets |
Phylogeny | Yes | Some | Some | No/yes | Yes | GeoChip uses gyrB for phylogeny | |
Taxonomic resolution | Strain, species, genus | Strain, species | Strain, species | Strain, species | Genus, family | It depends on molecular markers with high resolution for functional genes | |
Functional features | No/yes | Yes | Yes | Yes | No | TGS can analyze DNA and RNA for functional genes | |
Signal threshold | Yes | NA | NA | Yes | Yes | Both PGAs and FGAs require a threshold to call positive signals, which is more or less arbitrary. Thus, some ambiguity exists for positive or negative spots. | |
Requires a priori knowledge | No/yes | No | No | Yes | Yes | Closed-format technologies are designed based on known sequences | |
Analysis of α diversity | Very good | Good | Very poor | Fair | Fair | Here, α diversity estimation is based on a single gene | |
Data comparison across samples | Moderate | Difficult | Difficult | Easy | Easy | Random or undersampling is a major issue for open-format approaches | |
Performance | Coverage/breadth (no. of different genes detected) | Very low | High | High | High | Very low | TGS can analyze phylogenetic or functional genes |
Sampling depth (no. of sequences or OTUs per gene) | Very high | Low/medium | Low/medium | Medium | High | The sampling depth for closed-format approaches depends on the number of probes used | |
Detection of rare species/genes | Medium | Difficult | Difficult | Easy | Easy | Easy for closed format as long as the appropriate probes are present | |
Quantification | Low | Not known | Not known | High | Low/medium | Not rigorously tested for SMS and MTS; for PhyloChip, if RNA is used instead of DNA (no PCR step), quantification is high | |
Susceptibility to the artifacts associated with random sampling process | Medium | High | High | Low | Medium/low | A major problem for sequencing approaches; PCR amplification may be involved in PhyloChip | |
Potential discovery of novel genes/species | Yes | Yes | Yes | No | No | ||
Results skewed by dominant populations | Yes | Yes | Yes | No | No | ||
Sensitivity to (host) DNA/RNA contamination | No/yes | Yes | Yes | No | No | Difficult to remove host DNA/RNA contamination | |
Applicability and cost | Most promising applications | In-depth studies of microbial diversity or specific functional groups and discovery of novel genes | Surveys of microbial genetic diversity of unknown communities and discovery of novel genes | Surveys of functional activity of unknown microbial communities and discovery of novel genes | Comparisons of functional diversity and structure of microbial communities across many samples | Comparisons of taxonomic or phylogenetic diversity and structure of microbial communities across many samples | The choice of technology mainly depends on the biological questions and hypotheses to be addressed |
Relative cost per assay | Medium | High | High | Low | Low | It is challenging to make general statements of cost because they depend on technology platforms, depth of analysis, and approaches used for processing and analyzing data | |
Cost per sample ($) | 30–150 | 1200–4000 | 1500–4500 | 150–800 | 150–1000 | This is only based on the cost of materials for target gene amplicon preparations and sequencing. | |
Cost for bioinformatic analysis | Medium | High | High | Low | Low |
Since various technologies have different features, it is difficult to make straightforward, point-by-point direct comparison. Thus, our attempt is to highlight the major differences of various technologies in a general sense. We attempt to focus on the issues important to microbial ecology within the context of environmental applications and complex microbial communities like those in soil rather than list the differences of various technologies in a comprehensive manner.
TGS, target gene (e.g., 16S rRNA, amoA, nifH) sequencing; SMS, shotgun metagenome sequencing; MTS, metatranscriptome sequencing; FGAs, functional gene arrays: the listed analysis is mostly based on GeoChip; PGAs, phylogenetic gene arrays: the listed analysis is mostly based on PhyloChip; NA, not applicable.