Multiple technologies
|
Next-generation sequencing can be performed on multiple platforms each with different characteristics, and each constantly under improvement
|
Difficulty comparing results from different platforms and with those from older techniques
|
Pipelines must be constantly updated to account for new techniques
|
[74,76,92]
|
Universal approach not yet possible
|
Different platforms should be utilized depending on the question asked
|
|
|
Continuously evolving technology requires skilled workforce rather than established pipelines
|
|
|
Computational resources
|
Our ability to generate DNA sequence data has rapidly surpassed our computational abilities to analyze the data
|
Significant requirements for storage of DNA sequence
|
Perform analysis using a staged approach
|
[69,93]
|
|
|
Assembling and identifying short reads from next-generation sequencing is computationally intensive
|
Cloud computing
|
|
Suitable reference databases
|
Multiple reference databases are available, which may generate different results depending on the database used
|
Certain features of a metagenomic sample might be missed if the wrong database is used
|
HMP aims to sequence multiple references genomes associated with the human body
|
[94]
|
|
|
Limited by the diversity represented in each database
|
HMP currently has a total of 6,500 reference sequences generated
|
|
Short read lengths
|
Read lengths depend on sequencing platform used
|
Makes de novo assembly more complicated
|
Read lengths are continually increasing
|
[92,95]
|
|
|
More difficult to identify large-scale genomic variations and repetitive regions
|
Third-generation sequencing platforms promise much longer read lengths
|
|
Causation
|
Finding a pathogen in a disease sample does not imply causation
|
Important to determine causation before changing public health management
|
Follow-up studies are required - for example, using animal models, or serological or epidemiological methods.
|
[11,75,96]
|
|
|
False association can lead to costly, useless or even potentially harmful therapies
|
Results must be independently validated
|
|
Contamination
|
Metagenomics can detect contaminants from cell cultures, reagents and laboratory equipment
|
Contaminants may be incorrectly associated with the disease of interest
|
Negative controls must be used
|
[97]
|
Researchers must consider the plausibility of the findings
|
|
|
|
Results must be independently validated
|
|
Privacy
|
Host nucleic acids are almost always sequenced in metagenomics studies
|
Host genetic sequences are confidential
|
Host DNA to be available only to researchers in HMP
|
[92,98]
|
|
|
Human subjects might be traceable from their DNA sequences |
Only microbiome data are released to the public |
|