Skip to main content
. 2013 Sep 20;5(9):81. doi: 10.1186/gm485

Table 4.

Challenges for the integration of metagenomics into public health

Challenge Description Relevance Solution Reference(s)
Multiple technologies
Next-generation sequencing can be performed on multiple platforms each with different characteristics, and each constantly under improvement
Difficulty comparing results from different platforms and with those from older techniques
Pipelines must be constantly updated to account for new techniques
[74,76,92]
Universal approach not yet possible
Different platforms should be utilized depending on the question asked
 
 
Continuously evolving technology requires skilled workforce rather than established pipelines
 
 
Computational resources
Our ability to generate DNA sequence data has rapidly surpassed our computational abilities to analyze the data
Significant requirements for storage of DNA sequence
Perform analysis using a staged approach
[69,93]
 
 
Assembling and identifying short reads from next-generation sequencing is computationally intensive
Cloud computing
 
Suitable reference databases
Multiple reference databases are available, which may generate different results depending on the database used
Certain features of a metagenomic sample might be missed if the wrong database is used
HMP aims to sequence multiple references genomes associated with the human body
[94]
 
 
Limited by the diversity represented in each database
HMP currently has a total of 6,500 reference sequences generated
 
Short read lengths
Read lengths depend on sequencing platform used
Makes de novo assembly more complicated
Read lengths are continually increasing
[92,95]
 
 
More difficult to identify large-scale genomic variations and repetitive regions
Third-generation sequencing platforms promise much longer read lengths
 
Causation
Finding a pathogen in a disease sample does not imply causation
Important to determine causation before changing public health management
Follow-up studies are required - for example, using animal models, or serological or epidemiological methods.
[11,75,96]
 
 
False association can lead to costly, useless or even potentially harmful therapies
Results must be independently validated
 
Contamination
Metagenomics can detect contaminants from cell cultures, reagents and laboratory equipment
Contaminants may be incorrectly associated with the disease of interest
Negative controls must be used
[97]
Researchers must consider the plausibility of the findings
 
 
 
Results must be independently validated
 
Privacy
Host nucleic acids are almost always sequenced in metagenomics studies
Host genetic sequences are confidential
Host DNA to be available only to researchers in HMP
[92,98]
    Human subjects might be traceable from their DNA sequences Only microbiome data are released to the public