Skip to main content
. 2017 Nov 21;46(Database issue):D700–D707. doi: 10.1093/nar/gkx1124

Table 1. Overlaps in phages within data-sources.

Data source # clusters % overlap * Notes
‘Earth's virome’ project (44) 5412 57.4% Over 3000 samples were sequenced; most are environmental samples
Predicted prophages in human gut (1,42) 1505 18.67% ∼1700 fecal samples from two gut metagenomic studies (1,42)
Predicted viral and prophage sequences from complete and draft genomes (36) 7117 18.07%
Predicted prophages from NCBI complete genomes (40) 6964 15.4% All available complete prokaryotic genomes (as of May 2017)
NCBI reference viral genome database (39) 776 0.64%
Predicted prophages from EMBL proGenomes database (41) 3275 0.61% Representative complete prokaryotic genomes (as of May 2017)
ICTV 668 0 Data obtained from the International Committee on Taxonomy of Viruses (https://talk.ictvonline.org; ICTV)

* within each data-source, the overlap ratio is defined as proportion of phage clusters containing multiple sequences from the data source, out of the total phage clusters containing any number of sequences from the same data source.