Skip to main content
. 2020 Dec 21;39(5):578–585. doi: 10.1038/s41587-020-00774-7

Fig. 2. An expanded reference database of environmentally diverse complete viral genomes.

Fig. 2

ac, The 76,262 complete viral genomes were identified from publicly available metagenomes, metatranscriptomes and viromes based on the presence of a DTR and were clustered into 39,117 nonredundant genomes at 95% ANI. a, The nonredundant genomes are derived from diverse human-associated and environmental habitats. Habitats are based on the Genomes OnLine Database ontology47, and visualization was created using RAWgraphs48. b, The 39,117 genomes were taxonomically annotated based on clade-specific marker genes from the VOG database. c, Comparison of sequence length between GenBank genomes and DTR contigs. For box plots, the middle line denotes the median, the box denotes IQR and the whiskers denote 1.5× IQR. NCLDV, nucleocytoplasmic large DNA virus.