Skip to main content
. 2022 Jul 5;13:3863. doi: 10.1038/s41467-022-31502-1

Table 1.

Main characteristics of the datasets used to construct the WIS, UNITN, and UHGG reference sets.

N WIS UNITN UHGG
Samples Samples 51,052 9428 Did not generate new assemblies from samples
Body sites 1: Gut (100%) 5: Gut (85%), oral (8.5%), skin (5.4%), vagina (1%), and maternal milk (0.1%) NA
Countries 2: Israel (90%) and USA (10%) 31: USA (15%), China (14%), Israel (10%), Sweden (6%), and Denmark (6%)* NA
Age Adults (99%) and children (<1%) Adults (81%) and children (19%) NA
Gender Female (61%) and male (39%) Not specified NA
Assemblies Assemblies from samples before filtration criteria 483,192 345,654 0
Assemblies from samples after filtration criteria# 142,912 (30%) 154,723 (45%) 0 (0%)
External assemblies after filtration criteria# 98,206 (88% Passoli et al.7) 80,990 286,997 (48% Passoli et al.7)
Total assemblies used 241,118 (36% Passoli et al.7) 154,723 286,997 (48% Passoli et al.7)
Clusters Species 3594 4930 4644
Genera 2365 2640 Not specified
Families 627 778 Not specified

*Five most represented countries.

#Each set of assemblies went through different filtration criteria (“Methods”).