Appendix 1—table 5. Environmental (metagenomic) dataset description.
(A) Number of samples and sites per metagenomic project. | ||||
---|---|---|---|---|
Dataset | Reference | Samples | Sites | Contigs |
TARA | Sunagawa et al., 2015 | 242 | 141 | 62,404,654 |
Malaspina | Duarte, 2015 | 116 | 30 | 9,330,293 |
OSD | Kopf et al., 2015 | 145 | 139 | 4,127,095 |
HMP | Lloyd-Price et al., 2017 | 1,246 | 18 | 80,560,927 |
Dataset | Reference | Samples | Sites | Reads |
GOS | Rusch et al., 2007 | 80 | 70 | 12,672,518 |
(B) Number of predicted genes per completeness category. | ||||
Total | "00" | "10" | "01" | "11" |
322,248,552 | 118,717,690 | 106,031,163 | 102,966,482 | 75,694,123 |
Note: "00" = complete, both start and stop codon identified. "01" = right boundary incomplete. "10" = left boundary incomplete. "11" = both left and right edges incomplete.