Table 1. Phylogenomic data sources: “Human proteins” column refers to the number of genes where the human ortholog was included in the final dataset.
Numbers in brackets refer to the number of human orthologs recovered using our hmmsearch approach (see Methods). Total kept percent after trimming refers to the percent of sites that remained in the final dataset compared to the sum of the lengths of all untrimmed human orthologs.
| Dataset name | Genes | Taxa | Sites | Average coverage (by gene) | Average coverage (by site) | Human proteins | Total kept percent after trimming | Reference |
|---|---|---|---|---|---|---|---|---|
| Philippe 2009 | 128 | 55 | 30257 | 82% | 73.1% | 0 (128) | 86.0% | Philippe et al. (2009) |
| Ryan 2013 EST | 406 | 70 | 88384 | 50 | 41.6 | 0 (396) | 62.5 | Ryan et al. (2013) |
| Whelan 2015 D1 | 251 | 76 | 81006 | 75 | 59.6 | 248 | 56.6 | Whelan et al. (2015) |
| Borowiec 2015 | 1080 | 36 | 384981 | 87 | 75.8 | 1056 | 64.7 | Borowiec et al. (2015) |
| Cannon 2016 | 212 | 78 | 44896 | 80 | 69.0 | 212 | 46.1 | Cannon et al. (2016) |
| Simion 2017 | 1719 | 97 | 401632 | 74 | 60.7 | 1499 | 40.0 | Simion et al. (2017) |