Table 4.
Accuracy of supervised population structure inference with supplied allele frequencies on simulations
| Dataset type | Base dataset | k | n | m |
Supervised |
Unsupervised |
||
|---|---|---|---|---|---|---|---|---|
| RMSE | JSD | RMSE | JSD | |||||
| PSD | HGDP | 6 | 10,000 | 10,000 | 2.9∗ | 1.5∗ | 5.6 | 3.6 |
| PSD | TGP | 6 | 10,000 | 10,000 | 2.0∗ | 0.9∗ | 3.2 | 1.9 |
| PSD | TGP | 6 | 10,000 | 1,000,000 | 0.2∗ | 0.1∗ | 0.3 | 0.2 |
| PSD | TGP | 6 | 100,000 | 1,000,000 | 0.2∗ | 0.1∗ | 0.4 | 0.2 |
| PSD | TGP | 6 | 1,000,000 | 1,000,000 | 0.2∗ | 0.1∗ | 0.5 | 0.2 |
| Spatial | HGDP | 6 | 10,000 | 10,000 | 2.4∗ | 0.6∗ | 6.5 | 2.6 |
| Spatial | TGP | 6 | 10,000 | 10,000 | 1.7∗ | 0.3∗ | 7.3 | 3.3 |
| Spatial | TGP | 10 | 10,000 | 100,000 | 0.6∗ | 0.3∗ | 6.7 | 5.6 |
| Spatial | TGP | 10 | 10,000 | 1,000,000 | 0.3∗ | 0.1∗ | 8.2 | 7.2 |
True allele frequencies were supplied to SCOPE to use in supervised population structure inference. Root-mean-square error (RMSE) and Jensen-Shannon divergence (JSD) were computed against the true admixture proportions. Estimated proportions of 0 were set to for JSD calculations (see subjects and methods). Values are displayed in percentages and rounded to the first decimal place. Values with an asterisk denote the best value for each dataset.