Table 1.
Summary statistic | Description | Summarizes | Citation |
---|---|---|---|
iHS | Integrated haplotype score | Haplotype structure | (Voight et al. 2006) |
nSL | Number of segregating sites by length | Haplotype structure | (Ferrer-Admetlla et al. 2014) |
DIND | Derived intra-allelic nucleotide diversity | Diversity on derived background | (Barreiro et al. 2009) |
iSAFEa | Integrated selection of allele favored by evolution | Haplotype structure | (Akbari et al. 2018) |
HAFb | Haplotype allele frequency | Haplotype structure | (Ronen et al. 2015) |
H12 | Frequencies of first and second most common haplotypes, modified to use 80% identity threshold | Haplotype structure | (Garud et al. 2015) |
hapDAF-o | Haplotype-derived allele frequency (old) | SFS | This paper |
hapDAF-s | Haplotype-derived allele frequency (standing) | SFS | This paper |
Sratio | Segregating sites ratio | Diversity on derived background | This paper |
lowfreq | Low-frequency alleles on derived background | Diversity on derived background | This paper |
highfreq | High-frequency alleles on derived background | Diversity on derived background | This paper |
When a window included fewer than 300 SNPs, the iSAFE statistic was not calculated and the SAFE statistic was used instead. This is in accordance with Akbari et al. 2018's recommendation to use SAFE instead of iSAFE when there are few segregating sites, and SAFE is used in this circumstance for calculating all feature vectors (from the training data as well as from data to be classified).
Only the top 10% of HAF values are used, as this provides a better signal for incomplete sweeps.