Skip to main content
. 2019 Mar 7;20:116. doi: 10.1186/s12859-019-2680-1

Table 1.

10 times repeated 5-fold cross-validated F1 score in five 1000 Genomes Project superpopulations using SVM, PCA or GTM

Ancestry 1000G code PCA 8-NN SVM 10 PCs GTM 3 PCs GTM 10 PCs
Africans AFR 1.00±0.00 1.00±0.00 1.00±0.00 1.00±0.00
Admixed Americans AMR 0.93±0.00 1.00±0.00 1.00±0.00 1.00±0.00
East Asians EAS 1.00±0.00 1.00±0.00 1.00±0.00 1.00±0.00
Europeans EUR 0.99±0.00 1.00±0.00 1.00±0.00 1.00±0.00
South Asians SAS 0.93±0.01 1.00±0.00 1.00±0.00 1.00±0.00
Overall F1 score 0.98±0.00 1.00±0.00 1.00±0.00 1.00±0.00

SVM10 = support vector machine classification model using 10 principal components, PCA = k-nearest neighbours model based on 2D PCA map (k = 7), GTM{3,10,100} = bayesian classification model based on generative topographic mapping using 3, 10 or 100 principal components. Each value is an average with 95% confidence interval