Skip to main content
. 2020 Oct 29;15(10):e0240429. doi: 10.1371/journal.pone.0240429

Table 5. Genotype inference task.

We evaluated a single methods (PCA with clustering) on the genotype inference task (which inversion genotype does a sample have?) using two benchmark test cases (positive from a single population and positive from multiple populations). Note that the two association-testing methods are not able to infer genotypes. For each chromosome arm used, we indicated known inversions, how many genotypes are present in the data set, and a measure of balanced accuracy calculated from the cluster predictions. The D. melanogaster 3R chromosome arm has three mutually-exclusive inversions, which we list separately.

Test Case Chrom. Inversion Present Genotypes Clusters Balanced Accuracy
Single D. mel. 2L In(2L)t 3 3 93.3%
Single D. mel. 2R In(2R)NS 3 3 94.4%
Single D. mel. 3R In(3R)Mo 3 60.7%
Single In(3R)p 3 43.3%
Single In(3R)K 3 55.0%
Multiple 150 An. gam. and col. 2L 2La 2 3 66.7%
Multiple 81 An. gam. 2L 2La 2 2 100.0%
Multiple 34 An. gam. and col. 2L 2La 3 4 100.0%

We evaluated clustering in terms of accuracy of inferring inversion genotypes. Inversion genotypes were retrieved from the original papers describing the data [17, 3739]. Association of the known genotypes with the cluster labels was measured using balanced accuracy. *Could not resolve multiple, mutually-exclusive inversions