Skip to main content
. 2021 May 14;10:e67855. doi: 10.7554/eLife.67855

Appendix 1—table 2. Comparison of feature sets on the downstream task of predicting mouse strain.

(C57 vs.DBA) given acoustic features of single syllables. Classification accuracy, in percent, averaged over five disjoint, class-balanced splits of the data is reported. Empirical standard deviation is shown in parentheses. Euclidean distance is used for nearest neighbor classifiers. Each MUPET and DeepSqueak acoustic feature is independently z-scored as a preprocessing step. Latent features dimension is truncated when >99% of the feature variance is explained. Random forest (RF) classifiers use 100 trees and the Gini impurity criterion. The multilayer perceptron (MLP) classifiers are two-layer networks with a hidden layer size of 100, ReLU activations, and an L2 weight regularization parameter ‘alpha,’ trained with ADAM optimization with a learning rate of 10-3 for 200 epochs. D denotes the dimension of each feature set, with Gaussian random projections used to decrease the dimension of spectrograms.

Predicting mouse strain (Figure 4d–e)
Spectrogram MUPET DeepSqueak Latent
D = 10 D = 30 D = 100 D = 9 D = 10 D = 7
k-NN (k=3) 68.1 (0.2) 76.4 (0.3) 82.3 (0.5) 86.1 (0.2) 79.0 (0.3) 89.8 (0.2)
k-NN (k=10) 71.0 (0.3) 78.2 (0.1) 82.7 (0.6) 87.0 (0.1) 80.7 (0.3) 90.7 (0.4)
k-NN (k=30) 72.8 (0.3) 78.5 (0.2) 81.3 (0.5) 86.8 (0.2) 81.0 (0.2) 90.3 (0.4)
RF (depth = 10) 72.8 (0.2) 76.6 (0.2) 79.1 (0.3) 87.4 (0.5) 81.2 (0.4) 88.1 (0.5)
RF (depth = 15) 73.1 (0.3) 78.0 (0.3) 80.5 (0.2) 87.9 (0.4) 82.1 (0.3) 89.6 (0.4)
RF (depth = 20) 73.2 (0.2) 78.3 (0.2) 80.7 (0.3) 87.9 (0.4) 81.9 (0.3) 89.6 (0.4)
MLP (α = 0.1) 72.4 (0.3) 79.1 (0.4) 84.5 (0.3) 87.8 (0.2) 82.1 (0.4) 90.1 (0.3)
MLP (α = 0.01) 72.3 (0.4) 78.6 (0.3) 82.9 (0.4) 88.1 (0.3) 82.4 (0.4) 90.0 (0.4)
MLP (α = 0.001) 72.4 (0.4) 78.5 (0.8) 82.8 (0.1) 87.9 (0.2) 82.4 (0.3) 90.4 (0.3)