Extended Data Fig. 4. Investigating the network prediction model performance from keypoints.
a–e, We varied different components of the deep network model from keypoints and computed the normalized variance explained across neurons, choosing the architecture denoted with the star. Pink represents the average across visual recordings (n = 10 recordings, 7 mice), and purple represents the average across sensorimotor recordings (n = 6 recordings, 5 mice). a, Varying the number of units in the deep behavioral features layer—the last fully-connected layer before the output layer. Star denotes 256 units. b, Varying the number of core layers—the layers before and including the deep behavioral features layer, star denotes 2 layers. c, Varying the number of readout layers—the layers after the deep behavioral features layer, star denotes 1 layer. d, The performance when removing the first linear layer in the network, removing the ReLU non-linearity in the convolution layer, or removing the ReLU non-linearity in the deep behavioral feature layer. e, Varying the number of one-dimensional convolution filters, star denotes 10. f, Prediction from all keypoints using network, or from all keypoints excluding each face region: eye, whisker and nose. Error bars represent s.e.m.: in visual areas, n = 10 recordings in 7 mice; and in sensorimotor areas, n = 6 recordings in 5 mice.