Skip to main content
. 2022 Jun 16;11:e76218. doi: 10.7554/eLife.76218

Figure 1. The framework of Selfee (Self-supervised Features Extraction) and its downstream applications.

(A) One live-frame is composed of three tandem frames in R, G, and B channels, respectively. The live-frame could capture the dynamics of animal behaviors. (B) Live-frames are used to train Selfee, which adopts a backbone of ResNet-50. (C, D, and E) Representations produced by Selfee could be used for anomaly detection that could identify unusual animal postures in the query video compared with the reference videos. (C) AR-HMM (autoregressive hidden Markov model) that models the local temporal characteristics of behaviors and clusters frames into modules (states) and calculates stages usages of different genotypes (D) DTW (dynamic time warping) that aligns behavior videos to reveal differences of long-term dynamics (E) and other potential tasks including behavior classification, forecasting, or even image segmentation and pose estimation after appropriately modifying and fine-tuning of the neural networks.

Figure 1.

Figure 1—figure supplement 1. Beddings and backgrounds that affect training and inference of Selfee (Self-supervised Features Extraction).

Figure 1—figure supplement 1.

(A) Textures on the damped filter paper would mislead Selfee to output features similar to copulation but not wing extension (ground truth). The left example showed a background that would not affect Selfee neural network, and the right example showed a background that could strongly affect classification accuracy. (B) Background inconsistency would affect the training process when Selfee was applied to mice behavior data. Therefore, backgrounds were removed from all frames to avoid potential defects. After background removal, illumination normalization was applied.
Figure 1—figure supplement 2. t-SNE visualization of pose estimation derived features.

Figure 1—figure supplement 2.

(A) Visualization of fly courtship live-frames with t-SNE dimension reduction of distances between key points, including head, tail, thorax, and wings. Each dot was colored based on human annotations. Points representing non-interactive behaviors (‘others’), chasing, wing extension, copulation attempt, and copulation were colored with red, yellow, green, blue, and violet, respectively. (B) Visualization of fly courtship live-frames with t-SNE dimension reduction of human-engineered features, including male head to female tail distance, male body to female body distance, male wing angle, female wing angle, and angle between male body axis. Each dot was colored based on human annotations. Points representing non-interactive behaviors (‘others’), chasing, wing extension, copulation attempt, and copulation were colored with red, yellow, green, blue, and violet, respectively.
Figure 1—figure supplement 3. Animal tracking with DLC, FlyTracker, and SLEAP.

Figure 1—figure supplement 3.

(A) Visualization of DLC tracking results on intensive interactions between mice during mating behavior. The nose, ears, body center, hips, and bottom were labeled. DLC tended to detect two animals as one due to occluding. (B) Visualization of FlyTracker tracking results provided in Fly-vs-Fly dataset. The head, tail, center, and wings were colored in red, blue, purple, and green, respectively. When two flies were close, wings became hard to be detected correctly. (C) Visualization of SLEAP tracking results on close interactions during fly courtship behavior. Five body parts were marked with points, including head, tail, thorax, and wings, and head to tail, thorax to wings were linked by lines. Female was labeled in red, and the male was in blue. Body parts were wrongly assigned when two animals were close.
Figure 1—video 1. Visualization of DLC tracking results on intensive interactions between mice during mating behavior.
Download video file (870.9KB, mp4)
The nose, ears, body center, hips, and bottom were labeled. DLC worked great when two animals were separated, but it tended to detect two animals as one during mounting or intromission due to occluding.
Figure 1—video 2. A tracking example of FlyTracker of Fly-vs-Fly dataset.
Download video file (6.4MB, mp4)
Visualization of FlyTracker tracking results provided in Fly-vs-Fly dataset. The head, tail, center, and wings were colored in red, blue, purple, and green, respectively. The tracking result was very competent even compared with deep learning-based methods. However, when two flies were close, wings, even bodies, became hard to be detected correctly.
Figure 1—video 3. A tracking example of SLEAP on fly courtship behavior.
Download video file (7.8MB, mp4)
Visualization of SLEAP tracking results on fly courtship behavior. Five body parts were marked with points, including head, tail, thorax, and wings, and head to tail, thorax to wings were linked by lines. Female was labeled in red, and the male was in blue. In general, the tracking result was good. However, body parts were wrongly assigned when two animals were close, and performance was significantly impaired during copulation attempt and copulation.