Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2026 Apr 1:2026.03.29.715170. [Version 1] doi: 10.64898/2026.03.29.715170

Optimizing Intermediate Representations: A Framework for Low-Cost, High-Accuracy Behavior Quantification

Jessica D Choi, Brian Geuther, Vivek Kumar
PMCID: PMC13060303  PMID: 41959072

Abstract

Quantitative measurement of animal behavior is a cornerstone of neuroscience, genetics, and ethology. While modern computer vision has democratized automated analysis, the field has coalesced around pose estimation as the standard intermediate representation. This reliance imposes a significant bottleneck: researchers must often train custom pose models using large, labor-intensive datasets. Furthermore, the assumption that denser anatomical tracking yields better classification remains largely unverified. Here, we benchmark intermediate representations for supervised mouse behavior classification to determine the optimal trade-off between annotation cost and model performance. We systematically evaluate the sensitivity of classification to keypoint density, the impact of temporal feature engineering, and the viability of segmentation-derived shape descriptors as a low-cost alternative. We find that classifier performance is remarkably robust to keypoint variation; increasing keypoint density yields negligible gains, particularly when behavior training sets are sufficiently large. In contrast, augmenting models with temporal features (specifically FFT-based signal processing) consistently drives performance improvements. Crucially, we demonstrate that whole-body segmentation achieves performance parity with explicit pose estimation across most behaviors. These findings challenge the "more is better" intuition in pose tracking and suggest a paradigm shift: efficient pipelines should prioritize behavioral dataset volume and temporal dynamics over complex anatomical keypoints.

Full Text

The Full Text of this preprint is available as a PDF (9.0 MB). The Web version will be available soon.


Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES