For CaImAn, (an algorithm to analyze calcium imaging data) we independently characterized differences in the spatial and temporal components recovered by the model. Differences in spatial components are measured by the average Jaccard Distance over pairs of spatial components. A Jaccard distance of 0 corresponds to two spatial components that perfectly overlap. Differences in temporal components were calculated as the average root mean squared error (RMSE) taken over paired time series of component activity. For Ensemble DeepGraphPose (an algorithm to track body parts of animals during behavior from video), we considered multiple sets of outputs from a single, pretrained model. RMSE takes units of pixels, so differences of order 1e-8 are not relevant for behavioral quantification. For both analyses, we fixed a single dataset, configuration file and blueprint across runs. See Figures S3,S4 for more.