Skip to main content
. 2024 Feb 9;11(2):ENEURO.0352-23.2024. doi: 10.1523/ENEURO.0352-23.2024

Figure 3.

Figure 3.

SAND outperformed other pipelines on low numbers of ground truth labels using the Neurofinder datasets. A, Example segmentations from Neurofinder video 4.00. Masks generated by SAND were more accurate than those of other methods, even when trained on only 10 frames. Yellow boxes indicate region isolated in panel B. Scale bar, 50 µm. See Extended Data Figure 2-1B and Table 2-3 for more details. B, Example neurons zoomed from boxed regions in panel A. When trained on only 10 frames, SAND correctly identified more masks than CaImAn and Suite2p. Scale bar, 25 µm. C, SAND generally had higher accuracy than other methods when trained on a low number of ground truth labels. Dots represent the average F1 score for each model when processing the test video(s). Lines represent the mean F1 scores averaged over bins grouped by the number of training labels; bins spanned 0–50 labels, 50–100 labels, etc. Shaded regions represent standard error. Horizontal lines are the average F1 scores of Suite2p and CaImAn. More than half of Neurofinder videos did not have >250 neurons, so we did not include trials with >250 neurons in comparisons and binned results. The red line (SAND) represents ensemble learning and hyperparameter optimization with FLHO. The blue line represents single-model supervised learning and hyperparameter optimization with FLHO. The orange line (SUNS) represents single-model supervised learning and grid search hyperparameter optimization. See Extended Data Figure 3-1 and Table 3-1 for more details.