Segmentation model evaluation on the test set images
(A) Comparing the performance of DeepSea, Cell Pose, StarDist, and 2D-UNET using the standard average precision at different IoU matching thresholds.
(B) Measuring models’ latency (per image) to compare the DeepSea efficiency with the other models.
(C and D) Comparing models’ performance in segmenting easy (sparse cell density) and hard (high cell density) test images using average precision with one standard error of the mean shown by error bars.
(E) Comparing models’ performance in segmenting different cell types of the DeepSea dataset.