Skip to main content
. 2024 Mar 18;14:6482. doi: 10.1038/s41598-024-56081-7

Table 2.

Five-fold testing: quantitative performance evaluation of the cell segmentation module (DL/IDL) compared to state-of-the-art methods.

Detection metrics Semantic metric Speed metrics
mIoU (%) F1 (%) Accuracy (%) Precision (%) Recall (%) Dice (%) Train epochs Inference (s)
Cellpose17 63.85±1.39 83.35±1.41 72.22±1.99 91.30±1.75 77.09±1.24 84.51±1.16 500 1.032±0.0238
Stardist16 66.21±0.96 85.82±0.93 76.04±1.31 94.80±0.15 78.95±1.47 87.86±0.52 400 0.222±0.0173
Ours Att-UNet+LSTM 71.17±2.77 86.12±2.52 76.73±3.54 92.20±3.20 81.33±2.72 91.87±2.6 40 0.369±0.0062
Att-UNet (XAI) 73.55±1.41 86.53±1.64_ 77.00±2.47_ 89.00±2.63 84.53±1.55 93.77±0.34_ 20 0.157±0.0021_
UNet+LSTM 72.23±1.95 86.90±1.80 77.72±2.53 93.47±1.61_ 81.66±2.20 94.04 ± 0.28 40 0.369±0.0062
UNet (XAI) 72.29±2.6_ 85.44±2.12 75.72±2.92 87.95±2.87 83.56±2.46_ 93.18±1.18 20 0.136±0.0069
BiONet+LSTM 63.94±6.48 80.25±5.41 68.44±7.01 90.83±2.88 72.73±7.44 92.2±2.78 40 0.306±0.0062
BiONet (XAI) 65.94±5.31 81.43±4.13 70.09±5.33 88.67±3.67 75.99±5.95 91.24±2.66 20 0.203±0.0042

The reported results are the (mean±standarddeviation), computed over 5-fold testing. The best metrics (per column) are highlighted in bold, and the second-best metrics are underlined. Instance-level segmentation (detection) evaluations were used to assess performance with different metrics (per cell mask). The mIoU (mean Intersection over Union) is calculated as the sum of IoU (cell mask-wise) of the predicted cell masks divided by the ground-truth cell count. To report these metrics, we used IoU50% between ground truth and predicted masks to compute TP, FP, and FN. The F1 score is defined as F1=2TP2TP+FP+FN, while the Accuracy=TPTP+FP+FN, Precision=TPTP+FP, and Recall=TPTP+FN. We utilized a semantic segmentation metric (i.e., Dice coefficient) to quantify the foreground/background pixel-wise separation, defined as Dice=2|gtpred||gt|+|pred|, where gt is the ground truth mask and pred is the predicted mask (background=0,foreground=1). The training epochs refer to the number of epochs needed to complete the training phase. Inference time (on the test set) per image was computed using the following hardware: an 8-core i7 9700K CPU, 16GB RAM, NVIDIA MSI 2080.