Table 5.
Summary of evaluation, and some selected information to choose a method.
| Ranking on research data | Ranking on routine data | Robustness artifacts | Robustness different scanner | Sequences needed | Need training data | Limitations/ Requirements | Proc. time | |
|---|---|---|---|---|---|---|---|---|
| LPA | 2 | 1* | – | – | FLAIR | No | Matlab | 1 min |
| LGA | 4 | 5 | – | FLAIR/ T1w | No | Matlab | 6 min | |
| BIANCA | 4 | 1 | – | FLAIR/ T1w | Yes | Need mask of WM | 17 min1 | |
| SLS | 2 | 1 | – | FLAIR/ T1w | No | Matlab | 8 min | |
| W2MHS | 8 | 8 | – | FLAIR/ T1w | No | Matlab | 5 min | |
| nicMSlesion (original) | 4 | 7 | – | FLAIR/ T1w | No | GPU | 10 min2,3 | |
| nicMSlesion (retrained) | 1* | 5 | – | – | FLAIR/ T1w | Yes | GPU | 10 min2,3 (23.5 h2,4) |
| UBO | 4 | 4 | – | – | FLAIR/ T1w | No | Matlab | 9 min |
Ranking performed using t-test comparison on the primary criterion (DSC) (see Supplementary Tables 7 and 9 for details). We started by looking at the method with the best DSC. Then all methods not significantly different from it were given the same rank classified, and so on.
Processing time were evaluated on MacBook Pro laptop with a 2.2 GHz Intel Core i7 2018 CPU, without a graphic processing unit (GPU), with 16 Go RAM except for the nicMSlesion for which we used a GPU-equipped computer, namely a Linux workstation with an Intel Xeon E5-2699 @ 2.30 GHz CPU, with NVIDIA Quadro M4000 GPU, 256 Go RAM.
– indicates that the DSC is sensitive to artifacts or scanner type at p < 0.05 uncorrected for multiple comparisons, on routine dataset.
-- indicates that the DSC is sensitive to artifacts or scanner type at after correction for multiple testing, on routine dataset.
Best DSC in our evaluation (though not necessarily significantly better which explains equal first).
2 min for segmentation and 15 min for generation of the exclusion mask.
With graphic processing unit (GPU, NVIDIA Quadro M4000).
3.5 min for segmentation and 6.5 min for preprocessing
Retraining time.