Table 3.
Comparison of two standard single branch models and a dual-task model on the binary classification accuracy and absolute ruler score error. The models are tested on two standard test sets for noise and motion artifacts, and two OOD test sets for noise and motion, respectively. Better results are in bold.
| Metric | Dataset | Single-task | Dual-task |
|---|---|---|---|
| Accuracy | Motion-standard | 87.80% | 89.83% |
| Motion-OOD | 85.00% | 87.50% | |
| Noise-standard | 91.06% | 89.79% | |
| Noise-OOD | 86.43% | 89.29% | |
| Score error | Noise-standard | 1.066 | 1.034 |
| Noise-OOD | 1.236 | 1.143 |