Skip to main content
. 2020 Jun 23;8:27. doi: 10.1186/s40462-020-00214-w

Table 2.

Dataset parameters and accuracy metrics

Dataset Annotations Rate (Hz) Resolution (px) Coverage (%) Accuracy metrics
Metric Reconstruction (cm) Reprojection (px) Tracking (cm)
single 171 30 2.7k 97.79 median 0.30 9.65 NA
RMSE 1.28 16.30 NA
w/ sv as above 100.00 as above
mixed 80 30 4k 69.60 median 0.44 3.77 NA
RMSE 1.09 7.77 NA
school 160 60 2.7k 78.38 median 0.06 2.57 NA
RMSE 0.30 3.78 NA
w/ sv as above 94.02 as above
accuracy 73 30 4k 80.64 ±16.73 median -0.14 ±0.06 3.53 ±1.96 0.14 ±0.33
RMSE 1.34 ±0.79 8.56 ±5.21 1.09 ±0.47
w/ sv as above 97.29 ±2.20 median as above 0.28 ±0.32
RMSE 2.12 ±1.37

w/ sv’ indicates that trajectory points were also estimated from single-view projections at an interpolated depth component. Annotations lists how many frames were annotated for training Mask R-CNN, Rate the frames per second of each video set, i.e. the temporal tracking resolution. Resolution is video resolution, 2.7k: 2704 ×1520 px, 4k: 3840 ×2160 px. Coverage is the mean coverage off all individual trajectories of a dataset. Reconstruction metrics refer to the deviation of reconstructed camera-to-camera distances from the actual distance, Reprojection metrics to the reprojection of triangulated 3D tracks to the original video pixel coordinates and Tracking to the deviation of the tracked calibration wand length from its actual length. In case of the ’accuracy’ dataset, the accuracy results are listed as the mean and standard deviation of the four repeated trials. NA: not applicable