Table 2.
Inclusion criteria on challenge level
# | Criterion | Number of affected tasks/challenges |
---|---|---|
1 | If a challenge task has on- and off-site part, the results of the part with the most participating algorithms are used. | 1/1 |
2 | If multiple reference annotations are provided for a challenge task and no merged annotation is available, the results derived from the second annotator are used. In one challenge, the first annotator produced radically different annotations from all other observers. This is why we used the second observer of all challenges. | 2/2 |
3 | If multiple reference annotations are provided for a challenge task and a merged annotation is available, the results derived from the merged annotation are used. | 1/1 |
4 | If an algorithm produced invalid values for a metric in all test cases of a challenge task, this algorithm is omitted in the ranking | 1/1 |