Figure 2:
Inference from multi-rater datasets. The purpose of this step was to infer the nucleus locations and classifications from multi-rater data. A. The first step involved agglomerative hierarchical clustering of bounding boxes using intersection-over-union (IOU) as a similarity measure. We imposed a constraint during clustering that prevents merging annotations where a single participant has annotated overlapping nuclei. Participant intention was preserved by demoting annotations from the same participant to the next node (Step 5, arrow). After clustering was complete, a threshold IOU value was used to obtain the final clusters (Step 5, black nodes). Within each cluster, the medoid bounding box was chosen as an anchor proposal. The result was a set of anchors with corresponding clustered annotations. When a participant did not match to an anchor, it was considered a conscious decision not to annotate a nucleus at that location. B. Once anchors were obtained, an expectation-maximization procedure was used to estimate (i) which anchors represent actual nuclei and (ii) which classes to assign these anchors. The expectation-maximization procedure estimates and accounts for the reliability of each participant for each classification. Expectation-maximization was performed separately for NPs and pathologists. C. Grouping of nucleus classes. Consistent with standard practice in object detection, nuclei were grouped, on the basis of clinical reasoning, into 5 classes and 3 super-classes.