Fig. 2. Change in human performance over time.
As all raters were shown the same notes in the same order, the figure also enables a comparison of how well each group was able to improve their performance over time. In particular, the nonclinicians were able to improve their performance substantially upon receiving feedback, with a ~10% increase in absolute accuracy between their initial baseline performance and their final set of predictions after training. Performance of expert annotators (i.e., psychiatrists), on the other hand, did not improve notably over time.