Figure 2.

Analysis of training corpora and domains. (a) Number of articles utilizing different types of training data. Note that percentages are calculated based on 82 articles; multiple corpora usage in individual papers means the total does not sum to 100%.
(b) Subcategorization of textual training data. Abbreviation: EHR, electronic health record.