Skip to main content
. Author manuscript; available in PMC: 2014 Dec 1.
Published in final edited form as: J Biomed Inform. 2013 Aug 15;46(6):10.1016/j.jbi.2013.08.004. doi: 10.1016/j.jbi.2013.08.004

Table 2.

Domain representations for entity classes in the i2b2 and GENIA corpora (ST: semantic type; SG: semantic group; C: concept).

Dataset Class Domain representation # Seed terms
i2b2 Problem Disorders (SG) 398,725
Treatment Therapeutic or Preventive Procedure (ST) + Clinical Drug (ST) 153,084
Test Laboratory Procedure (ST) + Laboratory or Test Result (ST) + Diagnostic Procedure (ST) 66,015

GENIA protein Amino Acid, Peptide, or Protein (ST) 35,351
DNA C0012854 (C) 45,671
RNA C0035668 (C) 1,029
cell type C0007600 (C) 423
cell line C0449475 (C) 264,729