Skip to main content
. 2014 Jun 14;15:187. doi: 10.1186/1471-2105-15-187

Table 1.

Four medical informatics datasets used in experiments

# Dataset # of concepts # of terms Size in kilobytes
D1
The UMLS most frequent concepts from multiple sources
100
4,979
369
D2
The SNOMED CT most frequent concepts
155
5,000
281
D3
The UMLS concepts with longest terms (“longest concepts”)
3,337
5,000
1,693
D4 The SNOMED CT longest concepts 1,805 5,000 903