Table 2.
Table 2 Distribution of Documents and Disease-Drug Associations
| Disease | Source/Annotation | Total Documents | Disease-Drug Associations | “True” Disease-Drug Associations |
|---|---|---|---|---|
| Acquired immunodeficiency syndrome | RCT/MeSH | 270 | 75 | 20 (22) |
| RCT/UMLS | 270 | 106 | 20 | |
| DSUM/UMLS 2003 | 685 | 724 | 35 | |
| DSUM/UMLS 2004 | 805 | 755 | 37 | |
| Asthma | RCT/MeSH | 3,349 | 215 | 13 (18) |
| RCT/UMLS | 3,349 | 425 | 29 | |
| DSUM/UMLS 2003 | 1,332 | 889 | 23 | |
| DSUM/UMLS 2004 | 1,457 | 956 | 18 | |
| Breast neoplasms | RCT/MeSH | 1,931 | 191 | 8 (8) |
| RCT/UMLS | 1,931 | 210 | 2 | |
| DSUM/UMLS 2003 | 350 | 610 | 4 | |
| DSUM/UMLS 2004 | 391 | 679 | 8 | |
| Congestive heart failure | RCT/MeSH | 1,521 | 246 | 10 (11) |
| RCT/UMLS | 1,521 | 433 | 16 | |
| DSUM/UMLS 2003 | 1,817 | 1,157 | 13 | |
| DSUM/UMLS 2004 | 1,916 | 1,212 | 22 | |
| Diabetes mellitus | RCT/MeSH | 2,202 | 172 | 26 (27) |
| RCT/UMLS | 2,202 | 241 | 47 | |
| DSUM/UMLS 2003 | 3,926 | 874 | 4 | |
| DSUM/UMLS 2004 | 4,407 | 894 | 7 | |
| Parkinson’s disease | RCT/MeSH | 494 | 80 | 10 (11) |
| RCT/UMLS | 494 | 135 | 5 | |
| DSUM/UMLS 2003 | 211 | 450 | 5 | |
| DSUM/UMLS 2004 | 275 | 525 | 11 | |
| Pneumonia | RCT/MeSH | 273 | 116 | 37 (40) |
| RCT/UMLS | 273 | 198 | 105 | |
| DSUM/UMLS 2003 | 1,610 | 962 | 31 | |
| DSUM/UMLS 2004 | 1,794 | 1,036 | 30 | |
| Schizophrenia | RCT/MeSH | 1,098 | 186 | 8 (10) |
| RCT/UMLS | 1,098 | 241 | 10 | |
| DSUM/UMLS 2003 | 213 | 479 | 23 | |
| DSUM/UMLS 2004 | 232 | 463 | 24 |
This table presents several statistics for the text sources and annotation methods with respect to the eight diseases under investigation. “Total Documents” represents the number of disease-specific documents, “Disease-Drug Associations” refers to the number of 2×2 tables generated for each disease and respective generic name drugs, and “True Disease-Drug Associations” are the number of associations above the identified cutoff used for comparison.
The same set of documents was used for RCT/MeSH and RCT/UMLS.
For comparison, entities in RCT/MeSH represented by MeSH identifiers were mapped to UMLS concepts. Due to one-to-many mappings, two numbers are presented in the “True Disease-Drug Associations” column – one before mapping and one after.