. Author manuscript; available in PMC: 2019 Jul 25.

Published in final edited form as: J Biomed Inform. 2017 Jul 8;72:85–95. doi: 10.1016/j.jbi.2017.07.006

Table 1.

Distribution of relation classes in the training and test datasets. Both actual numbers and percentages are shown. “PP None” indicates None relation between medical problems. “Tep None” indicates None relation between tests and medical problems. “TrP None” indicates None relation between treatments and medical problems.

Relation Type	Training	Training %	Test	Test %	Effective Training^*
PIP	1239	38.4%	1986	61.6%	1123
PP None	7349	39.64%	11190	60.36%	4453
TeCP	303	34.0%	588	66.0%	271
TeRP	1734	36.4%	3033	63.6%	1564
TeP None	1535	38.50%	2452	61.50%	1379
TrAP	1423	36.4%	2487	63.6%	1284
TrCP	296	40.0%	444	60.0%	270
TrIP	107	35.1%	198	64.9%	100
TrNAP	106	35.7%	191	64.3%	101
TrWP	56	28.1%	143	71.9%	48
TrP None	2329	40.05%	3486	59.95%	2081

Effective Training denotes the number of samples used to train each class. It is less than the number of samples in the training dataset due to random allocation of 10% training dataset as validation set for all relations, and down-sampling for PP relations. We refer the reader to Experiments and Results section for more detail.