. Author manuscript; available in PMC: 2018 Nov 1.

Published in final edited form as: J Biomed Inform. 2017 Jun 7;75 Suppl:S112–S119. doi: 10.1016/j.jbi.2017.06.007

Table 1.

Results for the different models, both during 10-fold cross-validation (CV) and on the test run. Models marked in bold were submitted to the challenge.

	Development phase (10-fold CV)		Testing phase
System	MAE	F-measure	MAE	F-measure
Random baseline	50.96% (SD 5.32)	24.44% (SD 5.26)	57.28%	30.90%
Bag of words (baseline)	74.65% (SD 5.05)	43.53% (SD 9.15)	75.91%	54.31%
UMLS concepts (baseline)	72.76% (SD 4.42)	42.88% (SD 7.95)	72.88%	47.37%
UMLS concepts with context	75.49% (SD 3.73)	51.21% (SD 8.60)	79.41%	62.42%
DSM with context	72.82% (SD 3.22)	47.13% (SD 6.32)	71.91%	49.95%
DSM+1 with context	78.30% (SD 2.65)	57.05% (SD 4.58)	79.52%	61.15%
DSM+1 with context, bootstrapping, outlier removal	78.77% (SD 3.61)	56.70% (SD 6.66)	80.64%	63.67%
DSM+2 with context	76.81% (SD 3.73)	53.62% (SD 5.52)	79.78%	61.09%
SNOMED+1 with context	75.97% (SD 3.50)	52.69% (SD 6.06)	79.73%	61.38%
Question sets	78.01% (SD 2.63)	53.57% (SD 5.24)	79.34%	60.46%