. 2010 Sep-Oct;17(5):528–531. doi: 10.1136/jamia.2010.003855

Table 1.

Evaluation results of Vanderbilt's system for 2009 i2b2 challenge

			Exact			Inexact
			F-measure	Pre	Rec	F-measure	Pre	Rec
Horizontal	System-level	System	0.821	0.839	0.803	0.822	0.866	0.782
Horizontal	Patient-level	System	0.810	0.840	0.792	0.807	0.863	0.770
Vertical	System-level	Dosage	0.855	0.895	0.818	0.880	0.930	0.835
Vertical	Patient-level	Dosage	0.830	0.878	0.802	0.857	0.915	0.823
Vertical	Ssystem-level	Frequency	0.868	0.879	0.858	0.859	0.902	0.820
Vertical	Patient-level	Frequency	0.860	0.881	0.852	0.855	0.900	0.834
Vertical	Ssystem-level	Mode	0.887	0.918	0.858	0.882	0.926	0.841
Vertical	Patient-level	Mode	0.842	0.883	0.820	0.839	0.888	0.811
Vertical	System-level	Medication	0.856	0.842	0.871	0.893	0.895	0.891
Vertical	Patient-level	Medication	0.855	0.849	0.870	0.884	0.892	0.886
Vertical	System-level	Reason	0.360	0.459	0.296	0.367	0.517	0.285
Vertical	Patient-level	Reason	0.344	0.455	0.319	0.360	0.522	0.335
Vertical	System-level	Duration	0.361	0.364	0.358	0.405	0.458	0.364
Vertical	Patient-level	Duration	0.369	0.405	0.395	0.423	0.491	0.451

‘Exact’ and ‘inexact’ matching are two different ways to determine whether an extracted textual finding is correct or not.

Standard precision, Recall and F-measure were reported for each individual type such as medication names, dosage, and frequency (termed the ‘vertical’ analysis), as well as for all outputs regardless of types (termed the ‘horizontal’ analysis).

In addition, those measurements were also calculated at two different levels: patient and system levels.

The patient level calculated precision, recall, and F-measure for each note and reported the averages across all notes, while the system level calculated them based on all entries from all notes.