. Author manuscript; available in PMC: 2023 Jun 26.

Published in final edited form as: J Biomed Inform. 2023 Feb 6;139:104302. doi: 10.1016/j.jbi.2023.104302

Table 5.

Event and attribute classification performance with gold standard medication mentions.

Model	Metric	MedSingleTask		MedMultiTask		MedSpan		MedIdentifiers		MedIDTyped
Model	Metric	Macro	Micro	Macro	Micro	Macro	Micro	Macro	Micro	Macro	Micro
event	P	0.771	0.884	0.774	0.894	0.830	0.917	0.848	0.931	0.798	0.913
	R	0.697	0.884	0.739	0.894	0.818	0.917	0.844	0.931	0.851	0.913
	F1	0.729	0.884	0.755	0.894	0.824	0.917^†	0.846	0.931 ^‡	0.815	0.913^†

action	P	0.704	0.728	0.789	0.741	0.856	0.808	0.825	0.797	0.884	0.821
	R	0.482	0.479	0.570	0.616	0.671	0.726	0.685	0.739	0.706	0.762
	F1	0.568	0.578	0.646	0.673^δ	0.739	0.765^†	0.742	0.767^†	0.775	0.791 ^‡

actor	P	0.614	0.761	0.675	0.833	0.711	0.857	0.755	0.865	0.721	0.876
	R	0.513	0.707	0.528	0.684	0.552	0.782	0.623	0.811	0.564	0.805
	F1	0.554	0.733	0.592	0.751	0.611	0.818^†	0.677	0.837 ^†	0.622	0.839^†

temporality	P	0.707	0.729	0.768	0.802	0.727	0.785	0.724	0.804	0.743	0.812
	R	0.570	0.622	0.592	0.645	0.631	0.691	0.655	0.749	0.651	0.746
	F1	0.629	0.671	0.665	0.715^δ	0.675	0.735^δ	0.687	0.776^†	0.691	0.778 ^†

certainty	P	0.580	0.730	0.670	0.806	0.748	0.846	0.760	0.856	0.737	0.851
	R	0.656	0.713	0.555	0.635	0.684	0.749	0.782	0.795	0.718	0.779
	F1	0.611	0.722	0.598	0.710	0.701	0.795^†	0.766	0.824 ^†	0.711	0.813^†

Overall^*	P	0.675	0.766	0.735	0.814	0.774	0.843	0.782	0.851	0.777	0.855
	R	0.584	0.681	0.597	0.694	0.671	0.773	0.718	0.805	0.698	0.801
	F1	0.618	0.718	0.651	0.748^δ	0.710	0.806^†	0.744	0.827 ^†	0.723	0.827 ^†

Overall score is an unweighted average of the event and attribute scores.

^‡

indicates performance significance $(p < 0.05)$ compared across all models.

^†

indicates performance significance $(p < 0.05)$ compared against MedSingleTask and MedMultiTask.

^δ

indicates performance significance over MedSingleTask.