. 2021 Apr 22;16(8):1287–1295. doi: 10.1007/s11548-021-02370-9

Table 1.

Classification performance of the tested methods

Method	Procedural			Non-Procedural			A	Macro			Weighted
Method	P	R	F1	P	R	F1		P	R	F1	wP	wR	wF1
RandomForest	0.738	0.913	0.816	0.747	0.443	0.556	0.740	0.743	0.678	0.686	0.741	0.740	0.721
MultinomialNaïveBayes	0.717	0.965	0.823	0.852	0.344	0.491	0.737	0.785	0.655	0.657	0.767	0.737	0.701
LinearSVM	0.706	0.964	0.815	0.835	0.308	0.450	0.723	0.770	0.636	0.633	0.753	0.723	0.681
LogisticRegression	0.678	0.981	0.802	0.861	0.199	0.323	0.694	0.770	0.590	0.562	0.745	0.694	0.626
FastText	0.821	0.846	0.833	0.720	0.683	0.701	0.786	0.771	0.765	0.767	0.784	0.786	0.785
FastText[bal]	0.824	0.846	0.835	0.722	0.689	0.705	0.788	0.773	0.767	0.770	0.786	0.788	0.787
1D-CNN	0.889	0.834	0.861	0.742	0.821	0.780	0.829	0.816	0.828	0.820	0.835	0.829	0.831
1D-CNN[bal]	0.881	0.851	0.866	0.758	0.803	0.780	0.833	0.819	0.827	0.823	0.836	0.833	0.834
BiLSTM	0.894	0.896	0.895	0.820	0.817	0.818	0.867	0.857	0.856	0.857	0.867	0.867	0.867
BiLSTM[bal]	0.887	0.910	0.898	0.837	0.801	0.819	0.870	0.862	0.855	0.859	0.869	0.870	0.869
BERT	0.875	0.916	0.895	0.843	0.775	0.808	0.864	0.859	0.845	0.851	0.863	0.864	0.863
BERT[bal]	0.867	0.922	0.894	0.850	0.757	0.801	0.862	0.859	0.840	0.847	0.861	0.862	0.860
ClinicalBERT	0.886	0.915	0.900	0.845	0.797	0.821	0.872	0.866	0.856	0.860	0.871	0.871	0.871
ClinicalBERT[bal]	0.874	0.922	0.897	0.851	0.8771	0.809	0.866	0.862	0.846	0.853	0.865	0.866	0.865

“[bal]” indicates training on a 50–50 balanced dataset (upsampling)

Bold values indicate the highest values of the Macro-F1 and Weighted-F1 for each category of classification method considered