. Author manuscript; available in PMC: 2022 Oct 19.

Published in final edited form as: Proc Int Conf Comput Ling. 2022 Oct;2022:2979–2991.

Table 2:

ROUGE-L F-score (RL-F), sentence embedding cosine similarity (Sent.θ), BERTScore (BS), and evaluation using CUI F-score (CUI) from fine-tuning T5 and BART on the two input settings: Assessment (Assmt), Assessment with Subjective sections(A+Subj.) ++ represents the training with data augmentation.

Model	Setting	Explicit Mentions				Direct Problems				Indirect Problems				All Problems

		RL-F	Sent.θ	BS	CUI	RL-F	Sent.θ	BS	CUI	RL-F	Sent.θ	BS	CUI	RL-F	Sent.θ	BS	CUI
Rule-based	Assmt	34.45	58.81	59.80	38.97	12.31	55.33	40.13	34.23	9.49	55.58	44.46	33.16	13.45	68.61	50.32	43.93

T5	Assmt	32.77	59.57	57.75	41.73	13.68	53.44	39.72	36.10	10.40	54.76	44.16	35.08	14.82	67.49	49.89	44.51
	++	31.76	58.74	57.12	42.19	13.78	53.65	40.30	35.84	10.55	54.10	43.48	35.20	15.00	67.32	50.36	44.55
	A+Subj	20.24	50.04	47.55	33.44	9.52	51.91	39.72	30.43	7.10	54.14	43.87	30.29	10.89	64.63	49.75	39.02
	++	20.72	59.64	57.97	33.56	9.46	53.55	39.52	18.76	7.35	54.69	44.36	14.40	10.93	67.19	50.42	24.83

BART	Assmt	25.70	54.98	52.99	32.49	10.00	53.66	39.08	29.41	8.04	54.66	43.12	29.04	11.56	66.86	48.48	38.36
	++	28.22	57.04	55.16	32.28	10.33	53.40	39.21	30.75	8.29	54.48	44.01	32.08	11.65	66.67	49.23	40.69
	A+Subj	18.80	49.19	46.77	26.96	7.04	51.70	38.24	25.30	6.00	54.29	43.71	26.01	9.25	64.95	48.19	34.02
	++	20.23	57.91	54.68	32.91	7.88	53.85	40.21	30.09	6.85	54.61	43.15	30.12	9.84	67.00	49.70	38.72