Skip to main content
. 2024 Oct 22;10:e2293. doi: 10.7717/peerj-cs.2293

Table 6. IAA and concordance between the model and humans regarding arguments identified in sentences.

Argument Annotator 1 (N) Annotator 1 (Argument) Annotator 2 (N) Annotator 2 (Argument) Neural network (N) Neural network (Argument) Precision Recall F1 score K (Index ± SE)
PAR_RDNS 6,207 (74%) 297 (4%) 277 (3%) 1,633 (19%) 0.8550 0.8461 0.8505 0.8063 ± 0.0078
CHILD_OPIN 6,207 (74%) 277 (3%)
PSY_REP 7,835 (93%) 90 (1%) 114 (1%) 375 (4%) 2,177 (65%) 128 (4%) 0.8698 0.8358 0.8524 0.7888 ± 0.0117
PAR_RELAT 7,951 (94%) 92 (1%) 90 (1%) 281 (3%) 3,195 (96%) 18 (1%) 0.7574 0.7534 0.7554 0.7441 ± 0.0182
BEST_INT 5,974 (71%) 563 (7%) 299 (4%) 1,578 (19%) 3,184 (96%) 32 (1%) 0.8407 0.7370 0.7855 0.7186 ± 0.0090
PAR_DED 7,435 (88%) 158 (2%) 270 (3%) 551 (7%) 2,064 (78%) 104 (3%) 0.6711 0.7772 0.7203 0.6924 ± 0.0140
3,004 (90%) 62 (2%) 78 (2%) 184 (4%)