Table 3.
Named entity recognition model performance for concepts related to suicide predictors.
Variable | Manually annotated | Correctly identified | Spurious | Missed | Precision | Recall | F1 |
---|---|---|---|---|---|---|---|
History of violence | 80 | 60 | 22 | 20 | 0.73 | 0.75 | 0.74 |
History of self-harm | 90 | 78 | 26 | 12 | 0.75 | 0.87 | 0.80 |
Formal education | 29 | 24 | 32 | 5 | 0.43 | 0.83 | 0.56 |
Medication | 719 | 692 | 128 | 27 | 0.84 | 0.96 | 0.90 |
Benefits recipient | 44 | 35 | 15 | 9 | 0.70 | 0.80 | 0.74 |
Drug/alcohol use disorder | 28 | 17 | 13 | 11 | 0.57 | 0.61 | 0.59 |
(Parental) suicide | 12 | 11 | 19 | 1 | 0.37 | 0.92 | 0.52 |
Psychiatric admission | 53 | 36 | 28 | 17 | 0.56 | 0.68 | 0.62 |
Overall micro-average | 1055 | 953 | 283 | 102 | 0.77 | 0.90 | 0.83 |
Numbers in manually annotated/correctly identified/spurious/missed columns reflect the absolute numbers of text spans related to the concepts in the sample of free-text EHR documents used to assess the model. Spurious results are text spans identified by the model which were not annotated by the researcher (false positives). Micro-averaging figures for overall model performance are based on model performance when text-spans across all concepts are combined. F1 is a measure of overall model performance. EHR, electronic health records.