Skip to main content
. 2023 Apr 26;30(8):1418–1428. doi: 10.1093/jamia/ocad080

Table 4.

External validation results of the performance of conventional machine learning, deep learning, and transformer-based classifiers of preoperative cannabis use status documentation in unstructured narrative clinical notes using 1258 matching notes from 500 random patients in the MIMIC-III database

Class LR
L-SVM
W2V CNN
BERT-Base
Bio_ClinicalBERT
Support (N)d
Pa Rb Fc Pa Rb F c Pa Rb Fc Pa Rb Fc Pa Rb Fc
Not a true cannabis mention 1.00 .99 1.00 1.00 .99 1.00 0.97 0.99 0.98 0.99 1.00 0.99 1.00 0.99 1.00 135
Positive current use .85 .82 .84 .83 .82 .83 0.84 0.81 0.82 0.87 0.90 0.88 0.87 0.91 0.89 67
Positive past use .73 .80 .77 .73 .78 .75 0.77 0.83 0.80 0.89 0.83 0.86 0.89 0.78 0.83 41
Negative current use .78 .70 .74 .78 .70 .74 0.75 0.60 0.67 0.89 0.80 0.84 0.62 0.80 0.70 10
Weighted average .91 .91 .91 .90 .90 .90 0.90 0.90 0.90 0.94 0.94 0.94 0.93 0.93 0.93 253

BERT: Bidirectional Encoder Representations from Transformers; CNN: convolutional neural networks; LR: logistic regression; L-SVM: linear support vector machines.

a

Precision/positive predictive value.

b

Recall/sensitivity.

c

F score = 2×([Precision×Recall]/[Precision+Recall]).

d

Number of snippets included in model evaluation.