Skip to main content
. Author manuscript; available in PMC: 2024 Nov 1.
Published in final edited form as: J Biomed Inform. 2023 Sep 29;147:104507. doi: 10.1016/j.jbi.2023.104507

Table 3.

Performance of TGD identification algorithms on Dataset I (development set)

F1 Sensitivity Specificity Precision NPV Accuracy AUROC
(95% CI)
P values AUPRC
(95% CI)
Rule-based Exact Match 0.586 0.980 0.962 0.730 0.728 0.796
Augmented Match 0.857 0.883 0.882 0.858 0.869 0.870
Guo et al. (single)1* 0.816 0.723 0.952 0.936 0.777 0.838
Guo et al. (combined)2* 0.843 0.766 0.951 0.939 0.804 0.859

Machine Learning Random Forest 0.892 0.832 0.976 0.972 0.860 0.904 0.903
(0.879, 0.926)
0.002 0.942
(0.925, 0959)
Support Vector Machine 0.886 0.808 0.993 0.991 0.844 0.900 0.901
(0.877, 0.924)
<0.001 0.944
(0.930, 0.955)
Linear Regression 0.882 0.799 0.994 0.991 0.837 0.896 0.898
(0.874, 0.921)
<0.001 0.947
(0.932, 0.959)
XGBoost 0.892 0.828 0.978 0.975 0.858 0.903 0.900
(0.876, 0.923)
0.002 0.946
(0.927, 0.963)
Deep Learning ClinicalBERT_TGD 0.917 0.854 0.983 0.980 0.865 0.912 0.923
(0.902, 0.945)
 0.958
(0.945, 0.973)

P values were calculated to compare the AUROC between ClinicalBERT_TGD and other machine learning baselines using the two-sided DeLong test.

1

Best single-rule algorithm was based on ≥2 diagnosis codes and ≥1 keyword(s)

2

Best combined rule was either gender field indicates transgender or ≥1 diagnosis code(s) plus ≥1 TGD keyword(s)

*

Codes and keywords can be found in the paper by Guo et al. [17].