Table 2. Performance of models inferring SBDH labels using five-fold cross validation.
+/− | Structured EHR features | Text only | Structured EHR + text | |||||||
---|---|---|---|---|---|---|---|---|---|---|
F1 | P | R | F1 | P | R | F1 | P | R | ||
Sexual history | ||||||||||
LGBT status | 263/796 | 54.4 ± 5.1 | 55.9 ± 4.9 | 58.1 ± 5.7 | 79.2 ± 4.3 | 84.8 ± 5.3 | 74.7 ± 5.7 | 82.7 ± 4.0 | 86.1 ± 4.9 | 80.0 ± 5.7 |
History of STIs | 204/603 | 32.3 ± 6.4 | 30.1 ± 7.5 | 48.2 ± 8.0 | 48.9 ± 6.3 | 50.1 ± 7.7 | 56.7 ± 7.7 | 54.0 ± 6.7 | 54.2 ± 7.9 | 53.7 ± 8.0 |
Unsafe sex | 160/647 | 21.1 ± 6.4 | 21.3 ± 7.5 | 35.0 ± 7.4 | 43.8 ± 6.3 | 52.1 ± 7.7 | 38.9 ± 7.7 | 38.5 ± 6.5 | 46.0 ± 7.5 | 35.8 ± 8.0 |
Alcohol use | ||||||||||
Social alcohol use | 252/940 | 27.9 ± 5.6 | 35.0 ± 7.3 | 23.8 ± 5.3 | 39.2 ± 6.7 | 49.4 ± 8.8 | 32.7 ± 6.7 | 40.1 ± 6.5 | 51.6 ± 8.6 | 33.2 ± 6.7 |
Alcoholism | 165/1,027 | 33.4 ± 8.6 | 49.9 ± 11.5 | 42.4 ± 8.3 | 50.0 ± 7.9 | 61.2 ± 10.3 | 42.4 ± 8.3 | 52.0 ± 7.9 | 62.8 ± 10.3 | 44.8 ± 8.5 |
Substance use | ||||||||||
Marijuana use | 210/1,052 | 29.0 ± 7.4 | 52.5 ± 11.1 | 21.4 ± 6.4 | 49.8 ± 6.8 | 51.7 ± 7.8 | 49.0 ± 8.3 | 56.4 ± 6.8 | 57.8 ± 7.8 | 55.7 ± 8.6 |
Cocaine abuse | 274/988 | 56.2 ± 5.6 | 70.2 ± 7.3 | 47.0 ± 6.3 | 62.1 ± 5.5 | 67.2 ± 7.3 | 58.4 ± 6.3 | 65.1 ± 5.1 | 66.0 ± 6.2 | 64.6 ± 7.0 |
Opioid abuse | 99/1,163 | 30.9 ± 9.9 | 48.8 ± 16.6 | 23.2 ± 8.5 | 37.9 ± 10.7 | 48.7 ± 15.1 | 23.2 ± 10.3 | 40.0 ± 11.8 | 48.3 ± 14.7 | 34.4 ± 12.0 |
Intravenous drug abuse | 65/1,197 | 13.8 ± 9.6 | 19.9 ± 14.2 | 10.8 ± 10.0 | 27.3 ± 11.5 | 43.4 ± 19.6 | 21.5 ± 10.2 | 28.5 ± 12.3 | 38.3 ± 22.0 | 23.1 ± 10.1 |
Amphetamine abuse | 36/1,226 | 33.6 ± 16.3 | 55.4 ± 36.7 | 27.5 ± 17.8 | 47.0 ± 19.5 | 68.4 ± 31.1 | 42.5 ± 18.4 | 51.1 ± 17.1 | 51.4 ± 19.7 | 53.5 ± 21.9 |
Housing status | ||||||||||
Unstable housing | 262/978 | 27.4 ± 5.6 | 35.0 ± 6.0 | 23.6 ± 6.4 | 49.3 ± 6.4 | 59.4 ± 7.8 | 42.3 ± 7.5 | 53.1 ± 6.4 | 62.2 ± 5.8 | 46.9 ± 7.2 |
Abbreviations: LGBT, lesbian, gay, bisexual, transgender; P = precision, R = recall, ± standard deviation estimated using bootstrap method; SBDH, social and behavioral determinants of health; STI, sexually transmitted disease.