Skip to main content
. 2020 Nov 5;28(1):104–112. doi: 10.1093/jamia/ocaa220

Table 4.

Selection of matched characteristics with natural language processing (NLP) phrases associated with LTFU and Retention (comprehensive table in the Supplementary Material)

Matched characteristics relatively associated with LTFU Likely expanded phrase Domain Rank Weight
KS Kaposi’s Sarcoma Opportunistic infection 23 3.655
syphilis STI 25 3.613
Burkitt Burkitt’s lymphoma Opportunistic infection 41 3.348
HCV Hepatitis C virus Comorbidities 61 3.116
MSM Man who has sex with men Sexual and gender minorities 111 2.705
CMV retinitis Opportunistic infection 155 2.439
K103N HIV genotype mutation 177 2.319
substance abuse Substance use disorder 188 2.282
chlamydia STI 223 2.182
tearful Mental illness 232 2.149
Human papillomavirus STI 259 2.053
HBV Hepatitis B virus Comorbidities 271 2.039
cocaine Substance use disorder 294 1.967
influenza vaccine Preventive health services 298 1.963
m184v HIV genotype mutation 401 1.74
Bipolar Mental illness 407 1.717
heroin Substance use disorder 432 1.673
Not use condom Condomless sex 552 1.484
unemployed Life stressors and markers of socioeconomic status 747 1.273
Matched characteristics relatively associated with Retention Likely expanded phrase Domain Rank Weight
well on art Good adherence 3 –4.698
Depression Mental illness 13 –3.845
Marijuana Substance use disorder 71 –2.562
HSV Herpes simplex virus STI 91 –2.329
toxo toxoplasmosis Opportunistic infection 109 –2.229
Alcohol abuse Substance use disorder 117 –2.191
cryptococcal Opportunistic infection 131 –2.091
congenital HIV Congenital HIV 146 –2.043
Pap Papaliconaou smear Preventive health services 158 –2.008
Cesarean section Pregnancy 321 –1.529
schizophrenia Mental illness 680 –1.037
Excellent adherence Good adherence 948 –0.778
Pregnancy Pregnancy 1011 –0.736

Note: LTFU = lost to follow-up; Weights and ranks show the relative importance of a feature that matched the characteristic, sorted as illustrated in Figure 2. Positive weights are relatively related to LTFU, negative weights are relatively related to Retention. Weight ranks start at 0 (zero). The results are from the classifier trained on the main experiment setup. The model intercept was: −1.155.