Skip to main content
. Author manuscript; available in PMC: 2021 Jul 1.
Published in final edited form as: Matern Child Health J. 2020 Jul;24(7):901–910. doi: 10.1007/s10995-020-02942-2

Table 2.

Characteristics of supervised learning models compared to dichotomized risk categories, by data source

Supervised Learning Models Dichotomized Risk Categorya Model Characteristicsb

High Not High Subtotal Undeterminedc Totalc
Briggs & Freeman (N = 1,141 medications)
SVMd
 High 155 30 185 15 200 SEN 51% NPV 84%
 Not high 147 746 893 48 941 SPEC 96% Accur 84%
 Total 302 776 1078 63 1141 PPV 84%
Sentiment
 High 173 24 197 1 198 SEN 57% NPV 85%
 Not high 129 752 881 62 943 SPEC 97% Accur 86%
 Total 302 776 1078 63 1141 PPV 88%
TERIS (N = 1,703 medications)
SVMd
 High 47 2 49 29 78 SEN 57% NPV 90%
 Not high 35 308 343 1282 1625 SPEC 99% Accur 91%
 Total 82 310 392 1311 1703 PPV 96%
Sentiment
 High 51 7 58 6 64 SEN 62% NPV 91%
 Not high 31 303 334 1305 1639 SPEC 98% Accur 90%
 Total 82 310 392 1311 1703 PPV 88%
Drug Labels (N = 2,106 medications)
SVMd
 High 310 13 323 327 650 SEN 99% NPV 99%
 Not high 4 367 371 1085 1456 SPEC 97% Accur 98%
 Total 314 380 694 1412 2106 PPV 96%
Sentiment
 High 188 8 196 30 226 SEN 60% NPV 75%
 Not high 126 372 498 1382 1880 SPEC 98% Accur 81%
 Total 314 380 694 1412 2106 PPV 96%

Accur. Accuracy, NPV Negative Predictive Value, PPV Positive Predictive Value, SEN Sensitivity, SPEC Specificity, SVM Subject Vector Machine model

a

See Table 1 for details

b

Calculated using dichotomized risk categories of ‘high’ and ‘not high’ as the gold standard. Medications that the sources categorized as undetermined risk (see Table 1) were not included in models’ characteristic calculations

c

Provided for informational purposes; these data were not used to calculate the model characteristics

d

Categorized as ‘High’ if all 500/500 SVM models categorized a medication as high else a medication was categorized as ‘Not High’. See Methods for more details