Skip to main content
. Author manuscript; available in PMC: 2017 Apr 1.
Published in final edited form as: Inj Prev. 2016 Jan 4;22(Suppl 1):i34–i42. doi: 10.1136/injuryprev-2015-041813

Table 2. The Accuracy of the Human-Machine Classification System: Implementation of a Strategic Filtera Based on Agreement Between Two Naïve Bayes Algorithms.

Adapted from Accident Analyses and Prevention. Marucci-Wellman, Lehto, Corns. A practical too for public health surveillance : Semi-automated coding of short injury narrative from large administrative databases using Naïve Bayes algorithms. 2015

BLS OIICS 2-Digit Event Code Gold Standardc Human-Machine System Coding of all Narrativesd %Agreement Between 2 Manual Codersj Fleiss Kappak manual coders

(n) nprede %predf,g Senh 95% CI PPVi 95% CI
1* Violence and other injuries by persons or animals
11 Intentional injury by person 159 132 0.9 0.81 0.75, 0.87 0.98 0.95, 1.00 81%–97% 0.85
2* Transportation incidents
24 Pedestrian vehicular incidents 120 117 0.8 0.78 0.71, 0.86 0.80 0.73, 0.88 57%–78% 0.65
26 Roadway incidents motorized land vehicle 650 672 4.5 0.98 0.97, 0.99 0.95 0.93, 0.97 93%–96% 0.94
27 Nonroadway incidents motorized land vehicle 136 122 0.8 0.80 0.73, 0.87 0.89 0.84, 0.95 52%–84% 0.62
4* Falls, slips, trips
41 Slip or trip without fall 806 658 4.4 0.70 0.67, 0.73 0.86 0.83, 0.89 66%–89% 0.71
42 Falls on same level 2,148 2386 15.9 0.92 0.91, 0.93 0.83 0.81, 0.84 85%–93% 0.86
43 Falls to lower level 1,065 1176 7.8 0.89 0.87, 0.91 0.81 0.79, 0.83 78%–92% 0.81
5* Exposure to harmful substances or environments
53 Exposure to temperature extremes 141 130 0.9 0.86 0.8, 0.92 0.93 0.89, 0.97 82%–98% 0.88
55 Exposure to other harmful substances 175 165 1.1 0.83 0.77, 0.88 0.88 0.83, 0.93 81%–96% 0.87
6* Contact with objects and equipment
62 Struck by object or equipment 1,651 1749 11.7 0.90 0.89, 0.92 0.85 0.83, 0.87 82%–90% 0.82
63 Struck against object or equipment 466 397 2.6 0.74 0.7, 0.78 0.87 0.84, 0.91 66%–83% 0.68
64 Caught in or compressed by equipment 505 532 3.5 0.90 0.87, 0.93 0.86 0.83, 0.89 72%–83% 0.75
7* Overexertion and bodily reaction
70 Overexertion and bodily reaction, uns 188 151 1.0 0.59 0.51, 0.66 0.73 0.66, 0.80 6%–48% 0.19
71 Overexertion involving outside sources 4,189 4334 28.9 0.95 0.95, 0.96 0.92 0.91, 0.93 87%–95% 0.87
72 Repetitive motions involving micro tasks 484 537 3.6 0.90 0.87, 0.92 0.81 0.77, 0.84 71%–83% 0.75
73 Other exertions or bodily reactions 916 827 5.5 0.79 0.76, 0.82 0.88 0.85, 0.90 56%–85% 0.64
X* All other classifiables (n<100) in training dataset
xx Other small (n<100 cases) classifiable categoriesb 632 467 3.1 0.68 0.64, 0.72 0.92 0.89, 0.94 - -
Nonclassifiable
9999 Nonclassifiable 569 448 3.0 0.70 0.66, 0.74 0.89 0.86, 0.92 69%–84% 0.72

Overall 15,000 15,000 100.0 0.87 0.87, 0.88 0.87 0.87, 0.88 77%–90% 0.78
a

A filter is a technique to decide which narratives the computer should classify vs. which should be left for a human to read and classify.

b

Two-digit categories with <100 cases.

c

Gold Standard codes were assigned to each narrative by expert manual coders.

d

Human-Machine system: The computer assigns codes to narratives that the algorithms agreed on the classification (68% of the dataset), and the remainder are manually coded (32 % of the dataset).

e

npred = number predicted into category.

f

%pred = percent of cases in whole dataset predicted into category.

g

The distribution of two-digit classifications will be skewed towards categories with high sensitivity, biasing the finally distribution of the coded datasets.

h

Sen = Sensitivity: (true positives) the percentage of narratives that had been coded by the experts into each category that were also assigned correctly by the algorithm.

i

PPV = Positive Predicted Value: the percentage of narratives correctly coded into a specific category out of all narratives placed into that category by the algorithm.

j

Two-coder agreement, e.g. 6 total comparisons, coder 1 compared to 2,3,4, coder 2 compared to 3,4 coder 3 compared to 4.

k

Fleiss Kappa between 0 and 1, > 0.6 considered good agreement, >.8 considered very good agreement.

Naivesw = Naïve Bayes Single Word Algorithm. Naiveseq = Naïve Bayes Sequence Word Algorithm