Skip to main content
. 2020 Oct 21;5(6):e853. doi: 10.1097/PR9.0000000000000853

Table 2.

Reliability, recall, precision, and accuracy of OpenFace automated coding, based on manual FACS coding of 50 pain-categorized and 50 non–pain-categorized images.

Action unit Reliability (κ) Recall Precision Accuracy Presence in pain expressions (manual FACS) Presence in nonpain expressions (manual FACS) Presence in pain expressions (OpenFace) Presence in nonpain expressions (OpenFace)
AU4* 0.451 0.812 0.958 0.810 0.98a 0.72b 0.94a 0.50b
AU6* 0.270 1.000 0.500 0.590 0.60a 0.20b 0.94a 0.68b
AU7* 0.357 0.934 0.845 0.800 0.86a 0.68b 0.94a 0.76b
AU9* 0.459 0.891 0.710 0.740 0.82a 0.28b 0.82a 0.56b
AU10 0.064 0.828 0.320 0.430 0.34a 0.24a 0.86a 0.66b
AU12 0.512 0.811 0.652 0.760 0.40a 0.36a 0.52a 0.40a
AU20 −0.005 0.400 0.098 0.570 0.14a 0.06a 0.42a 0.40a
AU25 0.899 0.945 1.000 0.950 0.56a 0.56a 0.56a 0.50a
AU26 0.485 0.763 0.552 0.800 0.12b 0.32a 0.18b 0.42a
AU45* 0.358 0.870 0.671 0.690 0.92a 0.16b 0.86a 0.56b
Average 0.385 0.825 0.631 0.714 0.574a 0.358b 0.704a 0.544b
Average in selected AUs: 0.379 0.901 0.737 0.726 0.836a 0.408b 0.900a 0.612b

Asterisks indicate AUs determined to be reliable and pain relevant based on these data. Reliability is measured in Cohen’s kappa values. Recall (eg, sensitivity) was calculated as the number of true positives divided by the sum of true positives and false negatives. Precision (eg, positive predictive value) was calculated as the number of true positives divided by the sum of true and false positives. The last four columns present the proportion of expressions demonstrating the presence of a given AU in pain-categorized and non–pain-categorized expressions, split by manual and automated coding. Values within a coding set with the different subscripts are significantly different from each other (P < 0.05; a > b).

AU, action unit; FACS, Facial Action Coding System.