Skip to main content
. 2020 Oct 13;7:347. doi: 10.1038/s41597-020-00680-2

Table 3.

Movie word and face annotation information.

Movie Words Faces
On and Offsets (%) Truncated N
Matched/Similar Estimated
Continuous Partial Full >95% (%) Time (%)
500 Days of Summer 65.46 4.57 21.84 8.12 4.13 8,286.00 93.15 80.83
Citizenfour 80.68 3.82 14.62 0.88 1.32 13,936.00 93.04 70.79
12 Years a Slave 67.41 6.06 19.66 6.86 3.64 7,984.00 88.48 77.54
Back to the Future 72.52 4.40 17.32 5.77 2.35 8,634.00 89.85 71.21
Little Miss Sunshine 72.48 3.18 22.47 1.87 3.12 8,555.00 87.96 79.17
The Prestige 77.22 4.69 15.19 2.89 2.39 10,954.00 88.84 77.09
Pulp Fiction 73.14 4.13 18.35 4.38 2.77 16,155.00 88.88 79.63
The Shawshank Redemption 81.62 4.92 10.86 2.60 2.12 11,779.00 85.30 78.55
Split 82.21 4.34 8.58 4.88 2.09 7,032.00 96.27 70.13
The Usual Suspects 84.80 3.36 10.61 1.23 1.27 9,913.00 94.94 74.12
Mean 75.75 4.35 15.95 3.95 2.52 10,322.80 90.67 75.91
SD 6.57 0.82 4.83 2.46 0.92 2,909.40 3.49 4.01

The on and offsets of words were obtained from machine learning-based speech-to-text transcriptions. Dynamic time warping was used to align these to subtitles. If words in a subtitle page ‘Matched’ or were ‘Similar’ to words in the transcript, it received the transcript timing. Otherwise it was estimated. ‘Continuous’ estimations are single subtitle words inheriting the start and end time from the end of the prior and start of the next transcribed word. ‘Partial’ estimations are similar but involve two or more missing words between transcribed words. ‘Full’ estimations occured when no words were transcribed and words were estimated from the start and end time of the subtitle page. When word lengths were unreasonable, they were ‘Truncated’. This procedure resulted in an average number (‘N’) of >10,000 words per movie. The on and offsets of faces were also obtained from a machine learning-based approach. The final two columns are the average percentage of face labels with >95% confidence and the percent of time faces were on screen.