Fig. 1.
Timing in natural AV speech. Articulatory movements of the face naturally precede the onset of the audio speech signals by a few tens of milliseconds. The first detectable motion frame demarks the aspiration preceding the production of the consonantal burst in natural speech. Values are for stimuli that were used here. The consonantal burst in the audio portion is the “audio onset” and corresponds to the onset or “index zero” in all figures and text unless otherwise indicated. VOT, voice onset time.