Each panel illustrates the mean spectrotemporal difference between stimulus pairs across
talkers in the consonant conditions, aligned to the onset of voicing in each stimulus.
Dark shading shows greater absolute difference between stimuli. In the low-ambiguity
condition, /saɪ/ and /baɪ/ differ in terms of manner, place, and voicing,
and corresponding spectrotemporal differences can be seen in the high frequency energy
associated with the frication of /s/ and differences in the formant frequencies at the
onset of voicing. In the medium-ambiguity condition, /thaɪ/ and
/baɪ/ differ in terms of place and voicing, and corresponding acoustic-phonetic
differences reveal differences in aspiration and onset formant frequencies. In the
high-ambiguity condition, /phaɪ/ and /baɪ/ differ only in terms
of voicing, as evident in the energy differences related to aspiration during voice onset
time.