(A) In the crossmodal condition (face primes), activation in the right pSTS and in the right angular gyrus was significantly higher in response to person‐incongruent S2 stimuli than to person‐congruent S2 stimuli. (B) In the unimodal condition (voice primes), activation in the bilateral IFG was significantly higher in response to person‐incongruent S2 stimuli than to person‐congruent S2 stimuli. The mean percent signal change of the peak voxel is plotted for each condition. Error bars indicate the standard error of the mean. Crossmodal = face prime, unimodal = voice prime, person‐congruent = S1 and S2 same speaker, person‐incongruent = S1 and S2 different speakers, L = left, R = right. [Color figure can be viewed at http://wileyonlinelibrary.com]