Skip to main content
. 2024 Nov 11;20(11):e1012537. doi: 10.1371/journal.pcbi.1012537

Fig 5. To explore how Whisper’s accurate EEG predictions compared to self-supervised speech models trained without direct access to language (on identifying masked speech sounds) we also repeated analyses with Wav2Vec2 and HuBERT.

Fig 5

To enable cross-referencing to comparative fMRI studies [30,31], we performed this comparison on models comprising 12 layers. Wav2Vec2 and HuBERT both yielded highly accurate predictions but unlike Whisper, the inner layers (L7 and L9 respectively) rather than the last layer were most accurate. Further tests reported in the main text, suggest that Whisper predicted some different components of EEG signal to Wav2Vec2.