Skip to main content
. 2024 Oct 30;15:9383. doi: 10.1038/s41467-024-53147-y

Fig. 1. Overview of our approach.

Fig. 1

A The brain region of focus is occipitotemporal cortex (OTC), here shown for an example subject. The voxel-wise noise-ceiling signal-to-noise ratio (NCSNR) is indicated in color. B A large set of models were gathered, schematized here by repository, and colored here by the main experiments to which they contribute. C Brain-linking methods. The left plot depicts the target representational geometry of OTC for 1000 COCO images, plotted along the first three principal components of the voxel space. Each dot reflects the encoding of a natural image, a subset of which are depicted below in a corresponding color outline. The middle panel shows a DNN representational geometry (here the final embedding of a CLIP-ResNet50), plotted along its top 3 principal components. Classical RSA involves directly estimating the emergent similarity between the brain target and the model layer representational geometries. The right plot shows the same DNN layer representation, but after the voxel-wise encoding procedure (veRSA), which involves first re-weighting the DNN features to maximize voxel-wise encoding accuracy, and then estimating the similarity between the target voxel representations and the model-predicted voxel representations. (Note: Images in C are copyright-free images gathered from Pixabay.com using query terms from the COCO captions for 100 of the original NSD1000 images. We are grateful to the original creators for the use of these images).