Figure 4. Computational approaches to understanding scene perception in the brain.
A) Multivoxel fMRI patterns in PPA were obtained for 30 scene categories, and the resulting representational dissimilarity matrix (RDM) was compared to RDMs for three possible models of scene processing. Dissimilarity in the Objects model was based on the objects present within each scene; dissimilarity in the DNN features model was based on activation in a deep neural network trained on object classification; and dissimilarity in the Functions model was based on types of actions (e.g. walking, vacuuming) that could be carried out in each scene. Categories (e.g. bus depot, putting green, volcano, pier) were chosen to maximally differentiate between the three models. Middle panel shows that the RDMs for all three models correlate with the PPA RDM, with the strongest correlation for the DNN feature model. Right panel shows the results of variance partitioning, showing that much of the PPA variance explained by the Object and Functions models is shared with the DNN features model, which explains the most unique variance. Total response variance accounted for by all three models was 14.8%.
B) In an in silico experiment, the response profiles of individual DNN units were assessed by comparing response to an unaltered image with response to the same image overlaid with a small occluder. A discrepancy map showing the portion of the image that the unit responds to was created by varying the location of the occluder. On the right are discrepancy maps of three scenes for two DNN units that were previously shown to convey information about navigational affordances. The top unit appears to respond to features related to doorways; the bottom unit appears to respond to open spaces along the ground plane.