Skip to main content
PLOS One logoLink to PLOS One
. 2021 Sep 1;16(9):e0256849. doi: 10.1371/journal.pone.0256849

Holistic processing only? The role of the right fusiform face area in radiological expertise

Ellen M Kok 1,2,*, Bettina Sorger 3, Koos van Geel 1, Andreas Gegenfurtner 1,4, Jeroen J G van Merriënboer 1, Simon G F Robben 1,5, Anique B H de Bruin 1
Editor: Niels Bergsland6
PMCID: PMC8409609  PMID: 34469467

Abstract

Radiologists can visually detect abnormalities on radiographs within 2s, a process that resembles holistic visual processing of faces. Interestingly, there is empirical evidence using functional magnetic resonance imaging (fMRI) for the involvement of the right fusiform face area (FFA) in visual-expertise tasks such as radiological image interpretation. The speed by which stimuli (e.g., faces, abnormalities) are recognized is an important characteristic of holistic processing. However, evidence for the involvement of the right FFA in holistic processing in radiology comes mostly from short or artificial tasks in which the quick, ‘holistic’ mode of diagnostic processing is not contrasted with the slower ‘search-to-find’ mode. In our fMRI study, we hypothesized that the right FFA responds selectively to the ‘holistic’ mode of diagnostic processing and less so to the ‘search-to-find’ mode. Eleven laypeople and 17 radiologists in training diagnosed 66 radiographs in 2s each (holistic mode) and subsequently checked their diagnosis in an extended (10-s) period (search-to-find mode). During data analysis, we first identified individual regions of interest (ROIs) for the right FFA using a localizer task. Then we employed ROI-based ANOVAs and obtained tentative support for the hypothesis that the right FFA shows more activation for radiologists in training versus laypeople, in particular in the holistic mode (i.e., during 2s trials), and less so in the search-to-find mode (i.e., during 10-s trials). No significant correlation was found between diagnostic performance (diagnostic accuracy) and brain-activation level within the right FFA for both, short-presentation and long-presentation diagnostic trials. Our results provide tentative evidence from a diagnostic-reasoning task that the FFA supports the holistic processing of visual stimuli in participants’ expertise domain.

Introduction

Radiologists have the mind-blowing ability to detect abnormalities in radiographs within 2s or even less [1]. Whereas medical students might recognize the ribs and heart but little more than that, the ability to detect abnormalities develops dramatically over residency training. The question that arises is how this ability is implemented in the brain. Several studies [24] have investigated the neural implementation of visual expertise in radiology and other visual-expertise domains, with a focus mostly on the right fusiform face area (FFA). The exact role of the right FFA in visual-expertise domains, however, is not yet clear. In the current study, we aim to investigate the role of the right FFA by examining its involvement in the fast ‘holistic mode’ of diagnostic processing as compared to a slower, checking, or ‘search-to-find’ mode [5], using functional magnetic resonance imaging (fMRI).

The right fusiform face area and expertise

The FFA has been found to be selectively involved in the processing of faces [6]. For example, it responds more strongly to faces than to everyday objects, and more to intact faces than to scrambled faces. Furthermore, lesions in this region have been found to cause prosopagnosia, which is the inability to recognize faces [7]. Although some researchers still argue that the FFA is uniquely dedicated to faces [810], there is now ample evidence to suggest that the right FFA plays a crucial role not only in face perception but more broadly in visual expertise [e.g., 3, 4, 1115], as voiced in the general expertise hypothesis. Gauthier and colleagues [4] were among the first to show this effect. They found that car experts and bird experts show increased right FFA activation when looking at stimuli from their expertise domain but not from the other groups’ domain. They also trained participants to recognize novel objects called greebles and found increasing activation of the right FFA with increasing expertise [16]. In participants with FFA lesions resulting in prosopagnosia, high expertise in car recognition as measured with a verbal test did not result in an equally high ability to visually recognize cars, whereas those two variables were highly correlated in healthy controls. This suggests that patients with prosopagnosia also have trouble visually individuating highly similar objects (e.g., recognizing the model, manufacturer, and decade of make of cars) [17], and, likewise, that the FFA plays a role in this visual individuation of highly similar stimuli. After the classic studies of Gauthier and colleagues, expertise effects in the right FFA were established in a large number of studies with various objects of expertise [15], such as cars, [1820], birds and minerals [21], and butterflies and moths [22]. Since it has been argued that those objects have a face or face-like structure, other investigations focused on less face-like objects such as chess boards [3,14]. Again, expertise-related activation in the right FFA was found. For example, Bilalíc and colleagues [3] found that the FFA is differently engaged in experts versus novices when chess positions were presented, but not when single chess pieces were presented [3].

Radiographs are another example of stimuli that do not resemble faces. Bilalić and colleagues found increased sensitivity of the right FFA for radiographs in experienced radiologists in comparison to medical students [23]. Similarly, Harley and colleagues found a significant correlation between diagnostic performance and right FFA-activation level in a group of expert radiologists and radiologists in training with different levels of experience [2].

Although there is evidence for the involvement of the right FFA in processing non-face visual expertise stimuli such as radiographs, it is unclear what the function of the right FFA is in these tasks. It is argued that the main role of the right FFA is in the holistic processing of faces and expertise-related stimuli [23], i.e., processing a face as a whole, and not as a set of separate, distinct features that do not interact to form a single percept. Holistic processing is often evidenced by relying on the face-inversion effect: Face perception is more difficult for inverted than upright faces because inversion disrupts holistic processing [24]. Indeed, Bilalić showed an inversion effect in radiology: the right FFA of experts in radiology could distinguish upright and inverted radiographs, while the right FFA of novices could not [23].

Holistic processing is a central aspect of theories of visual expertise in general, and visual expertise in radiology in particular [25]. Visual-expertise research in radiology typically assumes a two-phase diagnostic process, consisting of a first, relatively fast, ‘holistic mode’ followed by a slower, ‘search-to-find’ or ‘checking’ mode [5]. The holistic mode entails an initial global analysis of the entire retinal image to distinguish normal from abnormal tissue, which subsequently guides the search to perturbations using foveal vision (the checking mode) [5]. The global impression is a comparison of the contents of the radiograph to an expert’s schema of the visual appearance of normal radiographs. Central in this conceptualization of holistic processing is speed [2527]: This global impression is developed first in the diagnostic process, and visual experts have been found to develop a global impression or gist of an image within 250–2000 ms [1] or less [5,28,29]. The slower ‘checking’ mode is more feature-based and involves shifting selective attention to potentially relevant areas of radiographs. Given that the right FFA is mostly linked to the holistic processing of faces and objects of expertise, it seems likely that the right FFA is less involved in the feature-based checking-mode that generally takes place after the initial holistic mode. However, investigating if this is indeed the case is only possible in experimental tasks that elicit the full diagnostic process and separates those phases. Complete separation of holistic and checking modes is not possible, but we argue that the short presentations of radiographs mostly elicits the holistic mode (since there is no time to enter the checking mode), whereas a longer presentation time in combination with the instruction to check an earlier diagnosis is expected to elicit mostly the checking mode.

Two of the neuroscientific studies that investigated radiological expertise so far aimed to capture the participants’ processing of stimuli, but not specifically the diagnostic processes. They thus asked participants to execute a 1-back task [23] or a manipulation detection task [30]. In studies that did require participants to detect or diagnose abnormalities, radiographs were presented for relatively short amounts of time only, such as 500 ms [2], or 1500 ms. [31]. Tasks aimed at processing but not diagnosing radiographs, and very short tasks are likely to elicit a holistic but not a search-to-find mode. If the right FFA plays a crucial role in the process of holistic perception, it is more likely to differentiate between expertise levels if radiographs are observed for short periods of time and less so if participants engage in the slower search-to-find mode. Thus, there is a need for research that investigates the activation of the right FFA during longer presentation periods to contrast this with right FFA activation during shorter task durations, to better understand the specific function of the right FFA in the diagnostic process.

In the current study, we aim to investigate the specific function of the right FFA in visual expertise tasks in radiologists in training. To do so, we contrast laypeople with radiologists in training (residents and fellows) in a diagnostic-reasoning task. We used an established functional-localizer procedure to identify individual regions of interest (ROIs) for the right FFA, see [2,23] for following a similar approach. Localizer tasks are tasks that are known to activate a particular brain area, in this case the FFA. Of course, other brain areas are likely to be involved in visual expertise [32], [see, e.g., 30, 31 for other areas related to radiological expertise]. However, the use of a localizer task for the FFA allows us to focus our analysis on the function of the right FFA. The use of functional ROIs provides more power to detect specific differences and, to some extent, avoids the multiple-testing problem (Bennett et al. 2011). Apart from localization of the right FFA, we use similar procedures to localize the right V1 to rule out attention effects. Finally, as an exploratory analysis, we investigated the lateralization of the expertise effect. After the localizer task, we asked participants to diagnose abnormalities on radiographs after short-presentation (2s) and long-presentation (10s) times and performed ROI-based ANOVAs. Additionally, we investigated the correlation between right FFA activation level and both diagnostic performance and radiological experience, to replicate the findings of Harley and colleagues [2]. It was hypothesized that:

  1. Radiologists in training show a higher diagnostic performance than laypeople for short- and long-presentation trials.

  2. For radiologists in training versus laypeople, the right FFA shows more activation during trials that elicit the holistic mode than during trials that elicit the search-to-find mode.

  3. The activation level within the right FFA is positively correlated with the diagnostic performance for radiologists in training.

  4. The activation level within the right FFA is positively correlated with experience level for radiologists in training.

To anticipate, we find preliminary evidence that the right FFA is selectively involved in the holistic mode.

Materials and methods

Participants

Participants were eleven laypeople with no experience in radiology (two males, nine females), and 17 radiologists in training: residents or fellows (seven males, ten females). An a-priori determined sample size of ten residents in their first year and ten residents in year 3 or higher was selected because this sample size seemed maximally feasible given the number of eligible participants at reasonable traveling distance from the MRI facilities, and was in line with earlier studies such as [2,4,16,30,31]. However, given that fewer eligible participants than expected were willing to participate, in combination with limited availability of funding, the a priori-determined sample size was not met. Therefore, the two groups of residents were combined into one group (radiologists in training). The average age of the laypeople was 28.4 years (SD = 6.2 years), nine were right-handed and two were left-handed. The average age for radiologists in training was 29.6 years (SD = 3.5 years), 16 were right-handed and one of them was left-handed. The experience of radiologists in training is reported in years: In the Netherlands, medical doctors specialize in radiology during a five-year residency training followed by a fellowship (one or two years) to become a subspecialist. Our sample included ten residents in their first year, one in the third year, three in their fourth year, one in the fifth year, and two first-year fellows. The laypeople were matched for age and educational level: They all had a master (n = 8) or PhD degree (n = 3) in a non-medical domain. For one participant in the laypeople condition, the behavioral data were corrupted (score is 0 due to no responses being recorded). Behavioral data from this participant were excluded from all analyses, but their fMRI data were included in all analyses. All participants had normal or corrected-to-normal vision. All participants gave written informed consent and received a compensation for study participation. This research was conducted in accordance with the Declaration of Helsinki. The Ethical Committee of the Maastricht University Medical Center approved the study protocol, file number 154066. The individual pictured in Fig 1 has provided written informed consent (as outlined in the PLOS consent form) to publish their image alongside the manuscript.

Fig 1. Overview of the experimental designs of the FFA-localization and the diagnostic-reasoning runs.

Fig 1

In all runs, the stimuli were presented against a grey background. (A) The FFA-localization procedure used a blocked design in which 45 images of faces and 45 images of objects were shown in blocks of 30s. Between each block, there was a 20s baseline. (B) The diagnostic-reasoning trials consisted of the presentation of the radiograph (2s in short-presentation runs and 10s in long-presentation runs), a scrambled version of this image (mask) presented immediately after for 1s and then the diagnostic question with answer options for 10s. Between trials, there was a 15s baseline period.

Stimulation materials and task

FFA-localization runs

Individual ROIs were determined using an established FFA-localization procedure [7]: Data collection took place in two runs of 5min and 20s each with the same general structure (see Fig 1A): Images of faces (3 blocks of 30s) and objects (3 blocks of 30s) on a grey background were presented with a 20s resting period between each block. Blocks of faces and objects were alternating. The order of conditions was counterbalanced across the two runs. Each block consisted of 45 images that were each presented for 667ms, and participants were required to passively but attentively view the images.

Diagnostic-reasoning runs

For the diagnostic-reasoning runs, 66 radiographs were extracted from an existing teaching file and resized to 1000 × 1000 pixels with a grey background. Each of the 66 radiographs showed at least one abnormality. A total of 21 different pathologies were present (see Table 1). The diagnosis for each radiograph was extracted from the teaching file and checked by a radiologist.

Table 1. List of different abnormalities presented.
Name of disease/abnormality Number of items in the experiment
Atelectasis 4
Cardiomegaly 2
COPD 2
Cystic fibrosis 5
Decompensatio cordis 1
Deviation mediastinal structures 1
Diaphragm ruptured 1
Lung fibrosis 6
Lung metastasis 5
Lung tumor 3
Lymfangitis carcinomatosa 1
Miliary tuberculosis 5
Pleural empyema 1
Pleural effusion 6
Pneumonia 9
Pneumothorax 4
Sarcoidosis 3
Silicosis 1
Broadened mediastinum 3
Enlarged hilus 1
Pleural calcification 2

The diagnostic-reasoning part of the study was split into six runs: three runs of short-presentation (2s) trials (8min and 44s per run) and three runs of long-presentation (10s) trials (11min and 40s per run), and was implemented in an event-related design. Each of the diagnostic-reasoning runs consisted of 22 radiographs. All radiographs were presented twice, first in a short-presentation run and next in a long-presentation run.

Fig 1B provides an overview of the trials. Each run started with a 20s baseline period indicated by the presentation of a black cross on a grey background. After the presentation of the radiograph, the scrambled version of the image was presented for 1s as a mask, followed by a display of the potential diagnosis (e.g., “Pneumonia?”) with answer options (yes/no) for 10s. For half of the radiographs, the presented diagnosis was the correct diagnosis. For the other half of the radiographs, an incorrect diagnosis was presented, which was the correct diagnosis for a randomly selected other image. The yes/no format was used because it eliminated the need for participants to speak, which can cause motion artifacts that deteriorate fMRI data quality. At the same time, by requesting participants to engage in diagnostic reasoning while the image was on the screen, we capture diagnostic reasoning processes and not just perceptual processing. Participants were requested to diagnose the image while it was on the screen, rather than afterwards (when the question was on the screen) to optimize when diagnostic reasoning would take place. Participants were allowed to free-view the images, eye positions were not tracked.

Participants indicated their diagnostic decision by pressing buttons assigned to yes or no on an MRI-compatible button box. A fixation cross was subsequently presented for 15s (so the fMRI signal could return to baseline) before the next image was presented. The order of the radiographs in the three short-presentation runs was randomized for each participant, the long-presentation runs showed the radiographs in the same order as the short-presentation runs, followed by the presentation of the same potential diagnosis for the long-presentation runs as for the short-presentation runs. Participants were instructed that radiographs would be repeated in the long-presentation runs and were instructed to take another good look at the image, check their diagnosis, and if necessary adapt their answer.

Stimulus presentation

Visual stimulation was generated by a personal computer (PC) using the BrainStim software (https://github.com/svengijsen/BrainStim) and projected onto a frosted screen located at the end of the scanner bore (at the side of the participant’s head) with a liquid crystal display (LCD) projector. Participants viewed the screen via a mirror mounted to the head coil at an angle of ~45°.

(F)MRI data acquisition

Anatomical and functional brain-imaging data were obtained using a 3-T whole-body MRI scanner (Magnetom Prisma; Siemens Medical Systems, Erlangen, Germany). Participants were placed comfortably in the MRI scanner; their heads were fixated with foam padding to minimize spontaneous or task-related motion.

Anatomical measurements

Each participant underwent a high-resolution T1-weighted anatomical scan using a three-dimensional (3D) magnetization-prepared rapid-acquisition-gradient-echo (MP-RAGE) sequence (192 slices, slice thickness = 1mm, no gap, repetition time [TR] = 2250ms, echo time [TE] = 2.21ms, flip angle [FA] = 9°, field of view [FOV] = 256 × 256mm2, matrix size = 256 × 256, total scan time = 5min and 5s).

Functional measurements

Repeated single-shot echo-planar imaging (EPI) was performed using the BOLD effect as an indirect marker of local neuronal activity [33]. The number of acquisitions differed between runs (FFA-localization runs: 160 volumes; short-presentation diagnostic-reasoning runs: 262 volumes, long-presentation diagnostic-reasoning runs: 350 volumes). Apart from that, identical scanning parameters were used for all functional measurements (TR = 2000ms, TE = 30ms, FA = 77°, FOV = 192 × 192mm2, matrix size = 96 × 96, number of slices = 32, slice thickness = 2mm, no gap, slice order = ascending/interleaved).

General procedure

Before being placed in the MRI scanner, participants were informed about the study, signed informed consent, and provided information on their sex, date of birth, and year of residency. The session consisted of an anatomical scan, two FFA-localization runs, and six (three short-presentation and three long-presentation) diagnostic-reasoning runs and took 1.5-2h.

Data analysis

Neuroimaging data were analyzed using BrainVoyager (v20.4, BrainInnovation BV, Maastricht, the Netherlands). Behavioral data (obtained via button presses) were extracted from the BrainStim logfiles and analyzed in IBM SPSS (version 22, IBM).

Analysis of anatomical MRI data

Anatomical images were corrected for intensity inhomogeneities and spatially normalized to Montreal Neurological Institute (MNI) space.

Analysis of fMRI data

Pre-processing of functional data included (a) slice-scan time correction, (b) 3D motion correction including intra-session alignment to the first functional volume of the session, (c) temporal high-pass filtering with a threshold of two cycles for the FFA-localization runs and five cycles for diagnostic-reasoning runs, and (d) spatial normalization to MNI space. Gaussian spatial smoothing (kernel: 4mm full-width at half maximum) was applied to the FFA-localization data. After 3D motion-correction was executed, the motion correction parameters were plotted and visually inspected. No runs had to be discarded for excessive, non-correctable motion.

ROI definition

Individual ROIs for the right and left FFA were defined by calculating individual general linear models (GLMs) with 2 (runs) × 2 predictors (face images and object images). The FFA-ROIs were defined by contrasting faces vs. objects. A Bonferroni-corrected statistical threshold of p < .05 was used. Only clusters in the right and left fusiform gyrus were included in the ROI. If clusters were too small, less stringent p-values were chosen until the FFA encompassed at least 20 voxels. This was necessary for nine participants (two laypeople, seven radiologists in training).

Additionally, we defined individual ROIs for the right primary visual cortex (V1). These ROIs were defined by a conjunction analysis (faces vs. resting and objects vs. resting). The most significant voxel in V1 was determined and subsequently, less stringent p-values were chosen until the ROI encompassed approximately 100 voxels (varying from 104 to 112 voxels).

Investigating the effect of expertise

Average diagnostic performance was analyzed with a 2 × 2 mixed ANOVA, with the factor expertise level varied between participants (laypeople and radiologists in training), and the factor trial length varied within participants (2s trials and 10s trials). Partial eta squared (partial η2) was used as an effect size, where 0.02 denotes a small effect, 0.13 denotes a medium effect, and 0.26 denotes a large effect.

Four ROI-based random-effects ANOVAs were performed (for both the left FFA and the right FFA: one for long-presentation and one for short-presentation runs) that included three predictors (radiograph presentation, scrambled-radiograph presentation, and diagnosis presentation) that were separately contrasted against the baseline. The resulting four individual beta values for radiograph presentation (for both the left FFA and the right FFA: one for short-presentation and one for long-presentation runs) were extracted and further analyzed in IBM SPSS (version 22, IBM), performing first a 2 × 2 mixed ANOVAs with factors expertise level (laypeople and radiologists in training) varied between participants, trial length (2s trials and 10s trials) varied within participants and right FFA activity as the dependent variable. Next, as requested by reviewers, we also ran an exploratory 2 × 2 × 2 ANOVA with factors expertise level (laypeople and radiologists in training), trial length (2s trials and 10s trials), and location (left FFA and right FFA) and beta value as the dependent variable. Those ANOVAs were followed up by post-hoc t-tests with factor expertise (laypeople and radiologists in training) with Cohen’s d as an effect size, where 0.2 denotes a small, 0.5 a medium, and 0.8 a large effect [34].

It can be argued that activity during the short trials is more reflective of holistic processing if the answer is correct. Thus, in order to investigate to what extent the hypothesized pattern was stronger when only the brain responses to correctly interpreted radiographs are analyzed (i.e., those radiographs for which the participant’s diagnosis was correct), an additional analysis was executed for those trials only: Consequently, four additional ROI-based random-effects ANOVAs were performed (for both the left FFA and the right FFA: one for long-presentation and one for short-presentation runs) that included four predictors (correctly diagnosed radiographs, incorrectly diagnosed radiographs, scrambled-radiograph presentation, and diagnosis presentation) that were separately contrasted against the baseline. The resulting four individual beta values for correctly diagnosed radiographs (one for short-presentation and one for long-presentation runs) were extracted and further analyzed in IBM SPSS (version 22, IBM), using the same analyses as described above.

As a check that the pattern of results was not already reflected in the early visual cortex and therewith might result from a different amount of attention paid to the stimuli (e.g., experts might pay more attention to the stimuli because they are relevant for them), we repeated the analyses for hypothesis 2 in the right V1-ROI. Also, on reviewer request, we conducted ANCOVAs with V1 activity as a covariate, right FFA activity as dependent variable, and expertise (laypeople and radiologists in training) for each of the repeated measures.

Finally, the percentage of correct answers was correlated with beta values in the FFA (short-presentation and long-presentation trials separately) using the Pearson correlation coefficient. Additionally, the ordinal measure of radiological experience (in years) was correlated with the beta values in the FFA using the Spearman correlation coefficient (short-presentation and long-presentation trials separately).

Results

Hypothesis 1: Radiologists in training show higher diagnostic performance than laypeople for short- and long-presentation trials

Due to technical issues, the behavioral data of one participant in the laypeople group were corrupted and thus excluded here. The average diagnostic performance (percentage of correct diagnoses) for short-presentation and long-presentation trials are shown in Fig 2. Laypeople scored just above chance level in both short trials (M = 53.8%, SD = 8.5) and long trials (M = 57.3%, SD = 7.3). Radiologists in training scored on average 80.5% (SD = 4.1) on short trials and 84.2% (SD = 7.0) on long trials. There was no interaction of trial length with expertise level, F(1,25) = 0.003, p = .96, η2p < 0.001. A main effect of expertise level was found, with all radiologists in training scoring higher than all laypeople, F(1,25) = 142.12, p <. 0001, η2p = 0.85. Finally, there was a main effect of trial length, F(1,25) = 6.781, p = 0.02, η2p = 0.213, with both groups scoring higher on the long-presentation trials than on the short-presentation trials.

Fig 2. Average diagnostic performance (percentage of correct diagnoses) for laypeople and radiologists in training.

Fig 2

Error bars show standard deviations. In both short-presentation trials and long-presentation trials, radiologists in training scored significantly higher than laypeople.

Hypothesis 2: The right FFA is more activated in radiologists in training versus laypeople, during trials that elicit a holistic mode (i.e., during short-presentation trials) and less so in the search-to-find mode (i.e., during long-presentation trials)

A right FFA-ROI of at least 20 voxels could be defined in all participants. A probability map of the selected individual ROIs is displayed in Fig 3. The average size in voxels of the laypeople’s ROIs was 624.5 (SD = 1044.2). The average size for radiologists in training was 534.5, SD = 328.0.

Fig 3. Probability map, showing the locations of the individual right FFA ROIs.

Fig 3

Warmer colors indicate a higher proportion of ROIs in this voxel.

Individual beta values within the FFA across all short-presentation and long-presentation trials separately for the two groups are depicted in Fig 4. A 2 × 2 ANOVA was run with factors expertise level (laypeople and radiologists in training) and trial length (2s trials and 10s trials) and beta value as the dependent variable. The interaction was not significant, F(1,26) = 2.315, p = 0.14, η2p = 0.082 (small-to-medium effect size). There was a trend towards a significant effect of trial length, F(1,26) = 4.149, p = .05, η2p = 0.138 (medium effect size), with lower betas for the long-presentation trials. There was no significant effect of expertise level, F(1,26) = 2.152, p = .15, η2p = 0.076 (small-to-medium effect size).

Fig 4. Density plots (upper panels) and scatter plots (lower panels) for the individual beta values in the FFA for short- and long-presentation trials.

Fig 4

In the upper panels, the dotted line denotes the group mean. In the lower panels, for residents, darker colors depict more experienced residents.

We also ran an exploratory 2 × 2 × 2 ANOVA with factors expertise level (laypeople and radiologists in training), trial length (2s trials and 10s trials), and location (left FFA and right FFA) and beta value as the dependent variable. The three-way interaction between expertise level, trial length and location was significant, F(1,26) = 4.395, p = .046, η2p = 0.145 (medium effect size). Furthermore, the two-way interaction between location and expertise level showed a trend towards significance, F(1,26) = 3.478, p = .07, η2p = 0.118 (medium effect size). The two-way interactions between trial length and expertise, and between location and trial length were not significant, both F’s < 0.6. Finally, the main effect of trial length was significant, F(1,26) = 6.388, p = .02, η2p = 0.197 (medium-to-large effect size), but the main effects of location and expertise were not significant, both F’s < 0.1.

As can be seen in Fig 4, beta values were low for long trials in both the left and the right FFA for both laypeople and residents. In the short trials, beta values were higher for residents than for laypeople in the left FFA but beta values were higher for laypeople than for residents in the right FFA. Post-hoc t-tests show that none of those differences was significant, all t’s < 1.6.

Correctly diagnosed trials only

We additionally analyzed the mean beta value in the FFA for the correctly diagnosed trials only, see Fig 5. There was a trend towards significance for the interaction, F(1,26) = 3.413, p = .08, η2p = .116 (medium effect size), a main effect of trial length, F(1,26) = 5.118, p = .03, η2p = 0.164 (medium effect size), and no main effect of expertise level, F(1,26) = 2.399, p = .13, η2p = .084 (small effect size).

Fig 5. Density plots (upper panels) and scatter plots (lower panels) for the individual beta values in the FFA for short- and long-presentation trials for correctly diagnosed trials only.

Fig 5

In the upper panels, the dotted line denotes the group mean. In the lower panels, for residents, darker colors depict more experienced residents.

We also ran an exploratory 2 × 2 × 2 ANOVA with factors expertise level (laypeople and radiologists in training), trial length (2s trials and 10s trials), and location (left FFA and right FFA) on beta value as the dependent variable. The three-way interaction between expertise level, trial length and location showed a trend towards significance, F(1,26) = 3.937, p = .06, η2p = 0.132 (medium effect size). There was also a trend towards significance for the interaction between location and expertise level, F(1,26) = 2.885, p = .10, η2p = .100 (small-to-medium sized effect). The interactions between trial length and location and trial length and expertise level were not significant, both F’s < 0.7. The main effect of trial length was significant, F(1,26) = 10.91, p = .003, η2p = 0.296 (large effect). The main effects of location and expertise level were not significant, both F’s < 0.3. Fig 5 shows the same pattern of results as when all trials were included. Post-hoc t-tests show a significant effect for the right FFA short runs, t(26) = 1.80, p = .04 (one-sided t-test), Cohen’s d = 0.71 (medium-to-large effect); for the other t-tests all t’s <1.0.

Analyses of V1 activity

To exclude overall activation differences (already observable in the early visual cortex) between the short- and long-presentation trials, we repeated the analyses in the right V1 ROI. A 2 × 2 ANOVA was run with factors expertise level (laypeople vs. radiologists in training) and trial length (2s trials vs. 10s trials) and V1 activity as the dependent variable (see Fig 6).

Fig 6. Density plots (upper panels) and scatter plots (lower panels) for the individual beta values in the right V1 for short- and long-presentation trials.

Fig 6

In the upper panels, the dotted line denotes the group mean. In the lower panels, for residents, darker colors depict more experienced residents.

The interaction of expertise with trial length was not significant, F(1,26) = 1.020, p = 0.32, η2p = 0.038 (small effect size). There was a significant effect of trial length, F(1,26) = 15.821, p < .01, η2p = 0.378 (large effect size), with lower betas for the long-presentation trials. There was no significant effect of expertise level, F(1,26) = 0.581, p = .45, η2p = 0.022 (small effect size).

We additionally analyzed the mean beta value in the V1 for the correctly diagnosed trials only (See Fig 7). There was no significant interaction, F(1,26) = 1.614, p = .22, η2p = 0.058 (small effect size), a main effect of trial length, F(1,26) = 23.752, p < .01, η2p = 0.477 (large effect size), and no main effect of expertise level, F(1,26) = 1.128, p = .30, η2p = 0.041 (small effect size).

Fig 7. Density plots (upper panels) and scatter plots (lower panels) for the individual beta values in the right V1 for short- and long-presentation trials, for correctly diagnosed trials only.

Fig 7

In the upper panels, the dotted line denotes the group mean. In the lower panels, for residents, darker colors depict more experienced residents.

Finally, as requested by reviewers, we use ANCOVAs to analyze expertise differences in short-presentation and long-presentation trials, both in all trials and only correctly diagnosed trials, with activity in V1 (during short-presentation and long-presentation trials, both in all trials or only correctly diagnosed trials) as the covariate. Table 2 shows F- and p-values for the covariate and the main effect of expertise.

Table 2. F- and p-values for the ANCOVAs.
Covariate Main effect expertise
F p F p
short-presentation, all trials 1.506 .23 1.376 .18
long-presentation, all trials .935 .34 .158 .70
short-presentation, only correct trials 0.834 .37 2.406 .13
long-presentation, only correct trials .401 .53 .401 .53

Hypothesis 3: The activation level within the right FFA is (positively) correlated with the diagnostic performance for radiologists in training

Correlations between the diagnostic performance and the right FFA activation level were calculated for radiologists in training only (n = 17). There was no significant correlation between the beta values for short-presentation trials and the diagnostic performance for short-presentation trials (r = -.05, p = .86), and between the beta values for long-presentation trials and the average diagnostic performance for long-presentation trials (r = 0.24, p = .35).

We also added exploratory correlation analyses for the left FFA. There was a trend towards asignificant correlation between the beta values for short-presentation trials and the diagnostic performance for short-presentation trials in the left FFA (r = -0.441, p = .08), but not between the beta values for long-presentation trials and the average diagnostic performance for long-presentation trials (r = -.263, p = .31).

Hypothesis 4: The activation level within the right FFA is (positively) correlated with experience level for radiologists in training

Spearman correlations between extracted beta values and the radiological experience of participants were calculated to take into account the ordinal nature of the experience measure. Within the radiologists-in-training group, no correlation between radiological experience and beta values were found for the short- (Rho = 0.18, p = .49) and the long-presentation trials (Rho = 0.20, p = .43).

We also added exploratory correlation analyses for the left FFA. There was no significant correlation between radiological experience and beta values in the short-presentation (Rho = -.257, p = .32) and the long-presentation trials (Rho = -.378, p = .13).

Discussion

In the current study, we investigated the function of the right FFA in visual expertise tasks in radiologists in training. First of all, it was hypothesized that radiologists in training show higher diagnostic performance than laypeople. Second, it was hypothesized that the right FFA shows more activation for radiologists in training versus laypeople, in particular in the holistic mode (i.e., during short-presentation trials) and less so in the search-to-find mode (i.e., during long-presentation trials). Finally, it was expected that the activation of the right FFA is correlated with diagnostic performance and experience for radiologists in training for both modes.

In accordance with hypothesis 1, the diagnostic performance of radiologists in training was significantly higher than the diagnostic performance of laypeople for short- and long-presentation trials. We found tentative support for hypothesis 2, in the form of a significant three-way interaction between expertise level, trial length, and location. Radiologists in training were found to show somewhat higher involvement of the right FFA in diagnosing radiographs as compared to laypeople, during short-presentation trials but not during long-presentation trials, whereas the opposite pattern was found for the left FFA. However, none of the post-hoc t-tests showed significant differences between laypeople and radiologists in training. Additionally, there was a significant difference in the right FFA between laypeople and radiologists in training for short-duration trials when only correctly-diagnosed trials were included. In contrast to hypothesis 3, diagnostic performance did not correlate significantly with the beta values, and in contrast to hypothesis 4, participants’ experience did not correlate significantly with beta values.

The analyses of V1 activity provide an insight into the attention account of expertise effect, which holds that expertise effects in the FFA and other brain regions are simply the effect of greater attentional engagement with the objects of expertise [9,18]. That is, an overall larger level of attention to objects of expertise by experts (because they are more interested in those stimuli) causes expertise effects not just in the FFA but also in other regions of the visual system, such as V1. Gauthier [35] argues against this account, showing expertise effects in the FFA even with limited attention (e.g., when the object of expertise is irrelevant), and showing expertise effects even in the regional grey matter thickness of the FFA. To explore the attention account of expertise for our data, we investigated whether radiologists in training show larger V1 activation (as a result of larger attentional engagement) than novices and found no evidence for this. Additionally, as suggested by reviewers, we investigated whether partialling out V1 activation removes expertise effects, and found indeed that differences between conditions were no longer significant after partialling out V1 activity. This means that we cannot exclude that increased attention for the stimuli by residents compared to laypeople might explain the pattern of results in the FFA. On the other hand, the V1 activity is not a significant covariate in any of the ANCOVAs.

Most expertise studies concentrate on experts (e.g., radiologists with at least ten years of experience) in visual tasks. In comparison, we found tentative evidence that the right FFA already responds to expertise-related stimuli in radiologists in training with only 1–5 years of experience in radiology, although the difference did not reach significance in all ANOVAs. This result suggests that fast holistic processes might play a role in diagnostic reasoning earlier than expected, as early as residency training. Of course, it has to be noted that our task was adapted to the level of the radiologists in training. We used abnormalities that were relatively easy to diagnose. Furthermore, we chose to present radiographs for two seconds, whereas experienced radiologists have been found to detect tumors on mammograms in as short as 250ms [1]. Still, these results suggest the involvement of the FFA in holistic processing in radiologists in training. While our design focused on another characteristic of holistic processing, speed instead of the inversion effect, our results corroborate Bilalić’s findings that the right FFA plays a crucial role in holistic processing.

These results provide further evidence for the idea that radiological expertise is reflected in the right-FFA activation level. Results, so far, are not completely consistent. Harley and colleagues [2] found no difference in right FFA-activation levels between expertise groups (1st-year residents versus 4th-year residents and practicing radiologists), but they did find a correlation between right FFA activation and diagnostic performance. We found the opposite pattern of results: we found differences in right FFA-activation levels between expertise groups (laypeople versus residents in training) but no correlation between right FFA activation and diagnostic performance or experience. This might be explained by the fact that we included a group of laypeople and a group of radiologists in training, whereas Harley and colleagues included no laypeople in their sample, but residents with two different expertise levels as well as a group of practicing thorax radiologists. Together, these results suggest that the right FFA starts responding to domain-specific stimuli already early in the process of acquiring expertise. Our lack of correlation between diagnostic performance and right FFA activation might be caused by the relatively high diagnostic performance of our radiologists in training. Participants were only required to indicate whether the potential diagnosis matched the diagnosis that they had in mind for the radiograph (‘forced-choice’ situation), which can be considered easier than diagnosing the abnormality. We instructed participants to execute their diagnostic reasoning during the presentation of the radiograph, and not during the presentation of the potential diagnosis. While this choice was made to optimize when the diagnostic process took place (i.e., ensure that diagnostic reasoning took place during the presentation of the radiographs), the resulting diagnostic performance measure was suboptimal because it resulted in very high performance and this might explain our lack of correlation.

Findings in the literature are somewhat mixed when it comes to differences in right FFA activation between expertise groups in radiology. Neither Haller and Radue [30] nor Melo and colleagues [31] found areas with significant activation in the vicinity of the FFA. Note that they did not employ an FFA localizer task but only report areas with significant activation. Bilalić and colleagues [23] found this only to some extent. However, two of those studies [23,30] employed tasks that were not tapping into the process of diagnosing radiographs, but instead employed unrelated tasks that required participants to only look at radiographs. Harley and colleagues [2] required participants to detect the presence or absence of cued tumors, and only Melo and colleagues [31] required participants to formulate a diagnosis (but for shortly presented radiographs only). Diagnostic reasoning relies strongly on (structured) expert knowledge and is thus likely to result in different patterns of brain activation than tasks that can also easily be conducted by novices, such as a 1-back task [23] or the task to spot manipulations in radiographs [30]. The role of the right FFA in visual expertise might depend on which (expertise-related) task is executed, how much time was spent, and other processes such as top-down attention modulation [18].

Related to this is an important limitation of the study: holistic processing is a term that is notorious for its many definitions and associations [32, 36]. A similar concern is true for diagnostic processing, a term that includes many different cognitive tasks and processes [25, 31]. The specific design of the task and the instructions are thus very likely to impact what diagnostic processes take place, and this, in turn, impacts results. For example, while our task aimed to separate holistic processing from the search-to-find mode, it is very likely that some holistic processing has taken place in the long-presentation trials as well, in which participants were instructed to rely mostly on the search-to-find mode. In contrast, it is unlikely that the search-to-find mode can be executed in the short time of the short-presentation trials, so these trials are likely to reflect mostly holistic processing. Still, our aim of tapping into the complexity of diagnostic processes made it relatively difficult to fully separate these two modes. Other researchers have avoided this problem by employing tasks with low similarity to the actual image interpretation task, such as 1-back tasks or detecting image manipulations. However, we consider it critical to go beyond this type of tasks and complement the literature by incorporating tasks that realistically reflect the diagnostic-reasoning process. Together, all these designs explore the complexity of the diagnostic-reasoning process. Thus, further research should use different types of tasks (including diagnostic-reasoning tasks) and designs to illuminate under which conditions results converge and diverge [cf. 37], in order to understand how visual expertise is represented in the right FFA. Likewise, the holistic processing of stimuli is only one aspect of visual expertise, and it would be interesting to also investigate the involvement of other regions of the brain in other aspects of visual expertise, such as areas in the occipital and frontal cortex, see, e.g., [21] for example in radiology [2,30,31].

Another limitation of our study related to this aim is that participants were required to check their diagnosis based on the short-presentation runs during the long-presentation runs to elicit the checking mode. Thus, the order of short- and long-presentation runs could not be counterbalanced and we could not use a new set of images in the long-presentation runs. This could have caused order effects and/or effects of novelty. That is, all pictures in the long-presentation runs were already inspected in the short-presentation runs, which could have caused disengagement. Not only was the repetition of stimuli necessary to elicit the checking mode, by using the same stimuli in the long-presentation runs and short-presentation runs, we also ensured that the difficulty of the two modes was the same. Furthermore, it has to be noted that although the images were presented twice, even a presentation time of 10s is shorter than what residents would normally spend on an image, making disengagement an unlikely explanation of the lower activity during the longer trials. Anecdotally, participants remarked that they would have preferred seeing the images even longer. Finally, it has to be noted that we do not interpret the main effect of trial length, but only the main effect of expertise and the interaction of expertise and trial length. Even so, further research could counterbalance the order of the short-presentation and long-presentation runs, to entirely exclude a possible order effect. This would also require equally difficult stimuli in the two modes and accordingly adapted instructions in the long-presentation runs.

A final important limitation of our study is the limited number of participants. As a reviewer pointed out, this could also explain that we found the opposite pattern of results as Harley and colleagues [2]. Recently, it has been argued that those earlier studies on the expertise account of the FFA with small sample sizes have overestimated the expertise effects [15]. The issue of power in fMRI has received increasing attention [38], with suggestions to execute power analyses before data collection in a pilot study. This is still seldomly done and complex in expertise studies where eligible participant populations are small. In our study, two issues increased power: The blocked design and the sufficiently high number of trials. Unfortunately, our sample size, loosely based on those earlier findings, might have been too small given our current understanding of the size of the expertise effect. With this, our study gives tentative evidence that indeed expertise effects can be found in the right FFA in the holistic mode but less so in the search-to-find mode, but further research with larger samples is necessary to corroborate our findings.

In conclusion, our data provides tentative support for the general expertise hypothesis of right FFA functioning in a group of radiologists in training. On top of that, we found tentative support for the hypothesis that the right FFA shows more activation for radiologists in training versus laypeople, in particular in the holistic mode (i.e., during short-presentation trials), and less so in the search-to-find mode (i.e., during long-presentation trials). We did not find significant correlations between diagnostic performance and right FFA activation. These data provide some evidence for the view that the right FFA supports holistic processing of stimuli in participants’ expertise domain.

Acknowledgments

We would like to thank Armin Heinecke from BrainInnovation B.V. (Maastricht, the Netherlands) for his help with the BrainVoyager analyses and Sven Gijsen for programming the visual stimulation using the BrainStim software.

Data Availability

The SPSS file (without identifying information) that can be used to replicate the study’s results can be accessed on Dataverse, https://doi.org/10.34894/O0CKVP. Researchers with an interest in the raw data can request access to specific files from the data manager using the contact function in Dataverse.

Funding Statement

This work was supported by an fMRI scanning grant to AdB and EK from the executive board of the Faculty of Faculty of Health, Medicine and Life Sciences, Maastricht University, the Netherlands. The funding covered scanning costs for this study, but not the salary of the researchers. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Evans KK, Georgian-Smith D, Tambouret R, Birdwell RL, Wolfe JM. The gist of the abnormal: above-chance medical decision making in the blink of an eye. Psychonomic Bulletin & Review. 2013;20(6):1170–5. doi: 10.3758/s13423-013-0459-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Harley EM, Pope WB, Villablanca JP, Mumford J, Suh R, Mazziotta JC, et al. Engagement of fusiform cortex and disengagement of lateral occipital cortex in the acquisition of radiological expertise. Cerebral Cortex. 2009;19(11):2746–54. doi: 10.1093/cercor/bhp051 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Bilalić M, Langner R, Ulrich R, Grodd W. Many faces of expertise: fusiform face area in chess experts and novices. Journal of Neuroscience. 2011;31(28):10206–14. doi: 10.1523/JNEUROSCI.5727-10.2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Gauthier I, Skudlarski P, Gore JC, Anderson AW. Expertise for cars and birds recruits brain areas involved in face recognition. Nature Neuroscience. 2000;3(2):191–7. doi: 10.1038/72140 [DOI] [PubMed] [Google Scholar]
  • 5.Kundel HL, Nodine CF, Conant EF, Weinstein SP. Holistic component of image perception in mammogram interpretation: gaze-tracking study. Radiology. 2007;242(2):396–402. doi: 10.1148/radiol.2422051997 [DOI] [PubMed] [Google Scholar]
  • 6.Kanwisher N, McDermott J, Chun MM. The fusiform face area: A module in human extrastriate cortex specialized for face perception. Journal of Neuroscience. 1997;17(11):4302–11. doi: 10.1523/JNEUROSCI.17-11-04302.1997 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Sorger B, Goebel R, Schiltz C, Rossion B. Understanding the functional neuroanatomy of acquired prosopagnosia. Neuroimage. 2007;35(2):836–52. doi: 10.1016/j.neuroimage.2006.09.051 [DOI] [PubMed] [Google Scholar]
  • 8.Kanwisher N. Domain specificity in face perception. Nature neuroscience. 2000;3(8):759. doi: 10.1038/77664 [DOI] [PubMed] [Google Scholar]
  • 9.Kanwisher N. The quest for the FFA and where it led. The Journal of Neuroscience. 2017;37(5):1056–61. doi: 10.1523/JNEUROSCI.1706-16.2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Duchaine B, Yovel G. A revised neural framework for face processing. Annual review of vision science. 2015;1:393–416. doi: 10.1146/annurev-vision-082114-035518 [DOI] [PubMed] [Google Scholar]
  • 11.Xu Y. Revisiting the role of the fusiform face area in visual expertise. Cerebral Cortex. 2005;15(8):1234–42. doi: 10.1093/cercor/bhi006 [DOI] [PubMed] [Google Scholar]
  • 12.McGugin RW, Van Gulick AE, Tamber-Rosenau BJ, Ross DA, Gauthier I. Expertise effects in face-selective areas are robust to clutter and diverted attention, but not to competition. Cerebral Cortex. 2015;25(9):2610–22. doi: 10.1093/cercor/bhu060 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Gegenfurtner A, Kok EM, van Geel K, de Bruin AB, Sorger B. Neural correlates of visual perceptual expertise: Evidence from cognitive neuroscience using functional neuroimaging. Frontline Learning Research. 2017;5(3):95–111. doi: 10.14786/flr.v5i3.259 [DOI] [Google Scholar]
  • 14.Bilalić M. Revisiting the role of the fusiform face area in expertise. Journal of Cognitive Neuroscience. 2016;28(9):1345–57. doi: 10.1162/jocn_a_00974 [DOI] [PubMed] [Google Scholar]
  • 15.Burns E, Arnold T, Bukach C. P-curving the Fusiform Face Area: Meta-Analyses Support the Expertise Hypothesis. Neuroscience & Biobehavioral Reviews. 2019;104. doi: 10.1016/j.neubiorev.2019.07.003 [DOI] [PubMed] [Google Scholar]
  • 16.Gauthier I, Tarr MJ, Anderson AW, Skudlarski P, Gore JC. Activation of the middle fusiform ’face area’ increases with expertise in recognizing novel objects. Nature Neuroscience. 1999;2(6):568–73. doi: 10.1038/9224 [DOI] [PubMed] [Google Scholar]
  • 17.Barton JJS, Hanif H, Ashraf S. Relating visual to verbal semantic knowledge: the evaluation of object recognition in prosopagnosia. Brain. 2009;132(12):3456–66. doi: 10.1093/brain/awp252 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Harel A, Gilaie-Dotan S, Malach R, Bentin S. Top-down engagement modulates the neural expressions of visual expertise. Cerebral Cortex. 2010;20(10):2304–18. doi: 10.1093/cercor/bhp316 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.McGugin RW, Newton AT, Gore JC, Gauthier I. Robust expertise effects in right FFA. Neuropsychologia. 2014;63:135–44. doi: 10.1016/j.neuropsychologia.2014.08.029 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Ross DA, Tamber-Rosenau BJ, Palmeri TJ, Zhang J, Xu Y, Gauthier I. High-resolution functional magnetic resonance imaging reveals configural processing of cars in right anterior fusiform face area of car experts. Journal of cognitive neuroscience. 2018;30(7):973–84. doi: 10.1162/jocn_a_01256 [DOI] [PubMed] [Google Scholar]
  • 21.Martens F, Bulthé J, van Vliet C, de Beeck HO. Domain-general and domain-specific neural changes underlying visual expertise. Neuroimage. 2018;169:80–93. doi: 10.1016/j.neuroimage.2017.12.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Rhodes G, Byatt G, Michie PT, Puce A. Is the fusiform face area specialized for faces, individuation, or expert individuation? Journal of cognitive neuroscience. 2004;16(2):189–203. doi: 10.1162/089892904322984508 [DOI] [PubMed] [Google Scholar]
  • 23.Bilalić M, Grottenthaler T, Nägele T, Lindig T. The faces in radiological images: Fusiform face area supports radiological expertise. Cerebral Cortex. 2016;26(3):1004–14. doi: 10.1093/cercor/bhu272 [DOI] [PubMed] [Google Scholar]
  • 24.Farah MJ, Tanaka JW, Drain HM. What causes the face inversion effect. Journal of Experimental Psychology-Human Perception and Performance. 1995;21(3):628–34. doi: 10.1037//0096-1523.21.3.628 [DOI] [PubMed] [Google Scholar]
  • 25.Sheridan H, Reingold EM. The holistic processing account of visual expertise in medical image perception: A review. Frontiers in Psychology. 2017;8:1620. doi: 10.3389/fpsyg.2017.01620 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Gegenfurtner A, Lehtinen E, Säljö R. Expertise differences in the comprehension of visualizations: a meta-analysis of eye-tracking research in professional domains. Educational Psychology Review. 2011;23(4):523–52. doi: 10.1007/s10648-011-9174-7 [DOI] [Google Scholar]
  • 27.Reingold EM, Sheridan H. Eye movements and visual expertise in chess and medicine. In: Leversedge SP, Gilchrist ID, Everling S, editors. The Oxford handbook of eye movements. Oxford (UK): Oxford University Press; 2011. p. 528–50. [Google Scholar]
  • 28.Kundel HL, Nodine CF. Interpreting Chest Radiographs without Visual Search. Radiology. 1975;116(3):527–32. doi: 10.1148/116.3.527 [DOI] [PubMed] [Google Scholar]
  • 29.Bilalić M. The neuroscience of expertise. Cambridge (UK): Cambridge University Press; 2017. doi: 10.3758/s13428-016-0782-5 [DOI] [Google Scholar]
  • 30.Haller S, Radue EW. What is different about a radiologist’s brain? Radiology. 2005;236(3):983–9. [DOI] [PubMed] [Google Scholar]
  • 31.Melo M, Scarpin DJ, Amaro E, Passos RBD, Sato JR, Friston KJ, et al. How doctors generate diagnostic hypotheses: A study of radiological diagnosis with functional magnetic resonance imaging. PLoS One. 2011;6(12):8. doi: 10.1371/journal.pone.0028752 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Harel A, Kravitz D, Baker CI. Beyond perceptual expertise: Revisiting the neural substrates of expert object recognition. Frontiers in Human Neuroscience. 2013;7:1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Ogawa S, Lee T-M, Kay AR, Tank DW. Brain magnetic resonance imaging with contrast dependent on blood oxygenation. Proceedings of the National Academy of Sciences. 1990;87(24):9868–72. doi: 10.1073/pnas.87.24.9868 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Cohen J. Statistical power analysis for the behavioral sciences. Hillsdale (NJ): Lawrence Earlbaum Associates.; 1988. [Google Scholar]
  • 35.Gauthier I. Re: The Quest for the FFA led to the Expertise Account of its Specialization. arXiv:170207038. 2017.
  • 36.Richler JJ, Palmeri TJ, Gauthier I. Meanings, mechanisms, and measures of holistic processing. Frontiers in Psychology. 2012;3:553. doi: 10.3389/fpsyg.2012.00553 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.de Bruin ABH. The potential of neuroscience for health sciences education: towards convergence of evidence and resisting seductive allure. Adv Health Sci Educ. 2016:1–8. doi: 10.1007/s10459-016-9733-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Szucs D, Ioannidis JPA. Sample size evolution in neuroimaging research: An evaluation of highly-cited studies (1990–2012) and of latest practices (2017–2018) in high-impact journals. NeuroImage. 2020;221:117164. doi: 10.1016/j.neuroimage.2020.117164 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Niels Bergsland

7 Sep 2020

PONE-D-20-23315

Holistic processing only? The role of the fusiform face area in radiological expertise

PLOS ONE

Dear Dr. Kok,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

There are a fair number of comments to be addressed by the Reviewers but they all seem highly pertinent to improve the overall quality of the manuscript. In addition, please carefully proofread the manuscript as there are some typos throughout (e.g. "Them," instead of "Then," in the abstract). Also, I recognize that recruiting additional subjects for the study, as suggested, may not be feasible. However, please do your best to discuss this aspect within your manuscript. Please also be careful about the use of the term "marginally significant". While you can discuss these results, the use of the term "trend" is more appropriate (e.g. lines 319-320).

Please submit your revised manuscript by Oct 22 2020 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Niels Bergsland

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2.We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors test whether the right FFA exhibits experience related responses to radiographs in radiology students vs a set of controls. The authors replicate the expertise-rFFA link like many papers before it, but only in the blocks where participants have to respond quickly. They do not find similar expertise related differences in V1, which the authors claim rejects the attention hypothesis of FFA expertise effects.

While it’s always reassuring to see a finding in the literature replicated, I think the authors could extend the literature in a more novel way by localising and examining these effects in other brain regions. I also have some comments on the authors’ interpretations of their data, analysis and methodology. The study is low powered and the authors should acknowledge this or remedy it with more data (although I appreciate this latter solution may be impossible due to lack of funding). I should mention I am not an expert on the technical minutia of fMRI data recording or processing. This paper should therefore be reviewed by an fMRI expert too, one who can assess the technical aspects carefully. However, I will comment on these as best I can. Having read all fMRI expertise papers, I am much more familiar with the paradigms used, and the evidence that supports and rejects the expertise hypothesis, which I feel will help the authors in my review.

Major points

1. While replicating non-face FFA effects is interesting, it has now been shown in around 20-30 papers. One thing that is less clear is whether the left FFA or either OFAs are responsive to object expertise. When you examine the literature, it is only a tiny minority of papers that show effects in these regions (e.g., Harley et al., 2009; McGugin, Newton, Gore, & Gauthier, 2014; McGugin, Van Gulick, Tamber-Rosenau, Ross, & Gauthier, 2014; Ross et al., 2018). Why do the authors not try and remedy this by localising these other regions and exploring whether they are also responsive to expertise? Or, if they can’t, the authors should report why; i.e., maybe they can’t localise these regions, which has been mentioned in some papers. The same is also true of effects in the LOC too.

I note the authors mention that they only looked at the FFA to avoid employing multiple comparisons with other areas that would inflate the risk of a Type 2 error, but this is not a good reason to not perform these additional analyses. Burns, Arnold, & Bukach (2019, Footnote 4) found that conservatively speaking, expertise studies should test 56 participants even when engaging one-tailed analyses. That would place this study as underpowered with n = 28, or 17 experts (I should mention only one expertise study does not suffer this low power problem: Martens et al., 2018). It does not therefore make sense to perform corrections for multiple comparisons when power is low (Nakagawa, 2004). Instead, it would be more interesting to run the analyses on the other localised regions, and simply report that due to low power, the p-values will not be corrected. That way, we can see if there are expertise effects in other regions (which are not often reported in the literature outside of the rFFA). Also, were the reported analyses two-tailed?

2. Relatedly, how did the authors determine sample sizes? Did they analyse their data as each participant was recruited then stop when they found predicted results? Or was there an a priori number? It seems the sample size is similar to that suggested in the post 2010 Gauthier lab papers, but the authors do not specify this.

3. The authors say there is a need for a non-passive study, but this doesn’t sound right. When I double checked, citation 2 (Harley et al) already did a diagnostic fMRI task in radiologists. While their localizer scan was passive, which is common, in the actual experiment the participants tried to identify the nodules (from the paper follows):

“The diagnosis scans were event-related scans in which participants viewed intact and coarsely scrambled radiographs and judged whether a cued region in each radiograph contained a lung nodule”

This means what the authors say has not been done, actually has been. The more novel aspect of their study therefore is the 2s vs 10s difference. I think the authors may want to reread their other citations to make sure there aren’t other errors when describing the literature.

4. I can’t think of a single fMRI expertise study that has made their raw data publically available even though there are repositories available for this very purpose (D’Esposito, 2000). Why do the authors not make their data available? It would help start a precedent, improve transparency and replicability, and future researchers could reuse the data with other sets to increase power. Or do the authors have a particular reason for not sharing their data?

5. Regarding the correlations between expertise and neural activity, I think it would help if they were actually plotted (they could go into sup matts if the authors prefer). That way the reader could see individual data points. Also, a table of the within group and across groups correlation co-efficients and p-values within each condition/location would also be helpful. Gauthier and colleagues in their most recent papers recruit participants with a broad range of expertise (i.e., they’re not all experts) and find correlations between activity and FFA activity, so having all participants analysed together would help increase power.

6. It is common for researchers to find the voxel in the FFA that activates the most to faces, and then test whether this is also the peak for expertise, did the authors do this here? I can’t see mention of it.

Minor points

Page 3 line 60, I’m not sure there is a general consensus. I still meet countless people at conferences/talks who claim the expertise hypothesis is not real. This is reflected in the literature where Kanwisher (2017) still argues against these effects, with echoes of this present in another recent review (Duchaine & Yovel, 2015).

The authors touch upon prosopagnosia, so it may be worth noting that there are three separate studies from Jason Barton’s lab of groups of patients with FFA lesions that exhibit impaired object expertise (the citations are in the recent Burns et al 2019 expertise meta-analysis paper). One of these papers doesn’t specify the patients have FFA lesions, but if you check their earlier papers, you can identify that the cases actually do. I think when combined with the fMRI data, the expertise hypothesis becomes very compelling. In the intro the authors ask what the FFA is doing in expertise, and I think this data strongly suggests it contributes to the experience based individuation of highly similar object exemplars.

The authors refer to chest pieces, but I think they mean chess.

Page 4

The authors don’t really define holistic processing (or many other terms they introduce). What is it? Does inversion disrupt it because we are no longer using experience based pathways which would typically recognise such objects in its commonly seen upright configuration? Where’s the evidence that speed is a more central feature of holistic perception than stimulus configuration? What is search to find? What are the conceptual similarities pointed out by Bilalic? It’s not explained clearly why short versus long duration should differentiate experts in their neural activity?

Why is the FFA the most theoretically relevant? How do we know it’s the most relevant if the other regions (e.g., OFA, lFFA) have not been not as well studied?

Page 14

I presume the authors look at V1 because others have suggested FFA expertise activity is an artefact of attention, however, the authors do not explain this, nor cite the previous literature positing this (Harel et al, 2010; Kanwisher, 2017). Prior work has tried to address this issue (McGugin et al., 2015; McGugin et al., 2016) and there are other problems with this hypothesis (Burns et al., 2019).

Change level should be chance level?

Page 16

Marginally significant effect was not significant and so should be reported as such (p = .052). Nor is the later interaction significant (p = .08). Despite this latter result, I’m guessing the subsidiary analyses were preplanned based upon the authors’ hypotheses and prior literature, hence the motivation to perform them despite the interaction not being significant? If so, this should be explained.

When performing the correlations between FFA activity and performance, would it not be interesting to partial out the V1 activation? If the authors argue this activity is attention related, then taking it into account may yield a non-attention related relationship between FFA and expertise? The authors could note these were exploratory as they were not planned. The logic behind this type of analysis is detailed in DeGutis, Wilmer, Mercado & Cohen (2013) and has been used in many of the McGugin FFA expertise papers. While they regressed out different types of behavioural performance, and I haven't seen it used to regress out brain activity, I believe it the same logic could be applied and be useful here.

The correlation coefficients of around .2 are actually at the most conservative end of the predicted expertise-FFA relationships when the broad literature is taken into account (Burns et al., 2019; Footnote 4). Again hinting the effect may be there, it’s just that insufficient numbers were tested. As mentioned earlier, why not run the correlations across groups (and plotted), as it would increase power and give a broader range of expertise.

I find it strange the authors did not compute dprime as this was what was correlated with the FFA in the Harley paper.

Page 20

A simpler explanation for the different pattern of significant results in this and the Harley paper is that both were underpowered (Burns et al., 2019). If both studies only have 50% power and run two different analyses testing for a real effect, then it’s not surprising if only one of them is significant and the other not. If power was extremely high (>99%), we would expect both to almost always be significant. This in my mind is the most likely explanation for why analyses on the link between FFA and expertise can be inconsistent even within a paper, with authors giving quite complicated explanations for incongruent results (and this occurs in almost all expertise papers). Instead, low power is the simplest and most likely explanation.

Page 21

The Haller and Melo papers did not test the FFA (i.e., no localiser task) so it is not accurate to report this as such.

Reviewer #2: The authors described their work on an interesting question that whether FFA is responsible for only holistic processing of expertise images but not for slower processing mode, which is a new angle lacks research. They have used a new set of task that required the subjects to do a diagnose based on a picture, which is better representing the real-life processing of the images, which engages attention, serial search and other reasoning.

The results seemed solid, only with big variations between subjects. Error bars are huge. Recruiting more subjects, getting rid of the subjects with big movement artifact would largely improve that. The correlation analysis may also improve if more subjects included. It’s also good if they can show individual beta values in figure 4.

There is one issue with the design though. As is understandable that it is always more difficult to claim a negative conclusion. The authors found FFA response in expert group is lower during 10s task than 2s task. They contributed such difference into FFA’s selectivity to short presentation. However, such difference could also due to: 1. Novelty effect/task difficulty. All pictures in 10s task were already seen in 2s task. Therefore less engaged, worse attention. 2. Longer exposure of the same image adapted the visual system, and BOLD signal became weaker. Such general mechanisms could contribute to the difference. And the V1 response also showed similar trend – 10s trials showed lower response than 2s trials, which again indicated such difference is not a unique nature of FFA processing, but could be contributed to the task or BOLD signatures.

If the authors could make the 10s task more challenging, I doubt FFA may have bigger response even with a slow processing mode. For example, if the 10s task use new set of images (not seen ones), and the subjects need to find the abnormalities as many as possible (not a yes or no question), or it's a new set of more difficult images (rather than easy typical educational examples), the subjects may engage more in the task.

The author didn’t mentioned whether they tracked the eye position or not, nor did they mention whether the subjects are required to fixate or free viewing. I assume that the subjects will make more saccades during the 10s task, and that may cause more variations in to the overall signal.

Other typos.

Line 68, 69: chess, not chest

Line 315: Figure 4 (right)

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Haoran Xu

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Sep 1;16(9):e0256849. doi: 10.1371/journal.pone.0256849.r002

Author response to Decision Letter 0


5 Nov 2020

Dear editor,

Please find the response to reviewer and editor comments in the response to reviewers letter.

Kind regards, Ellen Kok

Attachment

Submitted filename: Response to reviewers.docx

Decision Letter 1

Niels Bergsland

22 Dec 2020

PONE-D-20-23315R1

Holistic processing only? The role of the fusiform face area in radiological expertise

PLOS ONE

Dear Dr. Kok,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Upon review of the revised version, the Reviewers and myself continue to have concerns about the current manuscript. I agree with both Reviewers that the V1 results should not have been removed from the revised manuscript. Please pay careful attention to respond to each of the points that have been raised. Although there is a non-negligible amount of work to do, I trust that the detailed reviews and feedback from the Reviewers will help facilitate your revision.

Please submit your revised manuscript by Feb 05 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Niels Bergsland

Academic Editor

PLOS ONE

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: No

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors have edited their manuscript to correct some issues, but some of their responses need further clarification, while other haven’t addressed my original comments. I realise my major points are quite long this time, but this is just because I want to ensure I’m clear enough so that we avoid any misinterpretations. I hope the authors don’t think I’m being intentionally difficult or disingenuous, it certainly isn’t my intention. I just want to ensure the decisions the authors have made in their methods and analyses are transparent to future readers, and that there is not a risk of readers misinterpreting how the literature currently stands with respect to power considerations.

Major points

1. The authors state that they can’t look at the lFFA or OFAs because they never intended to study them in the first place. Yet they have now removed a priori research questions. See the authors’ two responses to reviewers:

“Although your suggestion to localize and analyze other regions is certainly interesting, it was not the original goal of the project”

“in line with their argumentation, we have decided to remove the V1 analyses from the paper completely.”

I agree it’s perfectly fine to focus your manuscript on what the original research questions were, and these should be reported as a priori hypotheses. What is problematic is that these claims are at odds with the authors removing their original V1 hypotheses and analyses after the first review. These have completely vanished, as if they never existed. Yet this was exactly what the authors planned. The replication crisis has shown us that removing or altering hypotheses post-hoc to fit the data promotes false positives (seems a form of HARKing, Kerr, 1998, where authors drop a priori hypotheses). At the most fundamental level, the V1 data must be reinstated in the manuscript.

Following this train of thought, why can’t they at least attempt analyses on the lFFA, and/or the OFAs, and report that these as exploratory (or report an inability to localise). You’ve spent a lot of time, effort and expense acquiring this data, so why not report it (it would also add to the novelty, provide effect sizes or illustrate localisation issues which are common in the literature for other researchers)? Especially when the raw data are not immediately available for others to answer these questions themselves.

I’m not trying to be difficult, but these competing arguments over original research questions and post-hoc questions do not accord with one another.

2. I don’t think it will affect the importance of the manuscript if the V1 data is linked to the FFA effect, it is better to be transparent so readers can make up their own minds. This is why performing correlations between V1 and FFA activity are interesting (same is true with the partial out method). As I mentioned previously, even if there is a link between V1 and FFA effects, this is still consistent with the expertise hypothesis: see Gauthier’s recent commentary on Kanwisher’s paper on the history of FFA research, and Lohse et al. (2016) on this link to face regions. If the face processing link is expertise based, it makes sense if there’s a link due to non-face expertise. You should report these as exploratory.

3. Can’t the authors ‘deface’ the raw fMRI data so they remove the risk of identification? Doesn’t this mean they could then make the data publically available? Also, I can’t access the data file the authors provide to replicate their analyses.

4. The description in the Methods of how the authors decided their sample size still doesn’t appear to be based on any objective criteria. Saying you decided to test 10 in this group and 10 in another group is not the same as using power estimates from prior work to decide sample sizes. I am mindful that pushing this issue further may make a posthoc decision appear, but why were these numbers decided (the authors mention line numbers 168-170, but I don’t see this in my pdf, maybe they mean 147-149), was power ever considered when designing your study? The decision to stop at 17 experts for practical reasons is again too vague. What were these reasons? Funding spent elsewhere? Funding expiring? Scanner was no longer available to the researchers? You asked all your radiologists and the others said no? Or you stopped as you found a significant effect? It should be stated.

5. The following points all touch upon issues of power that the authors discuss.

The authors cite Gegenfurtner et al. in their discussion, which is a review of eye tracking effect sizes, but the current study is fMRI. I don’t think this is appropriate and should be removed as it reminds me of the Cow-Canary problem where effects can be massively different between two different methods (Capitani et al., 1999). The authors’ own behavioural data here shows a huge effect size (which would be expected between experts and novices), yet only a medium one in the fMRI data. Why discuss eye-tracking effect sizes when you had 20-30 fMRI expertise studies to base your decision on, it doesn’t make any sense in this context and should be removed as just confuses the issue.

In the same place in the Discussion the authors seem tacitly argue two fMRI expertise studies that are 20 years old are an acceptable guide for sample sizes (Gauthier et al., 1999: which they don’t cite here but describe, and Gauthier et al., 2000). This ignores the recent studies that used larger sample sizes (peaking with the recent Martens et al paper, which is not cited, nor most other studies in the last 10 years). Even if two 20 year old papers were the basis for making their sample size decisions, it should be mentioned in the Methods and not the Discussion (although again, a question remains as to why the authors selected such old papers to base their sample sizes on rather than more recent work).

From the above I hope it’s clear why the current Discussion paragraph on effect sizes from eye tracking and two 20 year old fMRI papers does not work. I don’t want future researchers reading this paper to think these sample sizes would lead to sufficient levels of power when designing their own fMRI expertise studies. I think it would be more helpful to explicitly remind the readers that we now know this study and Harley’s were likely underpowered, that classic papers from 20 years ago were underpowered, and that significant effects here were found using liberal alphas. It’s perfectly fine accepting the limitations of this manuscript, but they need explained to the readers so they’re aware of the issues.

6. Also related to the previous points. Maybe I was not clear enough in my first review, but when I referred to page 20 and the different patterns of results, I meant that you failed to find a correlation but Harley and colleagues did. However, you found a group experience difference that Harley didn’t. The simplest explanation for this disparity is that both studies are underpowered. If you have two studies that are underpowered (maybe 50ish%), then it’s likely one (Harley et al) will find a significant effect in one analysis, but not the other, while the second paper (the current manuscript) finds the opposite pattern of significant/non-significant results with the same analyses. You haven’t discussed this in this paragraph, nor in the later paragraph where you argue your study is not underpowered based on two studies from 20 years ago. You should explain this hypothesis of the conflicting results explicitly when discussing the differences between Harley and yourselves.

I should add, the discussion of the 2s vs 10s trials is distinct from the point I’m trying to make here (although a similar logic could be applied, I think the correlation/group difference disparity needs explained separately).

Minor points

1. The authors uploaded two different versions of their manuscript, one with tracked changes indicated and another without, but switched between them when they were referring in their response to reviewers (and they sometimes referred to line numbers that didn’t correspond with the text). This made checking between the different texts confusing and overly time consuming to review (and apologies if I’ve now referred to one text and the other). I’d recommend on the next submission they simply change the font color of the text they’d edited and upload a single manuscript (unless this is a journal specific requirement to upload two?). Same is true of having the figure legends next to the actual figures.

2. I think the authors have misinterpreted signal detection theory by claiming they can’t compute dprime. They employed a 2AFC task, with stimulus pairs that were same or different. This leads to categories of hits, misses, CRs and FAs. It’s quite easy to compute dprime to test whether this behavioural measure is more strongly correlated to FFA activity as it was in Harley et al. I’m convinced other readers would want to see this to compare the results between the current study and Harley et al.

3. Line 84: “processing a face as a whole, and not as a combination of features”

I think referring to ‘combination of features’ implies a holistic relational percept. Maybe something like “as separate, distinct features that do not interact to form a single percept” is clearer?

4. Removing the original graphs stops making the group differences/similarities in the different conditions easily comparable so should be reinstated. I like the individual plots, but trendlines would be a useful addition too.

5. Line 490 still reads like prior studies analysed an area that included the FFA (i.e., you mention overlap), but we can’t say this as they did not localise it. It could be in the vicinity of the FFA, but not actually overlap . Related to this, there are many expertise FFA studies that did localise the FFA and are not cited in this manuscript which seem more relevant (particularly the Martens et al study which should be the gold standard going forward). Surely at least from a point of generating interest in the current paper the authors would want to reference these as more citations lead to more exposure?

Reviewer #2: I appreciate the authors modified and responded to each of the points I raised in the review.

My remaining concern for the paper is obviously the insignificance in the difference of rFFA activation between experts and laypeople for 2s trials. The statistics showed that for all trials, Ttest p=0.06, for correct trials ttest p=0.04, ANOVA results for expertise vs laypeople are all insignificant. Such small effect could be caused by the two dots shown in fig 4 and 5, which showed around 2 and 3 in beta values, but their diagnostic performance were low, and the training background were short. Given all the main conclusions and novelties for this paper are based on this small “trend”, and there seems no way the authors could recruit more subjects and alter that, I therefore am not quite confident to believe what have been concluded in the paper.

Reviewer #1 pointed out a paper by Burns, Arnold and Bukach 2019 in Neuroscience and biobehavioral reviews, in which the authors tried to justify p-values close to 0.05 could still be significant by meta-analysis. I am however not convinced that such meta-analysis on different tasks could justify the current small “trend” seen in the paper under review.

I also don’t quite like the author removed the V1 results. Like reviewer #1 suggested, I think the authors should use V1 to normalize the rFFA response, to control the possible attention/engagement bias in both subject groups and expertise levels. Literatures have shown that early visual areas including LGN, V1 and V4 all show attentional modulations in fMRI studies.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: PLOSONE 2.docx

PLoS One. 2021 Sep 1;16(9):e0256849. doi: 10.1371/journal.pone.0256849.r004

Author response to Decision Letter 1


15 Apr 2021

Dear editor and reviewers,

Please find our responses to your comments in the file 'Response to reviewers'.

Kind regards, Ellen Kok

Attachment

Submitted filename: Response to reviewers.docx

Decision Letter 2

Niels Bergsland

21 Jun 2021

PONE-D-20-23315R2

Holistic processing only? The role of the fusiform face area in radiological expertise

PLOS ONE

Dear Dr. Kok,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

I would also like to thank you for your patience during the review process as I recognize it has been quite lengthy.

Please submit your revised manuscript by Aug 05 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Niels Bergsland

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I think the manuscript reads better now. I only have one main comment:

In the previous drafts, the authors ran a one-tailed t-test on the right FFA short run which showed an effect of expertise, but for some reason they are no longer reporting this as one-tailed, but two-tailed now on line 401 (I think, the t-value is identical and the p-value is now double what it was previously in the deleted section), which renders the effect of expertise non-significant. I can't work out why this has been changed? I think it would be beneficial if this was changed back to the one-tailed test as that was what the authors initially intended and corroborates what they're actually claiming, i.e., an effect of expertise in the right FFA. There may need to be some minor text adjustments because of this (e.g., paragraph at line 466).

Lines 481-483 need citation/s.

There were a couple minor typos so a proofread may be good (e.g., line 77 gap before comma).

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Sep 1;16(9):e0256849. doi: 10.1371/journal.pone.0256849.r006

Author response to Decision Letter 2


28 Jul 2021

Dear editor,

Thank you for providing us with the opportunity to revise the manuscript. Below, we detail how we have adapted the manuscript in response to the comments of the reviewers. Additionally, the reference list was checked for completeness and correctness.

Kind regards on behalf of all authors,

Ellen Kok

1. In the previous drafts, the authors ran a one-tailed t-test on the right FFA short run which showed an effect of expertise, but for some reason they are no longer reporting this as one-tailed, but two-tailed now on line 401 (I think, the t-value is identical and the p-value is now double what it was previously in the deleted section), which renders the effect of expertise non-significant. I can't work out why this has been changed? I think it would be beneficial if this was changed back to the one-tailed test as that was what the authors initially intended and corroborates what they're actually claiming, i.e., an effect of expertise in the right FFA. There may need to be some minor text adjustments because of this (e.g., paragraph at line 466).

We had adapted the analysis to a two-sided test because all other tests were two-sided. Since we originally had a one-sided test (because we had a specific hypothesis, we have now changed the sentence to read:

“Post-hoc t-tests show a significant effect for the right FFA short runs, t(26) = 1.80, p = .04 (one-sided t-test), Cohen’s d = 0.71 (medium-to-large effect); for the other t-tests all t’s <1.0.” (lines 401-402 in the version without tracked changes).

Additionally, we changed ‘trend towards significance for the difference’ to ‘significant difference’ in line 471 in the version without tracked changes.

2. Lines 481-483 need citation/s.

We have now added the citation that was accidentally left out for line 481-482 (reference 35).

3. There were a couple minor typos so a proofread may be good (e.g., line 77 gap before comma).

An additional thorough proofread was conducted, fixing the typo in line 77 and several others.

Attachment

Submitted filename: ReponsetothereviewersR3.docx

Decision Letter 3

Niels Bergsland

18 Aug 2021

Holistic processing only? The role of the right fusiform face area in radiological expertise

PONE-D-20-23315R3

Dear Dr. Kok,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Niels Bergsland

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Niels Bergsland

23 Aug 2021

PONE-D-20-23315R3

Holistic processing only? The role of the right fusiform face area in radiological expertise

Dear Dr. Kok:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Niels Bergsland

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: Response to reviewers.docx

    Attachment

    Submitted filename: PLOSONE 2.docx

    Attachment

    Submitted filename: Response to reviewers.docx

    Attachment

    Submitted filename: ReponsetothereviewersR3.docx

    Data Availability Statement

    The SPSS file (without identifying information) that can be used to replicate the study’s results can be accessed on Dataverse, https://doi.org/10.34894/O0CKVP. Researchers with an interest in the raw data can request access to specific files from the data manager using the contact function in Dataverse.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES