Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Dec 1.
Published in final edited form as: Cogn Affect Behav Neurosci. 2010 Dec 1;10(4):541–551. doi: 10.3758/CABN.10.4.541

Holistic processing of musical notation: Dissociating failures of selective attention in experts and novices

Yetta Kwailing Wong 1,*, Isabel Gauthier 1
PMCID: PMC3044322  NIHMSID: NIHMS240107  PMID: 21098813

Abstract

Holistic processing, i.e. the tendency to process objects as wholes, is associated with face perception and also with expertise individuating novel objects. Surprisingly, recent work also reveals holistic effects in novice observers. It is unclear whether the same mechanisms support holistic effects in experts and in novices. Here, we measured holistic processing of music sequences using a selective attention task in participants who vary in music reading expertise. We found that holistic effects were strategic in novices but relatively automatic in experts. Correlational analyses revealed that individual holistic effects were predicted by both individual music reading ability and neural responses for musical notation in the right fusiform face area (rFFA), but in opposite directions for experts and novices, suggesting that holistic effects in the two groups may be of different nature. To characterize expert perception, it is important to measure not only the tendency to process objects as wholes but to test whether this effect is dependent on task constraints.

Keywords: holistic processing, music reading, perceptual expertise, Chinese character, rFFA


Holistic processing, the tendency to process objects as wholes rather than as parts, is regarded as a hallmark of face recognition (Farah, Wilson, Drain, & Tanaka, 1998; Maurer, Grand, & Mondloch, 2002; Young, Hellawell, & Hay, 1987). Holistic processing has different meanings in Psychology, sometimes conceptualized as better recognition performance for object parts in the context of a whole object (the part-whole effect; Tanaka & Farah, 1993) or evidence that patterns are not represented by a set of structural units (Chen, Allport &Marshall, 1996). Here, we are concerned with a specific definition of holistic processing whereby observers are shown to be unable to selectively attend to part of an object (as in the composite effect; Young et al., 1987). Such failures of selective attention are associated with perceptual expertise for various non-face object categories including cars (Gauthier, Curran, Curby, & Collins, 2003), fingerprints (Busey & Vanderkolk, 2005) and novel objects such as Greebles and Ziggerins (Gauthier, Williams, Tarr, & Tanaka, 1998; Gauthier & Tarr, 2002; Wong, Palmeri, & Gauthier, 2009). Holistic processing effects are also stronger for those faces with which we have the most experience, such as faces of one's own race (Michel, Rossion, Han, Chung, & Caldara, 2006; Tanaka, Kiefer & Bukach, 2004) or one's own age (de Heering & Rossion, 2008).

Although holistic processing is found to increase with perceptual experience, other work suggests that holistic effects can be found with unfamiliar objects. For example, holistic effects were obtained with novel objects (Greebles) when they were presented in the context of faces, or if they were first encoded as two misaligned halves rather than as a whole object (Richler, Bukach & Gauthier, 2009). In another study, people with no expertise with Chinese characters processed them holistically, while experts did not (Hsiao & Cottrell, 2009). Holistic effects in novices appear inconsistent with an expertise account of holistic processing. However, are the causes of these holistic effects in novices similar to those in perceptual experts?

Here, we take a first empirical step towards addressing this question, as a more theoretical approach would be difficult without a consensus regarding the mechanisms underlying holistic processing, even in the case of face perception (Richler, Gauthier, Wenger & Palmeri, 2008; Jacques & Rossion, 2009)1. We defined holistic processing as a failure of selective attention because this specific definition has been used both in studies where holistic effects depend on expertise (e.g., Wong et al., 2009) and in others where the effect was obtained in novices (Richler et al., 2009; Hsiao & Cottrell, 2009). We asked participants to selectively attend to part of an object (either the top or bottom half) and ignore the irrelevant part. Holistic processing is thus indexed as the difference between congruent and incongruent trials, in which the irrelevant part either led to an identical or conflicting response to that for the target part respectively (the congruency effect, CE; Cheung, Richler, Palmeri & Gauthier, 2008; Richler, Gauthier, Wenger & Palmeri, 2008; Richler, Tanaka, Brown & Gauthier, 2008; Wong, Palmeri & Gauthier, 2009).

To compare holistic effects in novices and in experts, it is important to note that failures of selective attention occur in many different circumstances and tasks, but that not all of the CEs indicate a perceptual tendency of processing objects as wholes (i.e., holistic processing). For example, one can obtain a CE when Greebles are viewed in the context of aligned faces but not for misaligned faces (Richler et al., 2009), suggesting that the CE for Greebles was induced under a specific context rather than reflecting a stable perceptual tendency. Also, with any selective attention paradigm, a congruency effect may simply arise because of competing responses, similar to the classic Stroop effect where a reading response interferes with the response to ink color (MacLeod, 1991), which, again, is unrelated to perceptual tendency. In contrast, holistic effects obtained in perceptual experts, at least in the case of expert face perception, appear to be more automatic and remain relatively stable across various contexts. First, holistic effects for faces are relatively unaffected by task constraints that should influence the top-down deployment of attention, for instance whether participants are asked to attend to one part of the face throughout the study (e.g. Michel et al., 2006) or asked to switch attention unpredictably to either the top or bottom part of a face (e.g. Richler et al., 2009). Second, holistic effects for faces cannot be explained by Stroop-like response interference but rather by perceptual interference (Richler, Cheung, Wong & Gauthier, 2009). Overall, this led us to hypothesize that holistic effects in experts are more automatic and more stable across various conditions, while holistic effects in novices can be obtained sometimes but not others, depending on the specific contexts, strategies and task constraints.

To test this hypothesis, we sought an object category for which both experts and novices might produce holistic effects (since in prior studies, novices or experts showed holistic effects on the same task but with different material). We chose to explore the domain of musical notation for a number of reasons. We thought holistic processing may be engaged in expert music readers as it is for faces, because of the critical importance of spatial relationships between notes (Sloboda, 1978). Short sequences of musical notation are also roughly comparable in visual complexity to the Chinese characters that produced holistic effects in novices in Hsiao & Cottrell (2009) and like these stimuli, they are composed of parts that each hold limited meaning for novices. Fluency with musical notation varies greatly across individuals (even among experts), allowing us to examine how holistic effects vary as a function of music reading skill.

Here, we tested the ability to selectively attend to part of a four-note sequence in music reading experts and novices with a sequential matching task, and whether it is stable across contexts prompting for different attention strategies. On each trial, two four-note sequences were presented sequentially, and participants were asked to judge whether the target note (one of the four notes cued by two arrows in the second sequence) was the same or different from the equivalent note in the first sequence. Similar to previous studies (Cheung, Richler, Palmeri & Gauthier, 2008; Richler, Gauthier, Wenger & Palmeri, 2008; Richler, Tanaka, Brown & Gauthier, 2008; Wong, Palmeri & Gauthier, 2009), congruency was manipulated by shifting an irrelevant part (a note adjacent to the target note) which led to a response congruent or incongruent to the correct response, and we used the congruency effect (the CE) to index the degree of holistic processing.

In two experiments, we created different contexts by manipulating target position (central or peripheral in the sequence) and target distribution (mostly at center, mostly in periphery, or evenly distributed in four positions), and thus encouraged participants to pay more attention to some notes than others. This manipulation serves to reveal the extent to which the obtained holistic effects are dependent on various attention strategies. In particular, if the cause of holistic effects is strategic, the magnitude of the CE should be different as a function of target position and target distributions. Specifically, the CE should be small when targets appear in attended locations (encouraged by higher target distribution in those locations), because the distractors then fall into locations that are relatively ignored. When targets appear in locations that are relatively ignored, however, the distractors fall in areas that “receive” more attention and are thus more likely to interfere. Therefore, a contextual CE would be stronger when targets appear in relatively infrequent locations. In contrast, if holistic effects are relatively automatic, the magnitude of the CE should be relatively similar across target positions and target distributions, even though different attentional strategies are expected. To preview our results, we found that holistic effects can be obtained with music sequences in both experts and novices. Across two experiments, we found that holistic effects for novices were highly strategic, while they were more stable across contexts in experts.

Experiment 1

Method

Participants

Sixty-six participants were recruited from Vanderbilt University and the Nashville community (age 18-40) for course credits or cash payment. All participants reported amount of experience in music reading and rated their music-reading ability (1 = do not read music at all; 10 = expert in music reading). They were subsequently divided into an expert group (N=25, 6 males and 19 females, with years of experience ≥ 8 and a self-rating score ≥ 8), a novice group (N=22, 7 males and 15 females, participants reported that they could not read music), and an intermediate group (N=19, 8 males and 11 females, participants who read music but did not meet the expert criteria). All reported normal or corrected-to-normal vision and gave informed consent according to the guidelines of the institutional review board of Vanderbilt University.

Stimuli & Design

The experiment was conducted on Power Macs G4 using Matlab (Natick, MA) with the Psychophysics Toolbox extension (Brainard, 1997; Pelli, 1997). Forty sets of four-note music sequences were generated with ‘Finale notepad’ (http://www.finalemusic.com/notepad/), and the notes were linked up with a straight line (Fig. 1). The pitch of the notes was randomly generated, under the constraint that pairs of adjacent notes were at least two steps apart, so that changes would not alter the global contour of the music sequences. The notes ranged from the note below the bottom line (a ‘D’) to the note above the top line (a ‘G’). All stimuli were black and shown on white background at approximately 4.8° × 4.8° degrees of visual angle.

Figure 1.

Figure 1

The sequential matching paradigm for Experiment 1, in which (a) illustrates a different congruent trial (both the cued target note and an irrelevant note was shifted on the 2nd sequence); and (b) shows a same incongruent trial (the cued target note did not change but an irrelevant note was shifted on the 2nd sequence). (c-d) are examples of the music sequences used in Experiment 2, in which the notes are either (c) connected or (d) disconnected.

A sequential matching paradigm was used. On each trial, a fixation cross was shown for 500ms, a first note sequence for 750ms, a mask for 500ms, and a second note sequence for 2500ms (Fig. 1 a-b). A target note in the 2nd sequence was cued with two arrows. Participants were asked to judge whether the target note was the same or different from the equivalent note in the first sequence. Half of the trials were ‘same’ trials with the target note unchanged, the other half were ‘different’ trials with the target note shifted one step up or down. Participants were instructed to respond only according to the matching status of the target note by key press. Both speed and accuracy were emphasized and responses were required within 2500ms after the onset of the 2nd sequence, or were counted as errors (< 1% of the trials).

Three factors were manipulated, with Group as a between-subject factor (experts / intermediates / novices), and Congruency (congruent / incongruent) and Target Position (center / periphery) as two within-subject factors. Targets appeared in the two center positions of the sequence (the 2nd or the 3rd note) on 75% of the trials, and in the peripheral positions (the 1st or the 4th note) on 25% of the trials. This context was intended to encourage relatively more attention to notes in the central positions than to those in the periphery. A note adjacent to the target was considered the “distractor” (left or right counterbalanced if the target was one of the central two notes). In the 2nd sequence, the distractor note could be shifted one step up or down, resulting in different congruency conditions. Specifically, on congruent trials, the distractor note remained unchanged (compared to the 1st sequence) on ‘same’ trials while it changed on ‘different’ trials. For incongruent trials, the distractor note changed on ‘same’ trials and remained unchanged on ‘different’ trials. Dependent measures were sensitivity (d’) and response time (RT) for correct responses. Holistic processing was defined as the congruency effect (the CE), using the difference in performance (d’ or RT) between congruent trials and incongruent trials. If the CE for novices is due to context-dependent strategies, the CE should be modulated by target position, while the CE for experts should be minimally modulated if it is relatively automatic. There were a total of 256 trials, with 64 trials for each of the four within-subject conditions. Ten practice trials with feedback were included, followed by test trials without feedback.

Results and Discussion

Data for two novices and one intermediate music reader were excluded because their overall accuracy was lower than 60%. Examination of the stimuli post-experiment revealed that shifting the position of the target or distractor note in the 2nd sequence caused a change of the angle of the line (connecting the four notes) between the two sequences in 27 of the 256 trials. To eliminate the possible confound of the angle change with our manipulations, these trials were excluded from data analysis.

For each condition, the CE was computed using delta d’ (congruent d’ – incongruent d’) or delta RT (incongruent RT – congruent RT). A 3×2 ANOVA (Group × Position) on delta d’ revealed a main effect of Target Position, F (1,60) = 8.18, p = .006, with a larger CE for periphery-target trials. The interaction between Group and Target Position was significant, F (2,60) = 4.17, p = .020 (Fig. 2). Scheffé tests (p < .05) revealed that the delta d’ was larger for the periphery-target trials than the center-target trials only for novices. Also, the delta d’ for the periphery-target trials was larger for novices than for the intermediate group (p<.05), and marginally larger than for the expert group (p = .088). No main effect or interaction reached significance for delta RT.

Figure 2.

Figure 2

The mean delta d’ (a) and delta RT (b) for Experiment 1. Error bars show the 95% CI for the within-subject effects for the Group × Position interaction.

Planned t-tests were performed for each condition to see if the delta d’ or delta RT was reliably different from zero, i.e., whether the CE was significant in each condition. Results revealed that only the delta d’ for novices and experts in periphery positions were significant (p < .05).

Our results suggest that the CE can be found with music sequences for participants with all levels of expertise. Critically, target position affected the CE for novices but not that for experts, supporting our hypothesis that the CE was more strategic in novices than in experts.

Experiment 2

In Experiment 2, target distribution was manipulated parametrically to further test the hypothesis that the CE is more influenced by context in novices than in experts. Furthermore, we tested whether connecting the notes influences the CE by facilitating the spread of attention between notes within sequences (Egly, Driver, & Rafal, 1994). We also assessed whether the magnitude of the CE is predicted by music reading ability (perceptual fluency with musical notation measured in both experts and novices) and years of experience in music reading (in experts only). Finally, we studied the neural correlates of the CE for music sequences by analyzing the results for 9 music reading experts and 9 novices who participated in the present study as well as in a previous fMRI experiment with musical notation (Wong & Gauthier, 2010). We focused our analyses on the right fusiform face-selective area (rFFA), in which an increased activity is related to holistic processing of faces (Rotshtein, Geng, Driver & Dolan, 2007; Schiltz & Rossion, 2006) and other objects of expertise (Gauthier & Tarr, 2002; Wong, Palmeri, Rogers, Gore, & Gauthier, 2009).

Method

Participants

Nineteen participants who had more than 10 years of music reading experience and/or considered themselves music reading experts were recruited as experts (8 males and 11 females; 18-32 years old) and 19 participants who reported being unable to read music were recruited as novices (4 males and 15 females, 18-33 years old) for cash payment. Similar to Experiment 1, experts had more years of experience in music reading (on average 13.8 years for experts and 0.42 year for novices) and higher self-rating ability (on average 8.84 for experts and 1.21 for novices on a 10-point scale). All reported normal or corrected-to-normal vision and gave informed consent according to the guidelines of the institutional review board of Vanderbilt University.

Stimuli and Design

The experiment was conducted on Power Macs G4 using Matlab (Natick, MA) with the Psychophysics Toolbox extension (Brainard, 1997; Pelli, 1997). A total of 768 four-note music sequences were randomly generated using Matlab under the constraint that all the notes would have the same valid pointing direction before and after any shift. All stimuli were black on a white background and subtended a visual angle of about 7.2° × 4.8° degrees.

A sequential matching paradigm was used as in Experiment 1, except that the second sequence in each trial was presented for 2000ms instead of 2500ms. Five factors were manipulated, with Group as a between-subject factor (experts / novices), and Congruency (congruent / incongruent), Target Position (center / periphery), Target Distribution (25p75c / 50p50c / 75p25c; see below) and Connection (connected / disconnected) as four within-subject factors. The manipulation of Congruency and Target Position were identical to Experiment 1. Target Distribution had three levels and was manipulated across blocks of trials. Within each block, targets were distributed either 25% on periphery and 75% at center (25p75c; the distribution used in Experiment 1), 50% on periphery and 50% at center (50p50c), or 75% on periphery and 25% at center (75p25c). The order of the three target distributions was counterbalanced across participants within each group. Participants were told about the target distribution immediately before each block. The four notes were either connected with a horizontal line parallel to the staff or not (Fig. 1c & 1d). Participants were asked to ignore the irrelevant non-target notes that might interfere with their judgment. Dependent measures were sensitivity (d’) and response time (RT) for correct responses. There were a total of 768 trials, with 32 trials for each of the four within-subject conditions. Ten practice trials with feedback were included before the test trials (without feedback).

Measure of perceptual fluency

We quantified perceptual fluency with music sequences in a sequential matching paradigm with 4-note music sequences (Wong & Gauthier, 2010). On each trial, a fixation cross was presented at the center of the screen for 200ms, followed by a 500ms pre-mask, a target four-note sequence for a varied duration and, after a 500ms post-mask, two four-note sequences appeared side-by-side, one identical to the first sequence, and the other with one of the notes shifted by one step (randomly chosen out of the four notes, with the up/down shifts counterbalanced). The task was to select the matching sequence by key press. We estimated a perceptual threshold using QUEST (Psychtoolbox; Watson & Pelli, 1983) as the duration of the target sequence required to keep performance at 80% accuracy. Sequences were randomly generated, using notes ranging from the note below the bottom line (a ‘D’ note) to the note above the top line (a ‘G’ note). Contrast for all the stimuli was lowered by about 60% to avoid a ceiling effect. The threshold was measured twice, each with 100 trials, and the two thresholds were averaged.

To control for individual differences not specifically tied to expertise with notes, perceptual fluency for four-letter strings was also measured in an identical procedure. The four-letter strings were randomly generated with 11 letters: b,d,f,g,h,j,k,p,q,t,y. These letters were selected because they contain parts extending upward or downward, similar to musical notation. To create the distractor string, one of the four letters was chosen (counterbalanced across stimuli) and replaced by a different letter randomly drawn from the set. The string stimuli were also shown with the same lowered contrast as the note sequences.

Results and Discussion

One novice was excluded from further analysis because of an overall accuracy less than 60%. Similar to Experiment 1, the CE was measured with delta d’ (congruent d’ – incongruent d’) or delta RT (incongruent RT – congruent RT). Examination of the data revealed no speed-accuracy trade-off.

A 2×3×2×2 ANOVA (Group × Distribution × Target Position × Connection) was performed on both delta d’ and delta RT. None of the main effects or interactions involving Connection reached significance (all ps > .05) and therefore, we summarize here the results of 2×3×2 ANOVAs (Group × Distribution × Target Position) on delta d’ and delta RT.

In terms of delta d’, the main effect of Group approached significance, F (1,35) = 3.83, p = .058, with the CE for novices larger than for experts. Also, a main effect of Distribution was obtained, F (2,70) = 4.49, p = .015, which interacted with Group, F (2,70) = 3.57, p = .033, with the CE for experts unaffected by target distribution, while that for novices was higher for 25p75c than 75p25c (Fig. 3a).

Figure 3.

Figure 3

The mean delta d’ (a) and delta RT (b) for Experiment 2. Error bars show the 95% CI for the within-subject effects for the Group × Position interaction.

For delta RT, a main effect of Distribution was significant, F (2,70) = 4.57, p = .014, which interacted with Target Position, F (2,70) = 4.16, p = .020. More interestingly, the 3-way interaction of Group × Distribution × Target Position was significant for delta RT, F (2,70) = 5.96, p = .004 (Fig. 3b). We subsequently performed the 3×2 ANOVA (Distribution × Target Position) separately for the two groups. For novices, the interaction between Distribution and Target Position was significant, F (2,34) = 7.51, p = .002 (Fig. 3b). Scheffé tests (p<.05) revealed that the CE for novices was modulated by target likelihood, as the CE increased when the target appeared in relatively unexpected positions (for 75p25c, the CE was larger for center-target trials than periphery-target trials; for 25p75c, the CE was larger for periphery-target trials than center-target trials). Also, the CE for periphery-target trials was larger for 25p75c than 50p50c and 75p25c. For experts, the CE was affected by Distribution, F (2,36) = 3.41, p = .04 (Fig. 3b). Post-hoc tests (LSD, p<.05) revealed that the CE for 25p75c was larger than 75p25c. No other main effects or interactions reached significance.

Planned t-tests were performed for each condition to see if the delta d’ or delta RT was reliably different from zero, i.e., whether the CE was significant in each condition. Results revealed that novices produced significant CEs (p < .05) in various conditions on both delta d’ (center-target trials in 25p75c, and periphery-target trials in 50p50c and 75p25c) and delta RT (center-target trials in all target distributions, and periphery-target trials in 25p75c). For experts, the CE was significant in some conditions on delta RT (periphery-target trials in 25p75c, and center-target trials in 25p75c and 50p50c) (p < .05).

Our results replicated findings in Experiment 1 in that significant CEs were found for both experts and novices, with a stronger CE stronger for novices than for experts. Both experiments converged to suggest that the CE in novices was modulated by target position. However, the effect was obtained with delta d’ in Experiment 1 and with delta RT in Experiment 2. It could arise from the differences in the design of the two experiments, such as (1) the target distribution was implicitly manipulated in Experiment 1, while participants in Experiment 2 were explicitly informed about the target distribution manipulations; or (2) there was only 1 target distribution manipulated in Experiment 1, while there were 3 different target distributions (manipulated in different blocks) in Experiment 2, which might have led to adjustment of strategies for novices.

Both experts and novices were influenced by target distribution, with larger CEs in blocks where targets were more frequent in the center than periphery of the sequence (this effect was found in delta d’ in novices and delta RT in experts). This could be due to the fact that the distractor note was always adjacent to the target note. When periphery targets were more frequent, distractors never appeared in any of the attended periphery positions. However, in blocks where central targets were more frequent, distractors appeared in one of the more attended central positions for half of the trials, possibly leading to a larger CE for all participants.

More importantly, delta RT revealed that the CE for novices depended on whether the target appeared in an expected position (determined by both target distribution and target position), supporting our hypothesis that the novice CE is caused by context-dependent attentional strategies. In contrast, the absence of this target likelihood effect for experts is consistent with our hypothesis that the CE for experts is more automatic.

Slope analysis

To quantify the degree to which holistic effects are dependent on context, we examined the extent to which the CE was modulated across different target distributions in each group. For each participant, the slope of delta d’ (slopeDeltaD’) and delta RT (slopeDeltaRT) for either the center or periphery positions was calculated across the distributions of 25p75c, 50p50c and 75p25c using the least square method, with the linear weights of [-1, 0 1] for each distribution respectively.

A 2×2 ANOVA (Group × Target Position) was performed on slopeDeltaD’ & slopeDeltaRT separately. The main effect of Group was significant for slopeDeltaD’, F (1,35) = 6.37, p = .016, such that the magnitude of the slopeDeltaD’ was larger for novices than experts (Fig. 4a). For slopeDeltaRT, the main effect of Target Position was significant, F(1,35) = 7.31, p = .01, and the effect of Target Position interacted with Group, F(1,35) = 13.2, p < .001 (Fig. 4b). Scheffé tests (p < .05) revealed that the slopeDeltaRT was larger for center-target trials than periphery-target trials for novices, but it stayed similar across target positions for experts.

Figure 4.

Figure 4

The slope of delta d’ (a) and the slope of delta RT (b) across the distributions of 25p75c, 50p50c and 75p25c generated using the least square method. Error bars show the 95% CI within-subject effects of the Group × Position interaction.

This further supports our hypothesis that the CE for novices is context-dependent, while that for experts is more stable across different contexts.

Correlation analyses

I. Conditions included in the correlation analyses

Our goal was to test the relationship between the CE and (1) music reading ability measured with perceptual fluency; (2) years of music reading experience; and (3) the neural activity in the rFFA. In our design, there were 12 conditions from which we could extract the CE data for correlation analyses (2 Target Position × 3 Target Distribution, on delta d’ or delta RT). To keep the number of statistical comparisons to a minimum, we reduced the number of contrasts by focusing on 5 conditions that reflected the largest group differences in the CE (based on the above behavioral findings). First, we focused on the CE measured with delta RT as it expressed the most important contextual differences between the two groups. To capture the effects for experts (modulated by target distribution irrespective of target positions), we included the 3 target distribution conditions with the CE for the two target positions averaged. To capture the effects for novices (modulated by target likelihood), we recoded our conditions into 3 levels of target likelihood (unlikely / equal / likely), in which ‘unlikely’ trials refer to periphery-target trials for 25p75c and center-target trials for 75p25c, ‘likely’ trials refer to center-target trials for 25p75c and periphery-target trials for 75p25c, and ‘equal’ trials refer to 50p50c for both center- and periphery-target trials. Note that the condition of 50p50c is in fact identical to the ‘equal’ condition.

To summarize, we included the CE measured with delta RT in 5 different experimental conditions (25p75c / 50p50c or equal / 75p25c / unlikely / likely) in the correlation analyses between the CE and (1) perceptual fluency with musical notation (in both experts and novices); (2) years of experience in music reading (in experts only); and (3) neural activity in the rFFA.

II. Predicting the CE with perceptual fluency for music sequences

Two novices failed to participate in this part of the experiment. One novice and two experts were excluded because they had a perceptual threshold more than 3 s.d. away from the mean of the rest of their group. Therefore 17 experts and 16 novices were included in the correlational analyses.

As expected, experts could match music sequences faster than novices (mean = 239.3ms for experts; 868.4ms for novices), confirmed by a significant 1-way ANOVA for Group on the perceptual threshold for music sequences, F (1,31) = 102.5, p ≤ .0001. In contrast, the perceptual threshold for letter strings did not differ between groups (p > .1).

We considered the relationship between delta RT in each of five diagnostic conditions and the difference between the perceptual threshold for notes relative to that for letters (N-L). For experts, N-L was negatively correlated with delta RT for 75p25c (r = -.566, p = .018) (Fig. 5a). For novices, in contrast, N-L was positively correlated with delta RT in the same condition (75p25c; r = .699, p = .003) (Fig. 5a), as well as the target-likely condition (r = .647, p = .007). Thus, the CE decreases with perceptual fluency in novices but it increases with perceptual fluency in experts.

Figure 5.

Figure 5

Correlations between holistic processing and various behavioral measures and neural selectivity for notes in various visual areas, including (a) perceptual fluency for experts (black dots and solid line) and novices (open circles and dotted line) in 75p25c; Neural selectivity for music sequences in the rFFA for experts in 75p25c (b) and that for novices in ‘unlikely’ condition (c).

III. Predicting the CE with years of music reading experience

To test whether the CE pattern in experts relates to musical training, the CE was correlated with years of experience in music reading (ranging from 7 to 21 years) for the expert group only. Results revealed that more years of experience was associated with a larger CE with marginal significance (r = .473, p = .055), consistent with the possibility that the CE in experts could be due, at least in part, to musical training (rather than a priori individual differences).

IV. Predicting the CE with visual selectivity in the rFFA

To test whether holistic processing of music sequences is related to neural activity in the rFFA, as found for faces (Rotshtein et al., 2007; Schiltz & Rossion, 2006) and other non-face object categories (Gauthier & Tarr, 2002; Wong et al., 2009), we analyzed the relationship between the CE in the present study and the neural selectivity for musical sequences investigated in an fMRI study, in which 9 experts and 9 novices from the present study also participated. Details of the fMRI methods can be found in Wong & Gauthier (2010). Briefly, the neural selectivity for music sequences (five-note music sequences vs. five-letter strings and five-symbol strings) was measured within the group-defined rFFA (with separate localizer runs) while participants performed simple visual judgments on the stimuli (either detecting immediate repeats of images or a gap on the five-line background that existed in all stimuli). The results were combined across tasks, both in the original fMRI study and here, ensuring some degree of task-general selectivity in the fMRI measure.

For experts, delta RT was negatively correlated with neural selectivity for music sequences in the rFFA for 75p25c (r = -.942, p < .001; Fig. 5b), and a similar negative correlation approached significance for 2 other conditions (50p50c or ‘equal’ trials, r = -.648, p = .059; ‘unlikely’ trials, r = -.605, p = .084). For novices, in contrast, delta RT was positively correlated with neural selectivity for music sequences for ‘unlikely’ trials (r = .902, p = .001; Fig. 5c) and 25p75c (r = .669, p = .049). These results are surprising in many ways. Despite the fact that the rFFA was not selective for musical notation even in individuals with musical experience (Wong & Gauthier, 2010), activity in this region predicts holistic effects with musical notation. In addition, this relationship goes in opposite directions for novices and experts.

Discussion

In two experiments, we demonstrated that holistic effects can be obtained both in novices and in experts with the same task and object category. While the magnitude of holistic effects in at least some of our experimental conditions was predicted by music reading ability and by neural activity for music sequences in the rFFA, the relationship ran in opposite directions for experts and novices. Our findings suggest that, even though holistic effects in novices can reach a comparable (or even larger) magnitude than in experts, the holistic effects for the two groups are of different nature. Congruency effects in novices are highly dependent on experimental context with effects that appear more strategic, while in experts they appear much more stable and automatic.

In novices, holistic effects may be caused by relatively slow processing of music sequences, forcing participants to attend to notes that correspond to the most probable target positions. This is suggested by the fact that the CE for novices increased when targets appeared in unexpected positions and by the fact that perceptual fluency for music sequences is inversely related to the CE in novices (this strategy would be less necessary for novices who process notes more efficiently). In other words and paradoxically, holistic effects for novices may be caused by higher selective attention to part of the objects, instead of a failure of selective attention, as is the case in face perception (Richler, Tanaka, Brown & Gauthier, 2008).

In music reading experts, we observed relatively automatic holistic processing, which increase with perceptual fluency for musical notes and with the number of years of musical training. To speculate, this may reflect an increased tendency to process relative positions of notes in music sequences as one gains in proficiency with musical notation. Because our congruency manipulations, i.e., the shift of task-irrelevant notes, always change the relative positions of notes, these changes may be particularly salient to advanced music readers. Intuitively, music reading experts may have developed increased sensitivity to the configuration of notes to allow faster recognition of familiar patterns (e.g. scales or melodic lines of a piece), to enhance prediction of what different note sequences sound like (by processing the intervals between notes) and to enable effective planning of motor execution. In other words, music reading experts may have developed the ability to identify both individual parts and the configural information within a sequence, not unlike what is observed in face recognition (Young et al., 1987; Hayward, Rhodes & Schwaninger, 2008).

It is important to realize that the present attempt to relate holistic effects to neural responses for notes was primarily exploratory, taking advantage of the same people participating in our behavioral and fMRI studies, and that it mainly focused on the ROI associated with holistic processing in prior work. Nonetheless, the correlations with activity in the rFFA are interesting for a number of reasons. First, they are consistent with correlations with perceptual fluency (and the rest of our behavioral results) in suggesting a difference in the nature of holistic effects for novices and experts. Second, they reveal that activity in the rFFA is not necessarily positively correlated with holistic processing. One possibility to explain the negative correlation of rFFA activity and holistic processing in experts is that for expert readers of musical notation, the many codes (visual, auditory, motor) available for musical notation reduce reliance on this extrastriate area. After all, in our previous work, the FFA did not prove to be an area selective for musical notation in expert music readers (Wong & Gauthier, 2010), consistent with the idea that different training experiences lead to different areas being recruited in the visual system (Wong et al., 2009). In addition to revealing many areas recruited outside the visual system by expertise with notes, our prior fMRI work also uncovered expertise effects for notes in retinotopic cortex. Changes in early visual cortex could be related to the need for rapid processing of musical notation in both foveal and parafoveal visual regions, or to other similarities between music reading and perceptual learning tasks that recruit retinotopic cortex (Furmanski et al., 2004; Schiltz et al., 1999; Schwartz et al., 2002; Sigman et al., 2005). Further studies are needed to investigate the specific role of the FFA in holistic processing across a variety of domains.

Our findings address an interesting paradox in the study of holistic processing. Holistic effects have been found to increase with perceptual experience (Gauthier & Tarr, 2002; Wong et al., 2009) but have also been obtained in novices (Hsiao & Cottrell, 2009; Richler et al., 2009). We show that observing holistic effects is not sufficient evidence of underlying holistic processing, since holistic effects for notes in novices were shown to be context-dependent and quite different from the automatic holistic processing that is considered the hallmark of object and face expertise. It is possible that holistic effects in novices for Chinese Characters (Hsiao & Cottrell, 2009) were likewise strategic, but this would need to be tested by varying different aspects of the task. In fact, we suggest that in investigations of holistic processing across various domains, including developmental studies or comparisons of different populations on face recognition skills or training studies with objects, it is important to examine both the magnitude of holistic processing and whether it varies across task and contextual manipulations.

Acknowledgments

This research was supported by grants to the Temporal Dynamics of Learning Center (NSF Science of Learning Center SBE-0542013), the James S. McDonnell Foundation to the Perceptual Expertise Network, the Vanderbilt Vision Research Center (P30-EY008126), and by the National Eye Institute (2 R01 EY013441-06A2).

Footnotes

1

There is as of yet no consensus regarding whether holistic effects arise due to representational factors (e.g., parts not being explicitly represented) or to more decisional effects (e.g., dependencies in the decisions that use individually represented parts).

Reference

  1. Brainard DH. The psychophysics toolbox. Spatial Vision. 1997;10:433–436. [PubMed] [Google Scholar]
  2. Busey T, Vanderkolk J. Behavioral and electrophysiological evidence for configural processing in fingerprint experts. Vision Research. 2005;45(4):431–448. doi: 10.1016/j.visres.2004.08.021. [DOI] [PubMed] [Google Scholar]
  3. Chen YP, Allport DA, Marshall JC. What are the functional orthographic units in Chinese word recognition: The stroke or the stroke pattern? The Quarterly Journal of Experimental Psychology. 1996;49A(4):1024–1043. [Google Scholar]
  4. Cheung OS, Richler JJ, Palmeri T, Gauthier I. Revisiting the role of spatial frequencies in the holistic processing of faces. J Exp Psychol Hum Percept Perform. 2008;34(6):1327–1336. doi: 10.1037/a0011752. [DOI] [PubMed] [Google Scholar]
  5. Egly R, Driver J, Rafal RD. Shifting visual attention between objects and locations: Evidence from normal and parietal lesion subjects. J Exp Psychol Gen. 1994;123(2):161–177. doi: 10.1037//0096-3445.123.2.161. [DOI] [PubMed] [Google Scholar]
  6. Farah MJ, Wilson KD, Drain M, Tanaka JN. What is “Special” about Face Perception? Psychological Review. 1998;105(3):482–498. doi: 10.1037/0033-295x.105.3.482. [DOI] [PubMed] [Google Scholar]
  7. Furmanski CS, Schluppeck D, Engel SA. Learning strengthens the response of primary visual cortex to simple patterns. Current Biology. 2004;14(7):573–578. doi: 10.1016/j.cub.2004.03.032. [DOI] [PubMed] [Google Scholar]
  8. Gauthier I, Curby KM, Skudlarski P, Epstein RA. Individual differences in FFA activity suggest independent processing at different spatial scales. Cognitive, Affective, & Behavioral Neuroscience. 2005;5(2):222–234. doi: 10.3758/cabn.5.2.222. [DOI] [PubMed] [Google Scholar]
  9. Gauthier I, Curran T, Curby KM, Collins D. Perceptual interference supports a non-modular account of face processing. Nat Neurosci. 2003;6(4):428–432. doi: 10.1038/nn1029. [DOI] [PubMed] [Google Scholar]
  10. Gauthier I, Tarr MJ. Unraveling mechanisms for expert object recognition: Bridging brain activity and behavior. Journal of Experimental Psychology: Human Perception and Performance. 2002;28(2):431–446. doi: 10.1037//0096-1523.28.2.431. [DOI] [PubMed] [Google Scholar]
  11. Gauthier I, Williams P, Tarr MJ, Tanaka J. Training “Greeble” experts: A framework for studying expert object recognition processes. Vision Research. 1998;38(15/16):2401–2428. doi: 10.1016/s0042-6989(97)00442-2. [DOI] [PubMed] [Google Scholar]
  12. Ge L, Wang Z, McCleery J, Lee K. Activation of face expertise and the inversion effect. Psychological Science. 2006;17(1):12. doi: 10.1111/j.1467-9280.2005.01658.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Hayward WG, Rhodes G, Schwaninger A. An own-race advantage for components as well as configurations in face recognition. Cognition. 2008;106:1017–27. doi: 10.1016/j.cognition.2007.04.002. [DOI] [PubMed] [Google Scholar]
  14. Hsiao JH, Cottrell GW. Not all visual expertise is holistic, but it may be leftist: The case of chinese character recognition. Psychological Science. 2009;20(4):455–463. doi: 10.1111/j.1467-9280.2009.02315.x. [DOI] [PubMed] [Google Scholar]
  15. Levy-Agresti J, Sperry RW. Differential perceptual capacities in major and minor hemispheres. Proceedings of the National Academy of Sciences. 1968;61:1151. [Google Scholar]
  16. MacLeod CM. Half a century of research on the Stroop effect: An integrative review. Psychological Bulletin. 1991;109:163–203. doi: 10.1037/0033-2909.109.2.163. [DOI] [PubMed] [Google Scholar]
  17. Maurer D, Grand RL, Mondloch CJ. The many faces of configural processing. Trends in Cognitive Sciences. 2002;6(6):255–260. doi: 10.1016/s1364-6613(02)01903-4. [DOI] [PubMed] [Google Scholar]
  18. Michel C, Rossion B, Han J, Chung CS, Caldara R. Holistic processing is finely tuned for faces of one's own race. Psychological Science. 2006;17(7):608–615. doi: 10.1111/j.1467-9280.2006.01752.x. [DOI] [PubMed] [Google Scholar]
  19. Patterson KE, Bradshaw JL. Differential hemispheric mediation of nonverbal visual stimuli. J Exp Psychol Hum Percept Perform. 1975;1:246–252. doi: 10.1037//0096-1523.1.3.246. [DOI] [PubMed] [Google Scholar]
  20. Pelli DG. The videotoolbox software for visual psychophysics: transforming numbers into movies. Spatial Vision. 1997;10:437–442. [PubMed] [Google Scholar]
  21. Richler JJ, Bukach CM, Gauthier I. Context influences holistic processing of nonface objects in the composite task. Attention, Perception & Psychophysics. 2009;71(3):530–540. doi: 10.3758/APP.71.3.530. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Richler JJ, Cheung OS, Wong AC-N, Gauthier I. Does response interference contribute to face composite effects? Psychonomic Bulletin & Review. 2009;16(2):258–263. doi: 10.3758/PBR.16.2.258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Rotshtein P, Geng JJ, Driver J, Dolan RJ. Role of features and second-order spatial relations in face discrimination, face recognition, and individual face skills: behavioral and functional magnetic resonance imaging data. Journal of Cognitive Neuroscience. 2007;19(9):1435–52. doi: 10.1162/jocn.2007.19.9.1435. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Schiltz C, Bodart JM, Dubois S, Dejardin S, Michel C, Roucoux A, et al. Neuronal mechanisms of perceptual learning: changes in human brain activity with training in orientation discrimination. Neuroimage. 1999;9(1):46–62. doi: 10.1006/nimg.1998.0394. [DOI] [PubMed] [Google Scholar]
  25. Schiltz C, Rossion B. Faces are represented holistically in the human occipito-temporal cortex. Neuroimage. 2006;32(3):1385–94. doi: 10.1016/j.neuroimage.2006.05.037. [DOI] [PubMed] [Google Scholar]
  26. Schwartz S, Maquet P, Frith C. Neural correlates of perceptual learning: a functional MRI study of visual texture discrimination. Proc Natl Acad Sci USA. 2002;99(26):17137–17142. doi: 10.1073/pnas.242414599. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Sigman M, Pan H, Yang Y, Stern E, Silbersweig D, Gilbert CD. Top-down reorganization of activity in the visual pathway after learning a shape identification task. Neuron. 2005;46(5):823–835. doi: 10.1016/j.neuron.2005.05.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Sloboda JA. Perception of contour in music reading. Perception. 1978;7(3):323–331. doi: 10.1068/p070323. [DOI] [PubMed] [Google Scholar]
  29. Waters AJ, Underwood G, Findlay JM. Studying expertise in music reading: use of a pattern-matching paradigm. Perception & Psychophysics. 1997;59(4):477–488. doi: 10.3758/bf03211857. [DOI] [PubMed] [Google Scholar]
  30. Watson AB, Pelli DG. QUEST: A Bayesian adaptive psychometric method. Perception & Psychophysics. 1983;33:113–120. doi: 10.3758/bf03202828. [DOI] [PubMed] [Google Scholar]
  31. Wong AC-N, Palmeri T, Gauthier I. Conditions for face-like expertise with objects: Becoming a Ziggerin expert - but which type? Psychological Science. 2009;20(9):1108–17. doi: 10.1111/j.1467-9280.2009.02430.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Wong AC-N, Palmeri T, Rogers BP, Gore JC, Gauthier I. Beyond shape: How you learn about objects affects how they are represented in visual cortex. PLoS ONE. 2009;4(12):e8405. doi: 10.1371/journal.pone.0008405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Wong YK, Gauthier I. A multimodal neural network recruited by expertise with musical notation. Journal of Cognitive Neuroscience. 2010;22(4):695–713. doi: 10.1162/jocn.2009.21229. [DOI] [PubMed] [Google Scholar]
  34. Young AW, Hellawell D, Hay D. Configural information in face perception. Perception. 1987;10:747–759. doi: 10.1068/p160747. [DOI] [PubMed] [Google Scholar]

RESOURCES