Skip to main content
Human Brain Mapping logoLink to Human Brain Mapping
. 2004 May 10;22(3):216–228. doi: 10.1002/hbm.20029

Nonlinearity of FMRI responses in human auditory cortex

Thomas M Talavage 1,2,3,4,, Whitney B Edmister 3,4,5
PMCID: PMC6872043  PMID: 15195288

INTRODUCTION

The hostile acoustic environment of functional magnetic resonance imaging (fMRI) has been shown to reduce the efficacy of experimentation using auditory stimuli [e.g., Shah et al., 1999]. Flexion of the gradient coils during echo‐planar image (EPI) data acquisition can produce intense noise at acoustic frequencies [e.g., Bandettini et al., 1998; Ravicz et al., 2000]. This “acoustic imaging noise” produces activity in the auditory cortex that competes with responses to a presented stimulus. Exposure to a relatively long total duration of acoustic imaging noise within a fixed unit of time (e.g., 4 sec in an 8‐sec interval) has also been shown to produce signal changes in auditory cortex on the order of 1% or greater [Talavage et al., 1999]. The simultaneous presence of two cortical responses lowers the contrast‐to‐noise ratio for the desired response, necessitating longer duration stimulus paradigms (i.e., more sample points) to achieve statistically significant results.

Temporal clustering of the acoustic imaging noise via a clustered volume acquisition [CVA; Edmister et al., 1999], also known as sparse sampling [Hall et al., 1999], can increase the measured amplitude and spatial extent of auditory cortex responses to a desired stimulus at a minimal cost to the rate of image acquisition and experiment duration. Improved blood oxygenation level‐dependent (BOLD) [Ogawa et al., 1990] responses result from extending the period of time in which the stimulus is presented without interruption by the acoustic imaging noise. In effect, clustering the acoustic imaging noise can reduce the degree (duration and amplitude) of interaction between the hemodynamic response to the acoustic imaging noise and the response to the desired stimulus, improving the ability to detect the latter response [e.g., Amaro et al., 2002; Belin et al., 1999; Di Salle et al., 2001; Eden et al., 1999; Hall et al., 2000].

We proposed previously [Edmister et al., 1999; Talavage et al., 1999] that the reduction in observed activation is a result of acoustic imaging noise raising image signal levels during periods in which the desired stimulus is absent (i.e., off periods) to levels approaching those observed during periods in which the desired auditory stimulus is present (i.e., on periods). The observation of a reduction in statistical significance under conditions of increasing duration of acoustic imaging noise per unit of imaging time (e.g., the repetition time [TR]) indicates that the hemodynamic responses to the desired and undesired stimuli are not adding in a linear fashion. Consequently, the use of traditional statistical analyses (e.g., t‐test, cross‐correlation) would be expected to result in detected differences of lesser significance.

This study was undertaken to establish that nonlinear addition of BOLD responses is the dominant source of reductions in the measurement of both amplitude and spatial extent of auditory cortex activation. Having demonstrated variations in signal changes associated with multiple levels of acoustic stimulation and acoustic noise across separate experiments, we felt that it was important to compare multiple states of acoustic stimulus presentation in a single experiment, thus providing a direct test of our hypothesis. To this end, we implemented a two‐stimulus, four‐condition paradigm that permitted examination of responses to both music and acoustic imaging noise in isolation and combination. We observed that addition of acoustic imaging noise to a music paradigm reduces detection of the cortical response to the music without necessarily producing a detectable response to the acoustic imaging noise. The current study therefore demonstrates that responses in the auditory cortex can combine in a nonlinear fashion, indicating that subtractive paradigms should be used with caution and that extensive validation of rapid‐presentation event‐related auditory fMRI experiments is warranted.

SUBJECTS AND METHODS

Imaging Parameters

All imaging experiments were conducted on a 1.5‐T General Electric (Waukesha, WI) Signa imager retrofitted for EPI (ANMR, Inc.). A single surface coil combined with headphones was placed over the left auditory cortex, as described in Talavage et al. [2000]. An asymmetric spin‐echo EPI pulse sequence (τ = −25 msec, TE = 70 msec, field of view [FOV] = 20 cm, 64 × 64 acquisition matrix) was used to acquire 48 images for each of 12 slices (thickness = 4 mm, in‐plane resolution = 3.125 mm × 3.125 mm). All experiments were conducted with a fixed repetition time (TR = 8 sec). The slices were acquired in an oblique plane, parallel to the Sylvian fissure such that the superior temporal plane, containing auditory cortex [e.g., Galaburda and Sanides, 1980], was centered in the volume. Only the central four slices of the 12‐slice volume were used for analysis. In Talavage et al. [1999], eddy currents were found to produce variable imaging volumes when a CVA was preceded by variable numbers of additional gradient readout sequences used to provide acoustic stimulation. The use of a reverse‐centric slice acquisition order (i.e., central slices acquired last) permitted acquisition of a constant imaging volume beginning with the ninth acquired slice.

Subjects

Five adult subjects (4 men, 1 woman; ages 26–29 years; all right‐handed) were imaged after providing informed consent. All five subjects were native speakers of English and had no history of hearing problems. Each subject participated in a single imaging session of 2 hours duration.

Acoustic Stimuli

The two acoustic stimuli presented to the subjects were: (1) a 24‐sec segment of instrumental music; and (2) variable duration (1.5, 2, 3, or 4 sec) of acoustic imaging noise, occurring at the end (final 1.5, 2, 3, or 4 sec, respectively) of each 8‐sec TR period. The instrumental music segment, from Prokofiev's “Romeo and Juliet” performed by Emerson, Lake, and Palmer on Black Moon (Rhino Records, 1992), was selected because it contains dynamic components exhibiting moderate amplitude modulations up to 8 kHz (Fig. 1). This piece has been used previously by the authors to produce robust activation, both as a stimulus [Edmister et al., 1999] and as a control [Talavage et al., 2000]. The spectrum of the acoustic imaging noise associated with EPI acquisition of a single image on the GE‐ANMR system is shown in Figure 2 [see also Ravicz et al., 2000]. The music stimulus contained appreciable signal energy out to 8 kHz, providing significant spectral overlap with the acoustic imaging noise.

Figure 1.

Figure 1

Spectrogram of the music stimulus used in these experiments. The music stimulus was the opening 24 sec of Prokofiev's “Romeo and Juliet” (Emerson, Lake, and Palmer, Black Moon; Rhino Records, 1992). The intensity is given in dB relative to the peak component in the spectrogram (i.e., the maximum signal value was 0 dB). Low‐pass cutoff (half‐power) frequency of the acoustic transducers was approximately 6 kHz.

Figure 2.

Figure 2

Spectrum of the acoustic imaging noise (32 msec duration) associated with blipped EPI acquisition of a single image on the 1.5‐T GE‐ANMR imager. The intensity of the harmonic at 8 kHz was approximately 40 dB down from the peak at 1 kHz. (Data courtesy of M.E. Ravicz)

Stimulus Presentation

The music was presented from a digital source, amplified, and input to acoustic transducers located in the imaging room. The output of the acoustic transducers passed through flexible tubing (3 m length) to and through earmuffs, finally being delivered to the subject just lateral to the ear canal. The presentation level was chosen by the subject to be comfortable in conditions of quiet (i.e., no acoustic imaging noise). The earmuffs attenuated the acoustic imaging noise by approximately 30 dB [Ravicz and Melcher, 1998]; therefore, the acoustic imaging noise level incident on the ear canal was approximately 85 dB SPL (sound pressure level) at 1 kHz.

Experimental Design

In each experimental run, the music and acoustic imaging noise stimuli were combined to achieve equal numbers of four condition blocks (Fig. 3a). Each condition block was three TR periods (24 sec) in length. To refer to each of the four condition blocks, we have used an ordered pair that indicates, first, the state of music presentation, and, second, the state of the acoustic imaging noise duration. For the first element, music was either (M) presented or (N) not presented, with each presentation persisting throughout the entire 24 sec duration of the corresponding condition block (Fig. 3a, top row). For the second element, the temporal duration of the acoustic imaging noise in a given condition block was classified as either (S) short, or (E) extended (Fig. 3a, bottom row). The four condition blocks (Fig. 3b) may thus be represented as NS (no music, short duration acoustic imaging noise), NE (no music, extended duration acoustic imaging noise), MS (music, short duration acoustic imaging noise), and ME (music, extended duration acoustic imaging noise).

Figure 3.

Figure 3

a: Depiction of the temporal order of presentation of the four condition blocks in these experiments. Top row: presence (M) or absence of music (N) presentation throughout a condition block. Bottom row: indicates that acoustic imaging noise associated with all volume acquisitions within the condition block was either short (S), (1.5‐sec duration) or extended (E) (control condition of 1.5 sec or longer duration of 2, 3, or 4 sec). b: The four condition blocks (top to bottom). NS, no music, short duration acoustic imaging noise; NE, no music, extended duration acoustic imaging noise; MS, music, short duration acoustic imaging noise; ME, music, extended duration acoustic imaging noise. Each condition block was 24 sec (three volume acquisitions with TR = 8 sec) in duration.

The xS condition blocks were always conducted using a base (12‐slice) 1.5‐sec duration CVA, whereas the xE condition blocks were conducted using a single test (extended) duration CVA in any given experimental run: a 1.5‐sec control or an experimental condition of 2, 3, or 4 sec. The extended acoustic imaging noise durations of 2, 3, and 4 sec were obtained by appending a predetermined number of gradient readout sequences (duration = 32 msec; rate = 8/sec) to the beginning of the base 12‐slice (1.5‐sec) volume acquisition. In particular, acoustic imaging noise durations of 2, 3, and 4 sec were achieved by appending 4 (0.5‐sec duration), 12 (1.5‐sec) and 20 (2.5‐sec) gradient readout sequences, respectively. These three volume acquisition durations were chosen due to the observation that most fMRI experiments use an integer TR, and to permit comparison of results with Talavage et al. [1999]. The additional gradient readout sequences were EPI acquisition sequences carried out with no radio frequency transmission (i.e., no alteration of the magnetization in the imaged volume) and no recording of the signal (which should comprise only noise) received at the coil.

Seven experimental runs were conducted in each imaging session. The first run was always conducted using the control duration volume acquisition (i.e., 1.5 sec) during the xE condition blocks. Within a subject this run served as the gold standard for cortical activation in response to the music stimulus. Further, this run served to familiarize the subject with the music stimulus in an effort to prevent habituation from noticeably altering the strength or spatial extent of the response to the music in subsequent experimental runs. The remaining six runs were conducted in a pseudorandom order, with two runs being conducted for each of the extended durations of acoustic imaging noise (2, 3, and 4 sec) that differed from the base duration (1.5 sec). The order of presentation of these six experimental runs was varied across subjects.

To assist in the maintenance of arousal and attention in each experimental run, subjects were asked to focus on either the presented music or the acoustic imaging noise. During the two runs made for a given extended duration of acoustic imaging noise, attention was directed at the music in one run, and at the acoustic imaging noise in the other. Subjects provided verbal feedback after each experimental run. All subjects reported that the acoustic imaging noise dominated perception over the music, regardless of the focus of attention.

The four conditions blocks were selected to permit six comparisons regarding the interaction of the response to acoustic imaging noise and the music stimulus. These comparisons are referred to by the ordered pairs indicating the two condition blocks being compared (e.g., MS/NS), with the activation condition block listed first, and the baseline condition block second. The six comparisons used in this analysis are described below.

  • 1

    Music versus no music in the presence of constant short duration acoustic imaging noise (MS/NS). This comparison measured the cortical response to the music stimulus in the presence of the control duration of acoustic imaging noise (1.5 sec; 12‐slice volume). This response served as a control, and was expected to be constant across the experimental runs due to the use of the 1.5‐sec acoustic imaging noise duration in all xS condition blocks.

  • 2

    Extended duration acoustic imaging noise versus short duration acoustic imaging noise, both with constant no music (NE/NS). This comparison measured the response to the various durations of acoustic imaging noise, in isolation from the music. These data were expected to follow the results of Talavage et al. [1999] and provide a second control, exhibiting increased signal changes with increasing duration of acoustic imaging noise.

  • 3

    Music versus no music in the presence of constant extended duration acoustic imaging noise (ME/NE). This comparison measured the cortical response to the music stimulus during the possible presence of an extended duration volume acquisition in the xE conditions blocks (1.5, 2, 3, or 4 sec of noise in each 8‐sec TR period). This assessed how the response to music was affected by duration of acoustic imaging noise. Based on our previous work [Edmister et al., 1999; Talavage et al., 1999] this response was expected to decrease as the duration of the acoustic imaging noise increased.

  • 4

    Extended duration acoustic imaging noise versus short duration acoustic imaging noise, both with constant music (ME/MS). This comparison measured the response to the various durations of acoustic imaging noise in the presence of the music stimulus. If the BOLD response was linear, the growth of this response would be similar to that observed in the NE/NS comparison.

  • 5

    Music with extended duration acoustic imaging noise versus no music with short duration acoustic imaging noise (ME/NS). This cross‐stimulus comparison allowed quantification of the response to simultaneous presentation of two acoustic stimuli with similar spectral content. This comparison should be evaluated with respect to the signal changes observed for the MS/NS and NE/NS comparisons to assess whether superposition of responses had taken place. Two possible non‐ideal responses may be observed. First, a greater response could arise due to increased acoustic energy in the instantaneous stimulus providing greater input to the auditory nerve. Conversely, a lesser response could be observed either if acoustic masking were to reduce the actual perceived spectral content, or if nonlinear interaction of physiologic function (e.g., nonlinear increases in regional cerebral blood flow [rCBF] or regional cerebral blood volume) were to reduce the signal increase resulting from the BOLD effect. In either case, an observed response that does not comprise the sum of the two individual responses would argue for nonlinearity in the observed auditory cortex response and have implications for the success of rapid presentation event‐related experiments using acoustic stimuli.

  • 6

    Music with short duration acoustic imaging noise versus no music with extended duration acoustic imaging noise (MS/NE). This cross‐stimulus comparison was designed to quantify the change in the measured magnitude of the response to the desired stimulus (e.g., the music) resulting from the presence of an acoustic confound (e.g., the acoustic imaging noise) in the control condition. In this case, the condition in which the music was presented in a near‐optimal background (short duration acoustic imaging noise) was compared to the worst‐case control condition (extended duration acoustic imaging noise) to obtain an upper bound, for a given xE duration, on the reduction in the measured stimulus‐induced activation. The detected response to the music was expected to decrease with increasing acoustic imaging noise duration as the baseline condition was increasingly corrupted by the expected increase in the NE/NS comparison. Such a reduction would be consistent with the decrease observed by Edmister et al. [1999] in the cortical response to music stimulation as the total duration of acoustic imaging noise per TR period was increased.

Data Aanalysis

Using SPM99, the seven experimental runs in each session were aligned to the first volume of the first experimental run. Analysis of the six comparisons described previously used a functionally defined region‐of‐interest (ROI) within auditory cortex, generated on an individual basis for each subject. Auditory cortex was defined for these purposes as the superior temporal plane in the vicinity of Heschl's gyrus, the site of primary auditory cortex [e.g., Galaburda and Sanides, 1980]. We were interested primarily in identifying all locations that responded to the music stimulus in the most‐optimal conditions presented; therefore, we did not require that the ROI comprise a contiguous grouping of voxels.

The ROI for a given subject was generated from the results of the first experimental run, during which all durations of acoustic imaging noise were 1.5 sec (i.e., xS and xE condition blocks used identical CVA durations). In this case those volumes acquired during the presentation of music (both MS and ME condition blocks in Fig. 3a) were compared to those volumes acquired when no music was presented (both NS and NE condition blocks in Fig. 3a). The signal change between these two sets of acquisitions should be the optimal music response that we may obtain, due to minimization of acoustic imaging noise. All voxels responding to the music at a significance level (unpaired t‐test) of P < 0.01 were identified within the central four slices of the acquired volume and used to define the ROI for the given subject. Table I (top row) shows the number of voxels in the functionally defined ROI for each subject. The resulting ROI indicates where responses to music are most likely to be observed, and thus provides the best set of voxels in which to evaluate the effect of additional acoustic imaging noise on the music response. The ROI contains functionally heterogeneous voxels from multiple auditory areas and possibly from outside auditory cortex; therefore, analysis of the ROI will likely underestimate any effect that the acoustic imaging noise has on the music response.

Table I.

Individual subject mean signal change, c(T), and standard deviation as a function of extended acoustic imaging noise duration

c(T) Mean and standard deviation of signal change in each subject (%)
1 (34 voxels) 2 (83 voxels) 3 (34 voxels) 4 (66 voxels) 5 (41 voxels)
c(1.5 sec)
 MS/NS 4.8 ± 4.0 3.3 ± 2.2 1.7 ± 1.1 4.8 ± 3.7 4.0 ± 2.6
 ME/NE 3.3 ± 2.9 3.1 ± 2.2 2.0 ± 1.1 5.6 ± 4.1 3.6 ± 1.8
 NE/NS 2.0 ± 2.2 0.4 ± 0.9 −0.1 ± 1.0 −0.5 ± 1.3 −0.1 ± 1.2
 ME/MS 0.4 ± 0.7 0.2 ± 0.7 0.2 ± 0.6 0.4 ± 1.0 −0.5 ± 1.1
 ME/NS 5.2 ± 4.3 3.5 ± 2.4 1.9 ± 1.0 5.2 ± 4.2 3.5 ± 1.9
 MS/NE 2.9 ± 2.7 2.9 ± 2.0 1.8 ± 1.0 5.2 ± 3.5 4.2 ± 2.3
c(2 sec)
 MS/NS 4.2 ± 3.3 2.1 ± 1.7 1.4 ± 1.2 4.6 ± 3.8 2.7 ± 1.8
 ME/NE 5.0 ± 4.0 2.3 ± 1.7 1.5 ± 1.1 4.2 ± 3.4 2.4 ± 1.6
 NE/NS −1.0 ± 1.1 0.0 ± 0.4 0.3 ± 0.8 0.1 ± 0.7 0.2 ± 0.5
 ME/MS 0.2 ± 1.1 0.3 ± 0.4 0.4 ± 0.6 −0.3 ± 0.7 −0.1 ± 0.6
 ME/NS 4.0 ± 3.3 2.4 ± 1.6 1.8 ± 1.4 4.3 ± 3.4 2.6 ± 1.8
 MS/NE 3.7 ± 4.0 2.1 ± 1.8 1.1 ± 0.8 4.5 ± 3.7 2.4 ± 1.5
c(3 sec)
 MS/NS 4.3 ± 3.2 1.6 ± 1.2 1.5 ± 1.0 4.5 ± 3.7 2.7 ± 2.0
 ME/NE 4.8 ± 3.5 1.7 ± 1.4 1.5 ± 1.1 4.2 ± 3.7 1.5 ± 1.4
 NE/NS −0.1 ± 0.8 0.2 ± 0.5 −0.4 ± 0.9 1.0 ± 1.7 0.5 ± 0.6
 ME/MS 1.0 ± 0.9 0.4 ± 0.4 −0.4 ± 0.5 0.7 ± 1.1 −0.6 ± 1.1
 ME/NS 4.8 ± 3.2 1.9 ± 1.3 1.1 ± 0.8 5.2 ± 4.1 2.0 ± 1.5
 MS/NE 3.8 ± 3.4 1.4 ± 1.3 1.9 ± 1.4 3.5 ± 3.4 2.2 ± 1.7
c(4 sec)
 MS/NS 4.0 ± 3.0 2.0 ± 1.6 1.9 ± 1.5 4.0 ± 3.4 2.0 ± 1.3
 ME/NE 2.7 ± 2.3 1.4 ± 1.1 0.5 ± 0.7 3.1 ± 2.8 1.2 ± 1.1
 NE/NS 2.3 ± 1.3 0.6 ± 0.8 0.7 ± 0.8 1.7 ± 1.8 0.9 ± 0.8
 ME/MS −0.3 ± 0.9 0.0 ± 0.4 −0.7 ± 1.0 0.9 ± 1.0 0.1 ± 0.6
 ME/NS 4.9 ± 3.4 2.0 ± 1.5 1.2 ± 0.9 4.8 ± 4.1 2.1 ± 1.4
 MS/NE 2.9 ± 2.9 1.4 ± 1.2 1.2 ± 0.9 2.2 ± 2.3 1.1 ± 1.0

MS, music, short duration acoustic imaging noise; NS, no music, short duration acoustic imaging noise; ME, music, extended duration acoustic imaging noise; NE, no music, extended duration acoustic imaging noise.

Before further analysis, the pairs of runs made at each of the 2‐, 3‐, and 4‐sec extended durations of acoustic imaging noise were analyzed using the average signal change in the ROI to determine if the requested focus of attention affected results. When no significant differences were observed (paired Student's t‐test), the two runs made for each of the three extended durations of acoustic imaging noise were averaged to improve the signal‐to‐noise ratio. The seven experimental runs and three averaged runs comprised ten data sets to be examined. The 10 sets were analyzed (using in‐house software) with a Student's t‐test to detect changes in mean signal level. The use of a t‐statistic was justified by the long TR (8 sec) and the sustained nature of the music stimulus presentation; the sustained presentation was expected to produce an almost steady‐state BOLD response at each of the three sampling times (8‐, 16‐, and 24‐sec post‐stimulus onset) within a condition block.

In each subject, the mean percent signal change, c(T), was computed over the functionally defined ROI for each of the 10 individual and averaged runs. T refers to the duration of the acoustic imaging noise in extended duration (xE) condition blocks (T = 1.5, 2, 3, or 4 sec). All six comparisons described above were applied to compute a percent signal change between the activation and baseline condition blocks for each voxel. Population statistics (mean and standard deviation) of the percent signal change were computed over the voxels within the ROI.

Assessment of trends in each of the six comparisons, as a function of acoustic imaging noise duration, was conducted both on an individual subject basis and using a single set of pooled data, comprising all voxels in the subject ROIs. Population statistics for each comparison at each value of T were computed above for the individual subjects (Table I). For the set of pooled data, the mean and standard error were likewise computed for each comparison at each value of T. For each subject and the pooled data, signal changes at T = 3 sec and T = 4 sec were normalized by computing the change in c(T) relative to the response observed during the T = 2 sec run (i.e., Δc[T] = c[T] − c[2 sec]). The subtraction to effect normalization was motivated by the relatively large range of signal changes observed in the MS/NS comparisons (e.g., 1.7–4.8% for T = 1.5 sec; Table I, top) and the desire to use only those extended durations for which two experimental runs were carried out. Under the null hypothesis that the variation in duration of acoustic imaging noise has no effect on the response to the music, the Δc(T) values will, as a function of T, be independent, identically‐distributed, zero‐mean random variables.

RESULTS

The raw mean signal changes, c(T), within the music response ROI are listed for each comparison in Table I, grouped by subject and duration of acoustic imaging noise used during the xE condition blocks (T = 1.5, 2, 3, and 4 sec). The pooled mean signal changes are listed in Table II, also grouped by the acoustic imaging noise duration in the xE condition blocks. The normalized mean signal changes (i.e., Δc[T]) are listed for individual subjects in Table III, and for the pooled data in Table IV. Plots that depict the individual and pooled trends are shown in Figure 4.

Table II.

Pooled mean signal change, c(T), and standard error (258 voxels) as a function of extended acoustic imaging noise duration

Condition Mean and standard error of signal change (%)
At c(1.5 sec) At c(2 sec) At c(3 sec) At c(4 sec)
MS/NS 3.8 ± 0.2 3.0 ± 0.2 2.8 ± 0.2 2.8 ± 0.2
ME/NE 3.7 ± 0.2 3.1 ± 0.2 2.7 ± 0.2 1.9 ± 0.1
NE/NS 0.2 ± 0.1 0.0 ± 0.0 0.3 ± 0.1 1.2 ± 0.1
ME/MS 0.2 ± 0.1 0.1 ± 0.0 0.3 ± 0.1 0.1 ± 0.1
ME/NS 3.9 ± 0.2 3.0 ± 0.2 3.0 ± 0.2 3.0 ± 0.2
MS/NE 3.5 ± 0.2 2.9 ± 0.2 2.4 ± 0.2 1.7 ± 0.1

MS, music, short duration acoustic imaging noise; NS, no music, short duration acoustic imaging noise; ME, music, extended duration acoustic imaging noise; NE, no music, extended duration acoustic imaging noise.

Table III.

Normalized individual subject signal change

Δc(T)* Normalized individual signal change (%)
Subject 1 Subject 2 Subject 3 Subject 4 Subject 5
Δc(3 sec)
 MS/NS 0.0 −0.5 0.1 −0.2 0.0
 ME/NE −0.2 −0.6 0.0 0.0 −0.8
 NE/NS 0.9 0.2 −0.7 0.9 0.3
 ME/MS 0.8 0.1 −0.8 1.1 −0.6
 ME/NS 0.8 −0.5 −0.7 0.9 −0.6
 MS/NE 0.1 −0.7 0.8 −1.0 −0.2
Δc(4 sec)
 MS/NS −0.3 −0.1 0.5 −0.7 −0.7
 ME/NE −2.3 −0.9 −1.0 −1.1 −1.2
 NE/NS 3.3 0.5 0.4 1.6 0.7
 ME/MS −0.4 −0.3 −1.1 1.2 0.1
 ME/NS 1.0 −0.4 −0.6 0.5 −0.5
 MS/NE −0.8 −0.6 0.1 −2.3 −1.4
*

Δc(T) = c(T) − c(2 sec) at T = 3 sec and T = 4 sec.

MS, music, short duration acoustic imaging noise; NS, no music, short duration acoustic imaging noise; ME, music, extended duration acoustic imaging noise; NE, no music, extended duration acoustic imaging noise.

Table IV.

Normalized pooled mean signal change

Condition Normalized pooled mean signal change (%)
Δc(T) at c(3 sec) Δc(T) at c(4 sec)
MS/NS −0.2 −0.3
ME/NE −0.4 −1.2
NE/NS 0.4 1.2
ME/MS 0.2 0.0
ME/NS 0.0 0.0
MS/NE −0.4 −1.1

* Δc(T) = c(T) − c(2 sec) at T = 3 sec and T = 4 sec.

MS, music, short duration acoustic imaging noise; NS, no music, short duration acoustic imaging noise; ME, music, extended duration acoustic imaging noise; NE, no music, extended duration acoustic imaging noise.

Figure 4.

Figure 4

Plots for each of the six comparisons of the normalized mean signal changes in those runs for which xE condition blocks used the three extended durations of acoustic imaging noise (T = 2, 3, and 4 sec) rather than the control duration (T = 1.5 sec). See Figure 3 for explanation of condition block labels NS, NE, MS and ME. Data for individual subjects (see Table III) are indicated by plot symbols (diamond, Subject 1; square, Subject 2; circle, Subject 3; triangle, Subject 4; X, Subject 5). Solid lines in each plot indicate normalized pooled mean signal change, with error bars indicating ± two standard errors (see Tables II and IV).

Trends in Δc(T) observed for three of six comparisons as a function of acoustic imaging noise duration were found to be statistically significant at the P < 0.06 level. An increase (paired t‐test: P < 0.04; ANOVA: P = 0.052, F = 7.51) was observed for NE/NS with increasing acoustic imaging noise duration. Decreases were observed for the ME/NE (paired t‐test: P < 0.03; ANOVA: P = 0.042, F = 8.70) and MS/NE (paired t‐test: P < 0.04; ANOVA: P = 0.031, F = 10.60) comparisons as the acoustic imaging noise duration was increased. No statistically significant trends (ANOVA: P > 0.5) were observed as a function of the acoustic imaging noise duration for the MS/NS, ME/MS, and ME/NS comparisons.

Large signal increases were observed in the NE/NS comparison when the 4 sec duration of acoustic imaging noise was compared to the signal change observed for T = 3 sec (paired t‐test: P < 0.03) and T = 2 sec (paired t‐test: P < 0.04). The results of the pooled analyses (signal increase of 0.4% after 3 sec of noise, 1.2% after 4 sec; Table IV) are consistent with the results of Talavage et al. [1999]. All five subjects exhibited increases in Δc(T) with increasing acoustic imaging noise duration (Table III), even though, at T = 4 sec, one or more individual voxels achieved the P < 0.01 level (unpaired t‐statistic) in only three subjects (shown in Fig. 5).

Figure 5.

Figure 5

Statistically significant activation (P < 0.01; unpaired t‐test) was observed at T = 4 sec in three subjects (1, 2 and 4; inferior‐most slice at left) in response to the extended duration acoustic imaging noise, relative to the short acoustic imaging noise duration (NE/NS). The observed activation for an acoustic imaging noise duration of 4 sec (left), was consistent with activation observed in Talavage et al.[1999], and may be contrasted with the lack of statistically significant activation when the duration of acoustic imaging noise was 2 sec (right).

The decrease in the ME/NE comparison was consistent with the results of Edmister and associates [1999]. On a pair‐wise basis, the decrease was statistically significant when the acoustic imaging noise duration was extended to 4 sec from T = 3 sec (paired t‐test: P < 0.03) and from T = 2 sec (paired t‐test: P < 0.01), but the response at T = 3 sec was not significantly different from the response at T = 2 sec (paired t‐test: P = 0.14). The decreased percent signal changes at T = 4 sec led to readily observable reductions in the spatial extent of the music response for four of the five subjects (Fig. 6).

Figure 6.

Figure 6

Comparison of statistically significant activation (P < 0.01; unpaired t‐test) in response to on/off music presentation during condition blocks with (left) short duration acoustic imaging noise, MS/NS, and (right) extended duration acoustic imaging noise. The duration of the extended duration acoustic imaging noise varies from 2–4 sec, top to bottom. As expected, the response to on/off music presentation during short duration acoustic imaging noise condition blocks (MS/NS) did not exhibit a significant change as the extended acoustic imaging noise duration was increased. Conversely, the response to on/off music presentation during extended acoustic imaging noise duration condition blocks (ME/NE) exhibited a decrease in both spatial extent and statistical significance as the duration of acoustic imaging noise was increased, particularly in the probable location of primary auditory cortex. Data are from Subject 4, with the inferior‐most slice at left.

The decrease in the MS/NE comparison was also observed for the pair‐wise comparison of the 4‐sec acoustic imaging noise duration to T = 3 sec (paired t‐test: P < 0.02) and T = 2 sec (paired t‐test: P < 0.04). The decrease of the percent signal change as a function of increased acoustic imaging noise duration was reflected by a reduction in the spatial extent of observed activation (Fig. 7).

Figure 7.

Figure 7

Comparison of statistically significant activation (P < 0.01; unpaired t‐test) in response to music, demonstrating that much of the reduction in activation during extended duration acoustic imaging noise (xE) condition blocks arose from increased signal levels measured during the off (no music stimulus; Nx) condition blocks. The spatial extent of the response to the peak acoustic stimulation relative to the short acoustic imaging noise duration baseline (ME/NS, left) remained essentially constant even as the duration of acoustic imaging noise increased. In contrast, significant decreases in both spatial extent and statistical significance were observed when attempting to detect the response to music (presented in short acoustic imaging noise duration condition blocks) using the extended acoustic imaging noise duration condition blocks as a baseline (MS/NE, right). Data are from Subject 4, with the inferior‐most slice at left.

DISCUSSION

The BOLD responses observed in auditory cortex during simultaneous presentation of acoustic imaging noise and a desired acoustic stimulus combined in a nonlinear fashion, relative to the responses that arose from isolated presentations of the individual stimuli. As expected, the MS/NS comparison remained essentially constant as a function of the extended volume acquisition duration in the xE condition blocks (Fig. 6, left). Also as expected, signal changes increased in the NE/NS comparison as the xE condition block acoustic imaging noise duration was increased (i.e., T = 2, 3, and 4 sec) relative to the base (1.5 sec) duration (Fig. 5). In contrast, the signal change observed between the maximal and minimal acoustic stimulation conditions (ME/NS) was not found to increase as a function of the acoustic imaging noise duration in the xE condition blocks (Fig. 7, left). If the addition of responses was linear, the growth of the NE/NS comparison with the duration of acoustic imaging noise would have produced an ME/NS comparison that likewise grew with increasing duration. Also supporting the argument for nonlinear addition of responses was the comparison of music‐induced (on vs. off) responses in constant (extended) duration acoustic imaging noise (ME/NE). In this comparison, responses were found to decrease as the acoustic imaging noise duration increased. If the BOLD responses to the music and the acoustic imaging noise were to add in a linear fashion, the ME/NE comparison (theoretically the difference between the ME/NS and NE/NS conditions) would be constant.

Evaluation of Source of Nonlinearity in Auditory Cortex BOLD Response

The observed nonlinearity may arise from some level of acoustic masking, saturation of the BOLD response, or inherent nonlinearity of spatially coincident BOLD responses. The explicit role that acoustic masking may play in limiting the peripheral or central response to the music stimulus (due to simultaneous presence of the acoustic imaging noise and the presented music) cannot be evaluated from the results of this study. Future investigations utilizing spectrally similar but non‐overlapping stimuli may permit this possible source of nonlinearity to be examined.

The observed nonlinearity of the responses was not a consequence of saturation of the BOLD response. The MS/NS response during the first experimental run in each session, conducted using T = 1.5 sec, produced a pooled mean signal change of 3.8% (Table II). The MS/NS responses computed for T = 2, 3, and 4 sec produced pooled mean signal changes no larger than 3.0%. Because the acoustic imaging noise conditions were constant for all MS/NS comparisons, we may conclude that the dynamic range in the ROIs was at least as large as the change observed in the initial (T = 1.5 sec) run. Given that subsequent experimental runs did not approach this empirically defined upper bound, we may conclude that they also did not exhibit saturation of the BOLD response.

The nonlinearity of the BOLD response with increasing total acoustic energy per TR period was likely due to overlap of the acoustic spectra of the desired and undesired stimuli. Neuronal activity at an acoustic frequency common to both stimuli may not exhibit a significant increase due to the added presence of the second stimulus. This lack of a detectable increase may arise from limited dynamic range or saturation, in which case the activity at the neuronal and hemodynamic levels may not represent the sum of responses to the two stimuli in isolation. Having observed greater percent signal changes under circumstances of lesser total acoustic energy (i.e., T = 1.5 sec), we have demonstrated that the BOLD response does not achieve saturation, even in the presence of increasing acoustic energy (i.e., T = 2, 3, and 4 sec). We hypothesize that the source of the nonlinearity lies in the mechanism by which BOLD responses in auditory cortex combine. This possible effect should be considered distinct from acoustic masking, because it is due to an inability to distinguish whether cortical activation at a given acoustic frequency arises from the desired or undesired acoustic stimulus, rather than from an inability on the part of the subject to perceive the presence of both stimuli.

Effects on Cortical Activation of Nonlinear Interaction of BOLD Responses

Previous results suggest that nonlinearity in the BOLD response is likely to be the primary mechanism leading to a reduction in activation that correlates with increased background acoustic imaging noise. In Edmister et al. [1999], cortical responses were not found to increase with an increase in the total acoustic energy present in the environment. Rather, lesser cortical responses were observed when slice acquisitions were conducted more frequently. Nonlinearity observed in percent signal changes as a function of acoustically different acquisitions (i.e., volume acquisition plus variable lengths of additional acoustic imaging noise) in this study were therefore likely due to nonlinear combination of the response induced by the presented stimulus and the short‐term effects of the loud acoustic imaging noise [e.g., Bandettini et al., 1998; Talavage et al., 1999].

We hypothesize that reduction of auditory cortical activation in the presence of acoustic imaging noise (e.g., Fig. 8) was due both to the observed nonlinear addition of BOLD responses and to an increase in the baseline signal level to which the activation condition was compared. The nonlinear addition of the two BOLD responses resulted in a maximal signal change that did not vary as a function of acoustic imaging noise duration, despite the use of a fixed TR (ME/NS; Fig. 4, bottom left). The response to the acoustic imaging noise in isolation increased as a function of duration (NE/NS; Fig. 4, middle left). The combination of these two responses in an extended acoustic imaging noise duration experiment (i.e., ME/NE) therefore resulted in a reduced difference between signal means and produced less significant rejection of the null hypothesis when using a distribution‐based test.

Figure 8.

Figure 8

Example of the reduction in spatial extent of the activation in auditory cortex for extended acoustic imaging noise duration of 4 sec. Locations that exhibited the most significant responses to the acoustic imaging noise (NE/NS; middle) generally detracted from the control response to the desired music stimulus (MS/NS; top) and resulted in a response that was both smaller in spatial extent and lesser in statistical significance (ME/NE; bottom). All depicted activation was statistically significant at the P < 0.01 level (unpaired t‐test). Data are from Subject 1, with the inferior‐most slice at left.

Spatial Dependence of Nonlinear Interaction

It is probable that the nonlinearity in the auditory cortex varies by cortical area. In Edmister and coworkers [1999] and in the present work (e.g., Fig. 5, ME/NE comparison, third slice) the medial portions of auditory cortex, especially Heschl's gyrus, exhibited the most marked decrease in activation as a function of the duration of acoustic imaging noise. The current study did not permit effective analysis of this issue due to the large, functionally heterogeneous ROI used to evaluate variations in percent signal change. A more detailed study that combines variation in acoustic imaging noise with identification of functional auditory areas [e.g., Schönwiesner et al., 2002; Talavage et al., 2004] could address this issue.

Implications for Auditory fMRI Experiments

Block paradigms

The hypothesis that nonlinear addition of BOLD responses in the auditory cortex underlies the reduction in activation has significant implications for all auditory fMRI experiments, but particularly those involving block, or subtractive, paradigms (i.e., alternating between two or more stimulus conditions). Block paradigm BOLD fMRI experiments typically rely on distribution statistics or correlation measures to detect changes in signal levels as a function of time. The use of such analytical procedures assumes that any confound present in the stimulation (on) and baseline (off) conditions will not affect assessment of activation due to the stimulus. This assumption imposes two conditions on the properties of the BOLD response. First, the signal changes associated with the BOLD response in cortex must exhibit the principle of superposition (i.e., multiple responses must combine in a linear fashion). Second, the BOLD response must possess sufficient dynamic range of percent signal change such that stimulus‐ and confound‐induced responses may add without saturation. Our findings demonstrate that the first of these conditions will not always hold, potentially affecting the accuracy of the results of auditory block paradigm experiments (e.g., higher false negative rates).

Event‐related paradigms

The lack of linearity in the combination of responses to simultaneous presentation of multiple stimuli also has implications for auditory event‐related paradigms. Given that event‐related paradigms typically yield lesser signal‐to‐noise ratios while attempting to detect and estimate phenomena of lesser magnitude than block paradigms, the reduction in the measured response by the presence of the acoustic imaging noise may preclude acquisition of meaningful results from some experiments. In general, we expect that the short stimuli of event‐related fMRI will produce cortical responses that combine in a roughly linear fashion [e.g., Dale and Buckner, 1997], potentially overcoming much of this confound. One caveat is that the ME/MS comparison exhibited no observable response to the acoustic imaging noise when the duration of this noise was altered in the constant presence of the music stimulus. This lack of response was observed despite all subjects reporting being able to hear the acoustic imaging noise as dominant over the presented music stimulus. This has implications for our ability to conduct linearity assessments in the auditory cortex, because stimuli that exhibit significant spectral overlap (or at the least, significant overlap in the location of cortical response) may not be additive, despite the increase in total delivered acoustic energy per TR period.

CONCLUSIONS

The auditory cortex does not produce linear BOLD responses when a (desired) robust music stimulus is augmented by additional (undesired) acoustic imaging noise. Replicating previous work [Edmister et al., 1999; Talavage et al., 1999], we were able to quantify increases in cortical activity arising from increases in the total duration of acoustic imaging noise presented in a given time interval (i.e., TR period). In contrast, increases in the total duration of acoustic imaging noise in the same time interval did not lead to increased cortical activation (measured by either spatial extent or percent signal change) in the presence of music stimulation, despite the fact that the BOLD response had not saturated. Analyses conducted in the presence of greater acoustic imaging noise may therefore be determined to be less significant due to the raised mean of the baseline condition rather than due to an actual decrease in the activation produced by the presented stimulus.

Acknowledgements

Work was carried out at the Massachusetts General Hospital NMR Center, Charlestown, MA. We thank M.E. Ravicz for the acoustic imaging noise measurement data, Dr. C.I. Moore and Dr. K.K. Kwong for their assistance in carrying out and completing this work, and G.G. Tamer, Jr., and S.J. Kisner for their comments and editorial assistance.

REFERENCES

  1. Amaro E Jr, Williams SC, Shergill SS, Fu CH, MacSweeney M, Picchioni MM, Brammer MJ, McGuire PK (2002): Acoustic noise and functional magnetic resonance imaging: current strategies and future prospects. J Magn Reson Imaging 16: 497–510. [DOI] [PubMed] [Google Scholar]
  2. Bandettini PA, Jesmanowicz A, Van Kylen J, Birn RM, Hyde JS (1998): Functional MRI of brain activation induced by scanner acoustic noise. Magn Reson Med 39: 410–416. [DOI] [PubMed] [Google Scholar]
  3. Belin P, Zatorre RJ, Hoge R, Evans AC, Pike B (1999): Event‐related fMRI of the auditory cortex. Neuroimage 10: 417–429. [DOI] [PubMed] [Google Scholar]
  4. Dale AM, Buckner RL (1997): Selective averaging of rapidly presented individual trials using fMRI. Hum Brain Mapp 5: 329–340. [DOI] [PubMed] [Google Scholar]
  5. Di Salle F, Formisano E, Seifritz E, Linden DE, Scheffler K, Saulino C, Tedeschi G, Zanella FE, Pepino A, Goebel R, Marciano E (2001): Functional fields in human auditory cortex revealed by time‐resolved fMRI without interference of EPI noise. Neuroimage 13: 328–338. [DOI] [PubMed] [Google Scholar]
  6. Eden GF, Joseph JE, Brown HE, Brown CP, Zeffiro TA (1999): Utilizing hemodynamic delay and dispersion to detect fMRI signal change without auditory interference: the behavior interleaved gradients technique. Magn Reson Med 41: 13–20. [DOI] [PubMed] [Google Scholar]
  7. Edmister WB, Talavage TM, Ledden PJ, Weisskoff RM (1999): Improved auditory cortex imaging using clustered volume acquisitions. Hum Brain Mapp 7: 89–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Galaburda A, Sanides F (1980): Cytoarchitectonic organization of the human auditory cortex. J Comp Neurol 190: 597–610. [DOI] [PubMed] [Google Scholar]
  9. Hall DA, Haggard MP, Akeroyd MA, Palmer AR, Summerfield AQ, Elliott MR, Gurney EM, Bowtell RW (1999): “Sparse” temporal sampling in auditory fMRI. Hum Brain Mapp 7: 213–223. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Hall DA, Summerfield AQ, Goncalves MS, Foster JR, Palmer AR, Bowtell RW (2000): Time‐course of the auditory BOLD response to scanner noise. Magn Reson Med 43: 601–606. [DOI] [PubMed] [Google Scholar]
  11. Ogawa S, Lee TM, Kay AR, Tank DW (1990): Brain magnetic resonance imaging with contrast dependent on blood oxygenation. Proc Natl Acad Sci USA 87: 9868–9872. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Ravicz ME, Melcher JR (1998): Imager noise and noise attenuation during fMRI. Neuroimage 7(Suppl): 556. [Google Scholar]
  13. Ravicz ME, Melcher JR, Kiang NY (2000): Acoustic noise during functional magnetic resonance imaging. J Acoust Soc Am 108: 1683–1696. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Schönwiesner M, von Cramon DY, Rübsamen R (2002): Is it tonotopy after all? Neuroimage 17: 1144–1161. [DOI] [PubMed] [Google Scholar]
  15. Shah NJ, Jancke L, Grosse‐Ruyken ML, Muller‐Gartner HW (1999): Influence of acoustic masking noise in fMRI of the auditory cortex during phonetic discrimination. J Magn Reson Imaging 9: 19–25. [DOI] [PubMed] [Google Scholar]
  16. Talavage TM, Edmister WB, Ledden PJ, Weisskoff RM (1999): Quantitative assessment of auditory cortex responses induced by imager acoustic noise. Hum Brain Mapp 7: 79–88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Talavage TM, Ledden PJ, Benson RR, Rosen BR, Melcher JR (2000): Frequency‐dependent responses exhibited by multiple regions in human auditory cortex. Hear Res 150: 225–244. [DOI] [PubMed] [Google Scholar]
  18. Talavage TM, Sereno MI, Melcher JR, Ledden PJ, Rosen BR, Dale AM (2004): Tonotopic organization in human auditory cortex revealed by progressions of frequency sensitivity. J Neurophysiol 91: 1282–1296. [DOI] [PubMed] [Google Scholar]

Articles from Human Brain Mapping are provided here courtesy of Wiley

RESOURCES