Abstract
Multisensory calibration is fundamental for proficient interaction within a changing environment. Initial studies suggested a visual-dominant mechanism. More recently, a cue-reliability-based model, similar to optimal cue integration, has been proposed. However, a more general, reliability-independent model of fixed-ratio adaptation (of which visual dominance is a subcase) has never been tested. Here, we studied behavior of both humans and monkeys performing a heading-discrimination task. Subjects were presented with either visual (optic-flow), vestibular (motion-platform), or combined (visual–vestibular) stimuli and required to report whether self-motion was to the right/left of straight ahead. A systematic heading discrepancy was introduced between the visual and vestibular cues, without external feedback. Cue calibration was measured by the resulting sensory adaptation. Both visual and vestibular cues significantly adapted in the direction required to reduce cue conflict. However, unlike multisensory cue integration, cue calibration was not reliability based. Rather, a model of fixed-ratio adaptation best described the data, whereby vestibular adaptation was greater than visual adaptation, regardless of relative cue reliability. The average ratio of vestibular to visual adaptation was 1.75 and 2.30 for the human and monkey data, respectively. Furthermore, only through modeling fixed-ratio adaptation (using the ratio extracted from the data) were we able to account for reliability-based cue integration during the adaptation process. The finding that cue calibration does not depend on cue reliability is consistent with the notion that it follows an underlying estimate of cue accuracy. Cue accuracy is generally independent of cue reliability, and its estimate may change with a much slower time constant. Thus, greater vestibular versus visual (fixed-ratio) adaptation suggests lower vestibular versus visual cue accuracy.
Introduction
Integration of input from multiple sensory sources is required for coherent perception and adept interaction with the environment. However, inherent noise and the probabilistic nature of our senses make this task particularly challenging (Knill and Pouget, 2004). Hence, the brain requires a proficient strategy for multisensory integration. Optimal integration schemes predict that multiple sensory cues should be weighted in accordance with their relative reliabilities, such that reliable cues are more influential than less reliable cues (Yuille and Bülthoff, 1996; Jacobs, 1999; Landy and Kojima, 2001). Indeed, a number of paradigms have demonstrated reliability-based cue combination (RBCC) when integrating multisensory input (Ernst and Banks, 2002; van Beers et al., 2002; Alais and Burr, 2004).
Reliability-based cue combination is considered optimal in that it maximizes precision (synonymous here with reliability; defined by the inverse variance). Nonetheless, it may not account for accuracy, agreement between perception and the environment. If, for example, a sensory cue is biased, reliability-based cue combination may result in a biased perception (Watt et al., 2005). Although there are circumstances in which reliability-based cue combination may still be beneficial even with biased cues (Scarfe and Hibbard, 2011), cue calibration would enhance the accuracy of multisensory perception. However, without external feedback, it may not be possible to assess the accuracy of a sensory system. In this case, the best the brain can do is detect the relative bias between different sensory systems. Given a systematic discrepancy, sensory cues may undergo mutual calibration to achieve “internal consistency” (not to be confused with “external accuracy”; Burge et al., 2010).
Several groups have suggested a reliability-based model for multisensory calibration, whereby the extent of adaptation is determined by the relative reliability of each cue (Ghahramani et al., 1997; Witten and Knudsen, 2005; Burge et al., 2010). Reliability-based adaptation (RBA) seems a logical extension of reliability-based cue combination, particularly because the traditional view of visual-dominant adaptation (VDA) (Rock and Victor, 1964) has been challenged by accounts of visual recalibration (Lewald, 2002; Atkins et al., 2003). However, the most reliable cue might not always be the most accurate (Ernst and Di Luca, 2011). Furthermore, there is still evidence for visual-dominant effects (Knudsen, 2002; Spence, 2009). In fact, visual-dominant adaptation is only a subcase of a generalized fixed-ratio adaptation (FRA) model, whereby cues adapt toward one another at a fixed ratio regardless of cue reliability (possibly according to the underlying estimates of cue accuracy), yet a model of fixed-ratio adaptation has never been quantitatively compared with reliability-based adaptation. Hence, the manner, principles, and extent to which multiple sensory systems adapt to one another remains fundamentally missing.
Perception of self-motion and heading direction relies particularly on visual and vestibular input (Guedry, 1974; Warren and Hannon, 1988; Ohmi, 1996). We have shown previously in a heading discrimination task that visual–vestibular integration primarily follows reliability-based cue combination for both humans and monkeys (Gu et al., 2008; Fetsch et al., 2009). In this study, we probe the nature of visual–vestibular calibration by introducing a systematic discrepancy between visual and vestibular cues, in the same heading discrimination task, and assessing the resulting perceptual adaptation of the individual cues.
Materials and Methods
Separate experiments were performed on humans and monkeys. Details of the apparatus, stimuli, and basic task design, published previously for both humans (Fetsch et al., 2009; Gu et al., 2010; MacNeilage et al., 2010) and monkeys (Gu et al., 2007, 2008, 2010; Fetsch et al., 2009), are briefly summarized below together with the methods specific for this study. For additional details, see previous studies.
Human experiment
Nine human subjects (four male) participated in this study. All signed informed consent, and the study was approved by the Washington University Human Research Protection Office. Subjects were seated comfortably in a cockpit-style chair and restrained safely with a five-point racing harness. Each subject wore a custom-made thermoplastic mask that was attached to the back of the chair for head stabilization. The chair, a three-chip DLP projector (Galaxy 6; Barco), and a large projection screen were all mounted on a motion platform (6DOF2000E; Moog) to provide synchronized visual and vestibular input. The projection screen (149 × 127 cm) was located ∼70 cm in front of the eyes, subtending a visual angle of ∼94° × 84°. Subjects were enclosed in a black aluminum superstructure, such that only the display screen was visible in a darkened room, and wore active three-dimensional glasses (CrystalEyes 3; RealD) to provide stereoscopic depth cues. The field of view through the glasses was ∼70° × 90°.
The subjects' task was to discriminate heading direction (two-alternative forced choice, right or left of straight ahead), after presentation of a single-interval stimulus. Subjects were instructed to focus on a central fixation point throughout the duration of the trial. Trials were initiated, and choice selection was reported via button press on a handheld unit. Subjects received trial timing-related feedback through headphones. However, no feedback about correct or incorrect choices was provided.
The stimulus presented was either vestibular-only, visual-only, or simultaneously combined vestibular and visual cues. The vestibular-only stimulus was inertial motion of the platform in darkness (no optic-flow cues). The visual-only stimulus was optic-flow motion simulation, without inertial motion of the platform. The combined vestibular and visual stimulus comprised inertial motion in conjunction with synchronized optic flow. Although additional cues, such as proprioception and somatosensation, could also be present during inertial motion, we refer to this condition as “vestibular ” because primate performance depends strongly on intact vestibular labyrinths (Gu et al., 2007). The stimulus velocity followed a 4 σ Gaussian profile with duration of 1 s and total displacement of 13 cm. Peak velocity was 0.35 m/s, and peak acceleration was 1.4 m/s2.
The optic flow simulated self-motion of the subject through a random-dot cloud. Visual cue reliability was varied by manipulating the motion coherence of the optic-flow pattern, i.e., percentage of dots moving coherently. Three levels of coherence were used: high (100% coherence), medium (50% coherence), and low. The latter was subject specific, determined such that the subject's visual threshold was larger than the subject's vestibular threshold. Vestibular reliability was fixed throughout the trials. For each session, the actual reliability ratio (RR) of the visual/vestibular cues, extracted from single-cue psychophysical data, was used for analysis (see below, Data analysis). Subjects were given several practice sessions before data collection to familiarize themselves with the experiment and to extract subject-specific coherence levels for low visual reliability.
Seven heading directions were tested: straight ahead and three to each side. Heading angles were varied in small steps, spaced approximately logarithmically around straight ahead, and presented using the method of constant stimuli. The eccentricities of the heading angles were changed in accordance with the level of visual coherence, such that low visual coherence trials had larger heading direction angles. This was necessary to span the psychometric region of interest adequately, because lower coherences yield a wider psychometric curve.
Each experimental session consisted of three consecutive blocks, as follows.
Pre-adaption block.
A pre-adaptation block comprised cues from only a single (visual-only/vestibular-only) modality, interleaved. The resulting psychometric curves were used to deduce the baseline bias and individual reliability of each modality for the subjects. This block comprised [10 repetitions] × [2 stimuli (visual-only/vestibular-only)] × [7 heading angles] = 140 trials.
Adaptation block.
In this block, only combined visual–vestibular cues were presented. Across the trials, a discrepancy between the visual and vestibular cues was introduced incrementally: Δ = ±2°, 4°, 6°, 8°, 10°, and then held at ±10° for the remainder of the block. The discrepancy was introduced incrementally to prevent awareness of the subject. By convention, the sign of Δ indicated the orientation of discrepancy: positive Δ represented an offset of the vestibular cue to the right and visual cue to the left; negative Δ indicated the reverse. Only one discrepancy orientation, positive or negative, was used per session. Eight repetitions were run for each Δ increment, and an additional seven repetitions were run for maximum Δ (±10°), resulting in [(8 × 5 + 7) repetitions] × [7 heading angles] = 329 trials.
Post-adaptation block.
During this block, adaptation of the individual (visual/vestibular) cues was measured by single-cue trials, similar to the pre-adaptation block. The single-cue trials were interleaved with combined-cue trials (with Δ = ±10°, as in the end of the adaptation block) to maintain the adaptation while it was measured. This block comprised [10 repetitions] × [3 stimuli (visual-only/vestibular-only/combined)] × [7 heading angles] = 210 trials.
In total, there were 679 trials in a session, which typically lasted ∼60 min. Subjects participated in 12 such sessions: [2 session repeats] × [3 coherences (high/med/low)] × [2 deltas (positive or negative)]. For one subject, low-coherence data were not collected; for another, medium-coherence data were not collected. Hence, these two subjects participated in eight sessions. Sessions were sorted by low, medium, and high reliability ratio using the actual cue thresholds (see below, Data analysis). To test whether there was any influence of measuring both cues in the same session, three subjects participated in eight additional sessions during which cues were tested individually: [2 cues (visual/vestibular)] × [2 coherences (high/low)] × [2 Δ (positive or negative)]. The single-cue protocol was identical to the standard protocol, but all single-cue trials of the other (nontested) cue were removed. Single-cue trials of the cue being tested and combined-cue trials remained unchanged. There were no observable differences between the results of single-cue sessions and standard sessions. The data presented here are from the standard sessions only. In total, n = 100 experimental sessions were analyzed from the nine human subjects in this study.
Monkey experiment
Four male rhesus monkeys (Macaca mulatta) participated in the study. All procedures were approved by the Animal Studies Committee at Washington University. Monkeys were head fixed and seated in a primate chair that was anchored to a motion platform, identical to the platform used in the human experiment. Also mounted on the platform were a stereoscopic projector (Mirage 2000; Christie Digital Systems), a rear-projection screen, and a magnetic field coil (CNC Engineering) for measuring eye movements (Judge et al., 1980). The projection screen (60 × 60 cm) was located ∼30 cm in front of the eyes, subtending a visual angle of ∼90° × 90°. Monkeys wore custom stereo glasses made from Wratten filters (red #29 and green #61; Eastman Kodak), which enabled rendering of the visual stimulus in three dimensions as red–green anaglyphs.
As in the human experiment, the monkeys' task was to discriminate heading direction (two-alternative forced choice, right or left of straight ahead), after presentation of a single-interval stimulus. The monkeys were required to fixate on a central target during the stimulus and then report their choice by making a saccade to one of two choice targets (right/left) illuminated at the end of the trial. Like in the human experiment, the stimulus presented was either vestibular-only, visual-only, or simultaneously combined vestibular and visual stimuli. The stimulus velocity followed a 4 σ Gaussian profile with the same parameters as the human experiment: duration of 1 s; total displacement of 13 cm; peak velocity of 0.35 m/s; and peak acceleration of 1.4 m/s2. Ten heading directions were tested: five to each side. Heading angles were varied in small, logarithmically spaced steps around straight ahead and presented using the method of constant stimuli.
The optic flow simulated self-motion of the monkey through a random-dot cloud. Visual cue reliability was varied by manipulating the motion coherence of the optic-flow pattern, i.e., percentage of dots moving coherently. Two levels of coherence were used: high (100% coherence) and low. Low coherence was monkey specific, determined such that the monkey's visual threshold was larger than its vestibular threshold. Vestibular reliability was fixed throughout the trials. For each session, the actual reliability ratio of the visual/vestibular cues extracted from the data was used for analysis (see below, Data analysis).
The monkey experimental session was similar to that of the humans, also comprising pre-adaptation, adaptation, and post-adaptation blocks. However, at the end of a trial, monkeys were either rewarded or not rewarded with a portion of water/juice. Reward strategy was manipulated so as not to interfere with the adaptation, as described below.
Pre-adaption block.
The pre-adaptation block comprised visual-only/vestibular-only/combined cues, interleaved. For some sessions, the combined stimulus was excluded. The monkey was rewarded for correct choices 95% of the time and not rewarded for incorrect choices. The 95% correct reward rate was used to accustom the monkey to not getting rewarded all of the time, as was the case in the post-adaptation block described below. This block was used to deduce the baseline bias and individual reliability (psychometric curve) of each modality for the monkeys. It comprised [10 repetitions] × [3 stimuli (visual-only/vestibular-only/combined)] × [10 heading angles] = 300 trials. When the combined stimulus was excluded, the block comprised 200 trials.
Adaption block.
In the adaptation block, only combined visual–vestibular cues were presented. A discrepancy of Δ = ±10° was introduced between the visual and vestibular cues for the entire duration of the block. Like the human experiments, the sign of Δ indicated the orientation of discrepancy: positive Δ represented an offset of the vestibular cue to the right and visual cue to the left; negative Δ indicated the reverse. Only one discrepancy orientation was used per session. During this block, the monkey did not make direction choices. Rather, it was rewarded only for keeping its eyes fixated on the central target for the duration of the trial. Choice targets were not presented at the end of the trial. That way, the reward did not generate any bias for visual/vestibular cues, and perceptual adaptation reflects the mutual influence of the modalities on one another. This block typically comprised [50 repetitions] × [10 heading angles] = 500 trials.
Post-adaption block.
During the post-adaptation block, the adaptation of the individual (visual/vestibular) modalities was measured by single-cue trials, interleaved with the combined-cue trials (with Δ = ±10° as in the adaptation block). The combined-cue trials were run in the same way as in the adaptation block: rewarded by eye-fixation alone with no heading direction choice. They were included to retain adaptation. The probability of reward for single-cue trials worked slightly differently from the pre-adaptation block so as not to perturb the adaptation. When the single-cue trial was at a heading angle, such that if it were a combined-cue trial the other modality would be to the same direction (right/left), the monkey was rewarded as in the pre-adaptation block (95% probability reward for correct choices; no reward for incorrect). If however, the other modality would have been to the opposite side, a reward was given probabilistically (70%, no matter what the choice). This value was chosen because it approximately represents the correct choice rate in a normal heading discrimination task. This block typically comprised [20 repetitions] × [3 stimuli (visual-only/vestibular-only/combined)] × [10 heading angles] = 600 trials. At least 10 repetitions were required in this block for the session to be included in the study.
A typical session comprised ∼1400 trials (∼2.5 h in total) and was run at either high/low coherence, with either positive/negative Δ. Sessions were sorted by low, medium, and high visual to vestibular RR using the actual cue thresholds (see below, Data analysis). For each monkey, there were at least four repetitions for each discrepancy orientation (positive and negative Δ) at both high and low RR. Medium RR data were of a similar quantity. Monkey C was missing three of the eight high RR sessions because 100% visual coherence did not reliably result in high RR, and coherence cannot be increased >100%. In total, n = 108 experimental sessions were analyzed from the four monkeys in this study. One medium RR data point had a large vestibular shift (>10°), so it was excluded from Figures 4 and 5. However, it was in line with the results and was included in all data analyses and regressions.
Monkeys were able to perform “fixation-only” trials (as presented in the second and third block) interchangeably with the standard “heading-discrimination” trials (as presented in the first and third block) without any difficulty, and they did not need to be cued in advance which type of trial they were doing. This is because they were required to fixate during the stimulus regardless of the type of trial. After the stimulus, either the monkey was rewarded for fixating up until this point (fixation-only trials) or the central fixation point would disappear and choice targets would appear on the screen (heading-discrimination trials) cueing the monkey that it has to make a selection.
Data analysis
Data analysis was performed with custom software using Matlab R2006a (MathWorks) and the psignifit toolbox for Matlab (version 2.5.6; Wichmann and Hill, 2001a,b). Psychometric plots were defined as the proportion of rightward choices as a function of heading angle and calculated by fitting the data with a cumulative Gaussian distribution function. For each experimental session, separate psychometric functions were constructed for visual and vestibular cues before and after adaptation. The psychophysical threshold and point of subjective equality (PSE) were the SD (σ) and mean (μ), respectively, deduced from the fitted distribution function. The PSE represents the heading angle of equal right/left choice proportion, i.e., perceived straight ahead, also known as the bias. Visual/vestibular adaptation was measured as the difference between the pre- and post-adaptation PSEs.
For <2% of the post-adaptation distribution functions, there was only one data point that was not 0 or 1 (there were no distribution functions in which all data points were either 0 and 1). This resulted from a large PSE shift to a region where the curve was sparsely sampled. For these sessions, it was difficult to determine the SD of the cumulative Gaussian. Hence, the pre-adaptation SD was used as a Bayesian “prior” for fitting the post-adaptation psychometric plot. The prior was a raised cosine function that touched 0 at the 95% confidence limits of the pre-adaptation SD.
The RR was defined as the ratio of the visual/vestibular reliabilities and calculated for each session individually. Cue reliability was computed by taking the inverse of the threshold squared, using the geometric mean of the pre-adaptation and post-adaptation thresholds extracted from the fitted psychometric curves. For both human and monkey experiments, the data were divided into three RRs: low RR (RR ≤ 2.5−1), medium RR (2.5−1 < RR < 2.5), and high RR (RR ≥ 2.5). Because behavioral performance could change over time as a result of a “practice” effect, we did not assume that RRs were equal across sessions of the same subject with the same coherence. We therefore calculated the RR for each session individually.
When calculating a linear regression between two dependent variables, each containing uncertainty/noise in their measurements, a type II regression was used (the perpendicular distances between the data points and the regression line were minimized).
Results
Behavioral data of the nine human subjects who participated in the human experiment and four monkeys from the monkey experiment were analyzed and are presented in this study. Before presenting the data, we first introduce a brief theoretical background on which data analysis was based, followed by simulation of the experimental outcome according the theoretical predictions.
Theoretical framework for multisensory adaptation in the absence of external feedback
When presented with an environmental stimulus S, individual sensors (e.g., A and B) will estimate S based on their individual estimator functions: ŜA = fA(S) and ŜB = fB(S), where ŜA and ŜB represent their respective estimates. The “internal consistency” hypothesis predicts that, if the expectancies of the estimates, E(ŜA) and E(ŜB), are not equal, then (in the absence of external feedback) estimator functions will adapt toward one another to achieve equality and internal consistency. Cue-adaptation direction will be dependent on the discrepancy, Δ, of the cues:
such that fA will adapt proportional to +Δ and fB will adapt proportional to −Δ. However, the rate and extent of adaptation may be different for the cues. How the two estimator functions adapt toward one another can follow either one of the following principles.
Bayesian prediction of multisensory adaptation
The model of RBA predicts that the extent to which each estimator function adapts is dependent on the reliabilities of the cues. This was described by Ghahramani et al. (1997) as the weighted Δ rule:
where δfA and δfB are the additive changes to the cue estimators, η is the adaptation rate (small positive constant), and wA and wB are the weights calculated by the cues' relative reliabilities:
and cue reliability (rA and rB) is defined as the inverse variance:
Note that wA + wB = 1 and that the weights for multisensory adaptation are complementary to those used for multisensory integration. Namely, for cue integration, wA is used to weight the estimate of cue A and wB is used to weight the estimate for cue B. However, for cue adaptation, the extent of the adaptation of cue A is determined by wB, and the extent of adaptation of cue B is determined by wA. This is because the more reliable a cue, the higher it will be weighted during integration and the less it will undergo calibration (and vice versa).
Fixed-ratio prediction of multisensory adaptation
FRA predicts that cues adapt at a fixed ratio regardless of relative reliability. Thereby, estimator functions will adapt according to the following:
where C is a constant. A subcase of FRA is VDA, in which the ratio of non-visual cue adaptation to visual adaptation tends to infinity (only the non-visual cue adapts).
Visual–vestibular adaptation: simulation
To explain the expected outcome of visual–vestibular adaptation, we simulated performance for the task used in this study according to the theoretical framework presented above. Two models of adaptation were simulated: RBA and VDA. The simulation paradigm was a replica of the human experimental protocol, which was similar to the monkey protocol except for a few slight differences as detailed in Materials and Methods. Psychophysical discrimination of heading direction was simulated in a two-alternative forced-choice task (right or left of straight ahead). Like the actual experiments, the stimulus comprised either vestibular-only (inertial-motion), visual-only (optic-flow), or simultaneously combined vestibular and visual cues. Simulated psychometric curves depict the ratio of rightward choices as a function of stimulus heading direction (Fig. 1, middle three columns). The PSE was extracted from the fitted Gaussian cumulative distribution function and represents the estimate of a cue for straight ahead.
All baseline (pre-adaptation) simulations were generated with PSE = 0, i.e., no heading-direction bias. Precision of individual psychometric curves was controlled by setting the variance of the underlying Gaussian functions (Eq. 4). To induce a cue conflict, a discrepancy between the visual and vestibular headings was introduced incrementally: Δ = 2°, 4°, 6°, 8°, 10°. Cue adaptation was simulated according to RBA and VDA and measured by the resulting shift in the visual–vestibular PSEs. For these simulations, multisensory adaptation achieved internal consistency, namely, visual–vestibular PSEs shifted a combined 10° to cancel out the introduced Δ. We did not simulate the actual time course of adaptation but rather the endpoint (from pre-adaptation to post-adaptation) according to the ratio of vestibular/visual adaptation predicted by the models. The simulated shifts are presented in Figure 1: vestibular pre-adaptation and post-adaptation psychometric curves are presented in blue and cyan, respectively, and visual pre-adaptation and post-adaptation psychometric curves are presented in red and magenta, respectively.
For RBA (Fig. 1A), both visual and vestibular cues shifted toward one another. The extent to which each cue shifted was dependent on RR (ratio of visual to vestibular reliability). Visual and vestibular reliabilities were calculated according to Equation 4, using the variances extracted from the fitted cumulative Gaussian functions. Three different RRs were simulated: when RR = 5, the vestibular cue shifted five times more versus visual (top row); when RR = 1, both cues shifted equally (second row); and when RR = ⅕, the visual cue shifted five times more versus vestibular (third row). For VDA (Fig. 1B), only the vestibular curve shifted, whereas the visual curve did not shift. This happens regardless of RR.
The combined-cue responses during the adaptation block can provide insight into the type of adaptation. However, analysis of these data first requires definition of the reference frame because there is no absolute heading direction for the combined cue when individual cues are discrepant. In this study, the combined-cue axis (0) was artificially defined by the heading midway between the visual and vestibular cues. Hence, during the simulated adaptation block, the visual and vestibular heading angles were +Δ/2 and −Δ/2, respectively. As Δ increased, the headings of both cues became more eccentric but remained symmetric around the combined cue 0. For each value of Δ, the combined psychometric curve was calculated (Fig. 1, second column from the right). The combined-cue PSE was then extracted and plotted as a function of Δ (Fig. 1, rightmost column). Positive or negative combined-cue PSE values therefore indicate higher visual or vestibular weighting, respectively. PSE = 0 indicates equal weighting.
The combined plots presented here should therefore not be mistaken to represent a psychometric shift. In fact, according to both model simulations (RBA and VDA), the combined-cue response does not shift, in world coordinates, during adaptation. Rather, the observed changes in combined PSE are attributable to increasing Δ and the concurrent increase in heading eccentricity of the individual cues. Why do both RBA and VDA not predict a shift in the combined cue? If cue combination and cue adaptation follow the same model, then the combined cue will not shift. This is because, during cue adaptation, cues converge on the initial cue combination. For example, (1) visual-dominant cue combination always aligns the combined response with the visual cue; during VDA, only the vestibular cue adapts until it is aligned with the visual cue. Hence, the combined cue will not change during VDA. (2) During RBA, visual and vestibular cues shift according to the weights of Equation 3, the same weights used for RBCC. Hence, the combined cue response will remain unchanged during RBA. If, however, different models were used for cue combination and cue adaptation, the combined response would shift during adaptation. This issue is further discussed in the last section of Results.
Therefore, in this simulation, the slope of combined PSE versus Δ (Fig. 1, rightmost column) is simply an indication of cue weighting: a positive versus negative slope indicates higher visual versus vestibular weighting, respectively. A 0 slope indicates equal weighting. For RBA, when RR = 5, the combined response was weighted more by the visual cue (positive slope; top row), when RR = ⅕, it was weighted more by the vestibular cue (negative slope; third row), and when RR = 1, there was equal weighting (0 slope; second row). For VDA, the combined response always aligns with the visual cue (at +Δ/2). Hence, the combined PSE versus Δ always demonstrates the maximum slope, ½ (bottom row).
Visual–vestibular adaptation: examples
In the simulation presented in Figure 1, a cue conflict of Δ = +10° was used (vestibular cue offset to the right; visual to the left). The predictions for Δ = −10° (vestibular cue offset to the left; visual to the right) are the same but with shifts in the reverse direction. In the actual experiments, both orientations of Δ were used. Hence, when analyzing the data, the predicted cue shifts for positive versus negative Δ are equal but opposite in sign/direction.
Figure 2 shows representative psychophysical data from one human subject. Two experimental sessions are presented, for both possible orientations of cue conflict (Δ = ±10°). Motion dot coherence was 100%, and RR > 1 for both examples (cue reliabilities were calculated using Eq. 4 and the fitted Gaussian functions). Cue shifts are presented here with 95% confidence intervals for the PSE, calculated by bootstrapping the psychometric curve: for Δ = +10° (Fig. 2A), the vestibular PSE shifted from −0.3° [−0.9, 0.4] to 2.4° [1.6, 3.2], and the visual PSE shifted from 0.9° [0.4, 1.5] to 0.1° [−0.4, 0.6]. For Δ = −10° (Fig. 2B), the vestibular PSE shifted from −1.0° [−1.8, −0.2] to −3.0° [−4.4, −1.9], and the visual PSE shifted from 0.3° [−0.3, 1.0] to 1.8° [1.0, 2.8].
For both examples, visual and vestibular psychometric curves shifted in the direction required to reduce cue conflict (similar to the RBA simulation; Fig. 1A). However, unlike the simulations, internal consistency was not achieved: vestibular + visual cues shifted only a combined 2.7° + 0.8° = 3.5° for the example in Figure 2A and 2.0° +1.5° = 3.5° for the example in Figure 2B, less than the introduced discrepancy of 10°. The (absolute) ratio of vestibular/visual PSE adaptation was 2.7°/0.8° = 3.4 for the example in Figure 2A and 2.0°/1.5° = 1.3 for the example in Figure 2B compared with the RR predictions of 1.9 and 1.1, respectively. The positive (Fig. 2A) and negative (Fig. 2B) slope of the combined PSE versus Δ indicate higher weighting of the visual cue, as expected for RR > 1.
Similarly, Figure 3 shows representative psychophysical data of two experimental sessions from one monkey (Δ = ±10°). Motion dot coherence was 100% and RR > 1 for both examples (cue reliabilities were calculated using Eq. 4 and the fitted Gaussian functions). Cue shifts are presented here with 95% confidence intervals for the PSE, calculated by bootstrapping the psychometric curve: for Δ = +10° (Fig. 3A), the vestibular PSE shifted from 0.5° [−1.4, 2.3] to 4.0° [2.7, 5.7], and the visual PSE shifted from −0.3° [−0.9, 0.7] to −0.5° [−1.1, 0.0]. For Δ = −10° (Fig. 3B), the vestibular PSE shifted from 0.6° [−0.9, 2.2] to −6.6° [−8.7, −4.9], and the visual PSE shifted from 0.8° [−0.3, 1.7] to 2.3° [1.8, 3.0].
Similar to the human examples, visual and vestibular psychometric curves shifted in the direction required to reduce cue conflict but did not achieve internal consistency: the combined vestibular + visual shift was 3.5° + 0.3° = 3.8° for the example in Figure 3A and 7.2° + 1.6° = 8.6° for the example in Figure 3B. The (absolute) ratio of vestibular to visual adaptation was 3.5°/0.3° = 11.7 for the example in Figure 3A and 7.2°/1.6° = 4.5 for the example in Figure 3B compared with the RR predictions of 15.9 and 5.7, respectively. There were no behavioral responses during the adaptation block of the monkey experiments because of experimental constraints (the monkeys were rewarded for fixation only during the adaptation block and therefore did not make heading selections; see Materials and Methods).
Cue adaptation ratio changes with cue reliability
To quantify the results across experimental sessions, we first used an analysis based on cue adaptation ratio, as previously done by Burge et al. (2010). Because visual and vestibular cues are expected to adapt toward one another, their psychometric shifts should always be opposite in direction. Hence, the ratio of vestibular/visual shift should always be negative. Also, shift directions should reverse for positive versus negative Δ. Therefore, a single data point, plotted as the vestibular versus visual psychometric curve shift, is expected to lie in quadrant II of the Cartesian plane (top left: positive vestibular shift, negative visual shift) for positive Δ and in quadrant IV (bottom right: negative vestibular shift, positive visual shift) for negative Δ. Hence, a regression line of pooled vestibular versus visual shifts is expected to have negative slope, with two possible extremes: a vertical line would indicate only vestibular (and no visual) shift, and a horizontal line would indicate only visual (and no vestibular) shift.
RBA predicts that the magnitude of the visual to vestibular shift ratio should be dependent on RR. Specifically, for high RR, the vestibular shift is expected to be larger in magnitude than the visual shift, i.e., near-vertical regression line. For low RR, the visual shift is expected to be larger in magnitude than the vestibular shift, i.e., near-horizontal regression line. Finally, for medium RR, comparable magnitudes are expected, i.e., regression line with slope of approximately −1. In contrast, FRA predicts that the slope will be fixed according to a constant adaptation ratio, such that the same slope would be seen for low, medium, and high RR. A special case of FRA is VDA, that predicts a vertical regression line, only vestibular adaptation, regardless of RR.
To test which model best depicts cue adaptation, the data were sorted by low, medium, and high RR, and the vestibular versus visual shifts were plotted (Fig. 4). The results clearly contradicted the VDA model, because visual shifts were observed. In fact, for both the human (Fig. 4A) and the monkey (Fig. 4B) data, there seemed to be an influence of RR on cue adaptation, suggestive of RBA and not FRA. As predicted by RBA, high RR data approach a vertical line (Fig. 4, third column; especially for the monkey data) versus low RR data, which approach a horizontal line (Fig. 4, first column). Such conclusions would be similar to those of Burge et al. (2010), who concluded that visual–haptic cue adaptation follows the RBA model. However, these results need to be treated with caution because of the large changes in variability observed for different RRs. Specifically, variability of visual PSE shifts increased with decreasing RR, as discerned by the distribution of data along the x-axis (Fig. 4; these changes in variability are quantified below and in Fig. 5, rightmost column). In fact, data variability itself can strongly influence the orientation of the regression lines (we elaborate on this point in Discussion). Hence, these plots are not adequate to conclude that cue adaptation follows RBA.
If cue adaptation were to achieve internal consistency, then the absolute sum of cue shifts would equal 10°. When summing the absolute visual and vestibular PSE shifts (histograms in Fig. 4), it is very apparent that cue adaptation does not reach internal consistency. In fact, for almost all of the data, the sum of PSE shifts was <10°. This is not surprising given the limited length of our experiments. Visual inspection of the cue-shift plots from Burge et al. (2010) indicate the same to be true for visual–haptic adaptation.
Complete adaptation (internal consistency) is actually not a requirement to test the models. This is because, according to the theoretical framework presented above, the ratio of cue adaptation would be the same after partial adaptation as for complete adaptation. However, because of the large variability in cue shifts (as described above), the adaptation ratio may not adequately represent individual sessions. Furthermore, cues sometimes shifted in the “unexpected” direction (marked by the gray regions in Fig. 4; especially the visual cue at low RR). Therefore, to gain additional insight into the effects of cue reliability on adaptation, we next analyzed adaptation magnitude separately for visual and vestibular cues.
Cue adaptation magnitude does not correlate with relative cue reliability
The magnitude of individual-cue PSE shifts were plotted as a function of RR (Fig. 5, leftmost column). Blue and red circles represent vestibular and visual shifts, respectively; filled circles indicate significant shifts. For both the human (Fig. 5A) and monkey (Fig. 5B) data, linear regressions were calculated separately for the visual and vestibular cues. Dependence of cue adaptation magnitude on RR was assessed by testing whether the p value of Pearson's correlation coefficient was <0.05. Strikingly, there were no significant correlations between vestibular or visual PSE shifts versus RR: in the human data, r = 0.16 (p = 0.11) and r = 0.11 (p = 0.29) for the vestibular and visual correlations, respectively. This is in contrast to the RBA prediction of a positive correlation for vestibular PSE shifts and a negative correlation for visual PSE shifts. In the monkey data, a small tendency for opposite dependence was seen, but the slopes were not significantly different from 0; r = 0.12 (p = 0.23) and r = −0.13 (p = 0.19) for the monkey vestibular and visual correlations, respectively.
The vestibular cue adapts more than the visual cue, regardless of reliability ratio
The distributions of vestibular and visual PSE shifts (for all RRs pooled) are presented by blue and red histograms in Figure 5 (middle column), respectively. The filled sections in the bars represent significant shifts. Cue shift distributions were analyzed statistically under the null hypothesis of no shift, and p values were calculated using t tests and the Bonferroni's correction for multiple comparisons. The mean vestibular and visual shifts (blue and red dotted lines superimposed on the histograms) were significantly positive, i.e., in the “expected” direction, for both the human (Fig. 5A) and monkey (Fig. 5B) data (p < 0.0001 for all four comparisons). Furthermore, the average vestibular shift was significantly greater than the average visual shift, in both the human (p < 0.01) and monkey (p < 0.0001) data. The ratio of vestibular to visual PSE shift was ∼2:1 (1.75 for humans and 2.30 for monkeys).
When grouping the data by low, medium, and high RR (rightmost column), the mean vestibular PSE shift (blue) was always greater than the mean visual shift (red), even for low RR. This, too, is demonstrated for both the human and monkey data and is contrary to the RBA prediction. Furthermore, comparing the mean PSE shift for low RR versus high RR (separately for visual and vestibular cues) revealed no significant differences for either the human or monkey data (p > 0.2 for all four comparisons, using a t test). However, we did find that the SD for visual PSE shifts (red vertical lines) was significantly greater at low RR than high RR, for both the human and monkey data (p < 0.0001, using a χ2 test and the Bonferroni's correction for multiple comparisons). In contrast, the SDs for vestibular PSE shifts (blue vertical lines) remained unchanged for both the human and monkey data (p > 0.1). Hence, the only factor dependent on RR was visual PSE shift variability.
The finding that the vestibular cue shifts more and the visual cue less than expected by RBA is also demonstrated when analyzing proportional cue shifts. For this analysis, proportional PSE shifts were calculated as follows:
It should be noted that, when cues shift in the unexpected direction, calculating the proportional PSE shift may be an ill-posed problem.
Expected proportional PSE shifts were predicted by the relative cue reliabilities according to RBA. In Figure 6, the proportional shift data are plotted versus the expected proportional shifts. Blue and red lines represent regressions for the vestibular and visual data, respectively, and shaded regions represent 95% confidence bands of the regressions based on 1000 bootstrapped datasets. The diagonal (dashed) lines represent the expected regression if proportional shifts were to follow prediction. Significance was judged by whether or not the 95% confidence band included the diagonal. In the human data (Fig. 6A), the relative vestibular shift was significantly greater than expected, and the relative visual shift was significantly less than expected. In the monkey data (Fig. 6B), the relative vestibular shift was greater (albeit not significantly) than expected, and the relative visual shift was significantly less than expected.
A model of fixed-ratio adaptation better accounts for visual and vestibular PSE shifts
Our findings that cue adaptation magnitude does not correlate with relative cue reliability and that the vestibular cue adapts more than the visual cue, regardless of reliability ratio (Fig. 5), strongly implicate an FRA model. Furthermore, by dividing the average vestibular shift by the average visual shift, we found that the ratio representative of the FRA model was ∼2:1. However, cue adaptation-ratio analysis on a session-by-session basis, as described by Burge et al. (2010) and in our comparable Figure 4, seemed to suggest that there might also be a reliability-based component. Hence, we performed additional analysis to see whether RBA or FRA would best describe the individual session adaptation ratios.
The adaptation ratio of each session can be described by the angle it subtends on the Cartesian plane (as seen in Fig. 4). To compare the various models, we calculated the difference between the angle of each data point and the predicted angles of the models. A 0° difference would indicate perfect alignment between the data and model prediction; a 180° difference would indicate that the data behaved diametrically opposite to the prediction of the model. We compared four models: FRA, VDA, RBA, and vestibular-dominant adaptation (VeA). VeA, which predicts only visual and no vestibular adaptation, is not a viable model. It was introduced here only for the sake of comparison because it is the complementary form of VDA and hence also a subcase of FRA, at the other extreme to VDA. For FRA, the actual vestibular/visual adaptation ratio extracted from the mean PSE shifts (Fig. 5) was used: 1.75 and 2.30 for humans and monkeys, respectively.
The cumulative distribution of the angular deviation of the data from each model is presented in Figure 7. Model fits were quantified by taking the mean of the cumulative distribution. These are annotated on the plots. Essentially, the mean of the cumulative distribution is equivalent to the normalized area under the curve. For a perfect model, all angular deviations would be 0. Hence, the cumulative distribution, and mean thereof, would be 1. A random model would produce, on average, a diagonal cumulative distribution (Fig. 7, solid black lines) with a mean value of 0.5. The cumulative distribution means were used to compare the models statistically using t tests. To minimize the number of comparisons, VeA was not compared statistically because it was a priori not considered a viable model. Two model comparisons were made: RBA versus FRA and RBA versus VDA.
The results indicate that, even for session-by-session shift-ratio analysis (which suggested an element of RBA in Fig. 4), FRA best describes the data for both humans (Fig. 7A) and monkeys (Fig. 7B). FRA was significantly better than RBA for the human (p = 0.012, after the Bonferroni's correction for multiple comparisons), but not monkey, data (p > 0.2). VDA and RBA were statistically indistinguishable for both the human and monkey data (p > 0.8 for both). As expected, VeA provided the worst model fit.
Fixed ratio adaptation accounts for reliability based cue combination
We have shown previously, with the same heading-discrimination task used in this study, that visual and vestibular cues follow RBCC (Gu et al., 2008; Fetsch et al., 2009). Knowing that RBCC is used, we can study the behavior of the combined visual and vestibular cues, during and after adaptation, in light of RBCC. This could provide additional insight into the type of adaptation that the cues underwent. According to RBCC, the combined heading perception is a weighted sum of the individual cues. Hence, the combined-cue heading perception will depend on how the individual cues themselves have adapted. The predictions for combined-cue heading perception are very different for RBA versus FRA. Combined-cue heading behavior can therefore indicate whether the underlying adaptation followed RBA or FRA.
To explain the different predictions of RBA and FRA for the combined-cue estimate, we return to a concept that we introduced in Results: visual–vestibular adaptation − simulation. If cue combination and cue adaptation follow the same model, then the combined-cue response should not shift during adaptation. This can be understood intuitively as follows: when individual cues are disparate, without external feedback, they will adapt to converge on their combined-cue estimate. For example, for visual dominance, the initial combined-cue estimate is the visual cue, which is itself the endpoint of convergence according to VDA; for a reliability-based model, the initial combined-cue estimate is the weighted average, which is itself the endpoint of convergence according to RBA. Hence, the combined-cue estimate should remain unchanged throughout the process of adaptation. If, however, cue-adaptation uses a different model to cue combination, then combined-cue perception should change during adaptation.
For this analysis, we used the combined-cue behavior during adaptation, available in the human (but not monkey) data. The combined-cue heading bias in relation to the visual and vestibular cues was extracted from the data by finding the slope of the combined-cue PSE versus Δ. As explained above and presented in Figures 1 and 2, the slope of the combined-cue PSE versus Δ ranges from −½ to ½. A slope of ½ would indicate complete visual dominance, and a slope of −½ would indicate completed vestibular dominance. A slope of 0 would indicate equal weighting of visual and vestibular cues. RBCC could therefore be quantified by the slope of the combined-cue PSE versus Δ.
In Figure 8A, we demonstrate, through simulation, the predicted responses of the RBA and FRA models to an introduced heading discrepancy (Δ) between the cues. Like the simulation for Figure 1, the visual cue was presented at −Δ/2 and the vestibular cue at +Δ/2. Pre-adaptation, combined-cue PSE slopes that would result from visual or vestibular cue dominance are represented by the red and blue solid lines, respectively, and the RBCC is represented by the dark green line (Fig. 8A; same for FRA and RBA). Pre-adaptation curves can be understood as follows: visual dominance would result in a combined-cue PSE versus Δ slope of ½ (red line), and vestibular dominance would result in a slope of −½, for RR ≫ 1, RBCC asymptotes to visual dominance, whereas for RR = ∼0, it asymptotes to vestibular dominance.
As we demonstrated above, internal consistency was not achieved in our data. Hence, we did not constrain the models to complete adaptation; rather the extent of adaptation was a parameter that ranged from 0 (pre-adaptation) to 1 (internal consistency). In this simulation, an adaptation extent of 0.65 was used. The vestibular/visual adaptation ratio used for the FRA simulation was 1.75, because this was the actual ratio extracted from the data (Fig. 5A). After adaptation, cyan and magenta lines represent the combined PSE slope that would result from visual and vestibular dominance, respectively. Similar to pre-adaptation, post-adaptation RBCC asymptotes to these curves. For RBA (Fig. 8A, left), the cues adapted according to the same weights as RBCC. Hence, the RBCC remained unchanged even after adaptation (superimposed on the dark green curve). In contrast, according to FRA, the cues adapted at a fixed ratio regardless of RR (Fig. 8A, right). Because adaptation followed different weights to cue combination, the RBCC response changed during adaptation (light green curve).
For FRA, two major changes are evident in the RBCC curve: (1) the entire curve shifted vertically, as seen by the y-intercept, and (2) the curve narrowed, as seen by the reduced y-amplitude. The former resulted directly from the ratio of vestibular/visual adaptation, and the latter resulted directly from the extent of adaptation. Hence, according to FRA, the combined-cue response can be predicted based on two model parameters: (1) the ratio of vestibular/visual adaptation and (2) the extent of adaptation. These predictions are very different from RBA, which predicts no change to cue combination during adaptation.
The actual RBCC data are presented in Figure 8B (circles). Data were fit by the following function:
where RR is the visual/vestibular reliability ratio, and A and B are parameters used for optimization. A and B have a 1:1 relationship with the extent and ratio of adaptation as follows:
Hence, each combination of the two parameters (adaptation ratio and adaptation extent) represents a specific RBCC curve. The goodness of fit (R2) was calculated for each curve according to the standard formula:
Because the curves were defined externally to span the parametric range (and not fitted to the data), some curves provided a worse fit than the data mean. For these curves, Equation 9 would result in a negative value. Hence, R2 was truncated at 0.
Figure 8C presents the R2 values for all possible combinations of adaptation ratio and adaptation extent. The optimal fit is represented by a dashed line in Figure 8B and a black dot in Figure 8C. According to the FRA model, we should be able to predict the RBCC fit based on the adaptation ratio and adaptation extent. Although the adaptation extent may be unknown (these data are taken during the course of adaptation), the adaptation ratio should follow the same ratio extracted from the cue shifts, presented above (Fig. 5). The actual adaptation ratio extracted from the data is represented by the solid white line in Figure 8C (the dashed white line represents the ratio from the monkey data). Fitting the function according to the actual adaptation ratio (with only one free parameter, adaptation extent) resulted in an RBCC fit almost identical to the optimal fit. This is presented by the light green curve in Figure 8B and the light green dot in Figure 8C. In contrast, the RBA prediction was worse than the data mean. This can be seen by the dark blue region at the bottom of Figure 8C, because the RBA curve (which does not change during adaptation) is essentially identical to an FRA curve with adaptation extent of 0 (as seen in Fig. 8A). The very finding that the combined response undergoes adaptation indicates that cue combination and cue adaptation cannot be using the same model/weights. The finding that FRA can account for RBCC provides additional support for the FRA model.
Finally, the deviance of the actual combined-cue PSE values (as a function of Δ) from the FRA and RBA predictions is presented in Figure 8D. The actual combined-cue PSE values were closer to the initial (pre-adaptation) visual cue for low RR and initial vestibular cue for high RR than predicted by RBA (dark green lines). These data indicate that the visual and vestibular cues did not shift according to RBA. In contrast, FRA adaptation predictions, using the parameters from Figure 8, were indistinguishable from the actual combined-cue PSE values, for all RR (light green curves).
Discussion
In this study, we probed the nature of multisensory cue calibration in the absence of external feedback. We found that, given a heading-direction discrepancy between visual and vestibular cues, both cues underwent mutual adaptation toward one another. Quantitatively, the extent of individual cue adaptation followed a fixed ratio, regardless of relative cue reliability. Specifically, the ratio of vestibular/visual adaptation was ∼2:1 for both humans and monkeys. This finding is particularly striking because, during cue integration, visual and vestibular cues are weighted according to their relative reliabilities (Gu et al., 2008; Fetsch et al., 2009; Butler et al., 2010). Our results therefore indicate that multisensory cue integration and cue calibration follow different mechanisms/principles: cue integration is reliability based, whereas cue calibration follows a fixed ratio.
Reliability-based cue integration has been demonstrated in a number of multisensory paradigms (Jacobs, 1999; van Beers et al., 1999, 2002; Landy and Kojima, 2001; Ernst and Banks, 2002; Gepshtein and Banks, 2003; Knill and Saunders, 2003; Alais and Burr, 2004; Jürgens and Becker, 2006; Gu et al., 2008; Fetsch et al., 2009; Butler et al., 2010). However, quantitative testing of the nature of multisensory cue calibration is lagging. Burge et al. (2010) recently pioneered a paradigm to quantitatively test the reliability-based model for visual–haptic calibration. They reported that visual–haptic cues follow a model of reliability-based adaptation. In our study, we emulated their paradigm but with visual–vestibular calibration. When using similar methods of analysis, also our data suggested an influence of relative reliability on cue adaptation. However, additional analysis revealed that visual–vestibular calibration is much better accounted for by a model of fixed-ratio adaptation than reliability-based adaptation.
To explain why analyzing the data according to the previous methods (Fig. 4) could suggest reliability-based adaptation even if adaptation was not reliability based, we present a simulation in Figure 9: Each subplot displays a simulated probability density function corresponding to the subplots of Figure 4. The probability density functions were plotted by color scale, in which red represents high probabilities and blue represents low probabilities. All probability density functions were generated as the combination of two bivariate Gaussians: “two,” one for positive and one for negative Δ; and “bivariate,” for visual and vestibular shifts. The Gaussian means for visual and vestibular shifts were the same across all plots: (−1°, 1°) for positive Δ and (1°, −1°) for negative Δ. Because the means were of equal magnitude and the same for all RRs, the simulated probability density functions did not represent reliability-based adaptation.
The Gaussian variances were taken from the actual data (the same variances were used for positive and negative Δ Gaussians). As we described previously (Fig. 6, right column), vestibular PSE shift variability was unchanged across RRs, whereas visual PSE shift variability was strongly dependent on RR. (In itself, this result is not surprising because RR was controlled in the experiment by manipulating visual motion coherence only.) Therefore, for the Gaussians, we used the overall vestibular PSE shift variance for all RRs and specific visual PSE shift variances for low, medium, and high RR. Hence, the only difference between the RRs (columns in Fig. 9) was the variability of visual PSE shifts.
A type II regression line of 2000 data points, generated from each probability density function in Figure 9, is displayed in white. The regression lines are strikingly similar to those of the actual data (Fig. 4). They, too, approach the horizontal for low RR and vertical for high RR. However, their differences in orientation exist exclusively because of visual PSE shift variance, because there were no other differences between the simulations. Hence, changes in variability alone can cause changes in the orientation of the regression lines, similar to those seen in the data. Therefore, regression line orientation may not accurately represent the ratio of cue adaptation. Visual inspection of the data by Burge et al. (2010) seems to indicate that there, too, visual PSE shift variability could change with RR. Hence, additional analysis may be required to verify the model of adaptation.
The analyses in our paper strongly argue that visual–vestibular calibration follows fixed-ratio and not reliability-based adaptation. Particularly, we found that neither visual nor vestibular adaption correlated with RR. Rather, cue shifts remained constant regardless of whether the visual or the vestibular cue was more reliable, indicating that cue calibration does not follow the same mechanism as cue integration. The fact that our findings were consistent for both humans and monkeys consolidates these results. Even the vestibular/visual shift ratio (∼2:1) was similar for humans and monkeys. Finally, only through modeling fixed-ratio adaptation (using the ratio extracted from the data) were we were able to account for reliability-based cue integration during the adaptation process.
Ernst and Di Luca (2011) suggest that using relative cue reliability for multisensory calibration may be a suboptimal strategy because variance of a cue does not necessarily determine the probability of it being biased. This brings us back to the difference between reliability and accuracy: reliability is the inverse variance of the probability distribution that describes the contribution of a sensory signal to the perceptual estimation. In contrast, accuracy is the probability that the sensory signal truly represents the real-world physical property. Accuracy and reliability are therefore different properties. Hence, relative cue reliability may not be a good indication as to which cue most likely requires calibration.
If the goal of multisensory cue integration is optimal perception (in the sense of improving precision), a reliability-based model makes sense. In contrast, assuming that the goal of cue calibration is improvement of cue accuracy, an accuracy-based mechanism would seem more appropriate. The estimate of cue accuracy of the brain probably has a much longer time constant than that of cue reliability. Accordingly, it is possible that our finding of fixed-ratio adaptation is only relatively fixed, but a change in cue accuracy could change the rate of adaptation. Our finding of higher vestibular versus visual adaptation suggests that the visual cue is more accurate than the vestibular cue in heading discrimination. In this study, we manipulated cue reliability; however, we did not manipulate cue accuracy. To test the hypothesis that cue calibration follows relative accuracy, cue accuracy would need to be manipulated.
The proposal that cue accuracy is more important than cue reliability for multisensory calibration is in line with Gori et al. (2008, 2010) who found that, during development, children do not integrate visual and haptic cues optimally, but rather, touch dominates discrimination of size, and vision dominates discrimination of orientation, even in conditions in which the dominant sense is far less precise than the other. They propose that the sensory dominance may reflect cross-modal calibration, in which the more accurate sense calibrates the other (Burr and Gori, 2011).
Visual dominance has been a prevalent theory (Rock and Victor, 1964; Brainard and Knudsen, 1993). Hence, it has been used as a standard by which to compare reliability-based models (van Beers et al., 2002; Burge et al., 2010). However, visual-dominant adaptation does not represent the general reliability-independent alternative. In fact, it is only a specific subcase of fixed-ratio adaptation (in which the ratio of non-visual/visual adaptation tends to infinity). In this study, we present fixed-ratio adaptation as a novel, yet simple, model and recommend using it as a reliability-independent standard by which to compare reliability-based models. Our finding that visual–vestibular calibration actually follows the fixed-ratio adaptation model clearly highlights its relevance.
In conclusion, our results indicate that visual–vestibular cue calibration does not follow the same mechanism as cue integration. Cue integration is reliability based, whereas cue calibration follows a fixed ratio. It is possible that fixed-ratio adaptation may be only relatively fixed and that the ratio of adaptation may change with cue accuracy. To test this, multisensory cue adaptation needs to be tested as a function of relative cue accuracy.
Footnotes
This work was supported by NIH Grants EY019087, DC007620, and 5-T32-EY13360-10, and by the Edmond and Lily Safra Center for Brain Sciences at the Hebrew University of Jerusalem. We thank Chris Fetsch for his help with pilot experiments and Jason Arand and Heide Schoknecht for their help with data collection.
The authors declare no competing financial interests.
References
- Alais D, Burr D. The ventriloquist effect results from near-optimal bimodal integration. Curr Biol. 2004;14:257–262. doi: 10.1016/j.cub.2004.01.029. [DOI] [PubMed] [Google Scholar]
- Atkins JE, Jacobs RA, Knill DC. Experience-dependent visual cue recalibration based on discrepancies between visual and haptic percepts. Vision Res. 2003;43:2603–2613. doi: 10.1016/s0042-6989(03)00470-x. [DOI] [PubMed] [Google Scholar]
- Brainard MS, Knudsen EI. Experience-dependent plasticity in the inferior colliculus: a site for visual calibration of the neural representation of auditory space in the barn owl. J Neurosci. 1993;13:4589–4608. doi: 10.1523/JNEUROSCI.13-11-04589.1993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burge J, Girshick AR, Banks MS. Visual-haptic adaptation is determined by relative reliability. J Neurosci. 2010;30:7714–7721. doi: 10.1523/JNEUROSCI.6427-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burr D, Gori M. Multisensory integration develops late in humans. In: Murray MM, Wallace MT, editors. The neural bases of multisensory processes. Boca Raton: CRC; 2011. pp. 345–362. [Google Scholar]
- Butler JS, Smith ST, Campos JL, Bulthoff HH. Bayesian integration of visual and vestibular signals for heading. J Vis. 2010;10:23. doi: 10.1167/10.11.23. [DOI] [PubMed] [Google Scholar]
- Ernst MO, Banks MS. Humans integrate visual and haptic information in a statistically optimal fashion. Nature. 2002;415:429–433. doi: 10.1038/415429a. [DOI] [PubMed] [Google Scholar]
- Ernst MO, Di Luca M. Multisensory perception: from integration to remapping. In: Trommershauser J, Kording K, Landy MS, editors. Sensory cue integration. New York: Oxford UP; 2011. pp. 224–250. [Google Scholar]
- Fetsch CR, Turner AH, DeAngelis GC, Angelaki DE. Dynamic reweighting of visual and vestibular cues during self-motion perception. J Neurosci. 2009;29:15601–15612. doi: 10.1523/JNEUROSCI.2574-09.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gepshtein S, Banks MS. Viewing geometry determines how vision and haptics combine in size perception. Curr Biol. 2003;13:483–488. doi: 10.1016/s0960-9822(03)00133-7. [DOI] [PubMed] [Google Scholar]
- Ghahramani Z, Wolpert DM, Jordan MI. Computational models of sensorimotor integration. In: Morasso PG, Sanguineti V, editors. Self-organization, computational maps and motor control. Amsterdam: Elsevier; 1997. pp. 117–147. [Google Scholar]
- Gori M, Del Viva M, Sandini G, Burr DC. Young children do not integrate visual and haptic form information. Curr Biol. 2008;18:694–698. doi: 10.1016/j.cub.2008.04.036. [DOI] [PubMed] [Google Scholar]
- Gori M, Sandini G, Martinoli C, Burr D. Poor haptic orientation discrimination in nonsighted children may reflect disruption of cross-sensory calibration. Curr Biol. 2010;20:223–225. doi: 10.1016/j.cub.2009.11.069. [DOI] [PubMed] [Google Scholar]
- Gu Y, DeAngelis GC, Angelaki DE. A functional link between area MSTd and heading perception based on vestibular signals. Nat Neurosci. 2007;10:1038–1047. doi: 10.1038/nn1935. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gu Y, Angelaki DE, Deangelis GC. Neural correlates of multisensory cue integration in macaque MSTd. Nat Neurosci. 2008;11:1201–1210. doi: 10.1038/nn.2191. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gu Y, Fetsch CR, Adeyemo B, Deangelis GC, Angelaki DE. Decoding of MSTd population activity accounts for variations in the precision of heading perception. Neuron. 2010;66:596–609. doi: 10.1016/j.neuron.2010.04.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guedry FE. Psychophysics of vestibular sensation. In: Kornhuber HH, editor. The vestibular system. New York: Springer; 1974. pp. 1–154. [Google Scholar]
- Jacobs RA. Optimal integration of texture and motion cues to depth. Vision Res. 1999;39:3621–3629. doi: 10.1016/s0042-6989(99)00088-7. [DOI] [PubMed] [Google Scholar]
- Judge SJ, Richmond BJ, Chu FC. Implantation of magnetic search coils for measurement of eye position: an improved method. Vision Res. 1980;20:535–538. doi: 10.1016/0042-6989(80)90128-5. [DOI] [PubMed] [Google Scholar]
- Jürgens R, Becker W. Perception of angular displacement without landmarks: evidence for Bayesian fusion of vestibular, optokinetic, podokinesthetic, and cognitive information. Exp Brain Res. 2006;174:528–543. doi: 10.1007/s00221-006-0486-7. [DOI] [PubMed] [Google Scholar]
- Knill DC, Pouget A. The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 2004;27:712–719. doi: 10.1016/j.tins.2004.10.007. [DOI] [PubMed] [Google Scholar]
- Knill DC, Saunders JA. Do humans optimally integrate stereo and texture information for judgments of surface slant? Vision Res. 2003;43:2539–2558. doi: 10.1016/s0042-6989(03)00458-9. [DOI] [PubMed] [Google Scholar]
- Knudsen EI. Instructed learning in the auditory localization pathway of the barn owl. Nature. 2002;417:322–328. doi: 10.1038/417322a. [DOI] [PubMed] [Google Scholar]
- Landy MS, Kojima H. Ideal cue combination for localizing texture-defined edges. J Opt Soc Am A Opt Image Sci Vis. 2001;18:2307–2320. doi: 10.1364/josaa.18.002307. [DOI] [PubMed] [Google Scholar]
- Lewald J. Rapid adaptation to auditory-visual spatial disparity. Learn Mem. 2002;9:268–278. doi: 10.1101/lm.51402. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacNeilage PR, Banks MS, DeAngelis GC, Angelaki DE. Vestibular heading discrimination and sensitivity to linear acceleration in head and world coordinates. J Neurosci. 2010;30:9084–9094. doi: 10.1523/JNEUROSCI.1304-10.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ohmi M. Egocentric perception through interaction among many sensory systems. Brain Res Cogn Brain Res. 1996;5:87–96. doi: 10.1016/s0926-6410(96)00044-4. [DOI] [PubMed] [Google Scholar]
- Rock I, Victor J. Vision and touch: an experimentally created conflict between the two senses. Science. 1964;143:594–596. doi: 10.1126/science.143.3606.594. [DOI] [PubMed] [Google Scholar]
- Scarfe P, Hibbard PB. Statistically optimal integration of biased sensory estimates. J Vis. 2011;11:pii:12. doi: 10.1167/11.7.12. [DOI] [PubMed] [Google Scholar]
- Spence C. Explaining the Colavita visual dominance effect. Prog Brain Res. 2009;176:245–258. doi: 10.1016/S0079-6123(09)17615-X. [DOI] [PubMed] [Google Scholar]
- van Beers RJ, Sittig AC, Gon JJ. Integration of proprioceptive and visual position-information: An experimentally supported model. J Neurophysiol. 1999;81:1355–1364. doi: 10.1152/jn.1999.81.3.1355. [DOI] [PubMed] [Google Scholar]
- van Beers RJ, Wolpert DM, Haggard P. When feeling is more important than seeing in sensorimotor adaptation. Curr Biol. 2002;12:834–837. doi: 10.1016/s0960-9822(02)00836-9. [DOI] [PubMed] [Google Scholar]
- Warren WH, Hannon DJ. Direction of self-motion is perceived from optical-flow. Nature. 1988;336:162–163. [Google Scholar]
- Watt SJ, Akeley K, Ernst MO, Banks MS. Focus cues affect perceived depth. J Vis. 2005;5:834–862. doi: 10.1167/5.10.7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wichmann FA, Hill NJ. The psychometric function. I. Fitting, sampling, and goodness of fit. Percept Psychophys. 2001a;63:1293–1313. doi: 10.3758/bf03194544. [DOI] [PubMed] [Google Scholar]
- Wichmann FA, Hill NJ. The psychometric function. II. Bootstrap-based confidence intervals and sampling. Percept Psychophys. 2001b;63:1314–1329. doi: 10.3758/bf03194545. [DOI] [PubMed] [Google Scholar]
- Witten IB, Knudsen EI. Why seeing is believing: merging auditory and visual worlds. Neuron. 2005;48:489–496. doi: 10.1016/j.neuron.2005.10.020. [DOI] [PubMed] [Google Scholar]
- Yuille AL, Bülthoff HH. Bayesian decision theory and psychophysics. In: Knill DC, Richards W, editors. Perception as Bayesian inference. New York: Cambridge UP; 1996. [Google Scholar]