Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Sep 7.
Published in final edited form as: Neuron. 2016 Sep 1;91(5):1110–1123. doi: 10.1016/j.neuron.2016.08.007

Reading out olfactory receptors: Feedforward circuits detect odors in mixtures without demixing

Alexander Mathis 1,2,4, Dan Rokni 1,4, Vikrant Kapoor 1, Matthias Bethge 2,3, Venkatesh N Murthy 1,*
PMCID: PMC5035545  NIHMSID: NIHMS814296  PMID: 27593177

Abstract

The olfactory system, like other sensory systems, can detect specific stimuli of interest amidst complex, varying backgrounds. To gain insight into the neural mechanisms underlying this ability, we imaged responses of mouse olfactory bulb glomeruli to mixtures. We used this data to build a model of mixture responses that incorporated nonlinear interactions and trial-to-trial variability and explored potential decoding mechanisms that can mimic mouse performance when given glomerular responses as input. We find that a linear decoder with sparse weights could match mouse performance using just a small subset of the glomeruli (~15). However, when such a decoder is trained only with single odors, it generalizes poorly to mixture stimuli due to nonlinear mixture responses. We show that mice similarly fail to generalize, suggesting that they learn this segregation task discriminatively by adjusting task-specific decision boundaries without taking advantage of a demixed representation of odors.

Introduction

In natural environments, the olfactory system must be able to detect behaviorally relevant odors against varying background smells. This task resembles the cocktail party problem in audition where the signals of different sound sources also arrive as a linear mixture at the sensory organ. Similar to our remarkable ability of disentangling sound sources, mice have been shown to excel at the olfactory equivalent (Rokni et al., 2014). Yet, with increasing number of mixture components this task can get difficult due to the overlapping, combinatorial representation of odorants by receptor neurons (Duchamp-Viret et al., 1999, Koulakov et al., 2007, Rospars et al., 2008, Shen et al., 2013) and sparse, distributed representations in higher olfactory areas (Stettler and Axel, 2009, Wilson and Sullivan, 2011). These representations have been described as synthetic, in the sense that combinations of odorants are thought to be encoded rather than individual components (Gottfried, 2010, Wilson and Sullivan, 2011).

In the main olfactory system of the mouse, each sensory neuron expresses only one out of a family of ~1000 types of olfactory receptor proteins (Godfrey et al., 2004). Each odor activates a subset of these receptor types and is identified by this subset; and each receptor type may be activated by many odors, leading to a potentially large overlap in the representation of different odorants (Duchamp-Viret et al., 1999, Malnic et al., Shen et al., 2013). Furthermore, mixtures may be represented as nonlinear functions of their parts (Rospars et al., 2008; Shen et al., 2013). In a previous study, we found that mice can be trained to perform a target-odor detection task and can do so remarkably well even in the presence of a large number of background odors (Rokni et al., 2014). These behavioral results led to the question of how mice solve this task given the nonlinear, noisy and high dimensional input that the glomeruli provide.

Various attempts have been made previously to provide theoretical understanding of how odors can be detected against a background. Several studies have suggested algorithms that rely on the temporal structure of OB activity to provide a concentration and background invariant code for odor identity (Brody and Hopfield, 2003, Galán et al., 2006, Hiratani and Fukai, 2015; Hopfield, 1991, Li 1990). Detection of a specific component can also be achieved using Bayesian inference if one assumes prior knowledge of the receptor activation levels for ‘all odors’. Such prior knowledge could be implemented in the brain as a ‘generative model’, which could be formed by unsupervised learning (Mumford, 1994, Hinton and Sejnowski, 1999). Unsupervised learning could yield ’demixed’, atomic representations of odors in higher olfactory areas – akin to Barlow’s idea that a lot of knowledge about the world is imprinted into the representations of sensory systems (Barlow, 1997). Based on a linear encoding model, Bayesian inference has been shown to work well in conditions where the typical mixture contains few components (Grabska-Barwinska et al., 2013). More generally, many algorithms for blind source separation and inference could be applied to such a generative model (Bell and Sejnowski, 1995, Otazu and Leibold, 2011, Tootoonian and Lengyel, 2014, Cuevas Rivera et al., 2015). Alternatively, supervised methods, like linear classifiers could be used for discriminative learning (Galán et al., 2006, Berens et al., 2012, Shen et al., 2013). None of these studies, however, attempted to identify the decoding mechanism that mimics mouse psychophysical performance – this is the focus of the current work.

In this study, we sought to find the simplest decoding mechanism that uses experimentally constrained neuronal input and match the ability of mice to detect target odorants (Rokni et al., 2014). We measured representations of single odorants and hundreds of mixtures by olfactory receptors to derive a quantitative model of mixture responses, which also included saturation and experimentally-measured trial-to-trial variability. We tested whether simple biologically plausible read out mechanisms can solve the behavioral task based on inputs that were generated by the mixture model. We find that the simplest multi-glomerular approach, which sums up the weighted activity of different glomeruli and compares this sum to a threshold (a linear classifier), is sufficient to mimic the behavior of mice. Several glomeruli have to be combined, which argues against a strict labeled-line encoding model. The quality of the mixture samples used in training the model was important, since a model trained only on single odors only poorly generalizes to mixtures. We tested this prediction in mice and find a similar behavior.

Results

In the behavioral task, mice were required to report the presence of one of two target odors in random mixtures containing up to 14 odors (Fig. 1A). After learning, mice achieved accuracies close to 100% for few distractors, and performance declined with increasing number of background odors (Rokni et al., 2014). This is a seemingly remarkable feat given that a) mice had to generalize from around 1000 training trials to mixtures that they had never smelled before (more than 60% are novel in test phase, ~50,000 possible mixtures), b) glomerular patterns are highly overlapping, c) mixture responses arise from nonlinear interactions of single odors, and d) odor responses are highly variable from trial-to-trial. To assess how challenging this task actually is, we started by quantifying all the parameters of the input at the level of receptors and then probed the ability of decoders to use glomerular activity patterns to extract the presence of particular target odors.

Figure 1. Variability and summation of glomerular odor responses.

Figure 1

(A) Task structure. In each trial an odor mixture is randomly generated from a pool of 16 odors. The task is to categorize based on the presence of one of two targets (green). Filled squares denote present odors and empty squares denote absent odors. (B) The classifier reads the glomerular responses as inputs and attempts to report the presence of the targets. (C) Raw, temporal average fluorescence in one imaged area. (D) Superposition of the dF/F responses to the 16 odors for same area as in C. Ellipses show putative glomeruli. (E) Average response to ethyl propionate. Time course of fluorescence dF/F changes shown for three glomeruli at right. (F) Responses to four presentations of methyl tiglate. Each panel shows one trial for all glomeruli. (G) These four responses to methyl tiglate are plotted against the mean response. Colors represent trials as indicated by colored dots in F. Lines are linear fits. (H) The trial-by-trial standard deviation after removal of correlated variability (see Methods) is plotted against the mean response (gray dots) for each glomerulus-odor pair. Red dots are the median of mean-response-binned data. Red line is a linear fit to the data (forced through origin). (I) An example of linear summation of mixture components. Shown are the average responses to individual components (blue), the response to the mixture (black), and the linear sum of the responses to the individual components (red). (J) An example of sub-linear summation of mixture components. Colors as in I. (K) All mixture responses are plotted against the linear sum of the responses to individual components for one single glomerulus. Colored data points show the examples in I and J. Red line depicts the best sigmoid function fit (Eqn. (5), see Methods). (L) Same as I with data pooled across glomeruli and mice. Both the linear sum and the experimental data were normalized for each glomerulus to the saturation value of the fitted sigmoid function.

Estimating an encoding model for mixtures using empirical data

In earlier experiments (Rokni et al., 2014), we had measured the average glomerular responses to the 16 individual odors in anesthetized mice expressing the Ca++ indicator GCaMP3 in all olfactory receptor neurons (glomeruli were from the dorsal surface of the OB). The measured signals arise from the collection of sensory axons converging on glomeruli. The number of glomeruli that were responsive to at least one odor in each OB ranged between 63 and 76. Due to the large number of possible odor mixtures, measuring the neural responses to the whole stimulus set is intractable (> 10,000 mixtures). Therefore, we estimated a statistical model that describes the glomerular responses given a particular odor mixture.

We imaged responses of glomeruli in the dorsal OB to all single components as well as a sample of ~ 500 mixtures in 5 additional OMP-GCaMP3 mice (Isogai et al., 2011, Fig. 1C–L). To estimate the variability of responses each single odorant was presented several times per experiment (7.7 ± 1.7, mean ± SD, Fig. 1F). Over all odor-glomerulus pairs, the average coefficient of variation (CV) was 0.37 ± 0.07 (mean ± SD). However, much of the variability was correlated across glomeruli, probably reflecting causes that are either irrelevant to behaving mice (e.g. anesthesia level and breathing parameters; Blauvelt et al., 2013) or could be easily factored out by neural operations such as normalization. This correlated variability is evident when the responses of all glomeruli to each trial of a specific odor were plotted against the mean response to that odor across trials (Fig. 1G). To estimate the non-correlated variability, we first subtracted the correlated variability from all responses. This was achieved by subtracting from each response pattern its best linear fit to the mean response pattern of the same odor (solid lines in Fig. 1G, see Eqn. (4) in Methods). The timescale of changes in the linear fit parameters was on the order of 10 minutes (Fig. S1), suggesting factors such as variation in anesthesia. For each glomerulus i and odor j, we then calculated the uncorrelated component of the standard deviation (Eqn. (4)) across trials t, and plotted it against its mean response amplitude Oij. The non-correlated CV for each experiment was calculated as the slope of the linear relationship between the non-correlated SD and response amplitude (Fig. 1H). Non-correlated CVs ranged between 0.08 and 0.13 (0.099 ± 0.019, mean ± SD, n=5 mice). This value is higher than expected based on the CVs measured in single rat receptor neurons in vivo (Duchamp-Viret et al., 2005) indicating that our measure of variability is probably conservative.

To estimate how glomerular responses to mixtures relate to their responses to mixture components, we imaged a sample of ~100 mixtures per experiment (98 ± 39, mean ± SD, n=5) and compared the responses with the sum of the responses to individual components of the mixture (Fig. 1I–L). Most mixture responses could be estimated rather well by the linear sum of individual component responses (Fig. 1I). However, beyond a certain value, mixture responses saturated (Fig. 1J–L). Saturation in odor responses may be due to several factors such as odor-receptor interaction, receptor neuron firing, and saturation of the Ca++ indicator. We could not disambiguate which of these contribute to saturation in our data set, and since receptor ligand interactions are well known to be saturating with increasing ligand concentration (Firestein et al., 1993), we conservatively assume that saturation is a real characteristic of olfactory receptor neuron’s odor responses and use a saturating curve in our model for mixture responses.

These mixture responses can thus be summarized by the following encoding model that describes the activity vector R of glomerular responses in trial t (see Methods):

R(t)=λR0(t)+(1λ)σ(R0(t)) (1)

Thereby λ ∈ [0,1] interpolates between the fully linear model R0(t), and σ denotes the saturating nonlinearity giving rise to the saturated model σ (R0(t)) (Eqn. (6)). The linear encoding part is given by

R0(t)=j=116cj(t)(Oj+ηj),ηjN (0,α2Oj2) (2)

where Oj is the glomerular response pattern for the jth odor, cj(t) a binary variable denoting the presence of odor j in trial t, and the noise term ηj is normally distributed with a mean of zero and a variance of α2Oj2. The factor α controls the coefficient of variation, and our empirical estimate for it is ~ 0.1. We will refer to the sequence of cj(t) during an experiment as the trial statistics.

In addition to introducing nonlinear mixing, parameterized by λ, we also systematically varied levels of trial-to-trial variability to study its effect on decoding performance.

Performance of optimal linear decoder for measured mixture model

The simplest decoder at the population level is a linear readout, similar to a single perceptron (Rosenblatt, 1958). Such a readout weighs the activity of individual glomeruli Ri(t) by synaptic weights wi and predicts the presence or absence of a target when the summed product is larger or smaller than a threshold ϴ, respectively. This linear-nonlinear cascade can be interpreted like a single readout neuron. Mathematically, the output is given by

y =H(wTR(t)θ), (3)

where y is the binary output, H the Heaviside function, which is 1 for positive arguments and 0 otherwise and T denotes the transpose of the weight vector w.

In our published work (Rokni et al., 2014), there were thirteen mice and 1,500 – 3,500 trials per mouse (2,308 ± 625, mean ± SD) with the number of odors per trial equally distributed between 1 and 14. For each mouse this segregation task can be summarized by the trial statistics cj(t) and the ideal behavior r(t). For each animal’s trial statistics, we calculated the optimal linear weights that minimize the mean squared deviations between the ideal behavior r(t) and the output of the linear readout (see Eqn. (8)) and the optimal decision threshold ϴ that maximizes performance on samples drawn from the encoding model (Eqn. (1)), for a random training set of 80% of the trials. This decoder, which we call the OLE (for optimal linear estimator), was then tested on the remaining 20% of trials (Methods). We repeated this procedure 20 times and report the average performance on the test sets.

Performance was perfect without saturation (λ=1) and trial-to-trial variability (α = 0) (Fig. 2B). Increasing α gradually decreased performance, which reached ~ 65% for α = 1 (Fig. 2A and B). Similar trends were obtained using the glomerular patterns from different imaging datasets (Fig. 2A) and target odors (Fig. 2B). For a noise level similar to our experimental estimate (~ 0.1), most linear readouts performed better than corresponding mice detecting the same targets (Fig. 2B). Thus, considering only trial-to-trial variability and mixing of odors without saturation, a simple linear decoder is sufficient to match or exceed mouse performance, although it had access only to the time-averaged response of a limited set of dorsal glomeruli. Since the specific choice of data had little effect on classifier performance, we focused on a particular imaging set with 72 glomeruli.

Figure 2. Decoding performance of linear readout and SVM for mixtures with trial-to-trial variability and nonlinear mixtures.

Figure 2

(A) Performance of optimal linear readout in detecting one target pair (isobutyl propionate and allyl butyrate) is plotted as a function of trial-to-trial variability (α). The curves show the performance of one decoder based on each of the six measured glomerular maps and their average ± SD. The mouse performance for these targets is shown as horizontal line. The estimated range of the glomerular noise is highlighted by the shaded, red region. (B) Average decoding performance for all 13 pairs of target odors. The average performance of all mice is indicated by a horizontal line and the distribution of experimental performances is shown to the right. (C) Average performance for data from all mice of optimal linear decoder for varying noise levels and nonlinearities from λ=1 (linear case) to λ=0 (experimentally measured saturation). (D) Average performance for data from all mice, using SVM with radial basis function for mixture data as in panel C. (E) Performance of OLE and SVM as a function of the number of odors in the mixture. Performance curves for 13 mice and their average in red. The average performance curves for the SVMs and OLEs are shown for the same 13 target odor pairs for linear mixture model (λ=1) with σ=0.25. Those are the conditions for which the experimental and model performance curves are best correlated (Figures S4C–D). (F) Same as (E) but for mixtures with saturation (λ=0).

Next, we varied glomerular saturation by varying the parameter λ in Eqn. (1). As expected, the average performance of the optimal linear decoder for all target odors declined with both increasing nonlinearity and noise (Fig. 2C, single odor pair Fig. S2A). When noise levels are low (<0.1) and responses are not fully saturating (λ>0), the mean decoder performance is above 98%. When considering saturating glomeruli (λ=0) and the experimental estimate of noise (α=0.1), performance drops to 91.6%, which is comparable to the average mouse performance of 90.3%. This demonstrates that the ’cocktail party problem’ is almost linearly separable at the level of glomeruli.

Performance of support vector machine (SVM) for noisy, nonlinear mixtures

Since the glomerular activity patterns in the linear, noisy mixture case R(t) are given by a superposition of multiple Gaussian random vectors, the optimal decision boundary might be highly nonlinear and not merely a single hyperplane. When the odor patterns are nonlinearly mixed, the geometry of the decision boundaries becomes even more intricate. A decoder that is able to flexibly adjust its decision boundaries without constraining them to be a single hyperplane may therefore perform considerably better than the simple linear readout. One powerful class of decoders with such properties is the SVM with radial basis functions as kernel (see Methods). The decision boundaries for such SVMs are formed by sums of Gaussians and can therefore in the limit approximate almost any decision boundary (Schoelkopf and Smola, 2002).

We trained SVM decoders with the same cross validation procedure as described for the optimal linear decoder. As expected from their versatility, SVMs were more accurate than the linear readout. Fig. 2D shows the performance of the SVM for the same target odors as the optimal linear decoder in Fig. 2C. When noise levels are low (α < 0.1) and responses are not fully saturating (λ > 0), the SVM performance was above 99.5%. When considering saturating glomeruli (λ = 0) and the experimental estimate of noise (α = 0.1) performance drops to 94.7%, outperforming mice and making about 40% fewer mistakes than the optimal linear decoder. Similarly, for other conditions, SVMs make fewer errors than the optimal linear decoder and have substantially better performance for low noise levels irrespective of λ (Fig. 2C–D).

Given that the overall performance of both optimal linear decoder and SVM broadly resembled that of mice, we investigated whether they make similar errors. Specifically we asked whether the dependence of performance on the number of mixture components is similar to the monotonic decline seen in mice (Fig. 2E and F). We refer to this relationship as the performance curve. We calculated the average correlation coefficient (r) between the performance curves of the classifiers for specific odor targets and their corresponding mice. The average correlation was maximal for the fully saturated mixture model (λ=0) with a noise level of α = 0.25 for both the optimal linear decoder (r = 0.42 ± 0.01, mean ± sem, n = 13 mice) and SVM (r = 0.40 ± 0.01, mean ± sem, n = 13 mice; Fig. S2C and D). When scrutinizing the shape of the performance curves, it is important to note that the number of possible mixtures varies with the number of components in the mixture. For instance, there are many more possible mixtures of 7 components than there are of 1 or 14 components. Therefore, specific stimuli on the edges of the performance curve repeat much more, offering more training samples for the classifier. For this reason, one may expect that classifier (and mouse) performance curves will not be monotonic, but could have a minimum at the midrange of mixtures. We find, however, that despite the different computational capabilities of SVMs and the OLE, performance for both decoders decreases as the number of components in a mixture increase, for realistic estimates of noise and nonlinearity. This suggests that the overlapping and nonlinear mixing of noisy glomerular patterns, rather than limitations of the readout mechanism, is a fundamental bottleneck for this task.

Taken together, although the linear decoder is not the optimal decoder, its performance is comparable to mice for our conservative estimates of noise and saturation levels. We therefore used this linear decoder, rather than the SVM, for further analysis. Nevertheless, it is conceivable that nonlinear decoding schemes such as the SVM could be good models for certain olfactory computations, but for this task, nonlinear decision boundaries are not necessary.

Performance of sparse, linear decoder for noisy, nonlinear mixtures

The linear decoders considered so far had no restrictions on the number of glomeruli used. Inspired by the observation that the projections from the OB to the piriform cortex are sparse and disperse (Ghosh et al., 2011, Miyamichi et al., 2011, Sosulski et al., 2011), we wondered how well sparse, linear readouts would perform the task. We turned to logistic regression, which is an efficient, binary classification algorithm that can be readily interpreted from a neuronal point of view as a linear readout neuron with nonlinear firing (Berens et al., 2012). Similar to the OLE, the optimal logistic regression (OLR) is found by minimizing the likelihood of misclassifications on a training set. Additionally, we employed a regularization term that punishes non-zero readout weights by adding the sum of all absolute values of the weights to the number of misclassifications (L1 - regularization, Eqn. (10)). By weighing this regularization term with a constant 1/C that can be systematically varied, one can bias the OLR to exhibit varying degrees of sparseness. This regularization can be thought of as an additional cost term for neuronal wiring, and that the OLR attempts to maximize performance while minimizing wiring.

When the cost for wiring dominates, the performance of the OLR drops to chance level and at the other extreme performance asymptotes to levels similar to the optimal linear. Between these two extremes performance strongly depends on the number of nonzero readout weights (Fig. 3A). However, the overall effect of readout sparseness on performance was mild; this is not a simple consequence of the glomerular activity itself being sparse, since most glomeruli are activated by at least one odor. For our estimated noise level of 0.1, only 10–20 glomeruli are sufficient to match the performance of mice. To reach 90% of the asymptotic performance only 15.7 ± 0.9, mean ± sem (n = 13 mice) were necessary (Fig. 3A). The OLR classifier is trained merely based on the sampled mixtures vectors R(t), and the ideal response for the fraction of training trials, but has no explicit knowledge of the identity of the target odor or the glomerular representations of the individual components. Due to the large number of possible mixtures, even mice that were trained on the same targets were exposed to different trial statistics. Despite these differences, classifiers trained on trial statistics from different mice with the same target odors converged to similar weights in both the sparse and the dense case (Fig. 3B – see odor pairs F, G & H). When mice had different targets, the readout weights were also distinct (Fig. 3B). These weights are not simply a reflection of the glomerular patterns of the target odors, but also take into account glomerular patterns of distractors (Fig. S3A). Using the target-odor patterns as readout weights results in poor performance; even for just one target odor such a template matching algorithm is about 20% worse (Fig. S3B–C).

Figure 3. Performance of sparse linear readout.

Figure 3

(A) Performance versus number of non-zero weights of the OLR (λ=0). Each faded line is for data from one animal and solid line depicts the average. The horizontal, dashed lines indicate asymptotic performance and the vertical dashed lines indicate the corresponding minimal number of glomeruli to achieve 90% of that performance per animal. Their average is highlighted as solid vertical line. Inset: Average performance vs. number of non-zero weights of the OLR for varying noise levels (α=0, 0.05, 0.1, 0.25, 0.5, 0.75, 1, from top to bottom). (B) Example readout weights obtained for 90% of asymptotic performance (left) and asymptotic performance (right). Several of these 13 mice had the same target odor pairs, as indicated by grouping under the same letter. OLR learns similar readout weights from different trial statistics for the same target odors. Across different target odors, the readout weights vary substantially – but the weights are similar for the same target odors. (C) Performance of OLR versus perturbation level of weights for linear and nonlinear mixing with α=0.1. Each thin line is the average for 20 random perturbations per mouse, and the thick line is the average across mice. The horizontal lines indicate 90% of the unperturbed weight performance for either condition (λ=0. 1). The vertical lines indicate the perturbation level where the performance drops below 90% of the unperturbed weight performance.

Saturation also affects which and how many glomeruli are read out by the OLR (Fig.S4A–B). For linear mixing (λ = 1), only ~ 10 glomeruli are necessary to achieve 90% performance (10.6 ± 0.5 mean ± sem, n = 13 mice). These glomeruli are not just a subset of the ones used in the fully saturated case.

Multiple glomeruli are necessary for detecting the target odors from noisy and saturating glomerular inputs. We asked whether single glomeruli may be sufficient to detect the targets making less conservative assumptions. We used receiver operating characteristic (ROC) analysis, a standard technique in signal detection, to estimate the ability to decode using single glomeruli, assuming linear summation of glomerular inputs and no trial to trial variability. We found that even in these favorable conditions, the best glomerulus-target odor pairing leads to a performance of less than 75% (with top 1-percentile 68.0%, and mean ± SD performance 52.8 ± 0.04%, n = 5473 glomerulus-odor targets pairs; Fig. S4C–D).

Since synaptic transmission is inherently noisy in the nervous system (del Castillo and Katz, 1954), an important question is how robust the decoder is to fluctuations in the readout weights. To test for robustness, we perturbed the classifier weights (for a dense readout, C = 106) by multiplying them by a random factor that is normally distributed with a mean of one. We varied the perturbation level by varying the standard deviation of this factor. We found that the linear decoder is fairly robust to changes in its readout weights and that robustness is higher for the linear mixing case (λ = 1, Fig. 3D). Robustness of classifier performance strongly depends on the specific target odors, yet it remained above 90% of optimal for all 13 trial statistics at perturbation levels that are less than 0.25 for λ = 1, and 0.1 for λ = 0.

Learnability and experience dependency of readout weights

We established that this task is linearly separable, but even a linear decoder has at least as many degrees of freedom as the number of mouse glomeruli. This raises the question of how difficult it is to learn the readout weights; that is, how many examples are needed to properly constrain the weights? To quantify the need for learning, we first computed the odds of a ‘random’ readout to perform well. Random readout weights were drawn from a Gaussian distribution with the same mean and standard deviation as the optimal linear readout weights. We evaluated the performance of 1,000,000 such readouts per target pair used in the experiment assuming α = 0.1 and saturating glomerular responses (λ = 0), always using the experimental trial structure (Fig. 4A). The average performance was 58.41 ± 4.53% (mean SD). The chance that a random readout performs better than 80% is around 1 in 105. Thus, random hyperplanes typically fall short of separating the mixtures with target odors from the ones without. We next directly computed how many training samples are required to find the proper readout weights for performing the task. To find the minimum number of trials needed to generalize to the rest of the data, we varied the absolute number of training trials directly. With α = 0.1, ~ 30 random trials were needed to get to 90% of asymptotic performance assuming linear mixing (λ = 1), and ~ 150 trials for saturating mixture responses (λ = 0) (Fig. 4B). This difference highlights how saturation makes the task harder to learn, but crucially these numbers of training trials compare favorably to mice.

Figure 4. Learnability and dependence on trial statistics.

Figure 4

(A) Distribution of performances for 13,000,000 random linear decoders with weights drawn from the distribution of OLE-weights and corresponding optimal threshold ϴ. The ability of 1,000,000 random readouts to detect the presence of the targets was evaluated for each of the 13 target odor pairs. The blue lines show the performance of the optimal linear decoder (OLE). (B) Performance vs absolute number of training samples for α=0.1 and C = 106. For each target the average performance when trained on a small, random subset of samples is shown; each curve is averaged over 20 random training sets. The inset show the asymptotic average performance, which is reached when trained by 1,500–2,000 trials. The dashed, horizontal lines indicate 90% of the asymptotic performance, and the corresponding vertical lines mark the necessary training samples to achieve this performance. (C) Top: Average performance curves for OLE per target pair trained only on 80% of the single odor stimuli (yellow, OLE1), and on 80% of all the data, (blue, OLE14), with 0.1 noise and linear mixing based on 72 glomeruli. Bottom: same as top for non-linear mixing. (D) Average performance curves for OLE per target pair when trained only on 80% of the single odor stimuli (yellow, OLE1), and on 80% of all the data, (blue, OLE14) based on 421 glomeruli. The red curve shows the average performance ± s.e.m. of 5 mice trained on single odors and subsequently tested on mixtures of 1,3,8 and 14 odors (Mice1).

The analysis above indicates that a linear decoder with random readout weights is unlikely to perform well, but it can learn to generalize from a rather small number of random mixture stimuli to thousands of other mixtures. Due to nonlinear mixing, this ability to generalize beyond trained samples strongly depends on the quality of training samples. A particularly meager training set is given by samples of just single odor components. Therefore, we trained classifiers only on single odors and tested them on arbitrary mixture stimuli. Without saturation the OLE performance for mixtures with more than one component degrades only mildly, but is substantially worse than an OLE trained with a training set containing all component numbers (Fig. 4C, top). With saturation, the performance of the OLE degrades strongly with increasing number of odors, and already for mixtures of two components has a performance of below 70% (Fig. 4C, bottom). This inability to generalize is obviously stronger when fewer glomeruli are used for decoding. When we pool over all measured glomeruli (6 OBs, 421 dorsal glomeruli), the performance of the OLE decays more gracefully with the number of components (Fig. 4D). Thus, for the linear readout, the mixture model, as well as the training stimuli, strongly affects the ability of the decoder to detect target odors.

Discriminative learning vs. generative modeling

An important idea that is complementary to the purely discriminative learning considered above, is that neural representations are also formed in a task-independent, unsupervised way by learning a generative model of the sensory input (Mumford, 1994, Hinton and Sejnowski, 1999). The crucial advantage of generative modeling is the ability to learn tasks with much smaller number of task-specific training samples by exploiting general knowledge about the nature of the world. For our demixing task specifically this would mean that the glomerular representation (Fig. 1B) is first demixed into a representation akin to the one in Fig. 1A. Ideally this would allow the animal to perfectly learn the task from samples of just single odor components without the drop in performance observed for purely discriminative learning (cf. Fig. 4C–D).

To test the ability of mice to generalize beyond single odor components, we trained 5 mice for several sessions on the single odor task until they reached > 80% performance. During test sessions, we mainly presented single odors, but in 10% of the trials, we also presented mixture stimuli with 3, 8 and 14 components (equally distributed). Mice continued to perform well for single odorant trials, and for (novel) mixtures of three. For mixtures with more components performance dropped substantially (Fig. 4D), consistent with OLE predictions. This pattern was observed in all mice individually (Fig. S5C; Supplementary Table 1).

Our earlier experiments demonstrated that mice can learn this task when exposed to training samples with more complex odor components (Rokni et al., 2014). This suggests that due to the nonlinear nature of odor mixture encoding, both mice and OLEs have difficulty generalizing much beyond the trained complexity, and rather settle on decision boundaries that are good enough to solve the task. It further indicates that mice cannot significantly benefit from a generative model by which they could task-independently decompose odor mixtures into their individual odor components.

Capacity of linear readout

Our analysis demonstrated that the presence of a single odor in mixtures of up to 14 odors is explicitly available for a linear decoder from the rich repertoire of receptors. What is the capacity limit for such a decoder? We can estimate this by using surrogate odor representations based on the representations of the 16 odors we measured. Surrogate odors were generated by drawing with replacement from the activations of the individual glomeruli by the 16 odors. We then calculated how well a linear decoder can report the presence of a pair of target odors in mixtures of up to 128 odors. Performance of classifiers was dependent on the target odor but all classifiers performed above chance level even for 128 odors, and above 80% for mixtures of around 32 odors (Fig. 5A). This is remarkable since the classifier only uses 72 glomeruli out of a few thousand glomeruli in mice. We repeated the same analysis using pooled glomerular data from 6 olfactory bulbs (421 glomeruli). With this large number of (potentially redundant) inputs, the performance of classifiers was above 70% for mixtures of up to 128 odors (Fig. 5B). This analysis suggests that the capacity of mice to detect target odorants from mixtures may be well above what was tested.

Figure 5. Capacity of an optimal linear decoder in cocktail party tasks with increasing odor numbers.

Figure 5

(A) Performance of optimal linear decoder with linear and nonlinear mixing and noise α=0.1 as a function of the number of odors. Additional odor patterns beyond 16 are generated by sampling from the responses of the 72 glomeruli to the 16 odors as described in the main text. Each faded line represents such a randomly generated odor data set and the 13 solid lines are the averages of 4 such lines with the same target odors. The red dots highlight the experimental performance of mice (B) Same as in panel A, but calculated by pooling across imaging experiments to form odor patterns with 421 glomeruli.

Discussion

An important task of the olfactory system is to identify odors of interest that are embedded in background mixtures. We formulated this task as a classification problem and asked whether a simple, biologically-plausible classifier can solve this task in the face of noise and non-linear integration of odorants at the level of olfactory receptors. To tackle this question we first determined an encoding model for mixture responses of olfactory sensory neurons as the input for the classifiers. The model is based on data from calcium imaging and generates mixture representations that are saturating sums of single odor representations with trial-to-trial variability. We find that olfactory receptors have a substantial capacity to transmit information about the composition of odor mixtures despite saturation and noise, which is explicitly available for a linear, multiglomerular readout. In recent years, object recognition has been identified as a core challenge to understand the visual system (DiCarlo et al., 2012). For (visual) object recognition, simple linear readouts are also sufficient to match behavioral performance (DiCarlo et al., 2012, Majaj et al., 2015) However, these linear readouts cannot be applied directly to photoreceptors, but rather to the neuronal activity of higher visual areas, after several nonlinear transformation steps, which are generated from multiple feedforward steps of processing (DiCarlo et al., 2012). Our finding that the identity of odors in mixtures can be directly decoded by linear readouts from glomeruli highlights a fundamental contrast between the olfactory and visual systems which is also mirrored anatomically: olfaction has substantially fewer anatomical stages than vision (Gire et al., 2013).

Mixture Model

The general question we have posed in this study is whether glomerular responses to mixtures that include target odors and those that exclude target odors are separable. Due to the large number of possible stimuli in the behavioral task (> 10,000 possible mixtures), we estimated a statistical model of mixture responses based on measurements made from a sample of mixtures. We found that the relationship between glomerular responses to mixtures and the responses to mixture components can be described with a saturating nonlinearity. For the odor stimuli we used in this study, suppressive or supralinear responses were not significantly above response variability (Rospars et al. 2008). We also systematically measured the trial-to-trial variability in glomerular responses. We found two forms of response variability: one that is correlated across glomeruli and can probably be attributed to variation in the level of anesthesia and breathing, and another that was uncorrelated across glomeruli and may represent trial-to-trial variability in individual input channels. The magnitude of the uncorrelated noise was, on average, 10% of response amplitude. In anesthetized rats, the CV of single OSN firing was found to be on the order of 1 (Duchamp-Viret et al., 2005). Single glomeruli receive a few thousand sensory axons although this number can vary significantly depending on receptor gene (Bressel et al., 2016). If OSNs were statistically independent of each other, the glomerular CV is expected to be on the order of 1–2%. The higher estimate in our study may indicate that sensory neuron responses co-vary and are not truly independent, but may also reflect non-biological sources of variability in our measurements.

Single glomeruli are not enough

In some cases, individual odorant receptors are thought to directly signal the presence of individual odors in a “labeled line” manner (Li & Liberles, 2015). These are thought to be involved in detecting specific odors that are important for innate behavior (Li & Liberles, 2015, Stowers et al., 2013). Here we find that individual glomeruli are not sufficient to report the presence or absence of individual odors. Although it is possible that our limited sample of dorsal glomeruli miss more specialist glomeruli with highly selective responses, it seems implausible that there will be such glomeruli for a vast array of odors. Indeed, experimental evidence strongly supports the idea that most odorant receptors in the main olfactory system are broadly tuned (Saito et al. 2009). Although individual glomeruli do not carry enough information to separate stimuli that contain a target odor from those that do not, we could reliably make this distinction by integrating information across different glomerular channels.

No preprocessing required, a linear readout is sufficient

Our decoders were based directly on the nonlinear, noisy glomerular inputs. Other than removing co-fluctuations across glomeruli, no intermediate stages of preprocessing such as normalization or decorrelation were necessary. Our analysis suggests that correlated trial-to-trial variability probably reflect slow changes due to anesthesia (Ecker et al., 2014) and breathing. Breathing induced covariation could be easily removed by lateral interactions, which is why we neglected them here (Olsen et al., 2010, Blauvelt et al., 2013, Friedrich and Wiechert, 2014). Our measurements as well as those of others (Duchamp-Viret et al., 2005) indicate that noise levels at the glomeruli and overlap of glomerular patterns are not high enough to require an additional processing stage. The optimal linear readout relies on weights that reflect the pattern of glomerular activation by target odors and are orthogonal to the pattern for distractor odors. This mechanism inherently removes overlap of glomerular patterns. Normalization and decorrelation may become more important for large amplitude stimuli that may saturate neuronal elements more strongly or for suboptimal readouts (like template matching). Decorrelation of different types has been proposed in the olfactory bulb, most notably in the context of reducing overlap among patterns of activity for different odors (Friedrich and Wiechert, 2014, Gschwend et al., 2015). Such pattern decorrelation or separation is thought to help downstream readout when noise limits separability (Friedrich and Wiechert, 2014). In the future it will be interesting to look at the role and amount of noise correlations in glomeruli of awake animals performing this figure-ground segregation task.

Weights can be sparse but not random

Although our intent was not to make specific analogies with neural circuits in the mouse brain, it is tempting to compare the decoder readout weights to the connections from the OB to the piriform cortex (PC). Since PC neurons receive a sample (possibly random) of glomerular input (Giessel and Datta, 2015), we also asked whether imposing sparseness constraints on the decoder affects performance. Remarkably, less than 20 glomeruli were sufficient to achieve performance that matches mice. It is likely that even fewer glomeruli may suffice if we were able to choose from the entire complement of mouse glomeruli. We found that while sparse connectivity was sufficient for robust performance after adequate training, classifiers built with random connectivity based on the same statistics as the optimal linear readouts without training were typically not successful. The problem remains linearly separable within the higher dimensional PC representation, but crucially this recoding step is not necessary for a linear readout to be successful. Furthermore, our analysis of random readout suggests that even in a large PC population there might only be a few neurons that are selectively tuned to the target odors, due to the large number of mixtures. To the extent we can make an analogy to mouse olfactory circuits, our findings suggest that any stochastic connectivity from the OB to PC may have to be supplemented with synaptic modification in the inputs, associative fibers or outputs of the PC to allow odor mixture analysis. Whether such learning occurs associatively over an extended period as animals experience odor environments, or whether it requires more specific reinforcement is an interesting question for future experiments. Alternatively, learning could happen via granule cells that dynamically gate the output of appropriate glomerular channels via mitral cells (Koulakov and Rinberg, 2011, Markopoulos et al., 2012), which would predict the existence of target-odor specific feedback modulation of mitral cells.

Although the optimal linear readouts are sufficient to match the performance of mice given the encoding model, we also showed that SVMs, which can approximate arbitrarily complex decision boundaries, outperform those simple decoders and suffer from decaying performance for increasing numbers of distractors. This performance decay suggests that for both mice and machines the ultimate bottleneck in this task is the overlapping glomerular representation. While SVMs were not required to mimic mouse performance, it is conceivable that the myriad parallel pathways from the OB could implement decoding schemes similar to an SVM and that over time plasticity rules allow for the learning of intricate decision boundaries; however, for this segregation task nonlinear decision boundaries are not necessary.

Robustness and training dependence

Other than the overall performance, the difficulty of classification can also be assessed by the robustness of classification to modifications in readout weights and by the speed at which a classifier converges on the proper weights. Even linear classifiers can learn the task relatively easily in dozens of trials, despite response variability and saturation. Although a direct comparison of learning rates of classifiers and mice may not be appropriate, it is worth noting that mice take hundreds of trials to learn the task under our conditions. We also find that once a classifier has been trained, small perturbations of the weights did not strongly affect performance. This indicates that the classifier has robust performance in a local region of the weight space, which nevertheless cannot be readily reached by random assignment. Our analysis of random readouts indicates that there is a 1 in 105 chance of having a performance of above 75%. Extrapolating this insight to neural architecture of the mouse olfactory system, it might mean that any projections that start out randomly (e.g., the OB to cortex projections) will need to be modified with learning, but that there might be several ‘template’ neurons whose high correlation with rewards will facilitate learning.

We showed that optimal linear readouts based on glomerular activity are highly sensitive to the quality of the training set. When trained on single odors, they perform poorly for mixtures. An ideal observer that knows how individual odors mix should perform accurately for mixtures even when trained on single odors. To be specific, assuming an explicit ‘atomic’ representation of the single odors in the olfactory cortex, and that a decoder has access to this population, the readout should generalize well to mixtures when solely trained on the atoms. This reflects the main advantage of generative modeling, which has therefore been hypothesized to play an important role for shaping sensory representations (Barlow, 1997, Mumford, 1994, Hinton and Sejnowski, 1999). We tested this prediction in mice and found that while their performance for single odors remains stable and their performance for mixtures of three is comparable, mice generalize poorly to mixtures of eight and fourteen. This suggests that mice learn the segregation task by adaptively adjusting decision boundaries directly based on incoming sensory information rather than highly processed, demixed representations. Due to the highly nonlinear representation of mixtures, such decision boundaries will generalize poorly when only trained with single odors. An alternative explanation for the mouse behavior (i.e., failure to detect target odor in mixtures when trained only on single odorants) is that they learned a different task rule - for instance, that two of the 16 odors are rewarded only when presented alone. Due to the novelty and low relative frequency of the mixture stimuli they may have ignored those stimuli and thus failed to generalize. Such an interpretation however is at odds with the good performance for mixtures of three odors and the fact that their errors consisted of both false alarms and misses. The training data from our previous study also argue against this alternate interpretation. The mice in that work were trained by picking odor mixtures with increasing complexities - from distributions ranging from few distractors to uniform distractor distributions. Whenever a mouse reached 80% performance, the distribution was changed (in three steps until the uniform distribution was presented in the test phase). There the mice required more than 1,000 trials to reach this level, and usually also encountered hundreds of trials per condition. This experience dependency dwarfs the small number of trials that the linear decoder needs to learn the task. However, carefully designed future experiments are necessary to address the question of how much training is required for learning in mice. Arguably, learning the task relies at least on two independent components. First, mice have to learn the task structure (go/no go, rewards, timing, etc.) and second mice have to learn which target odors are rewarded. By using the same task structure for different odor sets, one could greatly reduce the learning phase for the structure part and better estimate the number of trials that mice need for learning the task.

Experiments similar to those in the generalization task we studied in this paper (Fig. 4D) have been done in humans. Jinks and Liang trained humans initially for 3 days on single chemicals and then tested them for mixtures (Jinks and Laing, 1999). The subjects detected single chemicals with performance of around 70% and their performance for mixtures of 12 dropped to chance level. Our modeling, as well as our own experiments, suggest that this decay in performance is both a consequence of the training data and the synthetic encoding of stimuli, which makes generalization difficult. In vision, the intriguing characteristics of training schedules has long been noted – i.e. a minimal number of training examples is needed per session for learning and an interleaved stimulus presentation design can hinder learning (Aberg and Herzog, 2012). Consequently, we believe that the study of the impact of training schedules could also be fruitful for understanding olfactory circuits in mice.

Capacity

Overlapping representations of odors allow the olfactory system to encode more odors than the number of receptor types (Hopfield, 1999). However, broad tuning of receptors may decrease the discriminability of stimuli when noise and saturating glomeruli are considered. An earlier analysis of the coding capacity of spatial patterns of glomerular activation (Koulakov et al., 2007) pointed out that even with all-or-nothing glomerular activity, it is possible to simultaneously represent a dozen mixed odorants. Our analysis considers graded glomerular responses and empirical measurements of responses to specific odors and suggests that the capacity of the system is substantially greater. Whether this estimate is comparable to the capacity of mice remains unknown. Yet, understanding the capacity to detect and discriminate odors (Bushdid et al., 2014, Meister, 2015, Weiss et al., 2012) will provide crucial insights for understanding the olfactory system.

Conclusions

In this work, we have linked experimentally-measured glomerular responses to behavioral data in an odor demixing task. Using realistic assumptions about neuronal noise, and nonlinear interactions for mixtures, we show that the information about odor-mixture components at the level of olfactory receptors is already linearly separable and does not require any preprocessing or inference algorithms that rely on prior information and feedback circuits.

Methods

All computational analyses are based on published behavioral and imaging data (Rokni et al., 2014). Further imaging experiments to estimate trial-to-trial variability and mixing properties in glomeruli, as well as additional behavioral experiments, have been performed (described below). All procedures were performed using approved protocols in accordance with institutional (Harvard University Institutional Animal Care and Use Committee) and national guidelines. Additional details for all the methods can be found in the Supplemental Experimental Procedures.

Glomerular Imaging

Wide-field imaging was performed on adult OMP-GCaMP3 mice (Isogai et al., 2011) as described previously (Rokni et al. 2014) using a 10X objective (Olympus, NA 0.3). Blue light from an LED (M470L3, Thorlabs) was used for excitation, and the emitted light was filtered (500–550nmMDFGFP, Thorlabs) and collected with a CMOS camera (DMK 23U274, The Imaging Source GmbH). Odors were delivered using a homemade automated olfactometer (Rokni et al., 2014). All odorants were diluted to 30% v/v in diethyl phthalate and then further diluted 16 fold in air. The 16 odors and their mixtures were presented in a random order for 2 seconds (inter-stimulus interval of 45 seconds).

Imaging data analysis

All odor responses were first converted into dF/F images with dF being the subtraction of the mean fluorescence in each pixel before odor presentation from the mean fluorescence during odor presentation. Putative glomeruli were identified from averaged dF/F images (across repetitions of the same (see Supplemental Experimental Procedures) and the time varying dF/F trace for each glomerulus-odor pair. Response magnitude was quantified by integrating these dF/F traces.

Variability of single odor responses was measured as the coefficient of variation (CV) of glomerular responses. To remove correlated variability we subtracted the best linear fit for each odor between the glomerular response at each trial Ojt and the average response for that odor Oj from the individual trials Ojt. We then calculated the remaining (private) variability for each glomerulus, i.e.

ηt=Ojt(αtOj+bt) (4)

Where ηt is the deviation of glomerular response from the linear fit, Ojt is the response to the jth odor on trial t, Oj is the trial-averaged response to odor j, and at and bt are the slope and intercept of the best linear fit between Ojt and Oj. For each glomerulus i we then calculated the standard deviation of ηi, and plotted it against its mean response amplitude Oij. The non-correlated CV for each experiment was measured as the slope of the linear relationship between the non-correlated standard deviation and response amplitude (Fig. 1H).

Summation of mixture components was analyzed by comparing mixture responses to the linear sum of the mean responses to mixture components. The data were fitted by a sigmoidal function (for each glomerulus and mixture):

σ(R0)=2A1+eR0sA (5)

where σ(R0) is the response to a mixture, A is the saturation level for the glomerulus, R0 is the linear sum of mixture component responses, and s is a free parameter.

Mixture models for decoding analysis

For the decoding analysis, we considered the following encoding model, which is based on our measurements. Given the average glomerular response vectors Oj and the trial structure cj(t), i.e. a binary variable denoting the presence of odor j in trial t, the (linear glomerular response pattern to a mixture in trial t is given by Eqn. (2) in the main text. This mean response is subject to trial-to-trial variability. The linear response R0 is thus the sum of the single component response Oj with an added noise term ηj that is normally distributed with a mean of zero and a variance of α2Oj2 as defined by Eqn. (2). The factor α controls the coefficient of variation.

To account for saturation, we model the nonlinear response σ(R0) to a mixture in trial t is a sigmoid function of the linear response R0:

σ(R0 (t))=2A1+eR0(t)sA (6)

where A is the saturated response amplitude and s is a free parameter that governs the slope of this sigmoidal function. For each glomerulus i we assumed A to be equal to the maximum of the linear sum of component response R0(i)(t) and set s = 10/A. This makes σ(R0(i)(t)) almost fully saturated for inputs larger than A/2 and the output level is around 50% of the saturation level for input values around A/10.

Decoding analysis

The glomerular responses in a given trial t are drawn from the probability distribution for response vectors R(t), which are generated using the mixture model. For any given trial t, the target was either present or not and we denote this ideal behavior by the binary variable r(t). Any readout function f(R,ϴ) will use R(t) as its input, with parameters ϴ that need to be optimized. This optimization can be framed using the 0–1 loss function δ(r,f(R,ϴ)) and the empirical risk functional for a subset of trials I with cardinality |I| (Schoelkopf and Smola, 2002):

Aemp(θ)=1|I|tI(1δ(r(t),f(R(t),θ))) (7)

Since the Kronecker-δ is one when both entries are the same, and zero otherwise, this empirical risk function measures the fraction of correct classifications by readout f(R,ϴ). The goal is then to find parameters ϴ0 for the readout f(R,ϴ0) that minimizes the empirical risk on the training set I. Since the 0–1 loss is not differentiable and its optimization is NP hard, it is common to minimize surrogate loss functions that are convex and thus facilitate the determination of a unique optimum.

To compare different readouts we compare their performance on the test set, i.e. the other trials that have not been used for training the readout. All read-outs are cross-validated using the repeated random subsample technique; unless otherwise specified we used 80% training data, 20% test data and 20 repetitions with randomly sampled training and test data sets. We report the average test performance, defined as the mean number of correct classifications of r(t) by f(R(t),ϴ0) averaged over the test set.

For the readout function f, we considered the following four options which are commonly optimized based on different surrogate loss functions (details in Supplemental Experimental Procedures):

  1. Single glomerulus readout: We used standard ROC analysis to obtain an upper bound for single glomerulus performance (Fig. S4C–D).

  2. Linear decoder: The linear decoder depends on the following parameters: the readout weights w and the decision threshold, and has output y = H(wT·R(t) + ϴ) (Eqn. (3)). The optimal linear weights (OLE weights), are the ones that minimize the norm
    r(t)wTR(t)2 (8)
    All reported performances are the averages calculated by cross validation, which was used to determine the optimal threshold and weights. For each resampling, we calculated the OLE weights as described above and computed the optimal decision threshold ϴ as the one that maximizes the training performance.
  3. Support Vector Machines: We studied the performance of support vector machines (SVM) with radial basis functions as kernels.

  4. Logistic regression: We used regularized logistic regression to decode the presence of a target odor. For binary classification, the likelihood for the logistic regression is given by:
    t=ρ(wTR(t))r(t)(1ρ(wTR(t))(1r(t)) (9)
    with logistic function ρ(s) = 1/(1 + exp(−s)). Minimizing misclassification can be formulated as minimizing the negative logarithm of the likelihood. Therefore, the error functional for the regularized logistic regression is given by:
    i|wi|CtIlogt (10)
    We used the L1-norm to bias towards sparse read-outs, and the constant C varies the degree of sparseness as it balances the trade-off between classification errors and the summed absolute readout weights. We varied the regularization constant from 10−2 to 106. We employed the Scikit-learn toolbox in Python to perform the SVM and logistic regression analysis (Pedregosa et al., 2011).

Details for the weight perturbation analysis, random readout analysis and single-odor training simulations can be found in the Supplemental Experimental Procedures.

Behavioral testing

Behavioral training was performed as described earlier (Rokni et al., 2014). In brief, 5 c57bl6 mice (Charles River) were trained on the task using single component odors. In each trial a single odorant was presented for 2 seconds (inter-trial interval of 10 seconds). One of 2 targets was presented in half the trials (go). Mice had to lick for go trials and refrain from licking for no go trials within the 2 second period. A correct “go” trial was rewarded with a water drop, correct rejections were not rewarded, and incorrect trials were punished by a 5 second timeout. In testing sessions, which began after mice performed above 80% correct in a session, mixtures of 3, 8, or 14 components (each mixture size equally probable) were presented every 10th trial.

Supplementary Material

supplement

Acknowledgments

We thank Philipp Berens, Mackenzie Amoroso and Alexandra Ding for helpful feedback. This work was supported by Harvard University, by DFG grant MA 6176/1-1 (AM), Marie Curie Fellowship PIOF-GA-2013-622943 (AM). MB has received financial support from the Bernstein Center for Computational Neuroscience (FKZ 01GQ1002) and the German Excellency Initiative through the Centre for Integrative Neuroscience Tubingen (EXC307). Research in VNM’s lab is supported by grants from the NIH (DC011291, DC014453). Computational resources were provided and maintained by Harvard FAS Research Computing.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

AUTHOR CONTRIBUTIONS

A.M., D.R., M.B. and V.N.M. designed research; A.M. performed simulations and model analysis with input from M.B.; D.R. performed imaging and behavioral experiments. V.K. contributed the data on glomerular maps. A.M., D.R., M.B. and V.N.M wrote the paper with input from all authors.

References

  1. Aberg KC, Herzog MH. About similar characteristics of visual perceptual learning and LTP. Vision Research. 2012;61:100–106. doi: 10.1016/j.visres.2011.12.013. [DOI] [PubMed] [Google Scholar]
  2. Barlow HB. The knowledge used in vision and where it comes from. Philosophical Transactions of the Royal Society of London B: Biological Sciences. 1997;352(1358):1141–1147. doi: 10.1098/rstb.1997.0097. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bell AJ, Sejnowski TJ. An Information-Maximization Approach to Blind Separation and Blind Deconvolution. Neural Computation. 1995;7(6):1129–1159. doi: 10.1162/neco.1995.7.6.1129. [DOI] [PubMed] [Google Scholar]
  4. Berens P, Ecker AS, Cotton RJ, Ma WJ, Bethge M, Tolias AS. A Fast and Simple Population Code for Orientation in Primate V1. The Journal of Neuroscience. 2012;32(31):10618–10626. doi: 10.1523/JNEUROSCI.1335-12.2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Blauvelt DG, Sato TF, Wienisch M, Murthy VN. Distinct spatiotemporal activity in principal neurons of the mouse olfactory bulb in anesthetized and awake states. Frontiers in Neural Circuits. 2013;7 doi: 10.3389/fncir.2013.00046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bressel OC, Khan M, Mombaerts P. Linear correlation between the number of olfactory sensory neurons expressing a given mouse odorant receptor gene and the total volume of the corresponding glomeruli in the olfactory bulb. Journal of Comparative Neurology. 2016;524(1):199–209. doi: 10.1002/cne.23835. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Brody CD, Hopfield JJ. Simple networks for spike-timing-based computation, with application to olfactory processing. Neuron. 2003;37(5):843–852. doi: 10.1016/s0896-6273(03)00120-x. [DOI] [PubMed] [Google Scholar]
  8. Bushdid C, Magnasco MO, Vosshall LB, Keller A. Humans Can Discriminate More than 1 Trillion Olfactory Stimuli. Science. 2014;343(6177):1370–1372. doi: 10.1126/science.1249168. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cuevas Rivera D, Bitzer S, Kiebel SJ. Modelling Odor Decoding in the Antennal Lobe by Combining Sequential Firing Rate Models with Bayesian Inference. PLos Comput Biol. 2015;11(10):e1004528. doi: 10.1371/journal.pcbi.1004528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. del Castillo J, Katz B. Quantal components of the end-plate potential. The Journal of Physiology. 1954;124(3):560–573. doi: 10.1113/jphysiol.1954.sp005129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. DiCarlo J, Zoccolan D, Rust N. How Does the Brain Solve Visual Object Recognition? Neuron. 2012;73(3):415–434. doi: 10.1016/j.neuron.2012.01.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Duchamp-Viret P, Chaput MA, Duchamp A. Odor Response Properties of Rat Olfactory Receptor Neurons. Science. 1999;284(5423):2171–2174. doi: 10.1126/science.284.5423.2171. [DOI] [PubMed] [Google Scholar]
  13. Duchamp-Viret P, Kostal L, Chaput M, Lansky P, Rospars JP. Patterns of spontaneous activity in single rat olfactory receptor neurons are different in normally breathing and tracheotomized animals. Journal of Neurobiology. 2005;65(2):97–114. doi: 10.1002/neu.20177. [DOI] [PubMed] [Google Scholar]
  14. Ecker A, Berens P, Cotton RJ, Subramaniyan M, Denfield G, Cadwell C, Smirnakis S, Bethge M, Tolias A. State Dependence of Noise Correlations in Macaque Primary Visual Cortex. Neuron. 2014;82(1):235–248. doi: 10.1016/j.neuron.2014.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Firestein S, Picco C, Menini A. The relation between stimulus and response in olfactory receptor cells of the tiger salamander. The Journal of Physiology. 1993;468(1):1–10. doi: 10.1113/jphysiol.1993.sp019756. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Friedrich RW, Wiechert MT. Neuronal circuits and computation: pattern decorrelation in the olfactory bulb. FEBS Lett. 2014;588:2504–2513. doi: 10.1016/j.febslet.2014.05.055. [DOI] [PubMed] [Google Scholar]
  17. Galan RF, Weidert M, Menzel R, Herz AVM, Galizia CG. Sensory memory for odors is encoded in spontaneous correlated activity between olfactory glomeruli. Neural Computation. 2006;18(1):10–25. doi: 10.1162/089976606774841558. [DOI] [PubMed] [Google Scholar]
  18. Giessel AJ, Datta SR. Olfactory maps, circuits and computations. Curr Opin Neurobiol. 2014;24(1):120–132. doi: 10.1016/j.conb.2013.09.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Gire DH, Restrepo D, Sejnowski TJ, Greer C, De Carlos JA, Lopez-Mascaraque L. Temporal processing in the olfactory system: can we see a smell? Neuron. 2013;78(3):416–432. doi: 10.1016/j.neuron.2013.04.033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Godfrey PA, Malnic B, Buck LB. The mouse olfactory receptor gene family. Proceedings of the National Academy of Sciences of the United States of America. 2004;101(7):2156–2161. doi: 10.1073/pnas.0308051100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Gottfried JA. Central mechanisms of odour object perception. Nature Reviews Neuroscience. 2010;11(9):628–641. doi: 10.1038/nrn2883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Grabska-Barwinska A, Beck J, Pouget A, Latham P. Demixing odors - fast inference in olfaction. In: Burges CJC, Bottou L, Welling M, Ghahramani Z, Weinberger KQ, editors. Advances in Neural Information Processing Systems 26. 2013. pp. 1968–1976. [Google Scholar]
  23. Gschwend O, Abraham NM, Lagier S, Begnaud F, Rodriguez I, Carleton A. Neuronal pattern separation in the olfactory bulb improves odor discrimination learning. Nature Neuroscience. 2015;18(10):1474–1482. doi: 10.1038/nn.4089. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Hinton G, Sejnowski TJ. Unsupervised learning: foundations of neural computation. MIT Press; Cambridge, MA: 1999. [Google Scholar]
  25. Hiratani N, Fukai T. Mixed Signal Learning by Spike Correlation Propagation in Feedback Inhibitory Circuits. PLoS Comput Biol. 2015;11(4):e1004227. doi: 10.1371/journal.pcbi.1004227. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hopfield JJ. Olfactory computation and object perception. Proceedings of the National Academy of Sciences. 1991;88(15):6462–6466. doi: 10.1073/pnas.88.15.6462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Hopfield JJ. Odor space and olfactory processing: Collective algorithms and neural implementation. Proceedings of the National Academy of Sciences. 1999;96(22):12506–12511. doi: 10.1073/pnas.96.22.12506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Isogai Y, Si S, Pont-Lezica L, Tan T, Kapoor V, Murthy VN, Dulac C. Molecular organization of vomeronasal chemoreception. Nature. 2011;478(7368):241–245. doi: 10.1038/nature10437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Jinks A, Laing DG. A limit in the processing of components in odour mixtures. Perception. 1999;28(3):395–404. doi: 10.1068/p2898. [DOI] [PubMed] [Google Scholar]
  30. Koulakov A, Gelperin A, Rinberg D. Olfactory Coding With All-or-Nothing Glomeruli. Journal of Neurophysiology. 2007;98(6):3134–3142. doi: 10.1152/jn.00560.2007. [DOI] [PubMed] [Google Scholar]
  31. Koulakov AA, Rinberg D. Sparse incomplete representations: a potential role of olfactory granule cells. Neuron. 2011;72(1):124–136. doi: 10.1016/j.neuron.2011.07.031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Li Z. A model of olfactory adaptation and sensitivity enhancement in the olfactory bulb Biol. Cybern. 1990;62(4):349–61. doi: 10.1007/BF00201449. [DOI] [PubMed] [Google Scholar]
  33. Li Q, Liberles SD. Aversion and attraction through olfaction. Curr Biol. 2015;25(3):R120–129. doi: 10.1016/j.cub.2014.11.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Majaj NJ, Hong H, Solomon EA, DiCarlo JJ. Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance. The Journal of Neuroscience. 2015;35(39):13402–13418. doi: 10.1523/JNEUROSCI.5181-14.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Markopoulos F, Rokni D, Gire DH, Murthy VN. Functional Properties of Cortical Feedback Projections to the Olfactory Bulb. Neuron. 2012;76(6):1175–1188. doi: 10.1016/j.neuron.2012.10.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Meister M. On the dimensionality of odor space. eLife. 2015;4:e07865. doi: 10.7554/eLife.07865. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Mumford D. Neural architectures for pattern-theoretic problems. MIT Press; Cambridge, MA: 1994. [Google Scholar]
  38. Olsen SR, Bhandawat V, Wilson RI. Divisive Normalization in Olfactory Population Codes. Neuron. 2010;66(2):287–299. doi: 10.1016/j.neuron.2010.04.009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Otazu GH, Leibold C. A Corticothalamic Circuit Model for Sound Identification in Complex Scenes. PLoS ONE. 2011;6(9):e24270. doi: 10.1371/journal.pone.0024270. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12:2825–2830. [Google Scholar]
  41. Richard MB, Taylor SR, Greer CA. Age induced disruption of selective olfactory bulb synaptic circuits. Proceedings of the National Academy of Sciences of the United States of America. 2010;107(35):15613–15618. doi: 10.1073/pnas.1007931107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Rokni D, Hemmelder V, Kapoor V, Murthy VN. An olfactory cocktail party: figure-ground segregation of odorants in rodents. Nature Neuroscience. 2014;17(9):1225–1232. doi: 10.1038/nn.3775. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Rosenblatt F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review. 1958;65(6):386–408. doi: 10.1037/h0042519. [DOI] [PubMed] [Google Scholar]
  44. Rospars JP, Lansky P, Chaput M, Duchamp-Viret P. Competitive and Noncompetitive Odorant Interactions in the Early Neural Coding of Odorant Mixtures. The Journal of Neuroscience. 2008;28(10):2659–2666. doi: 10.1523/JNEUROSCI.4670-07.2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Saito H, Chi Q, Zhuang H, Matsunami H, Mainland JD. Odor coding by a Mammalian receptor repertoire. Sci Signal. 2009;2(60) doi: 10.1126/scisignal.2000016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Schoelkopf B, Smola A. Learning with kernels: support vector machines, regularization, optimization and beyond. MIT Press; Cambridge, MA: 2002. [Google Scholar]
  47. Shen K, Tootoonian S, Laurent G. Encoding of Mixtures in a Simple Olfactory System. Neuron. 2013;80(5):1246–1262. doi: 10.1016/j.neuron.2013.08.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Stettler DD, Axel R. Representations of odor in the piriform cortex. Neuron. 2009;63(6):854–864. doi: 10.1016/j.neuron.2009.09.005. [DOI] [PubMed] [Google Scholar]
  49. Stowers L, Cameron P, Keller JA. Ominous odors: olfactory control of instinctive fear and aggression in mice. Current Opinion in Neurobiology. 2013;23(3):339–345. doi: 10.1016/j.conb.2013.01.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Tootoonian S, Lengyel M. A Dual Algorithm for Olfactory Computation in the Locust Brain. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND, Weinberger KQ, editors. Advances in Neural Information Processing Systems 27. 2014. pp. 2276–2284. [Google Scholar]
  51. Weiss T, Snitz K, Yablonka A, Khan RM, Gafsou D, Schneidman E, Sobel N. Perceptual convergence of multi-component mixtures in olfaction implies an olfactory white. Proceedings of the National Academy of Sciences. 2012;109(49):19959–19964. doi: 10.1073/pnas.1208110109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Wilson DA, Sullivan RM. Cortical processing of odor objects. Neuron. 2011;72(4):506–519. doi: 10.1016/j.neuron.2011.10.027. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplement

RESOURCES