Significance
Natural odors typically consist of many molecules at different concentrations, which together determine the odor identity. This information is collectively encoded by olfactory receptors and then forwarded to the brain. However, it is unclear how the receptor activity can encode both the composition of the odor and the concentrations of its constituents. We study a simple model of the olfactory receptors from which we derive design principles for optimally communicating odor information in a given natural environment. We use these results to discuss biological olfactory systems, and we propose how they can be used to improve artificial sensor arrays.
Keywords: olfaction, sensing, natural statistics, information theory, molecular recognition
Abstract
Natural odors typically consist of many molecules at different concentrations. It is unclear how the numerous odorant molecules and their possible mixtures are discriminated by relatively few olfactory receptors. Using an information theoretic model, we show that a receptor array is optimal for this task if it achieves two possibly conflicting goals: (i) Each receptor should respond to half of all odors and (ii) the response of different receptors should be uncorrelated when averaged over odors presented with natural statistics. We use these design principles to predict statistics of the affinities between receptors and odorant molecules for a broad class of odor statistics. We also show that optimal receptor arrays can be tuned to either resolve concentrations well or distinguish mixtures reliably. Finally, we use our results to predict properties of experimentally measured receptor arrays. Our work can thus be used to better understand natural olfaction, and it also suggests ways to improve artificial sensor arrays.
Discrimination of olfactory signals occurs in a high-dimensional space of odor stimuli in which a large number of distinct molecules and their mixtures can be distinguished by a much smaller number of receptors (1–3). For example, humans have about 300 distinct olfactory receptors (4), which can sense at least 2,100 odorant molecules (5), and the real number might be much larger (1). Moreover, humans can differentiate between mixtures of up to 30 odorants (6). Such remarkable molecular discrimination is thought to use a combinatorial code (7, 8), where typical odorant molecules bind to receptors of multiple types (1, 3). Each receptor type is expressed in many cells (9), and the information from all receptors of the same type is accumulated in corresponding glomeruli in the olfactory bulb (10, 11) (see Fig. 1A). The activity of a single glomerulus is thus the total signal of the associated receptor type, so the information about the odor is encoded in the activity pattern of the glomeruli (11, 12). This activity pattern is interpreted by the brain to learn about the composition and the concentration of the inhaled odor. We here study how receptor arrays can maximize the transmitted information.
It is known (13, 14) that the input−output characteristics of sensory apparatuses of many organisms are tailored to the statistics of the organism’s natural environment to maximize information transmission. For example, in the visual circuit of the fly, the input−output relationship of neurons is matched to the cumulative distribution of the input distribution (13). Similar observations have since been made in many sensory systems (14, 15) and even in transcriptional regulation (16). In all these cases, the distinguishable outputs of the sensory system must be dedicated to equal parts of the input distribution, which is known as Laughlin’s principle (13) or histogram equalization (17). Intuitively, more of the response range is dedicated to common stimuli, at the expense of less frequent stimuli (13).
Similarly, the binding affinities of olfactory receptors might reflect the natural statistics of odors in an organism’s environment. Odors vary across environments and differ in both their frequency and composition (18). For example, some molecules might frequently appear together because they originate from the same source, whereas others are rarely found in the same odor. Additionally, some odors are more important to recognize than others, which corresponds to considering an increased frequency for these odors. Together, the frequencies and correlations constitute the natural olfactory scene.
It is not clear how olfactory receptors can account for natural odor statistics. Merely dedicating more receptors to common odors is not optimal, given the small number of available receptors and the many-to-many relationship between receptors and odors. Further, the value of a receptor is strongly dependent on how it complements the other receptors in the array; many “good” receptors can still create a poor array. Finally, the concentrations of molecules composing an odor can vary widely. Odors need to be distinguished in both quality and quantity; hence receptors must vary in both what molecules they respond to and how strongly they do this. Given the statistics of an olfactory scene, what combination of odorants should different receptors in an array respond to?
We use an information theoretic approach to quantify how well a receptor array is matched to given odor statistics. We generalize Laughlin’s principle to the high-dimensional case and show that optimal receptor arrays should obey two general principles: (i) Each receptor should be active half the time when odors are presented with natural statistics. (ii) The activities of any pair of receptors should be uncorrelated when averaged over all odors presented with natural statistics. If both conditions are satisfied for an array of receptors with binary readouts, all activity patterns are equally likely when odors are presented with natural statistics (see Fig. 1B). The two basic principles may be obvious with some thought, but they usually cannot be satisfied simultaneously. We thus also determine the relative costs of violating the two conditions and use this to carry out numerical and analytical optimizations to determine conditions for optimal receptor arrays. Furthermore, our model implies relationships between the typical ligand concentrations and the ability to discriminate mixtures that have been missed before.
After introducing our general framework below, we first discuss general properties of optimal receptor arrays. We then consider two different classes of natural statistics, for which we find optimal receptors in terms of random matrices. Here, our information theoretic approach provides a combined measure of the array’s performance in multiple aspects—from the resolution of ligand concentrations to the discrimination of mixture composition. We thus finally discuss the trade-off between such potentially mutually exclusive goals and compare our results to experimentally measured receptor arrays.
Results
Odors are mixtures of odorant molecules that are ligands of olfactory receptors. Any odor can be described by a vector that specifies the concentrations of all possible ligands (). During a single sniff, the ligands in the odor come in contact with different odor receptors. In the simplest case, the sensitivity of receptor n to ligand i can be described by a single number , and the total excitation of receptor n is given by (19, 20)
[1] |
Typical receptors have a nonlinear dose–response curve (21), and the output is thus a nonlinear function of . Moreover, receptors are subject to noise (22), e.g., from stochastic binding, which limits the number of distinguishable outputs. To capture both effects, we consider receptors with only two output states, which corresponds to large noise (23). In this case, the activity of receptor n is given by
[2] |
i.e., the receptor is active if its excitation exceeds a threshold. Eqs. 1 and 2 describe the mapping of the odor to the activity pattern , where the receptor array is characterized by the sensitivity matrix (see Fig. 1C). This activity pattern is then analyzed by the brain to infer the odor . Such a distributed representation of odors in activity patterns has been compared with compressed sensing (24); here we focus on how this representation can be tuned to match the structure of natural odors.
We assume that the structure of natural odors in a given environment can be captured by a probability distribution from which odors are drawn. can encode, for example, the fact that some ligands are more common than others or that some ligands are strongly correlated or anticorrelated in their occurrence. Because natural odor statistics are hard to measure (18), we work with a broad class of distributions characterized by a few parameters. We define to be the probability with which ligand i occurs in a random odor. The correlations between the occurrences of ligands are captured by a covariance matrix . We expect to be small because any given natural odor typically contain tens to hundreds of ligands (20, 25), which is a small subset of all ligands (18). When a ligand i is present, we assume its concentration has mean and standard deviation (SD) . Thus, the full natural odor statistics are parameterized by , , and for all ligands i and a covariance matrix in our model.
Optimal Receptor Arrays.
An optimal receptor array must tailor receptor sensitivities so that the odors-to-activity mapping given by Eqs. 1 and 2 dedicates more activity patterns to more frequent or more important odors as specified by . In information theoretic terms, the array must maximize the mutual information (26). In our model, the mapping from to is deterministic, and I can be written as the entropy of the output distribution ,
[3] |
where the sum is over all possible activity patterns . Note that , where describes the mapping from to . Consequently, I depends on and the odor environment . In fact, I is maximized by sensitivities that are tailored to such that all activity patterns are equally likely (13, 26).
The mutual information I can be approximated (27) in terms of the mean activities and the covariance between receptors, , encoded by ,
[4] |
which is an expansion up to quadratic order in . The first term gives the information gained through each receptor in isolation. The second term describes the reduction of information due to correlations between different receptors. For both Eqs. 3 and 4, the maximal mutual information of can only be obtained if
[5a] |
and
[5b] |
Consequently, in a receptor array optimized for its natural environment, each receptor responds to about half of all odors and any pair of receptors is uncorrelated in its response to odors, assuming odors are presented with frequency .
These design principles follow from very general considerations, but they may not always be simultaneously achievable. To understand such constraints, we study how microscopic properties of receptor arrays (the sensitivities ) determine both and . The mean receptor activity is given by the probability that the associated excitation exceeds 1, , where denotes the cumulative distribution function of (see Supporting Information). The covariance can be estimated in terms of using a normal approximation around the maximum of I (see Supporting Information). These statistics of can be calculated from Eq. 1 and read
[6a] |
[6b] |
where and follow from .
Combining Eqs. 4 and 6 to estimate mutual information, we can quantify how well an array’s sensitivities are matched to natural odor statistics . As a computational matter, these equations also allow a rapid calculation of mutual information without calculating the full distribution .
Random Sensitivity Matrices.
We next study which sensitivity matrices obey the optimization goals given in Eq. 5 for given odor statistics. Here, we will show that random with independent and identically distributed entries drawn from the right distribution can be close to optimal. This is because such matrices generically have low correlations, and the resulting activities are thus only weakly correlated. In this section, we study what distributions lead to and under what conditions these matrices minimize for two different classes of odor distributions.
Narrow concentration distributions.
We begin with the simple case where the concentration distributions are narrow, . In this case, we can focus on determining which ligands appear in a mixture. Receptors that are optimal for this task must be highly sensitive to some ligands while they ignore the others, but the exact value of the sensitivity does not matter. This property can be encoded in a binary sensitivity matrix where if receptor n reacts to ligand i and if it does not. We can then calculate activity statistics using Eqs. 2 and 6, as shown in Supporting Information. In the simple case of uncorrelated mixtures ( for ), and . In Supporting Information, we also calculate corrections due to the correlated appearance of ligands (); e.g., , where is the receptor activity in the uncorrelated case.
In the case of uncorrelated mixtures, we find, using Eq. 5, that for optimal receptor arrays must satisfy
[7a] |
and
[7b] |
Receptors are thus optimal if (i) the occurrence probabilities of the ligands they react to add up to 1/2 and (ii) no ligand activates multiple receptors. Because any given ligand is rare in natural odors, , such optimization is equivalent to a partition problem where the probabilities have to be put into groups (i.e., a group of ligands for each receptor), such that the sum of the elements is close to 1/2, while a minimal number of elements should appear in several groups. Eq. 4 gives the relative cost of violating these two possibly conflicting requirements.
This partition problem can be solved approximately using random binary sensitivity matrices. The ensemble of such matrices is characterized by a single parameter, the fraction of nonzero entries or sparsity ξ. Fig. 2A shows that there is an optimal sparsity , at which I is maximized. It follows from that
[8] |
where is the mean mixture size (see Supporting Information). This condition for random matrices agrees well with the sparsity found from numerical optimization over all binary matrices (see Fig. 2B). However, for small s, the sparsity becomes large, which leads to significant correlations and thus reduced performance. Optimal matrices thus have a sparsity that is lower then predicted by Eq. 8 for small mixture sizes s (see Fig. 2B).
Wide concentration distributions.
In reality, odor concentrations vary widely, and receptor arrays must thus measure both odor composition and concentrations. The concentration of a single ligand can be measured if many receptors react to it with different sensitivities (7). The receptor array is optimal for this task if all possible outputs occur with equal frequency. This is the case if the inverse of the sensitivities follows the same distribution as the ligand concentrations (13), which is known as Laughlin’s principle. However, it is not clear how this principle can be generalized for measuring the concentration of multiple ligands simultaneously.
We study this problem by considering random sensitivities that are lognormally distributed. This choice is motivated by the complex interaction between receptors and ligands, which typically leads to normally distributed binding energies (28). We will show later that experimentally measured sensitivities indeed appear to be lognormally distributed. Lognormal distributions are characterized by two parameters, the mean and the SD λ of the underlying normal distribution. We thus next ask how these parameters have to be chosen to maximize the mutual information I. To estimate I, we need to consider the excitations , which approximately also follow a lognormal distribution (29). Their statistics are given by Eq. 6 and read and , where and . We use this to calculate from Eq. 2 and find that the receptor array is optimal () if (see Supporting Information)
[9] |
We test this equation by numerically calculating the mutual information I as a function of and λ. Fig. 3A shows that Eq. 9 predicts the optimal parameters of lognormally distributed sensitivities very well. Fig. 3B shows that this result also predicts the mean for numerical optimizations over general sensitivity matrices.
Log-normally distributed sensitivities perform badly if the distribution width λ is small (see Fig. 3A). This is expected because receptors with narrowly distributed respond similarly to all ligands, leading to large correlations and thus reduced performance I. Interestingly, for large enough λ, the correlations are so small that the exact value of λ does not influence I significantly (see Fig. 3A). In fact, for very large λ, the are likely very large or very small compared with . When is chosen according to Eq. 9, receptors can thus only detect whether ligands are present or not, corresponding to the binary sensitivities discussed above, which cannot resolve the concentration of the ligands. Consequently, λ must influence how well such receptor arrays can resolve concentrations.
Trade-off between concentration resolution and mixture discriminability.
When the distribution width λ is large, the receptor arrays have similar performance I, so they are equally good at the combined problem of resolving concentrations and discriminating mixtures. However, the performance in the individual problems can vary widely. Because, in many contexts, we might wish to trade off performance, say, by sacrificing some ability to discriminate mixtures in favor of a better concentration resolution, we next investigate these properties in detail.
We define the concentration resolution R as the ratio of the concentration c at which a single ligand is presented and the concentration change that is necessary to register a change, . Here, we consider the simple case where η additional receptors have to be excited to register a change in concentration. R is a function of the concentration c at which it is measured and its maximal value
[10] |
is obtained for , which is the inverse of the median of the sensitivity distribution (see Supporting Information).
The range of concentrations that can be detected by the receptor array is given by the ratio of the largest concentration at which concentration differences can be detected to the lowest detectable concentration , the odor detection threshold (30). In terms of η, the logarithm of the concentration range reads (see Supporting Information)
[11] |
where is the inverse error function. Eq. 11 shows that λ determines the number of concentration decades over which the receptor array is sensitive.
Taken together, λ has opposing effects on the resolution and the range of concentration measurements (see Fig. 4A). Consequently, λ can be tuned either for receptors that resolve concentrations well or cover a large concentration range. If only single ligands are measured, the optimal λ only depends on the concentration distribution . In this case, the mutual information I can be calculated from the resolution function , and optimizing is equivalent to maximizing I (31). For odor mixtures, I accounts for a combination of the concentration resolution and the mixture discrimination, and maximizing I does not uniquely determine an optimal receptor array. We thus next study how the distribution width λ influences the ability to discriminate mixtures.
We first consider mixtures of s ligands, each at concentration c, and determine the maximal size where adding an additional ligand does not significantly alter the activity pattern. Here is given by the largest s that obeys (see Supporting Information)
[12] |
where with being the cumulative distribution function of a lognormal distribution with mean μ and variance . Fig. 5A shows that increases with decreasing concentrations, but, if the concentration falls below the odor detection threshold, individual ligands cannot be detected (dotted lines).
Not all mixtures with less then ligands can be distinguished from each other. We show this by calculating the Hamming distance h of the activity patterns of two mixtures, i.e., the number of differences in the output. For simplicity, we consider mixtures that contain s ligands, sharing of them. In this case, a given receptor is activated by one of the mixtures if , where and are the excitations caused by the shared and the different ligands, respectively. Approximating the probability distribution of the excitations as a lognormal distribution, we can calculate the expected distance h (see Supporting Information). Fig. 5B shows that this approximation (solid lines) agrees well with numerical calculations (symbols). The figure also shows that mixtures can only be distinguished well if the concentration of the constituents is in the right range. This is because receptors are barely excited for too small concentrations, whereas they are saturated for large concentrations. The distance h also strongly depends on the number of shared ligands between the two mixtures, which has also been shown experimentally (32). The distance vanishes for , but Fig. 5B shows that a single different ligand can be sufficient to distinguish mixtures in the right concentration range (green line). This range increases with the width λ of the sensitivity distribution, similar to the range over which concentrations can be measured (see Eq. 11). The suitable concentration range is also a function of the mean sensitivity , which, in turn, must be adjusted to the odor statistics (see Eq. 9). Consequently, our model predicts that only mixtures with total concentrations near the average concentration in natural mixtures can be distinguished well.
Experimentally Measured Receptor Arrays.
The response of receptors to individual ligands has been measured experimentally for flies (33) and humans (34). We use these published data to estimate the statistics of realistic sensitivity matrices as described in Supporting Information. Fig. 6 shows the histograms of the logarithms of the sensitivities for flies and humans. Both histograms are close to a normal distribution, with similar SDs , which implies lognormally distributed sensitivities. Using a simple binding model between receptors and ligands, can also be interpreted as the SD of the interaction energies (see Supporting Information). Consequently, these interaction energies exhibit a similar variation on the order of 1 for both organisms, which could be caused by the biophysical similarity of the receptors.
We next use the measured lognormal distribution for the sensitivities to compare the concentration resolution R predicted by Eq. 10 to measured “just noticeable relative differences” (23). For humans (), the measured values are as low as 4% (35), which implies . Using , this suggest that about four receptors have to be activated until a change in concentration can be registered. Additionally, our theory predicts that humans can sense concentrations over about 2.6 orders of magnitude, which follows from Eq. 11 for , , and . However, we are not aware of any measurements of the concentration range for humans.
Our theory also predicts the maximal number of ligands that can be distinguished as a function of the concentration c of the individual ligands. For , we expect that the maximal number of ligands in a mixture is around 20 if individual ligands can be detected (see Fig. 5A). Experimental studies report similar numbers, e.g., (36) and (6). However, Fig. 5A shows that strongly depends on the concentration of the individual ligands and thus on experimental details. Similarly, how well mixtures can be discriminated also depends strongly on the ligand concentration. Fig. 5B shows that the concentration range over which mixtures can be distinguished is less than an order of magnitude for .
Discussion
We studied how arrays of olfactory receptors can be used to measure odor mixtures, focusing on the combinatorial code of olfaction, i.e., how the combined response of multiple receptors can encode the composition (quality) and the concentration (quantity) of odors. Such arrays are optimal if each receptor responds to half of the encountered odors and the receptors have distinct ligand binding profiles to minimize correlations.
Our simple model of binary receptors can, in principle, distinguish a huge number of odors, because there are different output combinations for . However, it is not clear whether all outputs are achievable and how they are used to distinguish odors. We showed that the mean receptor sensitivity must be tailored to the mean concentration to best use the large output space. Another important parameter of receptor arrays is the fraction of receptors that is activated by a single ligand, which is equivalent to the sparsity ξ in the simple case of binary sensitivities. If ξ is small, combining different ligands typically leads to unique output patterns that allow identification of the mixtures, but the concentration of isolated ligands cannot be measured reliably, because only a few receptors are involved. Conversely, if ξ is large, mixtures of multiple ligands will excite almost all receptors, such that neither the odor quality nor the odor quantity can be measured reliably. However, here, the concentration of an isolated ligand can be measured precisely. We discussed this property in detail for sensitivities that are lognormally distributed, where the width λ controls whether mixtures can be distinguished well or concentrations can be measured reliably. Interestingly, experiments find that individual ligands at moderate concentration only excite a few glomeruli (37), but natural odors at native concentrations can excite many (38). This could imply that the sensitivities are indeed adapted such that each receptor is excited about half the time for natural odors.
Our model implies that having more receptor types can improve all properties of the receptor array. In particular, both the concentration resolution R and the typical distance h between mixtures are proportional to , a prediction that can be tested experimentally. For instance, mice, with receptor types, are very good at identifying a single odor in a mixture (39), but flies, with (33), should perform much worse. However, quantitative comparisons might be difficult because the discrimination performance strongly depends on the normalized concentration at which odors are presented. In fact, we predict that mixtures can hardly be distinguished if the concentration of the individual ligands is changed by an order of magnitude (see Fig. 5B).
Our results also apply to artificial chemical sensor arrays known as “artificial noses” (40, 41). Having more sensors improves the general performance of the array, but it is also important to tune the sensitivity of individual sensors. Here, sensors should be as diverse as possible while still responding to about half the incoming mixtures. Unfortunately, building such chemical sensors is difficult, and their binding properties are hard to control (41). If the sensitivity matrix of the sensor array is known, our theory can be used to estimate the information that receptor n contributes as where , such that (see Eq. 4). This can then be used for identifying poor receptors that contribute only a little information to the overall results.
Our focus on the combinatorial code of the olfactory system certainly neglects intricate details of the system. For instance, we do not consider the dynamics of sniffing and odor absorption, which are the first processing steps and influence the perception (42). Further, our simple model of the binding of odorants to receptors, described by sensitivity matrices with independent entries, neglects biophysical constraints that will cause chemically similar ligands to excite similar receptors (8, 43). This is important because it makes it difficult to distinguish similar ligands (44), and it might thus be worthwhile to dedicate more receptors to such a part of chemical space. Additionally, receptors or glomeruli might interact with each other, e.g., causing inhibition reducing the signal upon binding a ligand (45). We can, in principle, discuss inhibition in our model by allowing for negative sensitivities, but more complicated features cannot be captured by the linear relationship in Eq. 1. One important nonlinearity is the dose–response curve of individual receptor neurons (21), which we approximate by a step function (see Eq. 2). This simplification reduces the information capacity of a single glomerulus to 1 bit, whereas it is likely higher in reality. However, we expect that allowing for multiple output levels would only increase the concentration resolution and not change the discriminability of mixtures very much (23). Additionally, these perceptual quantities could be influenced by other processes, e.g., lateral inhibition between glomeruli (11, 46) and top-down modulation that adjusts the sensory system based on behavior (46). Besides such enhancements of olfactory sensing, further processing can only remove information, so our results provide an upper bound for the ability to recognize odors.
Receptor Sensitivities
Equilibrium Binding Model.
We consider a simple model where receptors get activated when they bind ligands . This binding is described by the chemical reaction , where is the receptor−ligand complex. The equilibrium of the reaction is characterized by a binding constant , which reads
[S1] |
where is the interaction energy between receptor n and ligand i. In equilibrium, the concentrations denoted by square brackets obey . Hence,
[S2] |
where we consider the case where multiple ligands compete for the same receptor. Here, denotes the fixed concentration of receptors, and is the concentration of free ligands. We consider a simple receptor model in which the excitation of a receptor of type n is proportional to the concentration of bound ligands,
[S3] |
where characterizes the excitability of receptor type n. As discussed in the Introduction, the excitations of all receptors of a given type are accumulated in the respective glomeruli, whose excitation is thus given by , where is the number of receptors of type n. In the simple case of binary outputs, a glomerulus becomes active if its excitation exceeds a threshold , , where denotes the Heaviside step function. We consider the case , where the glomerulus signals before the associated receptors become saturated. In this case, we can linearize Eq. S2 and introduce the rescaled quantities
[S4] |
A simple theory (28) predicts that the interaction energies between receptors and ligands are normal distributed. For the receptor model described above, this implies lognormal distributed binding constant (see Eq. S1). In this case, the sensitivities will also be lognormal distributed (see Eq. S4).
Measured Receptor Sensitivities.
Response matrices have been measured experimentally for flies (33) and humans (34). The fly database has been constructed by merging data from many studies that used various methods to measure receptor responses (33). It contains a nonzero response for 5,482 receptor−ligand pairs, covering all 52 receptors that are present in flies. Fig. 6A shows the histogram of the logarithm of the associated sensitivities together with a normal distribution with the same mean and variance as the data.
The only comprehensive study of human olfactory receptors used a luciferase assay to measure receptor responses in vitro (34). It reports the intensity of clones of 511 human olfactory receptors in response to various concentrations of 73 ligands. Typically, the intensity of a given receptor−ligand pair is monotonously increasing as a function of ligand concentration c. We normalize the intensity to lie between 0 and 1 and fit a hyperbolic tangent function to determine the concentration at which the normalized intensity reaches 0.5. Here, the only fit parameters are the concentration and the slope of the tangent function at this point. We exclude poor fits, where the relative error in either parameter is above . This leaves us with 203 of the 623 receptor−ligand combinations, for which we then define the sensitivity as . Fig. 6B shows the histogram of the logarithm of these sensitivities together with a normal distribution with the same mean and variance as the data.
Receptor Response
We next discuss the statistics of the receptor responses as a function of the odor statistics . We first analyze narrow concentration distributions, where only the presence of ligands needs to be detected and their concentration is of minor interest. Then, we consider the more complex case of wide concentration distributions, which require a distribution of sensitivities for sensing different concentrations.
Narrow Concentration Distribution.
In this case, we are only interested in measuring the composition of an odor . Consequently, it is sufficient to consider binary vectors , where . Instead of parameterizing the statistics by the ligand frequency and the correlations , it will prove useful to write the odor statistics as
[S5] |
Here, denotes the commonness of ligand i (related to ) and parameterizes correlations between ligands i and j (related to ). Without loss of generality, is symmetric with zeros on the diagonal. The associated partition function , which ensures that , reads
[S6] |
where the integral is over all binary vectors of length . Note that in this section on narrow concentration distributions, we use the Einstein summation convention, i.e., we imply summation over repeated indices.
Uncorrelated mixtures.
For uncorrelated mixtures (), the partition function reads . The probability of finding a ligand then reads
[S7] |
where the notation and the index * denote the average with respect to uncorrelated mixtures. Eq. S7 provides the mapping between the commonness used in Eq. S5 and the ligand frequency used in the main text. The covariance follows from
[S8] |
and reads
[S9] |
The receptor activity , given by Eq. 2, is a function of the excitation , which involves the binary sensitivities . Consequently, and the step function in Eq. 2 can be approximated by
[S10] |
which becomes exact in the limit . We use this to calculate the moments of ,
[S11a] |
and
[S11b] |
where
[S12a] |
[S12b] |
In particular, we have in the limit ,
[S13a] |
[S13b] |
[S13c] |
Hence,
[S14a] |
[S14b] |
We develop these equations to linear order in to obtain Eqs. 7.
The receptor activity for binary sensitivity matrices with independent and identically distributed entries is described by
[S15a] |
[S15b] |
where is the mean number of ligands in a mixture. Here, ξ denotes the sparsity of , which is the only parameter of the random ensemble. The optimal sparsity at which I is maximized is given by the condition . Using Eq. S14a and solving for ξ, we obtain
[S16] |
which, for large at constant s, becomes .
Correlated mixtures.
We consider weakly correlated mixtures, where we expand all results to linear order in . Hence,
[S17] |
The probability with which ligand i appears reads
[S18] |
where
[S19] |
with . Hence,
[S20] |
where we used and . Similarly, the covariance between ligands can be calculated from , which involves
[S21] |
and thus reads
[S22] |
where we used Eq. S8. Hence,
[S23] |
where . Taken together, Eqs. S20 and S23 provide the mapping between used in the main text and used in the definition of in Eq. S5.
The statistics of the receptor activity follow from
[S24a] |
and
[S24b] |
which can be expressed as
[S25a] |
[S25b] |
where
[S26a] |
[S26b] |
Expanding the fractions in Eq. S25, we obtain
[S27a] |
[S27b] |
Substituting Eq. S26, this becomes
[S28] |
where, in the last expression, the can also be replaced by to first order in . For the simple case of a random, binary sensitivity matrix with sparsity ξ, we obtain
[S29] |
In the case where the correlations are predominately positive (), the frequency of individual ligands and the receptor response are increased, and , respectively. Consequently, the optimal sparsity must be smaller than in the uncorrelated case to have .
Wide Concentration Distribution.
We next consider mixtures where the concentrations of the individual ligands are drawn from a continuous distribution (). For simplicity, we consider uncorrelated mixtures (, for ), which are characterized by the ligand frequencies , their mean concentrations μi, and their SD σi. In the case where a receptor is excited by many ligands, its excitation can be described by a lognormal distribution (29), which is parameterized by the mean and variance given in Eq. 6. The associated mean receptor activity is then given by the probability that . Using the survival function of a lognormal distribution that is parameterized by its mean and variance we obtain
[S30] |
Because , the associated variance reads
[S31] |
which also determines the diagonal values of the covariance matrix . For , we have
[S32] |
where is the multivariate distribution of the two excitations and . We approximate by a normal distribution, which describes the excitations and in the vicinity of . This distribution is characterized by the means together with the covariances , which comprise five parameters in total. Hence,
[S33] |
for . The associated covariance follows from the definition , where we obtain the mean by expanding Eq. S30 around the optimal point for small ,
[S34] |
which is the same approximation that also led to Eq. S33. Consequently, we have
[S35] |
The conditions for optimal sensitivity matrices, and , can thus be expressed as (see Eqs. S30 and S35)
[S36a] |
[S36b] |
For small , this reduces to , which indeed leads to in the approximation given in Eq. S34.
Numerical Simulations.
We use a simple two-step procedure to draw odors from the statistics . First, we determine the ligands that appear in a given mixture by drawing a random binary vector with from
[S37] |
analogously to Eq. S5. Here, and determine and according to Eqs. S20 and S22, respectively. In the case of narrow concentration distributions, the odor is given by . For wide concentration distributions, we draw the concentration for each ligand i that appears in a mixture () from a lognormal distribution with mean and SD , which is described by the probability density function
[S38] |
where . Taken together, this describes the odor statistics .
Given this odor statistics and a sensitivity matrix , the mutual information I can, in principle, be calculated from Eqs. 1−3. Calculating to evaluate Eq. 3 involves an integral over over the nonlinear function given in Eq. 2. We approximate this integral using Monte Carlo sampling of the odor statistics . Because of the stochastic nature of Monte Carlo sampling, the calculated I is not exact. Consequently, we use the stochastic, derivative-free numerical optimization method covariance matrix adaptation evolution strategy (CMA-ES) (47) to optimize the sensitivity matrix with respect to I to produce Fig. 3B.
Properties of Arrays with Random Sensitivities
We study properties of receptor arrays characterized by random sensitivity matrices whose entries are independent and identically distributed. Here, we consider a lognormal distribution for the sensitivities, whose probability density function and cumulative distribution function read
[S39a] |
[S39b] |
and are parameterized by the mean and the width λ, which is the SD of the underlying normal distribution. Note that all following calculations could also be performed for other sensitivity distributions.
Concentration Resolution.
The fraction of receptors that are activated by a single ligand at concentration c reads
[S40] |
The typical concentration change that is necessary to excite η additional receptor is then defined by the condition . Expanding around c, the solution for reads
[S41] |
For lognormal distributed sensitivities, the maximum of the associated resolution is given in Eq. 10.
Concentration Range.
The minimal concentration that can be sensed is defined by the condition , whereas is given by . Solving these equations, the concentration range becomes
[S42] |
where is the inverse function of the cumulative distribution function . For lognormal distributed sensitivities, we obtain Eq. 11.
Maximal Number of Distinguishable Ligands.
In the simple case of a mixture with s ligands, all at concentration c, the fraction of excited receptors is given by
[S43] |
where is the cumulative probability function of the sum . If the are lognormal distributed, the distribution for can also be approximated by a lognormal distribution (29), which has mean and variance . In this case,
[S44] |
where is the squared coefficient of variation. Fig. S2A shows that Eq. S44 approximates the numerically determined well.
We next consider the maximal number of ligands that can be distinguished. Here, we, for simplicity, consider the case where mixtures can be distinguished when they excite activity patterns that differ for at least η receptors. Because a mixture with s components on average excites receptors, this condition reads
[S45] |
Expanding as a function of s, this condition can be approximated by
[S46] |
where . Fig. S2B shows that this function has a single peak. Mixtures with ligands can thus all be distinguished from each other if
[S47] |
Here, the first condition ensures that c is above the odor detection threshold, and the second condition ensures that the two largest mixtures excite sufficiently different activity patterns.
Discriminability of Two Mixtures of Equal Size.
We next consider how well two mixtures can be discriminated. For simplicity, we consider two mixtures, each with s ligands of which are shared. We call these two mixtures plus () and minus () to distinguish them. To determine the Hamming distance between the activation patterns, we first consider the excitations of a single receptor caused by the two mixtures,
[S48a] |
Here, denotes the set of ligands appearing in both mixtures, and denote those only appearing in one of the mixtures. Note that we only consider the case where the ligands appear with the same concentration c. The excitations can be rewritten as
[S49] |
where the are random variables. Here, is distributed according to , and are distributed according to . The probability that the receptor activity is the same for both mixtures is given by
[S50] |
The first term can be expressed as
[S51] |
where is the number of ligands that differ between the two mixtures. Here, denotes the probability density functions of . Eq. S51 can also be written as
[S52] |
Similarly, the second term in Eq. S50 can be expressed as
[S53] |
where the first term is the probability that the ligands appearing in both mixtures excite the receptor alone. The second term denotes the probability that although is not large enough, both and are sufficient to bring the excitation above threshold. The mean Hamming distance between the activation patterns of the two mixtures then reads
[S54] |
To test this equation, we randomly draw mixtures at given s and , determine their activation pattern according to Eqs. 1 and 2, and determine the associated difference. Fig. S1 shows that Eq. S54 agrees well with these numerical results. Although h is a function of s, , c, , λ, and , the only important parameters are s, , , and λ, because is just a prefactor and only sets the scale of typical concentrations. We can thus explore the behavior by plotting h as a function of s and for different (see Fig. S3). This plot shows that mixtures can be distinguished well when the concentration is in the right interval. Fig. S3 can be used to determine the parameter region in which a receptor array is likely able to distinguish two mixtures. In the simple case where the activity patterns must be different in at least η receptors, mixtures can typically be distinguished if .
Acknowledgments
We thank Carl Goodrich, Venkatesh N. Murthy, and Michael Tikhonov for helpful discussions and a critical reading of the manuscript. This research was funded by the National Science Foundation (NSF) through DMR-1435964, DMR-1420570, and DMS-1411694. M.P.B. is an investigator of the Simons Foundation. D.Z. was also funded by the German Science Foundation through ZW 222/1-1, the NSF through PHY11-25915, the National Institutes of Health Award 5R25GM067110-07, and the Moore Foundation Award 2919.
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1600357113/-/DCSupplemental.
References
- 1.Touhara K, Vosshall LB. Sensing odorants and pheromones with chemosensory receptors. Annu Rev Physiol. 2009;71:307–332. doi: 10.1146/annurev.physiol.010908.163209. [DOI] [PubMed] [Google Scholar]
- 2.Su CY, Menuz K, Carlson JR. Olfactory perception: Receptors, cells, and circuits. Cell. 2009;139(1):45–59. doi: 10.1016/j.cell.2009.09.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Mainland JD, Lundström JN, Reisert J, Lowe G. From molecule to mind: An integrative perspective on odor intensity. Trends Neurosci. 2014;37(8):443–454. doi: 10.1016/j.tins.2014.05.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Verbeurgt C, et al. Profiling of olfactory receptor gene expression in whole human olfactory mucosa. PLoS One. 2014;9(5):e96333. doi: 10.1371/journal.pone.0096333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Dunkel M, et al. SuperScent—A database of flavors and scents. Nucleic Acids Res. 2009;37(Database issue):D291–D294. doi: 10.1093/nar/gkn695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Weiss T, et al. Perceptual convergence of multi-component mixtures in olfaction implies an olfactory white. Proc Natl Acad Sci USA. 2012;109(49):19959–19964. doi: 10.1073/pnas.1208110109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Hopfield JJ. Odor space and olfactory processing: Collective algorithms and neural implementation. Proc Natl Acad Sci USA. 1999;96(22):12506–12511. doi: 10.1073/pnas.96.22.12506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Malnic B, Hirono J, Sato T, Buck LB. Combinatorial receptor codes for odors. Cell. 1999;96(5):713–723. doi: 10.1016/s0092-8674(00)80581-4. [DOI] [PubMed] [Google Scholar]
- 9.Hasin Y, et al. High-resolution copy-number variation map reflects human olfactory receptor diversity and evolution. PLoS Genet. 2008;4(11):e1000249. doi: 10.1371/journal.pgen.1000249. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Maresh A, Rodriguez Gil D, Whitman MC, Greer CA. Principles of glomerular organization in the human olfactory bulb—Implications for odor processing. PLoS One. 2008;3(7):e2640. doi: 10.1371/journal.pone.0002640. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Murthy VN. Olfactory maps in the brain. Annu Rev Neurosci. 2011;34:233–258. doi: 10.1146/annurev-neuro-061010-113738. [DOI] [PubMed] [Google Scholar]
- 12.Leon M, Johnson BA. Olfactory coding in the mammalian olfactory bulb. Brain Res Brain Res Rev. 2003;42(1):23–32. doi: 10.1016/s0165-0173(03)00142-5. [DOI] [PubMed] [Google Scholar]
- 13.Laughlin S. A simple coding procedure enhances a neuron’s information capacity. Z Naturforsch C. 1981;36(9-10):910–912. [PubMed] [Google Scholar]
- 14.Ruderman DL, Bialek W. Statistics of natural images: Scaling in the woods. Phys Rev Lett. 1994;73(6):814–817. doi: 10.1103/PhysRevLett.73.814. [DOI] [PubMed] [Google Scholar]
- 15.Lewicki MS. Efficient coding of natural sounds. Nat Neurosci. 2002;5(4):356–363. doi: 10.1038/nn831. [DOI] [PubMed] [Google Scholar]
- 16.Tkacik G, Callan CG, Jr, Bialek W. Information flow and optimization in transcriptional regulation. Proc Natl Acad Sci USA. 2008;105(34):12265–12270. doi: 10.1073/pnas.0806077105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Hummel R. Image enhancement by histogram transformation. Comput Graph Image Process. 1977;6(2):184–195. [Google Scholar]
- 18.Wright GA, Thomson MG. 2005. Odor perception and the variability in natural odor scenes. Integrative Plant Biochemistry, Recent Advances in Phytochemistry (Elsevier, New York), Vol 39, pp 191–226.
- 19.McGann JP, et al. Odorant representations are modulated by intra- but not interglomerular presynaptic inhibition of olfactory sensory neurons. Neuron. 2005;48(6):1039–1053. doi: 10.1016/j.neuron.2005.10.031. [DOI] [PubMed] [Google Scholar]
- 20.Lin Y, Shea SD, Katz LC. Representation of natural stimuli in the rodent main olfactory bulb. Neuron. 2006;50(6):937–949. doi: 10.1016/j.neuron.2006.03.021. [DOI] [PubMed] [Google Scholar]
- 21.Reisert J, Restrepo D. Molecular tuning of odorant receptors and its implication for odor signal processing. Chem Senses. 2009;34(7):535–545. doi: 10.1093/chemse/bjp028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lowe G, Gold GH. Olfactory transduction is intrinsically noisy. Proc Natl Acad Sci USA. 1995;92(17):7864–7868. doi: 10.1073/pnas.92.17.7864. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Koulakov A, Gelperin A, Rinberg D. Olfactory coding with all-or-nothing glomeruli. J Neurophysiol. 2007;98(6):3134–3142. doi: 10.1152/jn.00560.2007. [DOI] [PubMed] [Google Scholar]
- 24.Stevens CF. What the fly’s nose tells the fly’s brain. Proc Natl Acad Sci USA. 2015;112(30):9460–9465. doi: 10.1073/pnas.1510103112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Knudsen JT, Tollsten L, Bergström LG. Floral scents? A checklist of volatile compounds isolated by head-space techniques. Phytochemistry. 1993;33(2):253–280. [Google Scholar]
- 26.Atick JJ. Could information theory provide an ecological theory of sensory processing? Network. 2011;22(1-4):4–44. doi: 10.3109/0954898X.2011.638888. [DOI] [PubMed] [Google Scholar]
- 27.Sessak V, Monasson R. Small-correlation expansions for the inverse ising problem. J Phys A. 2009;42:055001. [Google Scholar]
- 28.Lancet D, Sadovsky E, Seidemann E. Probability model for molecular recognition in biological receptor repertoires: Significance to the olfactory system. Proc Natl Acad Sci USA. 1993;90(8):3715–3719. doi: 10.1073/pnas.90.8.3715. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Fenton LF. The sum of log-normal probability distributions in scatter transmission systems. IRE Trans Commun Syst. 1960;8(1):57–67. [Google Scholar]
- 30.Abraham MH, Sánchez-Moreno R, Cometto-Muñiz JE, Cain WS. An algorithm for 353 odor detection thresholds in humans. Chem Senses. 2012;37(3):207–218. doi: 10.1093/chemse/bjr094. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Bialek W. Biophysics: Searching for Principles. Princeton Univ Press; Princeton, NJ: 2012. [Google Scholar]
- 32.Bushdid C, Magnasco MO, Vosshall LB, Keller A. Humans can discriminate more than 1 trillion olfactory stimuli. Science. 2014;343(6177):1370–1372. doi: 10.1126/science.1249168. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Münch D, Galizia CG. Door 2.0—Comprehensive mapping of Drosophila melanogaster odorant responses. Sci Rep. 2016;6:21841. doi: 10.1038/srep21841. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Mainland JD, Li YR, Zhou T, Liu WLL, Matsunami H. Human olfactory receptor responses to odorants. Sci Data. 2015;2:150002. doi: 10.1038/sdata.2015.2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Cain WS. Differential sensitivity for smell: “Noise” at the nose. Science. 1977;195(4280):796–798. doi: 10.1126/science.836592. [DOI] [PubMed] [Google Scholar]
- 36.Jinks A, Laing DG. A limit in the processing of components in odour mixtures. Perception. 1999;28(3):395–404. doi: 10.1068/p2898. [DOI] [PubMed] [Google Scholar]
- 37.Saito H, Chi Q, Zhuang H, Matsunami H, Mainland JD. Odor coding by a Mammalian receptor repertoire. Sci Signal. 2009;2(60):ra9. doi: 10.1126/scisignal.2000016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Vincis R, Gschwend O, Bhaukaurally K, Beroud J, Carleton A. Dense representation of natural odorants in the mouse olfactory bulb. Nat Neurosci. 2012;15(4):537–539. doi: 10.1038/nn.3057. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Rokni D, Hemmelder V, Kapoor V, Murthy VN. An olfactory cocktail party: Figure-ground segregation of odorants in rodents. Nat Neurosci. 2014;17(9):1225–1232. doi: 10.1038/nn.3775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Albert KJ, et al. Cross-reactive chemical sensor arrays. Chem Rev. 2000;100(7):2595–2626. doi: 10.1021/cr980102w. [DOI] [PubMed] [Google Scholar]
- 41.Stitzel SE, Aernecke MJ, Walt DR. Artificial noses. Annu Rev Biomed Eng. 2011;13:1–25. doi: 10.1146/annurev-bioeng-071910-124633. [DOI] [PubMed] [Google Scholar]
- 42.Wachowiak M. All in a sniff: Olfaction as a model for active sensing. Neuron. 2011;71(6):962–973. doi: 10.1016/j.neuron.2011.08.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Hallem EA, Carlson JR. Coding of odors by a receptor repertoire. Cell. 2006;125(1):143–160. doi: 10.1016/j.cell.2006.01.050. [DOI] [PubMed] [Google Scholar]
- 44.Perez M, Giurfa M, d’Ettorre P. The scent of mixtures: Rules of odour processing in ants. Sci Rep. 2015;5:8659. doi: 10.1038/srep08659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Ukhanov K, Corey EA, Brunert D, Klasen K, Ache BW. Inhibitory odorant signaling in Mammalian olfactory receptor neurons. J Neurophysiol. 2010;103(2):1114–1122. doi: 10.1152/jn.00980.2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Wilson RI. Early olfactory processing in Drosophila: Mechanisms and principles. Annu Rev Neurosci. 2013;36:217–241. doi: 10.1146/annurev-neuro-062111-150533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Hansen N. The CMA evolution strategy: A comparing review. In: Lozano JA, Larrañaga P, Inza I, Bengoetxea E, editors. Towards a New Evolutionary Computation. Springer; New York: 2006. pp. 75–102. [Google Scholar]