Skip to main content
World Journal of Otorhinolaryngology - Head and Neck Surgery logoLink to World Journal of Otorhinolaryngology - Head and Neck Surgery
. 2018 Jun 28;4(1):11–28. doi: 10.1016/j.wjorl.2018.03.001

Measurement of chemosensory function

Richard L Doty a,b,
PMCID: PMC6051764  PMID: 30035257

Abstract

Although hundreds of thousands of patients seek medical help annually for disorders of taste and smell, relatively few medical practitioners quantitatively test their patients' chemosensory function, taking their complaints at face value. This is clearly not the approach paid to patients complaining of visual, hearing, or balance problems. Accurate chemosensory testing is essential to establish the nature, degree, and veracity of a patient's complaint, as well as to aid in counseling and in monitoring the effectiveness of treatment strategies and decisions. In many cases, patients perseverate on chemosensory loss that objective assessment demonstrates has resolved. In other cases, patients are malingering. Olfactory testing is critical for not only establishing the validity and degree of the chemosensory dysfunction, but for helping patients place their dysfunction into perspective relative to the function of their peer group. It is well established, for example, that olfactory dysfunction is the rule, rather than the exception, in members of the older population. Moreover, it is now apparent that such dysfunction can be an early sign of neurodegenerative diseases such as Alzheimer's and Parkinson's. Importantly, older anosmics are three times more likely to die over the course of an ensuring five-year period than their normosmic peers, a situation that may be averted in some cases by appropriate nutritional and safety counseling. This review provides the clinician, as well as the academic and industrial researcher, with an overview of the available means for accurately assessing smell and taste function, including up-to-date information and normative data for advances in this field.

Keywords: Smell, Taste, Psychophysics, Age, Identification, Detection, Threshold, Olfaction, Anosmia, Ageusia, Hyposmia, Hypogeusia

Introduction

Chemosensory disorders are common in the general population, impacting safety, nutrition, and quality of life. Persons who cannot smell or taste have less enjoyment from eating, drinking, and the natural environment, and are at higher risk from such dangers as spoiled foods, tainted water, fire, leaking natural gas, and toxic environments. Importantly, olfactory dysfunction can be an early sign of such neurodegenerative diseases as Alzheimer's and Parkinson's.1 Since older persons with smell loss are three times more likely to die over the course of a 4- to 5-year period,2, 3 it behooves the modern physician to be aware of his or her patient's degree of smell function.

Unfortunately, quantitative testing of the senses of taste and smell is rarely, much less routinely, performed in the clinic. Without testing, the accuracy of a patient's chemosensory complaint cannot be definitively established. Indeed, most persons are inaccurate in assessing the nature and degree of their chemosensory problem and considerable return of function can occur, often without patient awareness.4, 5, 6, 7 In one study, for example, only 18% of patients with bilateral anterior tongue taste loss following sectioning of both chorda tympani nerves were aware of their deficit.8 Without testing, it is nearly impossible to detect malingering,9 and it cannot be determined whether a perceived decline in function is normal for the patient's age and sex.10, 11 Without testing, the efficacy of pharmacological, surgical, or other therapeutic interventions cannot be accurately ascertained.

As demonstrated by quantitative testing, smell disturbances are generally believed to be more common than taste disturbances.12 In fact, most patients who complain clinically of a “taste” disturbance actually have altered smell function.12 The flavor of foods, which is often interpreted as “taste”, largely depends upon volatiles that reach the olfactory receptors via the nasal pharynx during deglutition.13 Aside from sweet, sour, bitter, salty, savory (“umami”), and perhaps chalky or metallic sensations, nearly all “taste” sensations are olfactory sensations. This can be demonstrated by holding one's nose while drinking coffee or eating a piece of chocolate. Until the blockage of airflow is released, no coffee or chocolate “taste” will be perceived. Although meaningful decrements in the basic taste-bud mediated qualities can occur, this is relatively rare. The most common bona fide taste problems are dysgeusias or distortions of taste, or persistent phantogeusias, i.e., the presence of taste sensations in the absence of obvious taste stimuli. Salty and bitter phantogeusias are typical, often as side effects of medications.

This review provides up-to-date information on the types of smell and taste tests available for both clinic and laboratory applications. Recent advances in practical ways to test smell and taste function in the clinic are provided, along with normative data for some tests. The focus is on psychophysical tests -- tests that quantify a subject's conscious perception of stimuli. Most such tests are based on 19th and 20th century concepts of Weber,14 Fechner,15 Thurstone,16 Stevens,17 Tanner and Swets,18 and others (e.g., Peryam and Pilgrim)19 and do not rely on complex equipment. Psychophysical tests are generally more sensitive and reliable in detecting and quantifying chemosensory disturbances than extant electrophysiological tests. The latter tests, unlike their auditory counterparts (e.g., auditory brainstem evoked response), cannot reliably identify the locus of pathology within the brain, although summated electrical responses can be measured from the surface of the tongue and olfactory epithelium. Such recordings are difficult to measure and olfactory responses are present long after death, making them a poor surrogate for conscious smell perception. More comprehensive reviews of olfactory and gustatory tests, including electrophysiological tests, are available elsewhere.20, 21, 22, 23, 24, 25, 26, 27, 28

Olfactory tests

Psychophysical olfactory tests can be divided into threshold and suprathreshold categories. Threshold tests establish the lowest concentration of an odorant that can be perceived (detection threshold) or recognized as a quality (recognition threshold). Detection thresholds are lower than recognition thresholds. Unfortunately, some investigators fail to instruct their subjects to make the distinction between detection (which does not require the perception of an actual odor, only some sensation being present) and recognition (which requires such a perception), thereby increasing the variability of their threshold measures. Suprathreshold tests include ones that assess the ability to discern subtle differences between above-threshold concentrations of a given stimulus (e.g., the difference threshold), as well as tests of quality identification, discrimination, memory, intensity, and hedonics (e.g. pleasantness/unpleasantness). Most olfactory tests are strongly correlated with one another, although exceptions exist.29 When a correlation exists between two tests, its magnitude is largely dictated by the least reliable test. Despite different names, chemosensory tests often measure the same underlying physiologic processes. In the case of olfaction, for example, this can reflect the degree of damage to afferent pathways, including the receptor cells within the olfactory epithelium.

Odor threshold tests

A number of odor threshold tests have been developed. Their popularity is due, in part, to the fact that they are akin to pure-tone auditory hearing threshold tests -- tests which are familiar to most physicians. Odor detection threshold tests have achieved the most widespread use, given their relatively high reliability and amenability to forced-choice testing. Nowadays, phenyl ethyl alcohol (PEA) is the most commonly employed odorant in clinical threshold testing, given its relatively low propensity to stimulate intranasal trigeminal afferents, its relatively wide dynamic perceptual range, and its pleasant rose-like smell at higher concentrations.30 Other odorants that have been used clinically include n-butanol (rancid sweet/alcohol),31 amyl (pentyl) acetate (banana-like),32 phenyl ethyl methyl ethyl carbinol (pleasant and mildly herbaceous),33 γ-undecalactone (soft peach-like),34 iso-valeric acid (putrid sweat),34 skatole (vegetable garbage),34 and methyl cyclopentenolone (burnt/caramel).34

In light of the thousands of potential odorants that are available, one may question whether a threshold test score for a given odorant is an accurate measure of a patient's overall olfactory function. The answer is that, with rare exception, persons who are insensitive to one odor tend to be insensitive to other odors, and vice versa.35 One of several potential physiological explanations for this phenomenon is that beginning early in life less-than-total damage (e.g., from viruses) occurs cumulatively over time within the olfactory epithelium. Such damage impacts a range of receptor types whose combinatory processing dictates the perceived intensity and quality of specific odorants.36 Genetic factors may be involved, either alone or in combination, with environmental insults.37

Administration time for olfactory detection thresholds is around 15 min for a person with a normal sense of smell when stimuli are presented manually using devices such as the recently developed Snap & Sniff® “smell wands” (Fig. 1). This time can be reduced by the use of automated self-administered olfactometers that vary stimulus concentrations according to algorithms that take into account subject responses (Fig. 2). Test times for persons with no smell are much quicker, as their responses quickly go off scale since they cannot detect even the highest presented concentration.

Fig. 1.

Fig. 1

The Snap & Sniff® threshold test. The test kit is comprised of 20 smell “wands”. Five contain no odor and the others contain, in the case of the standard odorant phenyl ethyl alcohol, half-log stimulus dilutions ranging from 10−2 (strongest) to 10−9 (weakest) concentrations. When the thumb of the operator pushes the black ring forward, an odorized tip is presented to the subject. Releasing the ring retracts the tip back into the wand's housing. This test makes it impossible to touch the nose to the odor source. Photographs courtesy of Sensonics International, Haddon Heights, New Jersey USA. Copyright © 2017, Sensonics International.

Fig. 2.

Fig. 2

The self-administered computerized olfactory test system (SCOTS). The dome, which can be readily exchanged for other domes with different sets of odors, contains up to 40 odorants that can be individually released, or released in combination, to the sniffing port. The standard configuration in a single dome is a 12-item smell identification test and a phenyl ethyl alcohol threshold test analogous to the one employed in the Snap & Sniff® threshold test (Fig. 1). Photograph courtesy of Sensonics International, Haddon Heights, New Jersey USA. Copyright © 2017, Sensonics International.

The ascending method of limits (AML) and the single staircase (SS) psychophysical procedures are the two most common means for presenting olfactory stimuli to determine a threshold value.38, 39 In the AML procedure, odorants are presented from low to high concentrations, usually in half-log dilution steps. A transition point between no detection and detection is estimated and repeated runs are performed to increase reliability. The SS procedure typically begins in the same way. However, once the threshold region is reached and reliable detection occurs, the odorant concentration is decreased until incorrect responses occur, at which time concentration is again increased. An average of a number of the up-down transitions (“reversals”) serves as the estimate of the threshold (e.g., the last four of seven reversals). In most AML and SS procedures, forced-choice trials are made, i.e., a blank is presented along with the stimulus in a counterbalanced order on each trial. This results in more reliable threshold values, eliminates response biases (e.g., the tendency to be liberal or conservative in reporting the presence of sensation under uncertain circumstances), and allows for the detection of malingering by persons who are avoiding correct responses. Although in both the AML and SS procedures trials are initially begun at lower concentrations to minimize adaptation from higher concentrations, adaptation appears to be minimal within the perithreshold concentration range.40 In general, SS procedures produce more reliable measures than AML procedures.41 Normative detection threshold data for the Snap & Sniff® wands and the SS procedure are shown in Table 1 and Fig. 3.

Table 1.

Percentile ranks for Snap & Sniff® bilateral odor detection threshold scores (n = 386).

graphic file with name fx1.jpg

Odorant: phenyl ethyl alcohol. Threshold scores ≥ −2.40 log vol/vol are indicative of a threshold deficit. Courtesy of Sensonics International, Haddon Heights, New Jersey USA. Copyright ©2017 Sensonics International.

Fig. 3.

Fig. 3

Frequency distribution of Snap & Sniff® bilateral detection threshold test scores. n = 386. Odorant is phenyl ethyl alcohol (PEA). Scores ≥-2.00 are indicative of total anosmia. Copyright © 2017 Sensonics International, Haddon Heights, NJ USA.

Odor recognition thresholds are less commonly measured than odor detection thresholds in the clinic. One might argue on theoretical grounds that a descending method of limits (where decreasing rather than increasing concentrations are presented) would be more likely to result in a recognition than in a detection threshold, since the initial trials would be at concentrations where a clear qualitative sensation would be more evident, thereby biasing the subject to focus on odor quality rather than subtle differences in intensity. However, this has not been specifically tested. To my knowledge, the only commercially available threshold test that purposively assesses recognition thresholds is the Japanese T&T olfactometer.34 An AML procedure is used for each of five odorants. As the concentration series is ascended, the subject first reports when the detection of an odor occurs. A few concentrations later, the recognition of the odor quality is typically ascertained. Although this test has proved very useful in a number of studies, its reliability is lower than that of threshold measures incorporating forced-choice trials.41

Odor identification tests

Odor identification tests are widely used clinically to determine the degree of a person's olfactory function. A number of such tests have been developed.31, 38, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52 In most cases, a series of odorants are presented and the task of the subject is to identify the odor of each. Since odor identification can be difficult without cues, several response choices are provided and the subject is asked to select the one that best signifies the target stimulus. In forced-choice tests, the subject must select an answer even when no odor is perceived or uncertainty is present.31, 53, 54 Because a number of subjects, particularly older persons, have difficulty with the concept of providing an answer when nothing is smelled, some tests allow for an additional response category of “no smell”.37 However, this negates the possibility of detecting malingering from improbable responding and decreases the chance of identifying the detection of subtle stimuli.9 As noted for threshold tests, the reliability of forced-choice tests is higher than that of non-forced choice tests, in part because forced-choices mitigate the influences of the subjects' response criteria, i.e., the conservativeness or liberalness of reporting the presence of a stimulus under uncertain conditions.20

In the early 1980's, my colleagues and I developed the 40-item University of Pennsylvania Smell Identification Test (UPSIT; commercially known as the Smell Identification Test, Fig. 4).53 This widely used test has now been translated into over 30 languages and administered to around a million persons worldwide. There are now 3-, 4-, and 12-items version of this test. The UPSIT's ability to be self-administered and sent through the mail has made it possible to test hundreds, even thousands, of persons within very short periods of time.55, 56, 57 This test was the genesis for a 1986 National Geographic Magazine survey of smell function sent to over 10 million subscribers.58 An 8-item version of this test is now a part of the U.S. National Health and Nutrition Examination Survey (NHANES).59

Fig. 4.

Fig. 4

The 40-odor University of Pennsylvania Smell Identification Test (UPSIT).53 This test is comprised of four booklets, each containing 10 microencapsulated (“scratch and sniff”) odors which are released by a pencil tip. The examinee is required to provide an answer on each test item (see columns on last page of each booklet) even if no odor is perceived or the perceived odor does not smell like one of the response alternatives (i.e., the test is forced-choice). Photograph courtesy of Sensonics International, Haddon Heights, New Jersey USA. Copyright ©2017, Sensonics International.

Using normative data from ∼4000 subjects, the UPSIT establishes the degree of a patient's dysfunction in an absolute sense in a manner analogous to auditory testing, i.e., no loss, mild loss, moderate loss, severe loss, and total loss. Moreover, by employing percentile rankings, the degree of relative dysfunction compared to that of healthy persons of the same age and sex can be ascertained (Fig. 5). Such information is useful in establishing whether a patient's function is appropriate for their peer group. Thus, while an older person may notice a loss of function, the function may still be above average for his or her age. It is very therapeutic for a patient to know when this is the case, as it helps to him or her to keep the dysfunction in general perspective and to minimize anxiety and depression that can accompany such loss. Over the age of 85 years, nearly 60% of men and women have severe or total loss; 40% of men and 26% of women are totally anosmic.11

Fig. 5.

Fig. 5

Male normative data for the University of Pennsylvania Smell Identification Test (UPSIT; n = 1819). Female normative data are found elsewhere (Doty11). Note classification of dysfunction in absolute terms (normosmia, mild microsmia, moderate microsmia, severe microsmia, anosmia), as well as relative terms (percentiles for age categories). Note also that low test scores suggest avoidance of correct responses in a forced-choice situation, i.e., probable malingering. From Doty.11 Copyright ©2017, Sensonics International.

Odor discrimination tests

An odor discrimination test evaluates whether, independent of naming or identification, a subject can perceive the difference between two or more odorants on the basis of their quality. In its simplest form, successive pairs of odorants are presented, sometimes the same and other times different, and the task is to indicate whether they are the same or different. The number of trials in which correct differentiation is made is the test score.60 Other tests have the subject pick the “odd” stimulus from a set of foils that have the same smell (when three stimuli are involved, this is termed a triangle test), or to choose a previously presented odor from a set of stimuli, only one of which is the original odor. When different delay intervals are interspersed between the smelling of a “target” odor and the set of odors containing the target, short-term odor memory is also being measured.61 However, if the odorants are familiar ones, then semantic memory, rather than olfactory memory, may be what is mainly measured. Perhaps the most sophisticated odor discrimination test is one where similarities among a number of odors, presented in pairs, are estimated. The similarity ratings are then subjected to a mathematical algorithm that places the odorants in n-dimensional space (usually two- or three-dimensions), denoting the relative relationships among the stimuli. The spatial representations of the stimuli by persons with olfactory deficits are haphazard, unlike the representations of those without such deficits.62 Because of the lack of normative data and time considerations, this approach has only rarely been performed in the clinic.63

Discrimination tests can also focus on the intensity differences within a concentration series. “Differential thresholds” are defined as the smallest difference that can be discriminated reliably between two suprathreshold concentrations of an odor. Classically, the size of the increment in concentration that can be perceived (Δ I) was termed a just noticeable difference (JND) and served as the discrimination metric. Although Δ I/I is a constant across some segments of the concentration series of a given odorant, it can vary at higher and lower concentrations and varies among odorants. In effect, Δ I/I can be viewed as an index of the odorant's dynamic perceptual range. Differential thresholds are rarely measured in the clinic, largely because of practicality, lack of standardization, and the need to define where in a given concentration series the JND should be measured. Nonetheless, practical clinical differential thresholds that differ somewhat from classical JNDs have been used clinically. Eichenbaum et al,64 for example, employed 10 binary dilutions of four odorants as separate stimulus sets (acetone, ethanol, almond extract, lemon extract). Within each set, the highest and lowest concentrations were initially presented to the subject and the task was to identify which of the two stimuli smelled strongest. On subsequent trials, the strongest stimulus of a set was continued to be presented but gradually the comparison stimulus was increased in concentration until, after 10 trials, it was the same as the strongest stimulus. This process was repeated twice for each odorant. The differential threshold was defined as the “lowest dilution sample for which identification up to and including that dilution was errorless” (p. 462).

Odor intensity tests

Tests seeking to understand the build-up of the perceived intensity of a stimulus as its concentration increases take numerous forms. Rating scales are the most common way to assess the perceived intensity of suprathreshold odorants. Some scales, termed visual analogue or line scales, are labeled only at their extremes (e.g., very weak – very strong).65 Scales in which discrete categories are present are termed category scales.66 Some scales attempt to address the logarithmic-like build-up of sensation as the concentration of an odorant is increased by placing adjectives along the scale in a non-linear fashion that minimizes clumping of responses at the ends of the scale.67, 68, 69 Such scales are termed labeled magnitude scales. Examples of various types of rating scales are shown in Fig. 6.

Fig. 6.

Fig. 6

Examples of four types of rating scales. From left to right: (a) a labeled magnitude scale; (b) standard category scale in which the subject provides answers in discrete categories; (c) a visual analogue or graphic scale with anchors (descriptors) at each end; (d) a category scale with logarithmic visual density referents to denote non-linear increasing magnitudes of sensation, with verbal anchors at each end. Copyright © 2002, Richard L. Doty.

Labeled magnitude scales were derived from magnitude estimation procedures in which an estimate of a stimulus' intensity is provided relative to that of other presented stimuli in a ruler- or ratio-like manner. For example, an odor given an intensity value of 6 would be an odor that is perceived as twice as strong as an odor given an intensity value of 3 and half as strong as an odor given an intensity value of 12. The key is maintaining ratio relationships among the perceived intensities. While numbers are typically assigned to the perceived intensities, other means of signifying the relative intensities can also be used, such as pulling a tape measure a distance proportional to the perceived intensity. Usually a given subject is allowed to initially choose the general range of numbers with which he or she feels most comfortable (termed the free modulus method). For example, one subject may choose to assign the intensity of the first odor encountered the number 50, whereas another subject may choose the number 10. If the next odor was twice as strong, the first of these two subjects would estimate its intensity using the number 100 whereas the second would assign the number 20 to its intensity. This method is preferred to the method in which the initial number is set by the experimenter (termed the fixed modulus method), since, depending upon where the chosen number falls in the stimulus concentration continuum, the ratio relations can be become distorted.

Data from magnitude estimates are usually plotted on log–log axes (log concentration vs log intensity estimates), where the derived functions typically prove to be linear. The data are best fitted by a power function, ψ = køn, where ψ = perceived intensity, k = the Y intercept, ø = stimulus concentration, and n = the slope. Although the slope is an index of the change in the rate of sensation as concentration increases, when the free modulus method is employed, differences in distances on the Y axis among individual functions depend upon the idiosyncratic choice of numbers by the subjects. Hence, these distances have little quantitative value. To overcome his problem, investigators have developed a procedure termed cross-modal matching where stimuli from two modalities,70 e.g., smell and hearing, are judged relative to one another on the same continuum. For example, in one study intensity functions of young (18–26 years) and old (65–85 years) persons were compared.71 Low pitch broad-band tones were interspersed among the presented odorant concentrations. Under the assumption that perceived intensity of such tones is not influenced by age, then the numbers assigned to the tones can be used as indicants of differences in individual number usage. These numbers can be used to adjust the individual olfactory functions in proper perspective to one another, making the differences among the individual odor intensity functions on the Y axis meaningful. As shown in Fig. 7, in one study the decrement in reported stimulus strength observed in old persons relative to young ones was consistent across concentrations for the odorant amyl butyrate.

Fig. 7.

Fig. 7

Magnitude estimates given to six concentrations of amyl butyrate after adjustment for number usage by using a cross-modal matching procedure. Each age group was comprised of 10 men and 10 women, the younger ranging from 18 to 26 years and older ranging from 65 to 85 years. From Stevens and Cain,71 with permission. Copyright © 1982, ANKHO International, Inc.

Taste tests

Several taste tests applicable in the clinical setting have been developed, largely paralleling the types of tests employed in olfaction. Thus, both threshold and suprathreshold taste tests have been devised. Such tests can be classified as those that use either chemical stimuli (chemogustometry) or electric stimuli (electrogustometry). Chemical stimuli are presented in numerous ways, including cups or small vessels where the contents can be sampled by “sipping & spitting”,4 or, in some cases, swallowed,72 cotton Q-tips or fine brushes previously dipped in tastants,73, 74 syringes,75 medicine droppers,76 micropipettes,4 or small discs or strips made methylcellulose polymers or filter paper impregnated with tastants.77, 78, 79 In electrogustometry, which requires no rinsing between stimulus presentations, microampere (μA) currents are administered to target regions via small disk-shaped electrodes.80 A modern electrogustometer that makes it possible to apply both anodal and cathodal stimuli to oral regions is shown in Fig. 8, along with a taste testing system that uses disposable plastic tabs whose monometer cellulose ends are embedded with dried tastants.

Fig. 8.

Fig. 8

Left: A modern electrogustometer and its electrodes. This device can apply a wide range of perithreshold currents with the electrode on the tongue being either the anode (standard confirmation) or cathode. Right: A modern clinical taste testing system employing plastic disposable tabs whose ends are embedded with tastants. Photograph courtesy of Sensonics International, Haddon Heights, New Jersey USA. Copyright © 2017 Sensonics International.

To gain an overall assessment of taste perception, whole-mouth testing is usually employed since it reflects what the patient is actually experiencing during deglutition. However, regional testing can provide information as to whether a given taste nerve is dysfunctional. Regional testing is most practical using electrogustometry, since no rinsing is involved and no lingering stimuli are present from one trial to the next. Although it would be ideal to evaluate the function of the left and right oral regions innervated by each cranial nerve (i.e., CN Ⅶ – anterior fungiform papillae, anterior foliate papillae, soft palate; CN Ⅸ – circumvallate and posterior foliate papillae; CN Ⅹ – esophagus and epiglottal surface), regional testing is usually confined to sectors of the anterior tongue. This is largely because of the gag reflex and difficulties in accurately presenting and localizing stimuli to the soft palate and deep regions of the oral cavity and throat.

Taste threshold tests

Analogous to olfactory threshold tests, a taste detection threshold is defined as the lowest concentration of a tastant that can be discerned from a control, usually water. When identification of a quality is required, then a recognition threshold is being measured. Taste detection thresholds are easy to measure and have been widely employed in academic, medical, and industrial settings. Clinically, electrogustometric threshold measurement is popular, given the ability to present stimuli to small regions of the tongue without the requirement of rinsing between trials.

Chemical threshold tests

Numerous psychophysical procedures for presenting chemical stimuli have been developed, including the AML and the staircase procedures described earlier for olfaction, as well as a descending method of limits (DML) procedure. Harris and Kalmus81, 82, 83 invented a reliable whole-mouth DML test using a sorting procedure. On a given trial, a number of cups are presented, half of which contain an above-threshold concentration of a tastant and the other half water alone. The subject's task is to sort the cups into those with and those without a taste. When this is done correctly, the test is repeated with the tastants being at the next lower log-based concentration. This continues stepwise through lower concentrations until inaccurate sorting occurs. When 14 cups are employed, as in the study by Settle,84 correct identification of 12 or more cups is statistically significant (P < 0.006) and the assumption is made that this stimulus concentration can be discerned at above-chance levels. If 11 or fewer of the sorts are correct (P ≥ 0.029), the assumption is made that concentration is not reliably discernable. In this example, the threshold was defined as the mean of these two final concentrations.

Data from a number of subjects tested using the Harris and Kalmus procedure are presented in Table 2. This study examined the whole-mouth taste sensitivity of 308 men and 368 women, all under the age of 56, for the bitter tastants 6-n-propylthiouracil (PROP) and quinine sulfate. Sensitivity to the sour tastant hydrochloric acid was similarly assessed in 163 men and 155 women.85 For hydrochloric acid, women, on average, were more sensitive than men. Despite the fact that an age-related trend was found, the effects were not strong within this age group.

Table 2.

Whole mouth taste thresholds for three stimuli produced by the Harris-Kalmus technique.

Stimulus concentrations (in deionized water)
Solution number Molar Conc. Solution number Molar Conc. Solution number Molar Conc.
1 7.32 × 10−7 6 2.34 × 10−5 11 7.50 × 10−4
2 1.46 × 10–6 7 4.69 × 10−5 12 1.50 × 10−3
3 2.93 × 10–6 8 9.38 × 10−5 13 3.00 × 10−3
4 5.86 × 10–6 9 1.88 × 10−4 14 6.00 × 10−3
5 1.17 × 10–5 10 3.75 × 10−4 15 1.20 × 10−2
Hydrochloric acid (Sour)
Age group (Years) Men
Women
P value
N Mean (SD) solution number N Mean (SD) solution number
1–10 31 10.62 (1.43) 39 9.77 (1.25) 0.0100
11–20 50 10.32 (1.14) 53 9.70 (1.28) 0.0110
21–30 42 10.22 (2.32) 15 9.80 (1.69) NS (0.5239)
31–40 21 11.52 (1.12) 20 10.30 (0.96) 0.0006
41–55 19 11.68 (1.61) 28 10.43 (1.33) 0.0057
Prop (Bitter)
Age group (Years) Men
Women
P value
N Mean (SD) solution number N Mean (SD) solution number
1–10 39 9.56 (2.27) 49 9.59 (2.47) NS (0.953)
11–20 112 9.37 (2.24) 148 9.47 (2.50) NS (0.739)
21–30 94 9.43 (2.48) 56 9.30 (2.81) NS (0.768)
31–40 36 9.83 (2.65) 42 9.95 (2.31) NS (0.831)
41–55 27 10.81 (2.33) 63 9.84 (2.61) NS (0.099)
Quinine (Bitter)
Age group (Years) Men
Women
P value
N Mean (SD) solution number N Mean (SD) solution number
1–10 39 5.36 (1.72) 49 5.49 (1.87) NS (0.080)
11–20 112 4.99 (1.91) 148 4.94 (1.22) NS (0.798)
21–30 94 5.19 (1.72) 56 5.44 (1.73) NS (0.392)
31–40 36 6.39 (1.98) 42 5.89 (1.75) NS (0.240)
41–55 27 5.56 (2.44) 63 6.13 (2.00) NS (0.250)

Thresholds based on solution numbers whose concentrations are indicated in left top box. Modified from Glanville et al 85.

Although not necessarily problematic for a clinical test so long as standard presentation protocols are adhered to, it should be noted that a number of factors can impact taste thresholds, including water temperature,86 amount of saliva present in the mouth (NaCl thresholds can be influenced by circadian rhythms in salivary Na+ content),87, 88, 89 intertrial intervals (short intervals can produce “threshold drift”),90, 91 presence or absence of rinsing between trials,92 stimulus volume (smaller volumes can produce higher thresholds),93 stimulus duration (shorter durations can produce higher thresholds),94 and, when locally applied, the number of papillae, hence taste buds, in the tested area.78

The influence of stimulus duration on taste thresholds is shown in Fig. 9.94 In this study, stimuli were flowed over defined regions of the tongue. This was accomplished by glass pipettes attached to the tongue by a vacuum surround. A computerized gustometer flowed boluses of the appropriate duration over the circumscribed lingual regions of interest (Fig. 10).95 As shown in Fig. 11, the more fungiform papillae stimulated, the greater the taste sensitivity.

Fig. 9.

Fig. 9

Mean NaCl detection thresholds (±SEM) on the anterior tongue as a function of stimulus duration plotted on log–log axes. The four data points, from left to right, correspond to 200, 400, 750 and 1500 ms. The relationship between threshold and stimulus duration represents a power function. From Bagla et al.94 Copyright © 1997 Oxford University Press.

Fig. 10.

Fig. 10

Left: University of Pennsylvania Regional Automated Taste Testing System (RATTS). From: Bagla et al,94 Copyright © 2001 Elsevier Science, Inc. Right: Glass stimulation device viewed from below. The stimuli flow through the 25 mm2 central chamber (A). A vacuum is present on the annular chamber (B) which holds the device securely to the tongue. The pressure (∼40 mmHg) was calibrated with a differential pressure gauge connected by a tube to the distal end of the vacuum chamber (C). Copyright© 1997 Oxford University Press.

Fig. 11.

Fig. 11

Left: Mean (±SEM) threshold values obtained from 8 subjects for NaCl presented to the four anterior tongue regions for two stimulation areas (12.5 and 50 mm2). The number of papillae counted under videomicroscopy is indicated by the dark bars, and threshold values by the gray bars. Note that the threshold scale is inverted, such that greater sensitivity is depicted at the top of the scale. Right: Tongue regions where stimulators were centered. From Doty et al.129 Copyright © 2001 Elsevier Science Inc.

Electrical threshold tests

The influence of electrical stimulation of the tongue and the resultant taste sensations was well known to experimenters in the 19th Century. Although they did not perform exacting threshold tests, they debated the means by which electric current induced taste sensations in a surprisingly modern manner. For example, in relation to passing current across the tongue, Erb96 noted in 1883, “Whether these sensations are due to the local action of the alkalies and acids produced by electrolysis, or to the stimulation of the nerves of taste or their terminal organs, is still undecided.” To some extent this question remains today.

Since such early observations numerous electrogustometers have been developed to specifically test taste function.8, 97, 98, 99, 100 Clinically, such devices are very practical since they are portable and very low levels of electrical stimuli can be rapidly and safely presented to small regions of the tongue without intervening rinses. Electrical thresholds generally correlate with most chemical thresholds,78 although such correlations are not always large101, 102 and only under special circumstances do electrical stimuli produce classic taste sensations, usually as a result of cathodal rather than anodal stimulation (e.g., sweet).80, 103, 104 Like chemical thresholds, electrical thresholds vary across studies and depend upon such factors as sex, age, and smoking behavior.105, 106, 107 Because electrode sizes vary considerably among studies, ranging from 12.5 mm2 to 234 mm2, comparisons across studies can be problematic. Twenty mm2 electrodes are the most common.97, 99, 108, 109, 110, 111 In general, threshold values are higher, i.e., sensitivity is lower, as electrode size becomes smaller.105 Like chemical thresholds, electric thresholds correlate with the number of underlying fungiform papillae (Fig. 12).108

Fig. 12.

Fig. 12

Mean (±S.E.M.) number of fungiform papillae and current density thresholds on four anterior tongue sites for two different sized electrode areas. From Doty et al.129 See Fig. 11 for depicted stimulation sites. Copyright © 2001 Elsevier Science Inc.

Data for electrical thresholds obtained from the tip of the tongue of 74 male and 82 female non-smokers are shown in Table 3.101 These data are presented as μA rather than the original dB values to allow for easy comparisons across studies. A two-alternative forced-choice initially ascending staircase procedure was employed, and stimulus duration was a half a second.

Table 3.

Electrogustometric thresholds on the anterior tip region of the tongue as a function of sex and age values are in μA. Modified from Pavis et al.106 Values are mean (SD) μA.

Age Men
Women
N Left Right L&R N Left Right L&R
1014 8 5.6 (8.8) 5.9 (8.5) 5.7 9 6.1 (9.1) 5.1 (8.6) 5.6
1519 9 5.2 (9.2) 6.3 (8.1) 5.7 12 5.3 (9.1) 6.4 (8.0) 5.8
2029 10 10.5 (9.1) 8.3 (8.0) 9.3 11 11.1 (9.3) 8.1 (8.1) 9.5
3039 10 15.5 (10.2) 12.7 (8.1) 14.0 12 10.6 (9.8) 13.0 (9.1) 11.7
4049 11 20.1 (9.3) 23.8 (9.3) 21.9 12 20.9 (10.0) 25.7 (9.8) 23.2
5059 11 31.4 (9.8) 29.1 (10.0) 30.2 11 33.5 (10.4) 54.6 (10.2) 42.8
6069 10 49.8 (11.2) 43.3 (10.0) 46.4 10 52.4 (11.0) 49.2 (10.3) 50.8
>70 5 51.8 (10.4) 42.2 (9.6) 46.8 5 54.6 (10.3) 40.6 (9.8) 47.1

Aside from assessing general taste dysfunction, the most common clinical use of electrogustometry is to assess taste function as a marker for the prognosis of Bell's and related palsies such as Ramsay Hunt syndrome,8, 112, 113 although its general value in this regards has been questioned.114 Electrogustometric thresholds have also been useful in assessing adverse effects of tonsillectomy,8 chemotherapy for cancer,115 and diabetes and its development,116 as well as assessing the success of chorda tympani reconstruction surgery.117, 118

Taste identification tests

As with olfaction, identification tests are useful in assessing taste function in the clinic. In most tests, different concentrations of tastants representing the basic taste qualities are presented in random or quasi-random orders. The task is to report, in forced-choice fashion, which taste quality is perceived on a given trial.

At our center, we employ a regional chemical taste test in which 15 μl of sucrose (0.490 M), sodium chloride (0.310 M), citric acid (0.015 M), and caffeine (0.040 M), equated for viscosity using cellulose to minimize stimulus drift, are micropipetted to left and right anterior and posterior tongue regions. The posterior regions tested are on or near the lateral circumvallate papillae.4 During testing, the stimuli are placed on the target regions of the extended tongue and the subject points to a chart indicating whether the taste sensation is sweet, sour, bitter, or salty. The tongue is then retracted and the mouth rinsed with purified water. The test is comprised of 96 forced-choice trials (4 tastants × 4 lingual regions × 6 repetitions), with the maximum score for a given tastant across all segments of the tongue being 24.

The data from a study using this test to evaluate the influences of terbinafine (Lamisil®) on taste function are presented in Fig. 13. The test scores are averaged across tongue regions (left and right, front and back) and, despite the small sample, verify the complaints of subjects who presented with taste dysfunction from this oral medication.119

Fig. 13.

Fig. 13

Influence of Terbinafine (Lamisil®) on taste identification test scores for stimuli representing the four major taste qualities.119 The agents were presented to left and right anterior and posterior regions of the tongue using micropipettes. The test scores represent the summation of scores across all four lingual regions. Dark bars are terbinafine patients and gray bars controls (see text for details). From Doty & Haxel.119 Copyright © 2005 The American Larygological, Rhinological and Otological Society, Inc.

In our whole-mouth chemical taste test, 10 ml samples of five concentrations each of sucrose, sodium chloride, citric acid, and caffeine are sipped and then expectorated. The stimuli are presented in a counterbalanced order and testing is performed twice, with a rinse of purified water occurring between trials.120 A total of 40 trials (4 tastants × 5 concentrations × 2 presentations) is presented, and the total possible identification score for a given tastant is 10, with 40 being the maximum for the overall test.

In a study employing this procedure, early stage Parkinson's disease patients were found to be less able than controls to accurately identify the salty taste of sodium chloride and the bitter taste of caffeine.121 The findings for caffeine are shown in Fig. 14.

Fig. 14.

Fig. 14

Mean (±SEM) percent correct performance of 29 PD patients and 29 matched controls in identifying the bitter taste quality of caffeine at each of five stimulus concentrations. From Doty et al.121 Copyright © 2015 Springer.

A practical way of presenting chemical tastants to subjects in the clinical setting is to employ circular disks or elongated “taste strips” in which tastants are embedded into filter paper or other materials (see example of such strips in Fig. 8). One novel material is pullulan (α-1,4-; α-1,6-glucan) combined with the polymer hydroxypropyl methylcellulose.77 Disks made of this material dissolve directly on the tongue and do not require removal after stimulation.

In a popular 32-trial test, filter paper strips dried in sucrose (0.05, 0.10, 0.20, and 0.40 g/ml), citric acid (0.050, 0.090, 0.165, and 0.300 g/ml), sodium chloride (0.016, 0.040, 0.100, 0.250 g/ml), and quinine hydrochloride (0.0004, 0.0009, 0.0024, 0.0060 g/ml) are applied to each side of the anterior tongue.79 A few blank trials are given as well, and rinsing occurs between trials. The subject indicates whether each stimulus is sweet, sour, bitter, or salty. This test is sensitive to age and sex, as shown in Table 4.

Table 4.

Normative data (percentiles) for a taste study using filter paper strips for men and women according to three age groups. Data, which were essentially equivalent for the left and right sides of the tongue, represent left and right sides combined. Modified from Landis et al.79

Percentile Age: 18–40 Years
Age: 41–60 Years
Age: > 60 Years
Women (n = 141) Men (n = 84) Women (n = 122) Men (n = 84) Women (n = 55) Men (n = 51)
10th 19 17 15 9 10.2 9
25th 23 21 19 13 16 13
50th 27 25 24 21 22 19
75th 30 28 27 24.75 26 24
90th 32 30 30 27 28.4 25
Mean (SD): 26.3 (5.1) 24.3 (5.3) 23.0 (5.7) 19.1 (7.1) 20.6 (6.5) 18.2 (5.9)

A perplexing issue with taste identification testing is that many subjects consistently confuse the quality of above-threshold tastants. For example, in one study, sour stimuli were called bitter by 19%, and salty by 2.4%, of the 1000 subjects. Bitter stimuli were termed sour by 11.4% and salty by 3.5% of the participants. Salty stimuli were called bitter by 7.3% and sour by 7.0%.

Age and sex influenced some confusions (e.g., 30.7% of those > 68 years of age exhibited sour-bitter confusions, compared to only 13.5% of those < 50 years of age). Subjects who were most sensitive to the bitter taste of phenylthiocarbamide (PTC) had fewer sour-bitter confusions than those less sensitive to this compound (30.9% vs 40.7%). The basis of taste confusions is not entirely clear, although the authors concluded that both biological and experiential factors are likely involved.

Taste intensity tests

Suprathreshold taste intensity has been assessed in academic and clinical settings. In 1932, Fernberger122 had subjects assign the taste of PTC to the categories of “tasteless”, “slightly bitter”, “bitter”, “very bitter”, and “extremely bitter”. This study is probably the first to use a category scale to measure the relative intensity of a tastant's quality.

In addition to identifying the quality of each tastant presented in the Center's 40-trial whole mouth test (see section above), each subject rates the intensity of a given stimulus on the category scale with logarithmic visual density referents depicted in Fig. 6. The relationship between NaCl concentrations and perceived intensity ratings using this test is shown in Fig. 15.4 In this study, the question whether anosmia or hyposmia impacts taste perception was addressed. As can be seen in the figure, taste perception was not so altered and NaCl intensity ratings were shown to be linearly related to the logarithm of the tastant concentrations.

Fig. 15.

Fig. 15

Mean (±SEM) whole mouth taste intensity ratings for five concentrations of NaCl (salty) in anosmics, hyposmics and normosmics. P values are for olfactory function group main effects from ANOVAs performed on data from each stimulus concentration. This shows no effect of smell impairment on intensity rating to NaCl. From Stinton et al.4 Copyright © 2010 American Psychological Association.

An example of the advantage of a labeled magnitude scale over a category scale for assessing taste intensities is shown in Fig. 16.123 In this graph, the scale values are plotted as a function of the number of fungiform papillae within the sampled tongue regions. Note that the labeled magnitude scale shows a clear association between the variables whereas the category scale does not.

Fig. 16.

Fig. 16

Perceived taste intensity as a function of fungiform papillae density on the anterior tongue. Graphs on the left were obtained with the gLMS; graphs on the right were obtained with a nine-point category scale. All correlations between the papilla density and perceived intensity on the left were statistically significant (P < 0.05), but only the correlation for quinine was significant on the right. From Snyder et al.123 Copyright © 2004 Imprint Academic.

Suprathreshold taste scaling has also employed magnitude estimation procedures. In a pioneering study published in1948, Lewis124 presented a series of various “standard” concentrations of sweet, sour, bitter and salty tasting stimuli to 6 subjects. Each taste quality was separately tested. Their task was to choose from the array of concentrations those that were perceived as half as strong as each of the presented standards (method of fractionation). From this, magnitude scales were derived that fit power functions, with the slope of the functions being around 1.0. Subsequently, Stevens125 reported power function exponents of 1.3, 1.3, and 0.8 for sucrose, NaCl, and saccharine, respectively.

Magnitude estimation studies using electrical stimuli have also been reported, with exponents being quite similar to those observed for taste stimuli. In a study of six subjects, Jauhiainen et al.,126 using bipolar stimulation, found that the magnitude of electric taste resulted in a power function of 1.2. Helmbrecht,127 using pulsed stimulations, reported an exponent of 1.08 averaged across several different pulse sequences.

The most sophisticated electrogustometry study to date used the method of cross-modal matching to better define the relationship between current and perceived intensity.128 Single half-second square wave anodal pulses were administered to each side of five lingual and soft palate regions: tongue tip 1 cm from the midline; anterior tongue side 2 cm from tip on lateral margin; posterior tongue side in region of foliate papillae; posterior medial area of circumvallate papillae; soft palate 1 cm from the midline and 1 cm above the superior pole of the anterior palatine arch. The stimuli, which were applied in random order in regards to locus and magnitude, were interspersed with 1000 Hz pulses of white noise ranging from 40 to 60 dBSPL in 5 dB increments. The 12 subjects (age range: 20–26 years) had normal hearing with thresholds ≤ 20 dB over the 250–8000 Hz frequency range.

As shown in Fig. 17, the slope of the power function for the intensity on the tip of the tongue (exponent = 1.18) was 69% steeper than those of the other tongue regions (P = 0.002), which did not differ significantly from one another (exponents for anterior side = 0.75, posterior side = 0.74, posterior medial area = 0.64, and soft palate = 0.66). The midpoint of the function for the anterior tongue (22 dB – EGM), a measure of the absolute intensity of the response curve, differed significantly from the midpoints of the functions of the other tongue regions. Thus, both the build-up in sensation as a function of increasing current and the overall perceived intensity of the stimulus were greatest on the front of the tongue.

Fig. 17.

Fig. 17

Perceived magnitude of electrical taste (converted to a tone level) as a function of electrical stimulus intensity in dB using an electrogustometer. n = 12. From Salata et al.128 Copyright © 1991 Oxford University Press.

Conclusions

This review has provided an overview of the numerous psychophysical tests available for quantitatively assessing the senses of taste and smell. Like vision, hearing, and balance, it is now possible to accurately determine the nature and degree of chemosensory dysfunction based upon straight-forward psychophysical tests. As with these other senses, the degree of dysfunction, i.e., whether mild, moderate, severe, and total loss is present, can be easily determined. Moreover, most cases of malingering can be detected using forced-choice testing. Unlike electrophysiological tests, psychophysical tests directly reflect conscious experience and are more practical in terms of cost and training of administrative personnel.

Of the available smell and taste tests, those of identification and detection threshold testing are the most widely employed, reflecting their relative ease of administration and high reliability. Identification tests are generally preferred, since they can be self-administered and reflect the overall function of the involved senses. Identification tests tap into the fact that the chemical senses have evolved to provide key information about the environment critical for survival and it is the entire system, including the peripheral receptors, afferent nerves, and multiple regions of the brain, that is taken into account by such tests. Is the air or water safe? Is a food poisonous or edible? Is there a smell that signals impending danger, such as of a fire or leaking gas? To achieve these ends, both innate and acquired processes are involved in chemoreception, in a similar manner as to what occurs for vision. Like other sensory systems, periodic assessment of chemosensory function would seem to be in the best interests of patients and physicians alike, particularly since dysfunction of these senses can be a warning sign for impending health issues that impact not only quality of life, but longevity as well.

Financial disclosures

The author receives funding from the Michael J. Fox Foundation for Parkinson's Research. He is a consultant to Acorda Therapeutics, Eisai Co, Ltd, and Johnson & Johnson. He receives royalties from Cambridge University Press, Johns Hopkins University Press, and John Wiley & Sons, Inc. He is president of, and a major shareholder in, Sensonics International, a manufacturer and distributor of smell and taste tests, some of which are mentioned in this article.

Edited by Yi Fang

Footnotes

Peer review under responsibility of Chinese Medical Association.

References


Articles from World Journal of Otorhinolaryngology - Head and Neck Surgery are provided here courtesy of Chinese Medical Association

RESOURCES