Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Mar 31.
Published in final edited form as: J Opt Soc Am A Opt Image Sci Vis. 2025 May 1;42(5):B432–B442. doi: 10.1364/JOSAA.545286

Application Of The Angular Indication Measurement And Foraging Interactive D-Prime Paradigms To Tablet-Based Color Vision Testing

Christabel Arthur 1,*, Jingyi He 2, Peter J Bex 3, Jan Skerswetat 3,4, Michael A Crognale 1
PMCID: PMC13035409  NIHMSID: NIHMS2150882  PMID: 40793549

Abstract

Numerous computer-based tests are available for evaluating color vision deficiencies (CVD). Here we demonstrate the application of two paradigms—the angular indication measurement (AIM) and the foraging interactive D-prime (FInD)—to a tablet-based assessment of color detection and discrimination. Comparison between the anomaloscope and all other tests, including the CCT, the Mollon–Reffin test, and the AIM and FInD tests, showed good agreement in identifying color deficiencies. The defect-type classification revealed that AIM color discrimination showed the highest agreement with anomaloscopy, whereas AIM color detection showed the lowest agreement. The combination of either AIM or FInD detection and discrimination data resulted in an even better agreement than any single test. The results suggest that the AIM and FInD tablet tests provide relatively rapid, useful, and informative CVD screening with portability suitable for field research and the clinic.

1. INTRODUCTION

A key aspect of detailed vision assessment is color vision testing. Comprehensive color vision testing is important to screen not only for congenital color vision deficiencies [13] but also for acquired color vision deficiencies (CVDs), which may indicate underlying pathology [47]. The distribution of congenital and acquired CVDs across different populations worldwide is not well established [8], due in part to a paucity of testing in underdeveloped parts of the world. The availability of color vision tests is also important because color-normal vision is a requirement for employment in some professions [1].

Color vision testing paradigms include naming targets in pseudoisochromatic plates, color matching with an anomaloscope, ordered color arrangement of samples, and identification (naming) of the colors in a test lantern [1,911]. Tests, such as the Ishihara pseudoisochromatic plates [12], are designed to classify patients’ color vision as either normal or abnormal without a measure of severity. Some of these tests also attempt to differentiate between protans and deutans with varying degrees of success. Other tests, such as the HRR test [13], do not only provide a classification of the type of CVD but also provide a rough estimate of the severity of the anomaly as being “mild,” “medium,” or “strong.” Tests, such as the FM-100 hue test [14] classifications, are based on normative datasets that indicate whether a patient’s results are within or outside normal limits. Color matching via anomaloscope is particularly good at classification (protan versus deutan) and detection of CVD, as is the Rabin Cone Contrast Test [15] and the color assessment and diagnosis (CAD) test [16]. In addition to classification, the Cambridge color test [17] provides an estimate of detection thresholds. Color vision tests that can quantify discrimination or detection, like the Cambridge color test, afford information that is particularly useful for tracking acquired CVD [7].

The advantages and disadvantages of the many color vision tests have been reviewed elsewhere (e.g., [911,18]). For clinical practice, test simplicity, speed, reduction of administrator effort, and ease of classification (e.g., via plate tests) are priorities, while for research, more exacting and quantitative data (e.g., via numerical threshold measurements) are typically preferable. However, a comparison across color vision tests is difficult because different tests express results in different units and often measure different capacities. In fact, the expression of the results in terms of the most basic physiological determinant of color vision, cone contrast, is not easily accomplished because of the large variation in photopigment types [19], cone ratios [20], pigment density [21], rod contribution [22], and preretinal filtering [23] in both the general and color-vision-deficient populations.

One of the biggest drawbacks of many color vision tests is that the results can be greatly influenced by the motivation and/or criterion of the observer. Malingering and even responding more carelessly can result in higher error scores and overestimation of the severity of color deficiency, while heightened attention can improve scores. These problems can be mitigated by paradigm selection. Practice, memorization, and non-standard lighting can also reduce errors, generating an underestimation of the severity of loss [24].

Computerized testing platforms are becoming increasingly common as they solve some of the issues encountered in the recording and administration of color vision tests. Computerized tests can account for color variance and lighting errors [9,18]. Examiner errors and bias are also limited with the automation of scores. Model variations in computerized tests causing differences in stimulus display could potentially lead to inconsistency in results [25]. Nonetheless, because computerized platforms, such as tablets and even smartphones, are portable and increasingly affordable, they are improving the ease in which the color vision testing can be done outside the clinic and even in remote regions.

Recently, a set of computerized tests of visual function has been developed that utilize two different novel paradigms, the angular indication measurement (AIM) and the foraging interactive D-prime (FInD) [26,27]. The AIM and FInD paradigms are generalizable, self-administered, response-adaptive platforms to assess visual perception. The AIM paradigm provides an estimate of accuracy by means of angular error generation, i.e., the difference between the indicated and observed orientations. The FInD paradigm utilizes a signal-to-noise ratio (dt) calculation based on the detectability of the stimulus. Key features of both paradigms are the presentation of multiple stimuli for each test in randomized order, ranging from high to low intensity. Thus, both test paradigms remove memory biases that are problematic in printed tests and also ensure that every chart includes some stimuli that will be visible to the participant, which can be frustrating in conventional forced-choice methods. Successful application of the AIM and FInD tests to measure color detection and discrimination in adults with and without CVDs has also recently been reported [28,29].

In the present study, we examine the feasibility of administering a tablet-based version of the AIM and FInD tests. The platform employs touch-screen responses and on-screen instructions to provide for self-administration and could be administered in non-clinical settings. This diagnostic suite offered by PerZeption Inc. includes the AIM and FInD tests for a range of visual function assessments, including color vision, visual acuity, contrast sensitivity, motion, and form coherence. Both AIM and FInD color detection and discrimination tasks were used to assess the visual performance in participants with and without CVDs. We also tested the participants with a battery of some common color vision tests, including the Cambridge color test (CCT), the Mollon–Reffin test, the Farnsworth–Munsell hue test (FM-100), and the Oculus HMC anomaloscope (Typ 47715), and made quantitative comparisons where possible. Lastly, we investigated the trade-off between time and performance for AIM and FInD to analyze how many repetitions were sufficient to arrive at a steady performance level.

2. METHODS

A. Participants

We tested 24 participants with color-normal (CN) vision and 24 participants with CVD. Four of the participants with CVD were female. The ages ranged from 14 to 67 years (median = 25). Participants completed a brief demographic and ocular health history questionnaire. All participants reported normal or corrected-to-normal visual acuity (22 wore glasses, and eight wore clear contact lenses) and no history of eye disease. Participants provided written informed consent. The Institutional Review Board of the University of Nevada, Reno, approved the research protocol, which conformed to the Declaration of Helsinki.

B. Test Parameters

Discrimination test stimuli comprised colors lying within the nominal equiluminant plane, around the endpoints of the cardinal directions of MBDKL color space [30,31], while the detection task comprised colors along L-, M-, and S-cone isolating axes. The stimuli were generated by Psychtoolbox in MATLAB (MathWorks, USA) and presented on a Microsoft Surface Pro 8 tablet screen with a 2880 × 1920 resolution and a frame rate of 60 Hz. The display was calibrated with a Photo Research PR-670 SpectraScan spectroradiometer (JADAK, USA). The number of pixels per degree of the stimuli was 76.165, and the luminance of the background was 92.4 cd/m2. The stimuli were displayed in a 4 × 4 grid (16 cells) per chart, and participants ran a total of three chart trials per test. The first chart stimuli ranged from minimum to maximum task-dependent signal intensities in log steps based on parameters that included below- and above-threshold stimuli for people with typical and anomalous color vision. The second and third charts were adaptive to previous responses.

C. Stimuli and Procedures

1. AIM Color Detection

AIM color detection stimuli were L-, M-, and S-cone isolating, 2° Landolt C optotypes (letter stroke width and gap opening width were 0.4°) with different cone contrasts and gap orientations [Fig. 1(a)], randomized across 16 cell positions [29]. Each C target was embedded in a circular cell subtending a 5° diameter with an intercell gap width of 1°. The cells included a dynamic luminance noise pedestal with a Michelson contrast of 0.2, the noise check size was 4 pixels (3.2 arcmin), and the noise refresh rate was 14 Hz. The maximum cone contrasts for L-, M-, and S-cone types are 0.19, 0.23, and 0.86, respectively. The orientations of the AIM detection stimuli were randomly drawn from a uniform distribution between 0° and 360°. Participants were instructed to tap the gap orientation of the Landolt C stimuli. The difference between the actual and reported stimulus orientations in degrees was fit with a cumulative Gaussian as a function of cone contrast.

Fig. 1.

Fig. 1.

AIM and FInD color stimuli. (a) AIM detection stimuli, (b) AIM discrimination stimuli, (c) FInD detection stimuli, (d) FInD discrimination stimuli. (e) The sequence of the AIM detection task on the PerZeption app with visual instructions. The app also included voice commands that instructed the participant at the beginning how to perform the task. If the response error after an AIM chart was not significantly different from random responses, the voice command reminded the user of the protocol, otherwise encouraged the user to keep continuing. These results were also noted on the analysis document for the examiner to review and assess whether the test performance was normal. This is an important feature, as a random response can be either due to lack of visibility due to CVD, lack of understanding of the task, or in attention and random responses by the participants.

2. AIM Color Discrimination

AIM color discrimination stimuli were 2° bipartite circular patches with color differences and orientations, randomized across cell locations [Fig. 1(b)] [29]. Discrimination thresholds were measured for four different regions of the MBDKL space starting at the endpoints of the cardinal axes. The colors at these endpoints are referred to here as purple (+S), red (+L-M), yellow (-S), and green (+M-L). Thresholds were measured for directions perpendicular to the cardinal axes within the equiluminant plane and expressed in terms of degrees (azimuth) of the equiluminant color plane, symmetrical around the axis. Consequently, measurements at the purple and yellow ends describe discriminations made by the L–M pathway (perpendicular to the S axis), while discriminations at the red and green ends were made by S-cone modulation (perpendicular to the LM axis). Bipartite patches were contained in a circular cell subtending a 5° diameter with an intercell gap of 1°. The Michelson contrast of the dynamic noise was 0.2. The noise check size was four pixels (3.2 arcmin), and the noise refresh rate was 14 Hz. The boundary of the bipartite stimuli was smoothed with a Gaussian (σ = 0.15°). The orientations of the bipartite edges were randomly drawn from a uniform distribution between 0° and 360°. Participants were instructed to indicate the orientation of the color-defined edge in the bipartite stimuli. The differences between the actual and reported stimulus orientations were fit with a cumulative Gaussian as a function of angular color difference. Note that “degrees” here refers to the accuracy of the orientation judgment of the bipartite field, not to the discrimination threshold expressed in degrees of the equiluminant color plane. In addition, note that this latter angle in the equiluminant color plane at threshold can be taken as a quantification of discrimination performance.

3. FInD Color Detection

The stimuli for FInD color detection were Gaussian blobs (σ = 1.0°) comprising colors along the +L-, +M-, and +S-cone isolating directions, emanating from the achromatic point [28] [Fig. 1(c)]. The stimuli included a 14 Hz dynamic luminance noise pedestal (check size 4 pixels, 3.2 arcmin) with a Michelson contrast of 0.2. The width of each square stimulus cell was 6°, and each cell was surrounded by a black frame (≈ 0 cd/m2). A random subset of cells contained a Gaussian target; the remainder contained only the noise pedestal (null). Participants were instructed to click on any cells that contained “faint versions” of the colored stimulus. Target present or absent responses for each cell were classified as a hit, miss, false alarm, or correct rejection to estimate dt as a function of stimulus intensity, and the probability of a Yes response was fit with a decision response as a function of dt as in Ref. [28]. The FInD color detection threshold was estimated at the contrast where dt = 1.

4. FInD Color Discrimination

Stimuli for FInD color discrimination were two Gaussian blobs (σ = 1.0°) per cell in 20% Michelson contrast in 14 Hz dynamic noise (check size 4 pixels, 3.2 arcmin), a random subset of which contained blobs of two different colors (targets), while the colors of the blobs in the remaining cells were identical (nulls) [28] [Fig. 1(d)]. As with the AIM discrimination task, thresholds were measured for modulations perpendicular to the endpoints of the cardinal axis and expressed in degrees subtended in the equiluminant color plane. Other stimulus parameters were as with the FInD detection stimuli, and participants indicated which cells contained blobs of different colors. Blob colors, same or different responses for each cell, were classified as a hit, miss, false alarm, or correct rejection to estimate dt as a function of stimulus intensity, and the probability of a Yes response was fit with a decision response as a function of dt as in Ref. [28]. The FInD color discrimination threshold was estimated at the color difference angle where dt = 1.

D. General Procedures

All participants were assessed binocularly with the AIM color detection and color discrimination and FInD color detection and color discrimination tasks. Testing times for the AIM and FInD tests were automatically recorded by the program, and that of the standard color vision tests was manually recorded using a stopwatch.

For AIM and FInD tests, the application provided a high-contrast example with audio instructions and written instructions on the screen before each task. An example was also displayed at the upper-left corner of each chart to guide participants in identifying targets. Participants were instructed that the stimuli were present in some of the cells and that the number of targets varied per chart. Each chart remained on the screen until the participant completed the task and tapped the “Next” button which appeared on the right side of the screen. Participants viewed the tests binocularly at a distance of 40 cm.

1. Legacy Color Tests

Participants were screened and classified with a battery of color vision tests, including the FM-100 hue test, the Mollon–Reffin test, the CCT, and a Rayleigh match. The HRR plate test was also used to screen for tritan deficiencies. Tests were conducted in accordance with the manufacturers’ instructions. The FM-100 hue test, HRR, and the Mollon–Reffin test were administered under a fluorescent lamp (Verilux Full Spectrum) with a reported color temperature of 6200 K and a color rendering index of 94, in an otherwise dark room. The CCT was run using a Cambridge VSG3 board installed in a PC and displayed on a calibrated Sony CRT monitor. Rayleigh matches were conducted on the Oculus HMC anomaloscope running in manual mode and with absolute adaptation from a laptop PC. The initial classification of CVD was via anomaloscope. We recognize that a small number of observers will make a normal Rayleigh match on the anomaloscope yet fail other tests of red–green color vision, such as pseudoisochromatic plates (one such condition has been called “pigmentfarbenanomalie” [32,33]). Two of the participants with normal anomalous quotients by anomaloscope showed signs of CVD by other tests. These results are described below.

2. Comparisons

We assessed the diagnostic potential of AIM and FInD tests by inferring the classification from the numerical results and comparing them to classifications by some of the standardized color vision tests. Inferring the diagnosis was necessary since quantitative criteria for diagnosing CVD have not yet been developed for the tablet tests. To evaluate the relative ability to quantify CVD severity, we used MATLAB software (version R2023b) to perform intra-class correlations and compared the results from the PerZeption tablet’s tests with other tests that purport to grade severity. Detection threshold comparisons included AIM and FInD detection tests with the CCT and the Mollon–Reffin tests. For discrimination threshold comparisons, we compared the AIM and FInD discrimination tests with the FM-100 standard total error score (TES) [34]. We also extracted detection and discrimination thresholds after the first, second, and third trial runs and compared these to assess whether trial time can be shortened while preserving the accuracy in diagnosis.

3. RESULTS

Data analyses for AIM and FInD were performed in MATLAB (MathWorks, USA). AIM threshold, slope, and noise fit parameter outcomes, together with FInD thresholds, were generated (Table 1). Threshold values from AIM and FInD were exclusively employed for further data analysis. AIM And FInD detection scores were recorded for each of the three main cone axes. AIM and FInD discrimination scores along the cardinal directions were recorded at four color regions as described above. The mean duration of completion for each of the tests is reported in Table 2.

Table 1.

One-way ANOVA for AIM and FInD Color Detection and Discrimination Thresholds, Noise, and Slope Comparing Across Color Directionsa

AIM Detection CN (n = 24) Deutan (n = 14) Protan (n = 7)
F p F p F p
Threshold 4.71 * 18.04 *** 2.46 0.113
Noise 0.41 0.662 1.81 0.177 0.14 0.874
Slope 0.42 0.656 4.5 * 0.65 0.536
Discrimination F p F p F p
Threshold 357.07 *** 16.23 *** 54.41 ***
Noise 51.17 *** 0.45 0.721 2.55 0.08
Slope 20.98 *** 3.45 * 4.82 **
FInD Detection CN (n = 24) Deutan (n = 14) Protan (n = 7)
F p F p F p
Threshold 4.46 * 28.75 *** 39.06 ***
Discrimination F p F p F p
Threshold 129.26 *** 37.96 *** 166.64 ***
a

p < 0.001, <0.01, and <0.05 are represented as ***, **, and * respectively. Data were log-transformed to achieve normality.

Table 2.

Mean Testing Durations of Color Vision Testsa

Color Vision Test Duration (min)
CVD CN
AIM detection 7.87 [2.68] 8.82 [3.08]
AIM discrimination 8.67 [2.93] 8.51 [2.99]
FInD detection 5.93 [2.15] 6.16 [2.24]
FInD discrimination 4.45 [1.54] 4.83 [1.57]
Anomaloscopy 6.94 4.58
CCT 2.23 1.47
Mollon–Reffin 2.09 0.46
FM-100 13.03 10.26
a

Durations in square brackets are for AIM and FInD initial trials.

A. Rayleigh Matches

The anomalous quotient (AQ) and the extent of the matching range were used to diagnose CVD. Out of the 24 CVD participants, seven were classified as protanomalous, three were classified as deuteranopes, and 11 were classified as deuter-anomalous. Two of the female CVD participants produced normal Rayleigh matches, but their scores across other tests were ambiguous. CD21 was determined to be a low-discriminating CVD and showed tritan deficiency on the HRR plate test. The other participant, CD12, appeared to be generally low discriminating, including tritan losses on the CCT. CD6 was unable to complete a Rayleigh match.

B. CCT Scores

CCT scores were recorded for the protan (L), deutan (M), and tritan (S) axes. For this version of the CCT, the criterion for CVD was a score >100. All CN passed the CCT and recorded thresholds below 100. For CVD participants, thresholds were consistently above 100 for the protan and deutan axes, except for CD8, CD12, and CD21, who recorded thresholds lower than 100.

C. Mollon–Reffin Scores

The Mollon–Reffin test summary score was based on the most saturated chip correctly identified along protan, deutan, and tritan axes. All CN participants correctly identified all chips, including two CVD participants (CD8 and CD12).

D. FM-100

Standard total error scores were generated from the FM-100 hue test. The error scores determined superior discrimination (0–16), average discrimination (20–100), and low discrimination (>100); 92% of CVD participants were categorized as low discrimination based on TES, and the other 8% were average discrimination. Of CN, 2% were low discrimination, 83% were average discrimination, and 8% were superior discrimination.

E. AIM and FInD Tests

Each participant completed three trials (i.e., one initial and two subsequent adaptive charts) for the AIM and FInD tests. Note that AIM and FInD discriminations probe in four different regions in color space. Each CVD participant is plotted against color normals for AIM detection and discrimination tests in Figs. S1 and S2 and FInD detection and discrimination tests in Figs. S3 and S4, respectively. CNs cluster at lower thresholds on the L and M axes for the AIM and FInD detection tests, while CVD thresholds are elevated. CVD thresholds are likewise elevated in the purple and yellow directions for the AIM and FInD discrimination tests. The elevation pattern of protans and deutans on the L and M axes, respectively, is however not consistent per classification by the anomaloscope. We compared ratios of the L and M scores of protans and deutans from AIM and FInD detection tests, CCT, and Mollon–Reffin to determine if there is a trend of separation of both CVD groups. The trend of separation of protans from deutans is seen for the CCT, the Mollon–Reffin, and the FInD tests (Fig. 4).

Fig. 4.

Fig. 4.

Spearman’s rho plots comparing Mollon–Reffin with AIM and FInD detection tasks. Thresholds on the L and M axes correlated well. Correlations on the S axes were weaker. Other details as in Fig. 3.

F. Comparisons

AIM and FInD color detection test thresholds, CCT thresholds, and Mollon–Reffin summary scores were further compared using the Spearman’s rank correlation to account for the distribution present in our dataset. We included the Mollon–Reffin results as one capable of grading severity even though the grading is relatively coarse and discrete (six steps). In general, rho (ρ) values were high for tests expected to be affected by red–green CVD and were much lower for tests sensitive to tritan losses. The lower ρ for tritan CVD was expected in the present study, since few subjects showed evidence of tritan loss. Curiously, the Mollon–Reffin test seemed to indicate more tritan losses than the other tests. The CCT and Mollon–Reffin results produced moderate-to-strong correlations with thresholds from the detection tests, with the FInD tests yielding stronger correlations than those with the AIM tests (Figs. 3 and 4). We also found similar trends when we compared the CCT with the Mollon–Reffin test (Fig. 5).

Fig. 3.

Fig. 3.

Spearman’s rho plots comparing CCT with AIM and FInD detection tasks. CN data are the black circles, protan CVDs are displayed in red diamonds, deutan CVDs are shown in green squares, and CVDs determined to be low discriminators are shown in purple triangles. The symbols overlap so the purple triangles are not always visible. Thresholds on the L and M axes correlated well. Correlations on the S axes were weaker.

Fig. 5.

Fig. 5.

Spearman’s rho plots comparing the CCT with the Mollon–Reffin test. Similar correlations in Figs. 2 and 3 were found. Other details as in Fig. 3.

The AIM and FInD color discrimination thresholds for all four regions were summed and compared with the total error score (TES) of the FM-100. Moderate correlation was found with summed discrimination thresholds of the AIM test and strong correlation with the summed discrimination thresholds of the FInD test despite major differences in color geometry and metrics (Fig. 6). A comparison of summed scores from discrete regions of the FM-100, corresponding to protan, deutan, and tritan deficiencies, with similarly summed scores of the tablet discrimination tests, proved unwieldy due to these differences.

Fig. 6.

Fig. 6.

Sum of the threshold scores generated from the AIM and FInD color discrimination axes compared with the total error scores of the FM-100. Other details as in Fig. 3.

We also examined how the addition of the second and third response adaptive trials for the PerZeption tests affected results to assess whether the diagnostic ability of AIM and FInD remained with fewer trials. A single trial was sufficient to reveal the CVD (approximately 2.92 min for AIM tests and 1.88 min for FInD tests) (Fig. 7).

Fig. 7.

Fig. 7.

Thresholds after the first trial, the second trial, and the third trial. Box plot represents color normals, red diamonds represent protans, and green circles represent deutans. The presence of CVD is accurately established in the first trial.

G. Classification

Table 3 summarizes the classification results from the anomaloscope, CCT, Mollon–Reffin, AIM, and FInD tests. A supervised approach was used to classify each AIM and FInD data, i.e., data were labeled with diagnoses based on anomaloscope classification. Specifically, the anomaloscope classification of CVD and CN was used as a reference. Then all CN data for both AIM and FInD tests were used as a normative data sample. Their means μ and standard deviations σ were calculated. The classification boundary is based on Eq. (1).

Upper bound=μ+x*σ, (1)

where x is the number of standard deviations. Table S3 includes the classifications for each AIM and FInD test. The classification for AIM and FInD discrimination used the following assumption to determine whether a dataset was classified as deutan or protan. Specifically, the impairment along the yellow axis might be considered more pronounced because the yellow color is fundamentally altered due to the absence or altered function of L cones, which directly affects one of the two main components (red) of yellow. While deutan observers also have significant issues in the purple and yellow axes, their impairment might be slightly less severe in terms of yellow perception because even with an altered green perception, some aspect of yellow (the red component) is still perceived. Hence, if the CVD criterion was met, the CVD type for protans was determined with an imbalance between purple and yellow directions, whereas deutans have more equally impaired purple and yellow directions (see Table S3).

Table 3.

Testing Results of CVD Participants for the Anomaloscope, the CCT, Mollon–Reffin, and AIM and FInD Color Testsa

Subject ID Anomaloscope CCT Mollon–Reffin AIM Disc. Severity AIM Det. Severity FInD Disc. Severity FInD Det. Severity
CD1 Deuteranomalous D D D μ + 4σ P/D μ + 2σ D μ + 4σ D μ + 10σ
CD2 Deuteranope D P/D D μ + 4σ D μ + 10σ D μ + 4σ D μ + 10σ
CD3 Protanomalous D P P Y ≥ μ + 4σ T μ + 3σ P μ + 4σ P μ + 5σ
CD4 Protanomalous P P Y ≥ μ + 4σ N μ, L,M worst P μ + 4σ P/D μ + 5σ, L-worst
CD5 Protanomalous P P P Y ≥ μ + 2σ D μ + 2σ P μ + 4σ P μ + 5σ
CD6 P P P L, M ≥ μ N μ, L worst P μ + 3σ P μ + 4σ
CD7 Deuteranomalous D D D P, Y ≥ μ + 4σ D μ + 4σ D μ + 4σ D μ + 10σ
CD8 Deuteranomalous N N N P, Y ≥ μ N μ, L worst N Y ≥ μ N all ≥ μ
CD9 Deuteranomalous D P/D P P ≥ μ + 2σ D μ + 4σ D μ + 4σ P/D μ + 5σ, M-worst
CD10 Deuteranomalous P D D P, Y ≥ μ + 4σ D μ + 4σ D μ + 4σ D μ + 10σ
CD11 Protanomalous P P P Y ≥ μ + 4σ D μ + 2σ P μ + 4σ P/D μ + 5σ, L-worst
CD12 Normal T N N R, G ≥ μ N μ, L worst N P, Y, G ≤ μ N all ≤ μ
CD13 Deuteranomalous D D D P, Y ≥ μ + 4σ D μ + 4σ P μ + 4σ D μ + 10σ
CD14 Deuteranomalous D D D P, Y ≥ μ + 4σ D μ + 10σ D μ + 4σ D μ + 10σ
CD15 Deuteranomalous D D D P, Y ≥ μ + 4σ D μ + 5σ P μ + 4σ D μ + 10σ
CD16 Deuteranope P/D D D P, Y ≥ μ + 4σ P/D μ + 10σ P μ + 4σ P/D μ + 10σ, M-worst
CD17 Deuteranomalous D P/D P P ≥ μ + 4σ P μ + 5σ P μ + 4σ D μ + 5σ
CD18 Deuteranope P/D P/D D P, Y ≥ μ + 4σ P μ + 10σ D μ + 4σ P/D μ + 10σ, M-worst
CD19 Deuteranomalous D D D P, Y ≥ μ + 4σ P/D μ + 5σ, L worst P μ + 4σ D μ + 10σ
CD20 Deuteranomalous D D P Y ≥ μ + 4σ P μ + 5σ P μ + 4σ P/D μ + 10σ, M worst
CD21 Normal N T N P ≥ μ N all ≥ μ P μ + 4σ P μ + 3σ
CD22 Protanomalous P P P Y ≥ μ + 4σ N μ, L worst P μ + 3σ P μ + 5σ
CD23 Protanomalous P P P Y ≥ μ + 4σ P μ + 1σ P μ + 4σ P/D μ + 5σ, L-worst
CD24 Protanomalous P P P Y ≥ μ + 4σ N μ, L worst P μ + 4σ P μ + 10σ
CN vs. CVD N=23 N=22 N=23 N=23 N=23 N=23
Agreement [n] 22/23 21/22 22/23 19/23 22/23 22/23
Agreement [%] 96% 91% 96% 83% 96% 96%
CVD Type
Agreement [n] 17/23 16/22 19/23 10/23 15/23 14/23
Agreement [%] 74% 73% 83% 43% 65% 61%
a

Below the agreement between anomaloscope (reference) and all other tests.

P indicates protans, D indicates deutans, T indicates tritans, N indicates normal, and P/D indicates CVD with both P and D impairments.

1. Support Vector Machine Learning Classification

We also applied the supervised machine learning for classification. In a previous study by some of the authors [29], the support vector machine (SVM) was used to classify threshold data from the computer-based AIM paradigm, demonstrating strong consistency between AIM thresholds and anomaloscope results. Here, we repeat this approach for both the tablet-based FInD and AIM paradigms.

Thresholds from each paradigm serve as training data, with each color condition treated as a separate feature, resulting in seven features per paradigm. Following the procedures in Ref. [29], we used anomaloscope color-matching results as ground-truth labels and implemented a two-stage SVM classification protocol: the first stage distinguished color-normal (CN) from color-deficient (CD) individuals, while the second stage classified protan versus deutan deficiencies. We also employed five-fold cross-validation.

When combining FInD color detection and discrimination, the trained classifiers achieved accuracies of 98% (stage 1) and 95% (stage 2) when tested on the full dataset. For both AIM tests, the highest observed accuracies were 98% and 71%, respectively.

4. DISCUSSION

The present study was conducted to evaluate the usability of a tablet-based version of PerZeption’s AIM and FInD color detection and discrimination tests. We have tested CVD and normal participants and compared the results with those obtained using other standardized color vision tests. The tests selected as our means of comparison for this study were the anomaloscope, the CCT, the Mollon–Reffin test, and the FM-100 test. The CCT test was chosen as a comparison test because, like the AIM and FInD detection tests, the threshold values are contrast-based, permitting correlation. Threshold comparisons with the CCT and Mollon–Reffin test and discrimination comparisons with the FM-100 were made based on Spearman’s rank-order rho. Additionally, by using the anomaloscope classification as a reference test, we provided a classification metric for the AIM and FInD tests.

As predicted, higher thresholds were recorded for CVD participants compared to CN participants on L- and M-cone axes for the detection tests. CVD participants also recorded worse discrimination when judgments required L–M color opponent activity (purple and yellow conditions) as in Figs. S1, S2, S3, and S4. The AIM and FInD detection threshold estimates were found to highly correlate with scores from the CCT and Mollon–Reffin tests. AIM and FInD discrimination threshold estimates were likewise found to correlate well with total error scores from the FM-100. In both detection and discrimination tests, the FInD tests were found to produce slightly better agreements than the AIM tests. Moreover, our results demonstrate that the CVD classification is still sufficiently accurate when trial number is reduced (trial time is shortened), resulting in approximately 2.92 and 1.88 min testing duration for the AIM and FInD tests, respectively, making it more desirable for clinical applications and thus potentially shortening the time it takes to run the test in the clinic.

In this study, we used a small set of control and CVD observers to deploy two different classification strategies, namely within-normal-limits and the previously established SVM machine learning approach [29], and anomaloscope classifications as reference labels. By using the within-normal-limits approach, all FInD and AIM tests had high agreement with the anomaloscope regarding the CVD identification using a simple normal limits classifier. The AIM color discrimination performed slightly better than CCT and Mollon–Reffin, whereas both FInD and AIM color detection slightly worsened for the CVD-type categorization (Table 3). A previous study investigated the agreement between the Ishihara test, the FM-100 hue test, and the DIVE test and showed the same result pattern, namely that the tests reliably identified CVDs but had lower agreement for the CVD type [35]. The SVM approach was used to combine FInD’s and AIM’s detection and discrimination results and led to a significant improvement in classification compared to single test results using SVM and also compared with the within-normal-limits approach. It is important to note that both classification approaches suffer from small datasets and thus will be improved with larger sample sizes for each sub-group. The AIM color detection had a notably reduced CVD and CVD-type classification ability compared to the other tests (Table 3), which may be due to the sample noise check size, as it did not completely mask the luminance edge of the Landolt C. This is further supported as the impaired or reduced directions were often in agreement with that of the reference tests. Lastly, in the recently conducted study using the AIM color detection [29] on a screen, it was shown that the test had a better classification duration. This suggests that the checker size should be increased to improve the luminance masking ability and consequently improve the detectability of CVD observers.

Even with technological advancements, pseudoisochromatic plate tests remain the most widely used color tests despite their limited diagnostic value because they are fast, easy to administer, and are the most accessible of all the available tests [11,36]. The present tests incorporate some of the advantages of pseudoisochromatic plates, such as the addition of luminance noise. Notably, the AIM uses a smoothed edge between the bipartite stimuli and a dynamic noise mask, thereby reducing the influence of edge and luminance artifacts. Like the CCT, the present test determines thresholds adaptively, reducing the response time. In addition, it appears that the administration of the test can be shortened to a single trial with little loss of measurement accuracy.

Although the FM-100 has been used to measure the hue discrimination and may be suitable for the detection of acquired defects [37,38], the total error score (TES) is not an optimal measure of the severity of CVD [3941]. Compared to the FM-100, the AIM and FInD discrimination tests take a shorter time to complete. The AIM and FInD threshold scores might be particularly helpful for tracking changes in acquired CVDs. Additionally, both AIM and FInD methods provide the user with additional performance markers, including the slope of the psychometric function [42,43] that may prove useful in research and the clinic. The ability to provide both detection and discrimination assessment affords comprehensive evaluation of CVD. As with some other computer-based testing, instructions and automated scoring are integrated into the design of the AIM and FInD tests, facilitating self-administration.

This combination of findings provides support for the notion that AIM and FInD detection and discrimination tablet-based color tests may be a valuable addition to a test battery, particularly outside of the lab or clinic. To rigorously evaluate the clinical utility of the AIM and FInD tests, a strong normative database needs to be established. Successful application of the tablet-based version will also require verification of display calibration, stability, and consistency across individual devices.

Supplementary Material

Supplement

Supplemental document. See Supplement 1 for supporting content.

Fig. 2.

Fig. 2.

Threshold ratios of L/M. Initial classification is based on anomaloscopy. Red dots represent participants classified as protans, and green dots represent participants classified as deutans. High ratios should indicate protans, and low ratios should indicate deutans.

Acknowledgment.

JS and PB were supported by the NIH.

Funding.

National Institutes of Health (NIH) (R01 EY029713).

Footnotes

Disclosures. CA, JH, and MAC declare no competing interests. JS and PJB are inventors of the AIM and FInD methods, including the color detection and discrimination tests. JS and PJB are shareholders and founders of PerZeption Inc. The patents (pending) for both AIM and FInD are owned by Northeastern University, Boston, and exclusively licensed to PerZeption Inc.

Data availability.

Data for AIM and FInD tests underlying the results presented in this paper have been provided in Supplement 1. All other data are not publicly available at this time but may be obtained from the authors upon reasonable request.

REFERENCES

  • 1.National Research Council (US) Committee on Vision, Procedures for Testing Color Vision: Report of Working Group 41 (National Academies, 1981). [PubMed] [Google Scholar]
  • 2.Gegenfurtner KR and Sharpe LT, Color Vision: From Genes to Perception (Cambridge University, 2001). [Google Scholar]
  • 3.Swanson WH and Cohen JM, “Color vision,” Ophthalmol. Clin. North Am 16, 179–203 (2003). [DOI] [PubMed] [Google Scholar]
  • 4.Pokorny J and Smith VC, “Eye disease and color defects,” Vis. Res 26, 1573–1584 (1986). [DOI] [PubMed] [Google Scholar]
  • 5.Crognale MA, Switkes E, Rabin J, et al. , “Application of the spatiochromatic visual evoked potential to detection of congenital and acquired color-vision deficiencies,” J. Opt. Soc. Am. A 10, 1818–1825 (1993). [DOI] [PubMed] [Google Scholar]
  • 6.Elliot AJ, Fairchild MD, and Franklin A, Handbook of Color Psychology (Cambridge University, 2015). [Google Scholar]
  • 7.Simunovic MP, “Acquired color vision deficiency,” Surv. Ophthalmol 61, 132–155 (2016). [DOI] [PubMed] [Google Scholar]
  • 8.Birch J, “Worldwide prevalence of red-green color deficiency,” J. Opt. Soc. Am. A 29, 313–320 (2012). [DOI] [PubMed] [Google Scholar]
  • 9.Hasrod N and Rubin A, “Colour vision: a review of the Cambridge Colour Test and other colour testing methods,” Afr. Vision Eye Health 74, 7 (2015). [Google Scholar]
  • 10.Dain SJ, “Clinical colour vision tests,” Clin. Exp. Optom 87, 276–293 (2004). [DOI] [PubMed] [Google Scholar]
  • 11.Fanlo Zarazaga A, Gutiérrez Vásquez J, and Pueyo Royo V, “Review of the main colour vision clinical assessment tests,” Archivos la Sociedad Esp Oftalmología 94, 25–32 (2019). [DOI] [PubMed] [Google Scholar]
  • 12.Ishihara S, The Series of Plates Designed as Tests for Colour Blindness (Handaya Hongo Harukich, 1917). [Google Scholar]
  • 13.Hardy LH, Rand G, and Rittler MC, “H–R–R Polychromatic Plates,” J. Opt. Soc. Am 44, 509–523 (1954). [Google Scholar]
  • 14.Farnsworth D, “The Farnsworth-Munsell 100-hue and dichotomous tests for color vision,” J. Opt. Soc. Am 33, 568–578 (1943). [Google Scholar]
  • 15.Rabin J, Gooch J, and Ivan D, “Rapid quantification of color vision: the cone contrast test,” Invest. Ophthalmol. Vis. Sci 52, 816–820 (2011). [DOI] [PubMed] [Google Scholar]
  • 16.Seshadri J, Christensen J, Lakshminarayanan V, et al. , “Evaluation of the new web-based ‘colour assessment and diagnosis’ test,” Optom. Vis. Sci 82, 882 (2005). [DOI] [PubMed] [Google Scholar]
  • 17.Mollon J and Reffin J, “A computer-controlled color-vision test that combines the principles of Chibret and of Stilling,” J. Physiol.-London 414 (1989). [Google Scholar]
  • 18.French A, Rose K, Cornell E, et al. , “The evolution of colour vision testing,” Aust. Orthopt. J 40, 7–15 (2020). [Google Scholar]
  • 19.Sharpe LT, Stockman A, Jägle H, et al. , “Red, green, and red-green hybrid pigments in the human retina: correlations between deduced protein sequences and psychophysically measured spectral sensitivities,” J. Neurosci 18, 10053–10069 (1998). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Brainard DH, Roorda A, Yamauchi Y, et al. , “Functional consequences of the relative numbers of L and M cones,” J. Opt. Soc. Am. A 17, 607–614 (2000). [DOI] [PubMed] [Google Scholar]
  • 21.Berendschot TT, van de Kraats J, and van Norren D, “Foveal cone mosaic and visual pigment density in dichromats,” J. Physiol 492, 307–314 (1996). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Cao D, Pokorny J, Smith VC, et al. , “Rod contributions to color perception: linear with rod contrast,” Vision Res. 48, 2586–2592 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.He J, Taveras-Cruz Y, and Eskew RT, “Modeling individual variations in equiluminance settings,” J. Vis 21(7), 15 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ng JS and Liem SC, “Can the Farnsworth D15 color vision test be defeated through practice?” Optom. Vis. Sci 95, 452 (2018). [DOI] [PubMed] [Google Scholar]
  • 25.Dain SJ, Kwan B, and Wong L, “Consistency of color representation in smart phones,” J. Opt. Soc. Am. A 33, A300–A305 (2016). [DOI] [PubMed] [Google Scholar]
  • 26.Bex P and Skerswetat J, “FInD—foraging interactive D-prime, a rapid and easy general method for visual function measurement,” J. Vis 21(9), 2817 (2021). [Google Scholar]
  • 27.Skerswetat J, He J, Shah JB, et al. , “A new, adaptive, self-administered, and generalizable method used to measure visual acuity,” Optom. Vis. Sci 101, 451 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.He J, Bex PJ, and Skerswetat J, “Rapid measurement and machine learning classification of colour vision deficiency,” Ophthal. Physiol. Opt 43, 1379–1390 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.He J, Skerswetat J, and Bex PJ, “Novel color vision assessment tool: AIM color detection and discrimination,” bioRxiv (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.MacLeod DIA and Boynton RM, “Chromaticity diagram showing cone excitation by stimuli of equal luminance,” J. Opt. Soc. Am 69, 1183–1186 (1979). [DOI] [PubMed] [Google Scholar]
  • 31.Derrington AM, Krauskopf J, and Lennie P, “Chromatic mechanisms in lateral geniculate nucleus of macaque,” J. Physiol 357, 241–265 (1984). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kandatsu A and Kitahara K, “The visual characteristics of a case of Pigmentfarbenanomalie,” in Colour Vision Deficiencies XI: Proceedings of the 11th Symposium of the International Research Group on Colour Vision Deficiencies, Including the Joint IRGCVD-AIC Meeting on Mechanisms of Colour Vision, Drum B, ed., Sydney, Australia, 21–24 June 1991. (Springer Netherlands, 1993), pp. 113–117. [Google Scholar]
  • 33.Tanabe S and Hukami K, “Results of clinical colour vision tests of ‘Pigmentfarbenanomale’,” in Colour Vision Deficiencies XIII: Proceedings of the 13th Symposium of the International Research Group on Colour Vision Deficiencies, Cavonius CR, ed., Pau, France, 27–30 July 1995. (Springer Netherlands, 1997), pp. 99–104. [Google Scholar]
  • 34.Farnsworth D, The Farnsworth-Munsell 100-Hue Test for the Examination of Color Discrimination (Munsell Color Company, 1957). [Google Scholar]
  • 35.Fanlo-Zarazaga A, Echevarría JI, Pinilla J, et al. , “Validation of a new digital and automated color perception test,” Diagnostics 14, 396 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Birch J, “Efficiency of the Ishihara test for identifying red-green colour deficiency,” Ophthal. Physiol. Opt 17, 403–408 (1997). [PubMed] [Google Scholar]
  • 37.Foote KG, Neitz M, and Neitz J, “Comparison of the Richmond HRR 4th edition and Farnsworth–Munsell 100 hue test for quantitative assessment of tritan color deficiencies,” J. Opt. Soc. Am. A 31, A186–A188 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.François J and Verriest G, “On acquired deficiency of colour vision, with special reference to its detection and classification by means of the tests of Farnsworth,” Vis. Res 1, 201–219 (1961). [Google Scholar]
  • 39.Smith VC, Pokorny J, and Pass AS, “Color-axis determination on the Farnsworth-Munsell 100-hue test,” Am. J. Ophthalmol 100, 176–182 (1985). [DOI] [PubMed] [Google Scholar]
  • 40.Birch J, “Use of the Farnsworth-Munsell 100-hue test in the examination of congenital colour vision defects,” Ophthal. Physiol. Opt 9, 156–162 (1989). [DOI] [PubMed] [Google Scholar]
  • 41.Lakowski R, “Uses and abuses of the Farnsworth-Munsell 100-hue test,” in Colour Vision Deficiencies IX: Proceedings of the 9th Symposium of the International Research Group on Colour Vision Deficiencies, St. John’s College, Annapolis, Maryland, 1–3 July 1987. (Springer Netherland, 1989), pp. 375–395. [Google Scholar]
  • 42.Tyler CW, “Why we need to pay attention to psychometric function slopes,” in Vision Science and Its Applications (Optica Publishing Group, 1997), paper SuD.2. [Google Scholar]
  • 43.Maloney LT, “The slope of the psychometric function at different wavelengths,” Vis. Res 30, 129–136 (1990). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement

Data Availability Statement

Data for AIM and FInD tests underlying the results presented in this paper have been provided in Supplement 1. All other data are not publicly available at this time but may be obtained from the authors upon reasonable request.

RESOURCES