Familiarity influences visual detection in a task that does not require explicit recognition

Pei-Ling Yang; Diane M Beck

doi:10.3758/s13414-023-02703-7

. 2023 Apr 4;85(4):1127–1149. doi: 10.3758/s13414-023-02703-7

Familiarity influences visual detection in a task that does not require explicit recognition

Pei-Ling Yang ^1,^✉, Diane M Beck ^1,²

PMCID: PMC10072023 PMID: 37014611

Abstract

The current study aims to explore one factor that likely contributes to these statistical regularities, familiarity. Are highly familiar stimuli perceived more readily? Previous work showing effects of familiarity on perception have used recognition tasks, which arguably tap into post-perceptual processes. Here we use a perceptual task that does not depend on explicit recognition; participants were asked to discriminate whether a rapidly presented image was intact or scrambled. The familiarity level of stimuli was manipulated. Results show that famous or upright orientated logos (Experiments 1 and 2) or faces (Experiment 3) were better discriminated than novel or inverted logos and faces. To further dissociate our task from recognition, we implemented a simple detection task (Experiment 4) and directly compared the intact/scrambled task to a recognition task (Experiment 5) on the same set of faces used in Experiment 3. The fame and orientation familiarity effect were still present in the simple detection task, and the duration needed on the intact/scrambled task was significantly less than the recognition task. We conclude that familiarity effect demonstrated here is not driven by explicit recognition and instead reflects a true perceptual effect.

Keywords: Visual perception, Face perception

Introduction

We all agree that our experience not only shapes who we are, but also allows us to make inferences about the future. We learn to extract the relevant information from each event, scene, or situation that we encounter and then use that information to prepare for the next possible encounter. Although everyone agrees that prior knowledge informs our behavior and understanding, it is less widely believed that prior knowledge impacts perceptual processing.

One particularly influential model that predicts prior knowledge should impact perception is Rao and Ballard (1999) hierarchical predictive coding model. In their model, they posit that later regions of the visual pathway generate predictions based on prior knowledge to facilitate the processing of inputs that have been encountered before. To achieve this facilitation, each level of the brain cascade computes prediction errors by comparing the current signals with the feedback predictions from later regions. Instead of processing and passing the raw input signals, each level of the brain thus only needs to iteratively reduce the prediction errors until the errors are minimized to settle on a representation (Friston, 2005; Rao & Ballard, 1999). Central to these models is a mechanism, usually unspecified, to extract statistical regularities from the world to serve as predictions. This model structure predicts that more experienced inputs, that is, inputs with which we are more familiar, should be better predicted and thus require less processing time than novel inputs. In other words, the visual system should more quickly settle on, and thus perceive, input that is more familiar.

In previous work, our lab developed an intact/scrambled paradigm to assess the effects of real-world statistical regularity on perceptual processing that precedes the need to label or explicitly identify a stimulus (Caddigan et al., 2017; Center et al., 2022; Greene et al., 2015). We use the term real-world statistical regularity to capture those regularities that are built up over a lifetime, rather than regularity introduced and learned within the experiment. In the intact/scrambled task, participants simply discriminate whether a target is an intact or a scrambled image (Caddigan et al., 2017; Center et al., 2022; Greene et al., 2015; Smith & Loschky, 2019). The advantage of this task is that it can assess perceptual discrimination sensitivity without introducing the need for participants to explicitly identify what they see—just that they see something coherent rather than noise. In this paradigm the presentation duration is staircased to a threshold, previously between 70% and 82% for each participant that results in a target presentation duration that is often very brief, sometimes at the refresh rate of the monitor. This thresholding procedure means that participants very often are unsure whether an image or scrambled noise was presented, and instead experience a luminance flicker followed by the mask. Crucially though, on some trials participants clearly see an image (Caddigan et al., 2017). As such, the task is a proxy for whether a participant can “see” the image.

Using this intact/scrambled paradigm, researchers have manipulated real-world statistical regularity in a variety of ways and asked whether it impacts how readily participants see a rapidly presented image. Greene et al. (2015) manipulated the probability of natural scenes. Probability, in this case, was defined as “the probability of happening in daily life” and was assessed via four independent and naïve observers. Results showed that probable images were more easily discriminated from fully phased-scrambled images than less probable images. Caddigan et al. (2017) examined the category representativeness of natural scenes. Each scene was rated by separate participants as to how representative it is to its own category. They found that more representative scenes were better discriminated from fully phase-scrambled scenes than less representative scenes. Center et al. (2022) extended the paradigm to isolated objects. They assessed real-world statistical regularity by manipulating the typicality of the viewpoint of an object. Previous research has shown that it is easier to identify canonical views of an object (Palmer et al., 1981). Using the same intact/scrambled task, results showed that typically oriented objects were actually perceive, not just identified, more readily than atypical viewpoints.

Interestingly, in all of these experiments real-world statistical regularity impacted perception despite the fact that the exact stimuli used were unlikely to be previously experienced by the participants. However, given that predictive coding theory suggests that the predictions are based on statistical regularities extracted from the persons own personal experience, familiarity with a particular stimulus should be a strong modulator of perception. Familiarity, here, is defined as frequently encountered in one’s own life. We ask whether the intact/scrambled discrimination advantage for statistical regular images holds for images with which we have more experience or are more personally familiar.

Numerous studies have reported familiarity effects on perception, although their status as true perceptual effects is questionable. For instance, words that are more familiar have lower recognition thresholds than non-words (Solomon & Postman, 1952) and letter sequences that are more similar to English grammar require less exposure time to be recognized than random strings of letters (Miller et al., 1954). In the object domain, Gollin (1960) showed that training and familiarization of object images can improve participants’ recognition of line drawings in which the line segments delineating the object are fragmented. That is, familiar objects can be recognized with fewer line segments than unfamiliar objects. Critically, however, in these studies and many that followed, participants were asked to recognize objects (e.g., visual search: Hershler & Hochstein, 2009; Qin et al., 2014; Shen & Reingold, 2001; semantic context: Reingold & Jolicoeur, 1993; Snell & Grainger, 2017; see Baron, 2014, for review; objects recognition: Bülthoff & Newell, 2006; Honda et al., 2011; general review: Krueger, 1975). Pylyshyn (1999) has argued that recognition, the process interpreting our visual input, is better relegated to the realm of cognition, occurring after an “early-vision stage” in which properties such as color and shape are detected and individuating of object tokens occurs (see also Pylyshyn, 2001). Under this view then, the effects of familiarity assessed by recognition tasks should not be interpreted as evidence of familiarity’s effect on perception, or at least not on early vision. This view of early vision as encapsulated from cognition predicts that we should be able to detect the presence of something before we can recognize it.

One study that does explore the effect of familiarity on perception and uses the same intact/scrambled paradigm described above is that of Smith and Loschky (2019). In particular, they investigated whether the expectation of familiar scene sequences can induce the same processing advantage as previous intact/scrambled studies. The sequences were constructed by a series of first-person-view pictures navigating from one location to another location. To manipulate expectation, the sequences could be coherent, in the order experienced as you move through the real world, or randomized. Participants could expect what the next scene should be in coherent sequences but not in the randomized sequences. In essence, they asked whether participants’ intact/scrambled judgments were sensitive to expectation set up by the familiar sequence of scenes. Their results showed that target scenes are better detected when they are embedded in coherent familiar sequences than random sequences. Importantly, however, the scenes themselves were familiar in both the coherent and randomized sequence and so their study says less about the effect of familiarity and more about sequence prediction.

The current study aims to compare familiar and completely novel stimuli in the intact/scrambled paradigm to directly test the effect of familiarity. We predict that participants will better at detecting the presence of familiar stimuli (as opposed to noise) than novel stimuli, indicating that familiarity impacts perceptual processes. In short, the current study uses this intact/scrambled paradigm to ask whether familiar objects are actually perceived, rather than recognized, more readily than unfamiliar ones.

Experiment 1: Intact/scrambled logos

To examine whether familiarity influences perceptual processes, we first compared famous and novel logos in an intact/scrambled discrimination task. Participants were asked to respond whether the target was an intact or a scrambled image under rapid presentation (Caddigan et al., 2017; Greene et al., 2015; Smith & Loschky, 2019). In addition to the fame factor, we also asked whether previous exposure impacted discrimination. In short, participants went through the full set of stimuli twice. We hypothesized that if a single repetition is enough to set up a memory trace, repetition might improve the discriminability in the repeated blocks. It is also possible that if the repetition makes the novel logos more familiar, repetition might reduce the advantage for famous over novel logos. Thus, we can look at two different effects of familiarity; familiarity established within the experiment with repetition, and familiarity established over the course of everyday life. Lastly, to verify that our stimuli were famous to our participants, they were asked to rate how familiar they were with all the famous logos. Novel logos were computer-generated for the experiment, and thus novel to all participants.

Participants

Twenty-six participants (18 females, mean age = 18.9 years) were recruited from the University of Illinois participant pool and were compensated with course credits. This sample size was determined based on the range of sample sizes typically used in prior studies (Caddigan et al., 2017). An a priori power analysis was not conducted, but this sample size is sufficient to detect an effect of dz = .69 (the effect observed by Caddigan et al. (2017) Experiment 1, when comparing d’ values for representative and less-representative scenes using a paired t-test) with 92% power (matched pairs t-test in G power 3.1.9.4 (Faul et al., 2007)). All participants had self-reported normal or corrected-to-normal vision. Written informed consent was obtained in accordance with procedures and protocols approved by the University of Illinois Institutional Review Board.

Stimuli and procedure

Target images contained full-color famous and computer-generated novel logos (Fig. 1). Novel logos, 12 for the practice experiment, 94 for the staircase experiment, and 101 for the main experiment, were created using the following websites: https://emblemmatic.org/markmaker/#/, https://www.launchaco.com/logo, and https://www.freelogodesign.org/. Words that appeared in the famous logos (e.g., Adidas) were also included in the novel logos. The famous logos were selected based on the rating data of a separate pilot study in which six participants viewed 106 logos for unlimited time and rated them for familiarity on a 7-point scale. Only images with a mean rating greater than or equal to 5 were included in subsequent experiments. Scrambled versions of the famous and novel logos were created using a diffeomorphism with 25% distortion (see Stojanoski & Cusack, 2014) for the mathematical algorithm). The resulting images were no longer recognizable as famous logos but retained many of the same image qualities, including the centralized positioning of the artwork. We created masks by “grid scrambling” both intact and scrambled images. An invisible ten by ten grid was imposed on the images, and the images were phase scrambled within each grid. Hence, each target had its corresponding mask. The intact logos used in the staircase were computer-generated logos. The intact logos used in the practice session were all novel logos. Logos used in both the staircase and the practice sessions only appeared once in this experiment. Logos used in the main experiment were repeated twice, once in an initial experiment and again in a repeated experiment. All the targets and the masks were cropped to the same size of 320 px × 320 px square. Stimuli were presented on an 85-Hz CRT monitor of resolution 1,280 × 960 using the Psychopy package (Peirce et al., 2019) and Python (Python Software Foundation. Python Language Reference, version 3.7). Participants viewed the stimuli with their chin on a chinrest situated 59 cm from the monitor, and thus the images subtended approximately 9.69 degrees of visual angle.

To assess the visual salience of images, we used the saliency toolbox (Walther & Koch, 2006) for MATLAB (version R2021a). The mean and the maximum salience for each image were calculated. Famous logos had larger mean salience than novel logos (Famous (M = .053) > Novel (M = .045), t(187.8) = 3.31, p = .001, Cohen’s d = .47), while novel logos had larger maximum salience than famous logos (Famous (M = 3.15) < Novel (M = 3.20), t(199.8) = -2.27, p = .024, Cohen’s d = .32). Because salience was not perfectly equated between our variables of interest, salience values were included in hierarchical logistic linear models as predictors. In Experiments 3, 4, and 5, we controlled for the visual saliency across stimulus sets.

The experiment had five sessions, all performed within the same hour: practice, staircase, main experiments (initial and repeated), and rating task. For all parts except the rating task, participants performed an intact/scrambled discrimination task. Each trial began with a fixation cross, then a target image (either intact or scrambled) appeared briefly in the middle of the screen followed by a mask (26 frames, 306 ms) (Fig. 2). The duration for the target image was determined for each participant by a staircasing procedure (see below). Participants were asked to respond “intact” or “scrambled” by pressing either the left or right “control” keys on a keyboard, the assignment of which was counter-balanced across participants. Participants were instructed to respond as fast as possible without sacrificing accuracy. The trial ended if participants did not make a response within 136 frames (1.6 s) after the onset of the mask.

Fig. 2 — The procedure for the intact/scrambled task. The target was presented in the middle of the screen and followed by a mask

Practice: Each participant first completed 24 trials with the target duration set at 118 ms to familiarize themselves with the task. They received feedback on these practice trials; the word “incorrect” appeared in red in the middle of the screen along with a beep sound (100-Hz tone) for incorrect responses, whereas a black “correct” (and no sound) appeared for correct responses. Feedback was used only in the practice session.
Staircasing: Because we have previously found that individuals vary greatly on this intact/scrambled task, duration was staircased for each individual. We used the Quest algorithm (Watson & Pelli, 1983) with a 71% accuracy threshold to estimate presentation duration, and the resulting duration was used in the main and repeated-main experiments. The staircase procedure contained 188 trials and the possible durations were set to range from one to 21 frames at 11.8 ms per frame.
Main experiments: Because we were interested in repetition, each participant initially completed 404 trials of 404 unique stimuli (no repeats). We refer to this phase of the experiment as the main experiment. Immediately after completing the first repetition, they completed a second repetition of the task with all the 404 stimuli, but in a different order. We refer to this condition as the repeated main experiment.
Rating: After completing the repeated main experiment, participants rated how familiar they were with each famous logo on a 7-point scale, 1 (never seen this logo) to 7 (very familiar).

Results

The durations that were obtained during staircasing ranged from one frame (~ 12 ms) to 20 frames (~ 235 ms) across participants, with a mean of 9.76 frames (~ 115 ms) and a standard error of 1.48 frames (~ 17 ms). For the main experiment, we first excluded trials in which participants did not respond. Most participants missed fewer than 2% of the trials in each condition. The highest missing rate was 23% of trials in the novel intact repeated condition and this only happened in one participant. In the context of signal detection theory, intact images were viewed as signal present while scrambled images were considered as signal absent. Hit rates, therefore, were defined as ‘intact’ responses when the targets were indeed intact. False alarm rates were defined as ‘intact’ responses when the targets were scrambled. A sensitivity measure, d’, was calculated for both the famous and novel logo conditions in the main experiments. For both the initial and repeated phases of the main experiment, we observed higher d’ for famous (Initial: M = 2.54, SE = 0.23; Repeated: M = 2.49, SE = 0.21) than novel logos (Initial: M = 2.31, SE = 0.26; Repeated: M = 2.23, SE = 0.25) (Fig. 3). A two-way repeated measures ANOVA (afex package (Singmann et al., 2021) in R version 3.6.3 (R Core Team, 2020)) of familiarity (famous vs. novel) and repetition (initial vs repeated) revealed a significant main effect of familiarity (F(1, 25) = 18.38, p < .001, partial η² = .42), but no significant main effect of repetition (F(1, 25) = .07, p = .79, partial η² = .002) nor an interaction with repetition (F(1, 25) = .06, p = .81, partial η² = .002).

Fig. 3 — A Within-subject comparison violin plots. Each dot represents one participant, and the performance of each participant is connected with a line. The distribution shows the density function of the d’ distributions. The bar superimposed on the dots represents the 95% within-subject confidence interval based on the Cosineau-Morey-O’Brien method (Cousineau & O’Brien, 2014). The left-hand plots show the data from the initial presentation, and the right-hand plots show data from the repeated presentation. Blue represents famous logos, and red represents the novel logos. The main effect of fame is significant, but the main effect of repetition and the interaction effect are not. B The fame advantage of was calculated as the d’ of the fame condition subtracts the d’ of the novel condition. Each dot represents one participant, and the performance of each participant is connected with a line

To better understand the factors contributing to higher sensitivity, we compared both hit rates and false alarms as a function of fame and repetition (Table 1). We observed higher hit rates for famous logos than novel logos (Fame: F(1, 25) = 52.00, p < .001, partial η² = .68) and no effect of repetition (F(1, 25) = .23, p = .64, partial η² = .01) nor interaction with repetition (F(1, 25) = .28, p = .60, partial η² = .01). We also observed higher false alarm rates for the scrambled versions of the famous logos than novel logos (Fame: F(1, 25) = 41.11, p < .001, partial η² = .62). We suspect that the difference in false alarm rates between famous and novel logos are due to low-level features that were retained in diffeomorphed famous logos (e.g., colors) causing participants to guess intact. There were no effects of repetition (F(1, 25) = .01, p = .92, partial η² = .00) nor an interaction with repetition for false alarms F(1, 25) = .01, p = .92, partial η² = .00). Higher hit rates and false alarm rates resulted in significant difference in bias between famous and novel logos (F(1, 25) = 108.6, p < .001, partial η² = .81).

Table 1.

Summary of participant performance in Experiments 1, 2, and 3

	d’ (sensitivity)	Hit rate	FA rate	Bias	RT (ms)
Exp. 1 (Logo in-lab)
Initial - Famous	2.54 ± 0.16	0.90 ± 0.02	0.19 ± 0.04	-0.21 ± 0.09	559 ± 21
Initial - Novel	2.31 ± 0.15	0.82 ± 0.03	0.15 ± 0.03	0.12 ± 0.10	593 ± 22
Repeat - Famous	2.49 ± 0.17	0.89 ± 0.02	0.18 ± 0.04	-0.13 ± 0.09	539 ± 17
Repeat - Novel	2.23 ± 0.17	0.79 ± 0.03	0.14 ± 0.03	0.18 ± 0.10	557 ± 17
Exp. 2 (Logo online)
Famous upright	1.70 ± 0.14	0.84 ± 0.02	0.34 ± 0.03	-0.32 ± 0.06	589 ± 41
Famous inverted	1.60 ± 0.12	0.82 ± 0.02	0.31 ± 0.03	-0.21 ± 0.06	602 ± 26
Novel upright	1.43 ± 0.11	0.72 ± 0.02	0.27 ± 0.03	0.04 ± 0.07	614 ± 22
Novel inverted	1.35 ± 0.11	0.69 ± 0.02	0.26 ± 0.03	0.11 ± 0.07	657 ± 53
Exp. 3 (Face online)
Famous upright	1.88 ± 0.11	0.78 ± 0.02	0.21 ± 0.02	0.02 ± 0.06	584 ± 23
Famous inverted	1.60 ± 0.10	0.68 ± 0.02	0.20 ± 0.02	0.22 ± 0.06	618 ± 31
Novel upright	1.73 ± 0.10	0.75 ± 0.02	0.22 ± 0.02	0.07 ± 0.07	596 ± 28
Novel inverted	1.58 ± 0.10	0.66 ± 0.02	0.20 ± 0.02	0.26 ± 0.06	682 ± 66

Open in a new tab

Each cell represents mean ± standard error. The bias was calculated as -0.5*(Z(hit) + Z(FA)). Response time was calculated only for intact trials

There was no evidence of a speed/accuracy trade-off in these experiments (see Table 1); participants responded significantly faster to famous intact logos than to novel intact logos in initial and repeated presentations (Initial: 559 ms (SE = 21) vs. 593 ms (SE = 22), Repeated: 539 ms (SE = 17) vs. 557 ms (SE = 17); Fame: F(1, 25) = 53.54, p < .001, partial η² = .68; Interaction: F(1, 25) = 3.07, p = .092, partial η² = .11). That is, intact famous logos were detected both more accurately and more quickly than intact novel logos.

In follow-up analyses, we included salience in our statistical model to determine whether salience differences between intact famous and novel logos might be the cause of our familiarity effect. A random-intercept logistic hierarchical linear model was fitted to the accuracy (transformed into a continuous measure using a log of the odds ratio) of each intact trial (lme4 package (Bates et al., 2015) in R version 4.1.2 (R Core Team, 2021)). Instead of using d’ estimate, which aggregates across trials, accuracy on intact trials was chosen as the dependent variable so that we could model the salience of each image on each trial. This model contained five fixed-effect factors (fame, repetition, the interaction of fame and repetition, and the mean and maximum salience of each stimulus) and random intercept for each participant. Results showed that intercept, fame and repetition factors were the only significant predictors of accuracy in this model (Intercept: β = 2.09, SE = 0.90, Z = 2.31, p = .021; Fame: β = 0.75, SE = 0.08, Z = 9.16, p < .001; Repetition: β = -0.22, SE = 0.07, Z = -3.17, p = .002). The negative beta weight associated with repetitions indicates that participants were actually worse at discriminating the intact logos the second time they were encountered. This is more likely due to fatigue than actual logo repetition. Importantly, not only did salience not pick up significant variance, but the fame effect persisted when salience was included in the model, indicating that the familiarity effect cannot be attributed to low-level salience differences. Finally, the rating task confirmed that participants found our famous logos familiar with average rating of 6.57 on a 7-point Likert scale (7 is very familiar and 1 is have never seen this logo).

Discussion

In Experiment 1, two aspects of familiarity were tested: familiarity built up through repeated exposure in our everyday lives (famous) and familiarity established within the experiment (repetition). An intact/scrambled task was used to access the sensitivity of participants to the presence of coherent (intact) stimuli under rapid presentation. Our data suggested that famous logos were better perceived than novel logos. A single repetition within the context of the experiment was not sufficient to impact sensitivity. This influence of familiarity then may require more frequent exposures than a single repetition within the experiment. We note, however, that in this case we were repeating images that participants may not have perceived clearly due to the brief and masked presentations, potentially preventing the formation of a familiarity signal. Future work is needed to determine whether exposure to a single clearly perceived exposure might impact the intact/scrambled discrimination.

In the next experiment, we use the same set of stimuli to replicate the effect of fame (the stronger of the two effects from Experiment 1) and also introduce another source of familiarity, orientation. We assume that subjects have more experience with upright logos than inverted ones, and thus inverted logos would be less familiar than upright ones.

Experiment 2: Intact/scrambled logos (online)

In Experiment 1, we demonstrated an effect of familiarity on perception (the detection of an intact as opposed to scrambled image) when comparing famous and novel logos. In Experiment 2, we asked whether this familiarity effect can be replicated, and further explored the effect of upright (familiar) versus inverted (unfamiliar) orientation of the logo. Specifically, we picture-plane rotated the logos 180°. The same intact/scrambled task and procedure as Experiment 1 was used. Experiment 2 and the following experiments were conducted online due to the COVID-19 pandemic. We note that we have less precise control of the duration of the stimuli in an online study than we do in the laboratory, due to differences in monitors and internet bandwidth and reliability. To tackle these limitations, we loaded the stimuli when the experiment was first initialized, we excluded participants based on dropped frames (see exclusion criteria below for description of how dropped frames were calculated) we used the within-subject design such that any idiosyncrasies of the monitor are shared across conditions (e.g., decay rate of monitors), and we increased the number of participants to offset the variability in the online studies. Furthermore, the performance variability induced by any participant’s computer or internet should be viewed as variability that, if anything, reduces the effect size. Thus, if we find a significant effect of familiarity in the online studies, we can infer we would get similar or larger effects with better timing control. This experiment was pre-registered on the Open Science Framework (OSF) website (https://osf.io/mnpw6).