Skip to main content
Scientific Reports logoLink to Scientific Reports
. 2026 Jan 23;16:6080. doi: 10.1038/s41598-026-36905-4

Construction and analysis of a packaging design preference model using eye-tracking degree of preference

Yingzhe Xiao 1, Jingli Fang 1,, Hanyue Zhang 1, Qianxi Li 1, Yanyue Zhang 1
PMCID: PMC12901184  PMID: 41577746

Abstract

The homogenization of packaging design highlights the importance of objectively assessing consumer emotional responses. However, existing methods predominantly rely on subjective reports and lack deep integration with objective physiological metrics. To address this, this exploratory study aimed to construct and preliminarily validate a quantitative predictive model based on the Eye-tracking Degree of Preference (E-Dop). Eye-tracking metrics and subjective preference ratings were collected synchronously from 30 young adult participants as they viewed shampoo packaging samples, and a multiple linear regression model was constructed. The results indicated that, under the tested conditions, design elements such as medium-high saturation, cool tones, and rounded morphologies were significantly correlated with higher levels of visual attention and user preference ratings. The constructed E-Dop model demonstrated good predictive performance within the specific cohort (adjusted R2 = 0.702) and passed rigorous Leave-One-Subject-Out Cross-Validation (r = 0.84). This study provides methodological evidence for using eye-tracking technology to quantify the “visceral-level” visual preference elicited by packaging design. The E-Dop model establishes a preliminary bridge connecting subjective preference reports with objective eye-tracking measurements. This framework not only provides initial physiological evidence in support of emotional design theory but also offers a prototypical, quantifiable decision-support framework for optimizing designs targeted at specific consumer segments.

Keywords: Packaging design, Evaluation method, Emotional experience, Eye-tracking technology, Predictive model

Subject terms: Computational biology and bioinformatics, Neuroscience, Psychology, Engineering, Materials science, Mathematics and computing

Introduction

In the highly homogenized consumer goods market, packaging design has transcended its protective function to become a critical element for brands to establish emotional connections with consumers and achieve market differentiation1,2. In the fast-paced retail environment, effective packaging not only conveys information but also elicits positive emotional responses, directly influencing purchase decisions3. This process aligns closely with emotional design theory, which posits that successful design should systematically shape user experience across visceral, behavioral, and reflective levels4,5. Among these, the visual presentation of packaging first operates at the visceral level, aiming to elicit rapid, automated emotional responses through sensory attributes such as color and form. With the advent of the “experience economy,” quantifying the emotional experiences evoked by packaging has become a central challenge for both corporations and designers6.

However, current packaging design evaluation predominantly relies on designers’ experiential intuition and subjective reporting methods such as questionnaires and focus groups. While these approaches capture users’ explicit attitudes7, they are susceptible to psychological factors such as social desirability bias and struggle to access the rapid, implicit, preconscious cognitive processes underlying consumer decision-making8. This has created a measurement gap between the subjective emotional experience of packaging and its objective physiological impact.

To fundamentally bridge this gap between subjective emotional experience and objective design elements, eye-tracking technology has been introduced as an objective measure of visual attention. Its core theoretical foundation is rooted in the “eye-mind hypothesis” in cognitive science, which posits that an individual’s point of gaze is closely linked to their immediate information processing content9. Nevertheless, prevailing research paradigms often exhibit a “fragmented” character, typically examining descriptive correlations between single or limited eye-tracking metrics and overall packaging design preferences1012. This “one-at-a-time” analytical approach fails to capture the synergistic effects of multidimensional attentional processes and lacks the predictive capability to quantify preference levels for specific design elements13,14. Consequently, a critical methodological gap emerges: a framework that systematically integrates multi-dimensional eye-tracking metrics to construct a quantitative model for predicting subjective preference is currently lacking. This absence severely limits the decision-support value of eye-tracking data in design practice. Concurrently, in the field of complex user experience research, integrating multimodal measurements to enhance explanatory and predictive power has become a significant methodological trend.

To address this gap, this exploratory study aims to develop and preliminarily validate a novel evaluation framework based on an E-Dop within a specific yet essential consumer group (university students). This study attempted to integrate multidimensional objective eye-tracking data with subjective user preference reports, aiming to explore a new, data-driven path for design evaluation that combines subjective and objective evidence. The specific objectives are to: (a) explore and construct a multiple linear regression model that uses multidimensional eye-tracking metrics as independent variables to predict users’ subjective preference scores; (b) systematically investigate the effects of five core design elements (saturation, brightness, hue, image-text ratio, and morphology) on visual attention and preference; and (c) employ Leave-One-Subject-Out Cross-Validation (LOSOCV) to conduct a preliminary evaluation of the predictive efficacy and stability of the constructed model within the studied cohort.

The findings of this study are intended to provide initial methodological evidence for using eye-tracking technology to quantify the ‘visceral-level’ visual preference elicited by packaging design. Furthermore, they aim to offer a testable conceptual model for implementing an ‘integrated subjective-objective data’ decision-support framework in the field of packaging design evaluation. The study’s structure is shown in Fig. 1. To establish the theoretical foundation of this study and clarify its innovative contributions, a systematic review of relevant research on emotional design, the eye-tracking paradigm, and packaging design elements is presented next.

Fig. 1.

Fig. 1

Research structure.

Literature review and theoretical framework

Emotional design in packaging: value and measurement challenges

Packaging design has evolved beyond its physical functionality to become central in shaping brand experience and fostering emotional connections. This shift aligns closely with emotional design theory, which describes successful design across three interrelated levels: visceral, behavioral, and reflective15. In retail environments, packaging operates primarily at the visceral level, where it elicits immediate emotional responses through sensory attributes such as visual appearance and tactile qualities16. However, a key challenge in applying this theory empirically lies in quantifying these rapid, implicit, and often preconscious emotional reactions. Traditional self-report methods are limited in capturing such fleeting and instinctive responses, as participants may struggle to accurately recall subtle emotional shifts or may be influenced by social desirability bias. This measurement gap underscores the need for objective physiological approaches to empirically substantiate emotional design theory17.To address this need, this study aimed to explore an objective measurement pathway to quantify the emotional preference elicited by packaging design at the visceral level. Specifically, it defined the measurement target (visceral-level emotional responses) using emotional design theory, and attempted to operationalize this theoretical construct by employing the compatible eye-tracking paradigm, which is grounded in the “eye–mind hypothesis.”

Eye-tracking paradigm: from description to prediction

Among various objective tools, eye-tracking technology has emerged as a powerful instrument for decoding implicit cognitive processes, as it enables non-invasive and precise recording of visual attention allocation13. Methodologically, it is grounded in the “eye–mind hypothesis18,” which posits that an individual’s point of gaze is closely linked to their immediate information processing. Derived metrics, such as total fixation duration and time to first fixation, provide a quantifiable scale for visual behavior. The reliability of these eye-tracking metrics as objective data sources has been empirically supported in numerous applications with stringent precision requirements. For instance, studies in sophisticated human-robot interaction have demonstrated that high-precision eye-tracking calibration data can be reliably used to control collaborative robots, providing practical validation for the robustness of eye-tracking data in capturing and quantifying fine-grained cognitive intent19.

Existing studies have successfully applied these metrics for descriptive and diagnostic analyses, confirming the attention-capturing capacity of specific design elements11,12,2022. However, a key paradigmatic limitation is that much of the current research remains at a “post-hoc explanation” stage14,2325, addressing questions such as “which design elements attract attention?“10 or “Is there a correlation between attention patterns and overall preference?”26, without advancing toward a priori prediction. This approach has not yet succeeded in integrating multidimensional eye-tracking indicators into a quantitative model capable of predicting preference intensity, thus leaving a methodological gap between descriptive association and predictive modeling. To bridge this methodological gap and address the need for objective quantification of “visceral-level” emotional responses, the E-Dop model constructed in this study attempts a theoretical integration. It establishes the “visceral-level” emotional preference, a core concern of emotional design theory, as the prediction target, while employing multidimensional eye-tracking metrics (supported by the “eye–mind hypothesis”) as objective, quantifiable explanatory variables to predict this target. The fundamental hypothesis is that, during the rapid visual evaluation of packaging, the immediate emotional reactions elicited by a design systematically modulate and are reflected in observable eye movement behavior patterns. The central purpose of this theoretical integration is to construct an operationalizable, behavior-based theoretical framework for measuring “visceral-level visual preference,” thereby aiming to advance the eye-tracking research paradigm from description and diagnosis towards a new stage of explanation and prediction.

Synergistic effects of design elements and an integrated research perspective

The complex interaction of basic visual elements such as color, form, and layout shapes the overall experience of packaging design. Academic exploration into the individual psychological effects of these elements is already quite profound.

Color is one of the fastest elements to trigger visceral-level emotional responses2729, with its influence rooted in color psychology30,31. Hue, saturation, and brightness collectively determine the emotional tone and visual impact of color. In the field of consumer research, the association between specific colors and emotional semantics, such as the link between blue hues and feelings of cleanliness and freshness, has been examined and discussed32. The visual layout and packaging form follow Gestalt principles, influencing the organization of information and readability. The image-text ratio relates to the organization of information and visual fluency, directly affecting cognitive load and aesthetic perception33. From a formal perspective, rounded curves are typically perceived as safer, softer, and more pleasant34. This common perceptual tendency is also supported by research in product perception and consumer behavior35, representing a classic manifestation of visceral-level design in emotional design theory36,37.

However, the “isolated variable” strategy prevalent in mainstream research has led to the neglect of “synergistic effects” between design elements. Packaging in the market is a composite entity, yet prevailing research paradigms often dissect the intrinsic connections between design elements. This makes it difficult for existing studies to answer a core question in practical design decision-making: how should different elements be weighted, and how do they collectively influence consumer preference? By failing to place multiple design elements within a unified analytical framework to examine their relative contributions and interactions, the guiding significance of existing design theory is consequently limited when faced with complex, comprehensive decisions. However, the dominant “isolated variable” approach in mainstream research has led to the neglect of synergistic effects among design elements. Packaging in the market is a composite entity, yet existing research often breaks down the intrinsic connections between its components. This limits the practical guidance these findings can provide for addressing key questions in real-world design decisions, such as “How should different elements be weighted?” or “How do they collectively influence consumer preference?” By failing to examine the relative contributions and interactions of multiple design elements within a unified analytical framework, the potential of existing design theory for comprehensive application remains underexplored. Therefore, this study aims to explore a feasible pathway for the objective quantification and prediction of overall packaging design preference. This is achieved by introducing eye-tracking technology to capture real-time visual cognitive processes and constructing a predictive model (E-Dop) that integrates multidimensional eye-tracking metrics.

Positioning the model from a multi-criteria decision-making perspective

The preceding discussion has focused on domain-specific challenges and the technological evolution of packaging design evaluation. Broadening the perspective to a more macroscopic methodological spectrum reveals that the core task of this study, constructing the E-Dop model by integrating multiple heterogeneous inputs to generate a comprehensive evaluation output, resonates profoundly with mature paradigms in decision science.

The systematic integration of subjective judgment and objective data to support complex decision-making is a central issue that has long been explored in Multi-Criteria Decision Making (MCDM) and hybrid evaluation frameworks38. This field employs tools such as the Analytic Hierarchy Process and Fuzzy Comprehensive Evaluation to structurally integrate multiple, often conflicting, criteria, thereby addressing uncertainty and informational complexity in decision processes39. Its methodological essence lies in the transparent and interpretable weighting and aggregation of diverse “input criteria” to form a comprehensive judgment.

The E-Dop model proposed in this study is highly isomorphic with the aforementioned paradigm in its methodological structure. The model defines several eye-tracking metrics (TTFF, FD, FFD, OL), each carrying distinct cognitive semantics, as a set of “input criteria” quantifying a design’s performance across dimensions such as initial attraction and sustained interest. Subsequently, via multiple linear regression, a transparent and interpretable statistical model, E-Dop achieves a weighted fusion of these criteria to output a quantified composite preference score.

Consequently, this work can be explicitly positioned as a data-driven, physiology-based application of the MCDM philosophy to a specific, micro-level problem: consumer visual perception assessment. This positioning not only clarifies the cross-methodological significance of the study but also aims to advance eye-tracking technology beyond its traditional descriptive and diagnostic roles. It promotes its evolution into a quantitative, multi-criteria assessment system that supports design decision-making, thereby connecting with the broader frontier of data-driven decision analytics.

Identification of the research gap

Based on the foregoing analysis, a specific and clear research gap is identified. Methodologically, eye-tracking research needs to evolve from a descriptive analysis paradigm towards a predictive modeling paradigm. In terms of theoretical application, packaging design research must shift from examining elements in isolation to systematically integrating multiple aspects40,41. Therefore, the current field lacks a model framework capable of systematically integrating multidimensional eye-tracking metrics (which represent different potential cognitive facets of preference formation) and establishing a quantitative, interpretable predictive relationship with the intensity of users’ subjective preferences. Such a model would address the measurement needs of emotional design theory and provide design practice with decision support that transcends intuitive judgment.

To construct and validate this model, two sequential experiments were designed and conducted. First, a controlled eye-tracking experiment was performed to examine the effects of individual design elements on visual attention and preference.

Eye-tracking test and preference analysis of a single packaging design element

The eye movement experiment involves the operation of the eye-tracking device and the infrared camera photoreceptor, which require a stable temperature and humidity environment. Therefore, the eye movement experiment was conducted in a controlled-temperature and controlled-humidity environment. In addition, because eye movement experiments also involve the subjects’ mental cognitive processes, it is necessary to control the noise and light factors in the experimental environment that may affect the subjects’ tests and to exclude distracting objects to ensure the validity of the experiments.

All experimental procedures involving human participants comply with the ethical standards of the Declaration of Helsinki and have been approved by the School of Packaging Engineering at Hunan University of Technology. The experiment is non-invasive, utilizing an infrared eye tracker to record eye movements. Vulnerable groups, such as children, pregnant women, or individuals with decision-making disorders, have been excluded. The study involved thirty adult college students, and it has been confirmed that informed consent was obtained from all subjects.

This study explores consumers’ visual preferences for packaging design through eye-tracking experiments. Thirty adult college student participants were recruited. They browsed five groups of packaging samples in a standard environment, and eye movement indicators and subjective preference scores were recorded simultaneously. The experimental objective is to establish an E-Dop prediction model to guide the optimization of sustainable packaging design. Before the experiment began, written informed consent was obtained from all 30 adult college students.

Experimental samples

Shampoo packaging was chosen as the experimental stimulus in this study owing to its typicality in the fast-moving consumer goods (FMCG) sector. The strong market competition in this category renders packaging visual appeal a decisive factor in consumer choice. Moreover, shampoo packaging integrates core design dimensions including colour, graphic layout, and three-dimensional form. It thus offers an ideal platform that balances practical relevance with methodological feasibility for the systematic manipulation of design variables.

Accordingly, the packaging design elements examined in this experiment were selected based on a comprehensive visual design framework. This framework deconstructs design elements into three core dimensions: colour, layout, and form. Colour, as the primary source of visual impact, was further broken down into its fundamental attributes: hue, saturation, and brightness. Layout design, which directs information organisation and communication, was operationalised through the image–text ratio as its key variable. Form refers to the physical structure that defines the packaging’s tangible presence. Together, these five elements encompass the essential aspects of packaging visual design, each of which is well established in the literature as influential on consumer perception and preference.

  1. Personnel samples.

Thirty undergraduate students who reported weekly shopping at large offline supermarkets or shopping malls were recruited. Eligibility was determined via a screening question in the recruitment questionnaire: “How often do you typically shop at a supermarket or shopping mall?” with responses including “Less than once a month,” “Once a month,” “Once a week,” and “Multiple times per week.” Only those who selected “Once a week” or more frequently were included.

All participants had no history of ophthalmic disease and had normal or corrected-to-normal vision. Exclusion criteria included excessively long eyelashes, colour vision deficiency, and uncorrected astigmatism. The exclusion of participants with uncorrected astigmatism was based on technical requirements to ensure accurate eye-tracking data acquisition, thereby maintaining consistent data quality and preventing measurement errors arising from variations in ocular optical characteristics.

This sampling strategy was deliberate and theoretically motivated. As an exploratory study aimed primarily at constructing the E-Dop model, the key initial objective was to validate the method’s internal validity. Consequently, a strategic decision was made to select a relatively homogeneous group of young, educated individuals from a core FMCG consumer segment as the initial sample. This approach prioritized clarity in the relationships between core variables by minimizing interference from extraneous group-level variability during the model-building phase. However, this also means that the external validity (i.e., generalizability) of the findings is currently limited primarily to consumer groups with characteristics similar to this sample. It should be emphasized that this demographic constitutes a core consumer group for personal care products such as shampoo. All findings and the validity of the model in this study were established within this specific population (Chinese university students) and experimental context, thereby laying a necessary methodological foundation for subsequent validation and calibration in broader populations.

  • (2)

    The experimental sample setup of a single packaging element.

The experimental shampoo packaging samples were organised into five groups based on distinct appearance design factors: saturation, brightness, hue, graphical proportion, and morphology. It is important to note that all visual stimuli used in this study (Fig. 2) are abstract schematic representations created specifically for this eye-tracking research. The forms depicted are generic representations of common packaging types. These stimuli were expressly redrawn to remove all brand identifiers and are used solely for the controlled analysis of the targeted design variables. Each group was subdivided into different Areas of Interest (AOI), defined as key regions within the stimulus material that enable subsequent analysis of metrics such as fixation duration, pupil size42, and scan rate within each zone.

Fig. 2.

Fig. 2

Experimental sample. (a) Saturation group samples, (b) Brightness group samples, (fixed saturation), (c) Hue group samples, (cool, warm, neutral tones), (d) Graphical proportion group sample, (e) Stylized morphological group samples.

The saturation group (Sample 1) was designed to control the saturation gradient, comprising four AOIs: low saturation (S1-1), medium saturation (S1-2), medium–high saturation (S1-3), and high saturation (S1-4), as illustrated in Fig. 2a.

The brightness group (Sample 2) maintained fixed saturation while varying colour brightness, resulting in four independent AOIs: low brightness (S2-1), medium brightness (S2-2), medium–high brightness (S2-3), and high brightness (S2-4), shown in Fig. 2b.

The hue group (Sample 3) divided the samples into three AOIs corresponding to cool (S3-1), neutral (S3-2), and warm (S3-3) tones, as presented in Fig. 2c.

The graphical proportion group (Sample 4) varied the image-to-text ratio on the packaging cover, with AOIs classified as text-only (S4-1), low-image proportion (S4-2), and high-image proportion (S4-3), as detailed in Fig. 2d.

The morphology group (Sample 5) manipulated the roundedness of the packaging shape along a gradient from sharp to smooth. Four AOIs were defined: smooth-rounded (S5-1), more rounded (S5-2), sharper (S5-3), and sharp (S5-4), as shown in Fig. 2e.

To ensure internal validity, all visual stimuli were generated using a parametric design process. Professional design software was employed to precisely control individual variables while keeping all other visual elements consistent, allowing for the independent and unambiguous observation of each design factor. Furthermore, to verify that the experimental stimuli effectively manipulated the target variables and exhibited good ecological validity, a pre-test was conducted with 20 preliminary participants who met the formal study’s criteria. In this pre-test, participants viewed the sample sets sequentially in a controlled setting and completed a structured questionnaire. The questionnaire consisted of two main sections: one that assessed visual discriminability by asking participants to rate, on a 5-point Likert scale, the distinctness of differences in the target characteristics across samples within each element group; and another that evaluated semantic appropriateness by having participants rate, also on a 5-point Likert scale, how suitable each design was as shampoo packaging. Based on the feedback collected, the stimulus set was refined and optimized to ensure the validity and reliability of the materials used in the formal experiment.

Experimental tasks

To obtain both the distribution of fixation counts across different sample groups and the subjective rating data for each group, participants completed two experimental tasks: a “Sample Group Eye Movement Browsing Task” (Browsing Task) and a “Subjective Scoring Feedback Task” (Feedback Task). In the Browsing Task, participants viewed five sets of display materials corresponding to the different sample groups. The browsing time for each sample group is 10 s, and the rest time between every two sample groups is 5 s; after the browsing task was completed, the subjective feedback form was filled out at the laboratory bench, which consisted of the corresponding numerical information on the Likert five-point scale {Strongly like, Like, Neutral, Dislike, Strongly dislike}, which consisted of the numerical information {5,4,3,2,1}. The experimental task composition diagram is shown in Fig. 3.

Fig. 3.

Fig. 3

Experimental flowchart and experiment task map. (a) Experiment flow, (b) Experimental task.

Data preprocessing and quality control

To ensure the reliability and validity of the eye-tracking data used for analysis and to control for potential confounding variables and biases, a standardized data preprocessing and quality control protocol was implemented prior to formal data analysis. All eye-tracking data were processed in accordance with this protocol.

Quality assurance in eye-tracking data collection

During data collection, rigorous quality control procedures were implemented to ensure the fidelity of the eye-tracking recordings. First, a standard 9-point calibration was performed for each participant using the eye-tracker’s dedicated software before the experiment began. Formal data collection commenced only after the average calibration error consistently remained below 0.5 degrees of visual angle, thus ensuring precise tracking. Data quality was monitored in real time throughout the experiment. If substantial signal degradation occurred due to head movement or frequent blinking, the participant was given a short break and then recalibrated before continuing, to maintain data integrity and validity. To enhance data stability, binocular eye-tracking data were recorded synchronously. All subsequent analyses were based on the spatially averaged coordinates of the binocular fixation points provided by the eye-tracking analysis software. This integration of information from both eyes effectively reduces random noise arising from momentary signal loss in one eye or minor calibration discrepancies.

Standardized data preprocessing pipeline

To ensure the quality, reliability, and reproducibility of the eye-tracking data used for analysis, all raw gaze trajectory data underwent standardized preprocessing to extract stable, behaviorally meaningful aggregate metrics for subsequent analysis. The specific procedure was as follows. First, the raw gaze trajectory data were exported. A velocity-threshold identification algorithm, built into the eye-tracking analysis software, was then used to automatically classify the raw data stream into fixation events, saccade events, and invalid data points. For each pre-defined Area of Interest (AOI) within an experimental trial, all fixation events occurring within that region were extracted. Subsequently, a two-step data cleaning procedure was applied: (1) Brief fixations lasting less than 60 milliseconds were merged into adjacent, longer fixation events. This corrected potential oversegmentation by the event detection algorithm and ensured that each fixation event corresponded to a meaningful unit of information processing. (2) Continuous segments of invalid data, caused by significant head movement, prolonged blinking, or persistent signal loss that could not be reliably classified as any valid oculomotor event, were marked as missing. Data from these marked periods were excluded when calculating aggregate metrics for each AOI.

It should be noted that marking invalid data segments as missing and excluding them from the calculation of aggregate metrics is a widely adopted practice in eye-tracking research to ensure the reliability of the derived indices. This approach may result in a loss of micro-level temporal continuity. However, the objective of this study was to develop a predictive model linking macro-level eye-movement metrics (e.g., total fixation duration) to overall preference, rather than to analyze millisecond-level dynamic gaze patterns.

Definitions of key eye-tracking metrics

To eliminate terminological ambiguity and establish a clear foundation for subsequent analysis, strict operational definitions are provided for all core eye-tracking metrics used in this study43. These definitions adhere to standard terminology in the field of eye-tracking research. The calculation method, cognitive interpretation, and specific reference within the context of this study for each metric are detailed in Table 1.

Table 1.

Eye-tracking metric homogeneity test.

Eye movement indexes Full name Definition and description
FC Fixation count The total number of fixation events occurring within an Area of Interest (AOI)
FD Fixation duration The sum of the durations of all fixation points within a specific AOI
TTFF Time to first fixation The elapsed time from stimulus onset until the first fixation point lands within the target AOI
FFD First fixation duration The duration of the very first fixation event within a specific AOI
OL Observation length The total dwell time within an AOI. This includes both fixation and saccade times
OC Oculomotor capture The total number of times a participant’s gaze enters a specific AOI is used to measure re-visitation frequency to that area
FB Fixation blink A brief data gap caused by blinks or signal loss, with a duration below the minimum fixation threshold

*FD represents the cumulative sum of ‘effective fixation time’ during which cognitive processing occurs. In contrast, OL measures the ‘total visit time’ within an AOI, encompassing both effective fixations and saccadic movements.

Statistical reliability and control measures

Multiple strategies were employed to ensure statistical reliability and control systematic bias. The measurement consistency was first verified by calculating the intraclass correlation coefficient for key eye-tracking metrics, confirming good test-retest reliability. Second, a Latin square design was used to counterbalance the presentation order of the five packaging sample groups, thereby eliminating order effects. A single-blind experimental procedure was implemented, and environmental variables such as lighting, temperature, and humidity were controlled within a standardized laboratory setting to minimize variability introduced by the experimenter or environment. These standardized protocols collectively established a high-quality, controlled dataset, providing a solid foundation for the study’s internal validity.

Through these transparent and standardized procedures, the dataset underlying all subsequent statistical analyses was ensured to be of high quality, reliability, and control, providing robust support for the internal validity of this research.

Based on the rigorously preprocessed and quality-controlled dataset of eye-tracking and preference ratings, descriptive statistical analyses were first conducted on the subjective preferences and eye-tracking metrics for each sample group to investigate the fundamental influence of design elements on users’ visual behavior and preferences.

Experimental results and discussion

Subjective preference score

The Subjective preference score of different groups in the eye movement experiments is shown in Fig. 4, where “Sample1-5” indicates the number of the sample group. S1-1 to S1-4 on the abscissa in the figure represent the mean preference of different elements in the sample group of Sample1.

Fig. 4.

Fig. 4

Means of preferences for different sample groups.

In the saturation sample group, the average preference score for the medium-high saturation sample (S1-3) was significantly higher than for the other three samples (S1-3 > S1-2 > S1-4 > S1-1). Post-hoc tests showed that S1-3 scored significantly higher than the low-saturation sample (S1-1) (p < 0.001, Cohen’s d = 1.12, 95% CI [0.55, 1.68]) and also outperformed the high-saturation sample (S1-4) (p < 0.05, Cohen’s d = 0.65, 95% CI [0.10, 1.19]). According to Cohen’s guidelines (1988), these effect sizes indicate large and medium effects, respectively, suggesting that saturation’s impact on preference is not only statistically significant but also practically meaningful. In the lightness sample group, the highest preferences were for samples with medium to high lightness, particularly S2-3, whereas low and high lightness samples were less preferred. A clear preference gradient appeared in the hue sample group, with the mean preference for the cool color sample (S3-1) higher than that for warm and neutral colors. In the graphical proportion sample group, the pure text sample had the lowest mean preference (S4-1), while the few-image scale samples had the highest, followed by the multi-image proportion samples. Lastly, in the morphology sample group, the more rounded and smooth type (S5-2) had a significantly higher mean preference than the others, with the sharp sample receiving the lowest preference score.

Eye-tracking metric

In this study, all time-based eye-tracking metrics, including TTFF, FD, FFD, and OL, were recorded and analyzed using seconds (s) as the unit of measurement. All subsequent statistical analyses are based on this unit. By applying eye-tracking metric data, the number and length of fixations in the region of interest can indicate that the target in this region is more interesting and attention-grabbing to the observer, or more critical to the subject. Two typical indicators, fixation count (FC) and FD, are combined to analyze the fixation situation of the subjects. As shown in Fig. 5.

Fig. 5.

Fig. 5

Mean values of eye movement indices in different sample groups. (a) Saturation group, (b) Brightness group, (c) Hue group, (d) Graphical proportion group, (e) Morphology group.

Figure 5 illustrates the differences in the mean values of the two metrics, fixation count and fixation duration, across different sample groups. As shown, the area of interest with medium-high saturation (S1-3) exhibited the highest values for both fixation count and fixation duration. Both metrics followed the trend S1-3 > S1-2 > S1-4 > S1-1, indicating the highest level of participant interest in the medium-high saturation region and the weakest visual appeal for the low-saturation packaging design. Statistical tests further revealed that the fixation duration for the medium-high saturation area of interest (S1-3) was significantly greater than that for the low-saturation area (S1-1) (p < 0.001, Cohen’s d = 0.94, 95% CI [0.39, 1.49]). This large effect size indicates a substantial difference in the ability to attract and maintain user attention between these conditions. In the lightness group, the medium-high brightness regions of interest represented by S2-2 garnered the most attention from the subjects. Unlike the data trend observed in the saturation region, the mean values for low and high brightness regions in the lightness group were not significantly different across the two indicators, and both obtained only a lower level of attraction. In the hue sample group, a decreasing gradient trend of S3-1 > S3-2 > S3-3 indicates that the cool color series performed best in capturing user attention and interest. In the graphical proportion group, S4-2 > S4-3 > S4-1 denotes the interest area of a lesser picture proportion, while users do not receive the pure text type of package decoration design. Finally, in the morphology sample group, S5-2 > S5-3 > S5-1 > S5-4 signifies a rounded and smooth morphology, demonstrating that overly rounded and smooth designs, as well as excessively sharp packaging, reduce user favorability.

Eye movement heatmap

All eye-tracking heatmaps in this study were generated using a group-aggregated visualization method based on the fixation data from all 30 participants under corresponding experimental conditions. The specific procedure was as follows: the standard heatmap generation function of professional eye-tracking analysis software was employed. Each valid fixation point for each participant was modeled as a two-dimensional Gaussian kernel parameterized by its spatial coordinates and duration. These kernels were then summed and normalized to produce a heatmap reflecting the relative density of visual attention distribution across the participant group. In the resulting heatmaps, areas in warm colors (e.g., red, yellow) indicate regions with the highest cumulative fixation duration and attentional focus. In contrast, areas in cool colors (e.g., green) indicate regions receiving less fixation. This visualization enables intuitive observation of the degree of user attention allocated to different AOIs44. In relevant research on eye-tracking heatmaps, it has been noted that using fixation duration as the metric for heatmap generation is more effective45, as it reflects greater information and value than a user’s fleeting glance.

As shown in Fig. 6, the medium- and medium-high-saturation sample groups attracted greater visual attention from participants, a pattern consistent with the results observed for brightness. When combined with participants’ post-test subjective ratings, these findings suggest that medium-to-high brightness levels are more likely to be preferred in the colour design of personal care packaging. In the hue group, a noticeable difference in attention was observed between the cool-tone and neutral-tone areas of interest. Within the image–text ratio group, the low-image-ratio design received the highest degree of fixation, with gaze concentrated primarily on the key graphic components of the area of interest. In the morphology group, the rounded-smooth shape attracted the most attention, whereas fixation levels for the other morphological variations showed minimal differentiation. Overall, the attention patterns revealed by the eye-tracking heatmaps align with the quantitative eye movement metrics, thereby providing convergent support for the interpretation of the eye-tracking data. The heatmaps offer an effective visual representation of fixation density and duration, complementing the numerical eye-tracking indices.

Fig. 6.

Fig. 6

Eye movement hotspots of different sample groups (Heat maps were generated using Tobii Pro Lab software, version 1.116). (a) Saturation group eye movement hotspot, (b) Brightness group eye movement hotspot, (c) Hue group eye movement hotspot, (d) Graphical proportion group eye movement hotspot, (e) Modeling morphology group eye movement hotspot.

The distribution of heatmap intensity reflects participants’ attentional preferences during viewing and offers indirect insight into their immediate cognitive processing tendencies. Fixation hotspots were predominantly clustered in the decorative and primary graphic regions of the packaging, indicating that during the initial viewing phase, participants’ cognitive resources were directed mainly toward overall aesthetic impression, atmosphere perception, and the extraction of key visual information. In contrast, textual areas—such as ingredient lists and functional descriptions—received relatively limited fixation. This suggests that in brief “first-glance” encounters typical of fast-moving retail environments, the visual impact of packaging and the immediate brand impression it conveys play a more decisive role in capturing consumer attention and shaping initial preference than does detailed textual content. This observation is consistent with the rapid decision-making processes that characterize modern consumer behaviour.

Discussion

A comprehensive analysis of the eye-tracking metrics and subjective user-preference data from this experiment revealed systematic differences in participants’ preferences for the various shampoo packaging design elements under the study’s specific conditions. Based on data from the tested samples and the Chinese university student cohort, the results indicate that, within the context of shampoo packaging appearance design, the following combination of factors exhibited relatively higher attractiveness: medium–high saturation, medium–high brightness, cool tones, a low image-to-text ratio, and relatively rounded morphologies. Correspondingly, designs may benefit from avoiding excessively low or high saturation/brightness, neutral colour tones, purely textual decoration, and sharp morphological forms. It is emphasized that these observations were obtained under controlled conditions in which variables such as layout and typography were kept constant, thereby isolating fundamental visual elements. Their value lies in offering preliminary, trend-based evidence to inform subsequent design exploration in similar contexts.

These systematic preference patterns can be further interpreted through colour psychology and emotional design theory. First, the inverted U-shaped preference curve for saturation (medium–high > high > low) reflects the ‘optimal arousal level’ principle in visual perception. Excessively low saturation lacks visual impact and fails to attract attention, whereas overly high saturation can appear glaring or unnatural, causing discomfort. Medium–high saturation strikes a balance between attentional capture and visual comfort.

Second, the significantly higher preference scores and visual attention directed towards cool tones (e.g., blue hues) align with the “cool colour effect” described in colour psychology. In packaging design, cool tones are commonly associated with cleanliness, freshness, rationality, and trustworthiness, attributes that resonate with the core values intended for personal care products and likely elicited more positive emotional reactions within this experimental setting. It should be noted, however, that colour preference can be strongly influenced by cultural background; thus, these findings are primarily applicable to the specific group studied.

Finally, the pronounced preference for rounded, smooth forms in the morphology group can be understood through a dual theoretical lens. From an emotional design perspective, rounded morphologies operate at the visceral level, directly evoking impressions of safety, softness, and approachability. From a cognitive psychology standpoint, processing fluency theory suggests that continuous, smooth curves require less cognitive effort to process than sharp angles, and this fluency itself enhances aesthetic experience. This theory further provides a key theoretical perspective for understanding the consistent association observed in this study between longer fixation durations and higher subjective preference ratings. It posits that, during rapid visual evaluation under low cognitive load, design elements that are more visually harmonious and easier to process, such as rounded morphologies or cool tones with positive semantic associations, likely lead to higher processing fluency. This enhanced fluency can directly induce positive affective experiences and preference. Consequently, longer fixations may reflect deeper, more pleasurable visual exploration facilitated by high fluency, thereby operationally linking eye movement behavioral metrics to intrinsic affective experience. This explains why sharp morphological lines tend to be perceived as cold or threatening and are generally less preferred.

Comparing the mean trends of the eye-tracking metrics—fixation count and fixation duration—across the five sample groups with participants’ post-experiment subjective ratings revealed a consistent pattern. This indicates that eye-tracking metrics can reflect users’ emotional attitudes and preferences toward visual stimuli with reasonable accuracy. However, it is crucial to cautiously distinguish between “visual attention” and “emotional preference.” Longer fixations may also stem from cognitive confusion or aversion. In this specific study context, interpreting the observed eye-tracking patterns as indicators of “positive preference” was supported primarily by three lines of evidence. First, the nature of the task, free viewing of familiar FMCG (shampoo) packaging, entailed relatively low cognitive load and informational complexity, reducing the likelihood that “confusion” was the dominant explanation. Second, a high degree of concordance between behavior and report provided the core empirical evidence: the eye-tracking data and post-test subjective preference ratings showed a highly consistent synergistic effect. A stable finding was that designs performing better on eye-tracking metrics (e.g., shorter TTFF, longer FD) consistently received significantly higher subjective preference scores. This correlation between objective physiological response and subjective report provided key support for linking these specific eye-tracking patterns to positive preference in this context. Furthermore, this interpretation aligns with established theoretical predictions, as the best-performing design elements (cool tones, rounded forms) correspond with classic findings known to elicit positive affect. Therefore, this study confirms the high feasibility of combining eye-tracking measurement with subjective assessment for packaging design evaluation. The design preferences derived from the experimental data and theoretical analysis offer theoretically grounded, practical guidance for personal care packaging design. It should be clarified that the multiple linear regression model adopted here aimed to establish a transparent, interpretable baseline; therefore, interaction terms between metrics were not included. Investigating potential interactions among different eye-tracking metrics (e.g., between very short TTFF and very long FFD) represents a valuable direction for future research. Furthermore, individual differences (e.g., in aesthetic taste or brand preference) exist and likely account for part of the variance unexplained by the model. The Leave-One-Subject-Out Cross-Validation (LOSOCV) results, however, indicate that the model maintained robust predictive performance even when confronted with unobserved individual differences.

In summary, under the conditions of this study, which involved an initial, visually driven evaluation of familiar FMCG products in a low-cognitive-load task, specific eye-tracking patterns (faster attentional capture, longer early and sustained fixations) effectively predicted subjective preference. This relationship is supported by processing fluency theory and validated through multiple lines of evidence, including the nature of the task, the concordance between behavioral and self-reported data, and theoretical consistency. In research contexts involving highly complex products, deep textual reading, or high-stakes decisions, the relationship between eye-tracking metrics and preference may become more complicated or even reversed. Therefore, the interpretation of eye-tracking data must be closely tied to the research task, product type, and experimental context.

Following the systematic investigation of single-element effects, a second experiment was conducted to develop a quantitative model to predict overall preference, with the aim of examining the relationship between multidimensional eye-tracking metrics and preference for integrated packaging designs.

Eye movement test and preference analysis of comprehensive packaging design elements

Eye-tracking technology provides indicators that help researchers understand the distribution of a user’s attention across elements of a stimulus, as well as the user’s decision-making and cognitive processes. To assess whether a product’s visual design aligns with business objectives and whether its layout meets user expectations, studies have shown that changes in pupil diameter can reflect cognitive and emotional shifts. Thus, eye-tracking metrics can reveal users’ emotional responses and, to some extent, shape their preferences toward visual stimuli.

This chapter investigates the relationship between multi-index eye-tracking data and user preference through eye-tracking tests, constructing a predictive model using multiple linear regression.

Experimental samples

The sample of this experiment is a unit group composed of five different shampoo packaging appearance design schemes, and the five schemes are divided into different AOI interest areas, denoted as AOI1, AOI2, AOI3, AOI4, and AOI5 from left to right, respectively, to study the fixation distribution of the subjects’ fixation in the five different interest areas. As shown in Fig. 7. The packaging shown is an abstract schematic representation created for this eye-tracking study. The forms depicted are generic representations of common types. This figure has been redrawn to remove all brand identifiers and is used solely for the analysis of the design variables in this research. The five schemes involve design factors such as shape, decoration, and color in packaging design, and the styles presented differ, providing a sufficient basis for evaluating differences in eye movement indicators and preferences in the later stage.

Fig. 7.

Fig. 7

Sample group of comprehensive elements. (a) AOI1, (b) AOI2, (c) AOI3, (d) AOI4, (e) AOI5.

Experimental results and analysis

Preference score

Following the eye movement test, 30 participants completed a preference feedback form regarding five different packaging schemes, using a 5-point Likert scale. After collating the relevant data, the average trend of the preference scores of 30 subjects for five different interest areas was obtained. As shown in Fig. 8, it can be seen that AOI5 > AOI2 > AOI4 > AOI3 > AOI1. The subjects prefer the fifth design scheme. This shampoo packaging scheme utilizes the transparent material of the cool tone series, creating a clear, clean, and fresh visual and psychological experience for the subjects in visual psychology. In contrast, Scheme 1, with the lowest mean preference score, used a darker, large area of black with a red appearance, which created a sense of dreariness and insecurity. The fourth scheme is rounded smoothness in morphology, and the use of transparent materials can also create a clear atmosphere. However, the lightness and cleanliness feeling required by the shampoo packaging design is not as strong as the second scheme. In the third scheme, the color brightness of the package is low, and the large volume sense will bring people a feeling of heaviness.

Fig. 8.

Fig. 8

Preference degree of the comprehensive element sample group.

Eye-tracking metric data

The eye movement indicators that effectively reflect the user’s psychology and attitude in the area of interest primarily include the number of fixations, pupil size, fixation duration, duration of the first fixation, first fixation time, saccadic time, number of observation points, and length of observation time.

As shown in Fig. 9, the first fixation duration, represented by First Fixation Duration (FFD), the fixation duration, represented by FD, and the number of fixations, represented by FC, all showed the same trend in their means: AOI5 > AOI2 > AOI4 > AOI3 > AOI1. AOI5 received the highest values for the first visual point duration, fixation duration, and number of fixations among the subjects. From the conclusion of the eye movement test with a single packaging design element, it is evident that AOI5 is more attractive to users, while subjects show less interest in AOI1 and AOI3.

Fig. 9.

Fig. 9

Eye movement indicators of the comprehensive element sample group.

However, the time to first fixation, measured by the eye-tracking metric TTFF (Time to First Fixation), shows a clear opposite trend: AOI1 > AOI3 > AOI4 > AOI2 > AOI5. This is because the time to first fixation indicates how long it takes for the user to initially notice the area of interest. The shorter the fixation time, the more likely the area is to attract the subjects’ attention immediately.

Eye movement heatmap

As shown in Fig. 10, the heatmap of the multi-scheme sample group indicates that AOI5 has the highest hotspot effect, marked by strong red patches. This suggests that this area records the most fixation counts and the longest fixation duration. The red patches are primarily located in the decorative design area of the packaging, showing that users’ cognitive focus is mainly on interpreting the ornamental design. Additionally, the hotspot areas of other interest regions are also concentrated in the graphical decoration part of the packaging, highlighting that users are more attracted to the decorative elements among the packaging’s visual features design.

Fig. 10.

Fig. 10

Hot spot map of the comprehensive element sample group (Heat maps were generated using Tobii Pro Lab software, version 1.116).

Based on the integrated analysis of the aforementioned eye-tracking metrics and heatmap visualizations, the different packaging design schemes exhibit systematic variation across multiple dimensions of visual attention, including initial attraction and sustained engagement. To investigate how these multidimensional eye-tracking metrics collectively predict users’ subjective preferences and to establish a quantitative predictive relationship, a model relating eye-tracking indicators to preferences will be constructed next.

Model of the relationship between eye movement indicators and preference

Determine the indicators

A two-stage indicator screening procedure was implemented prior to including eye-tracking metrics in the multiple linear regression model to ensure robustness and parsimony.

First, homogeneity-of-variance tests were conducted using the packaging design solutions as the grouping variable. These tests verified whether each eye-tracking metric met the prerequisite assumptions for subsequent parametric analyses. The results are presented in Table 2. The study revealed that the FB metric had a significance value below 0.05, indicating heteroscedasticity and failure to meet parametric test assumptions. This metric was therefore excluded from subsequent analyses. The remaining eye-tracking metrics (OC, FL, OL, FFD, TTFF) all demonstrated p-values greater than 0.05, confirming homogeneity of variances across groups and successful completion of the initial screening. This step eliminated metrics that did not meet the fundamental prerequisites for analysis.

Table 2.

Eye-tracking metric homogeneity test.

Eye-tracking metric Levene statistic df1 df2 Sig
OC 1.025 15 100 0.364
FD 1.785 15 100 0.068
FB 2.815 15 100 0.029
OL 0.419 15 100 0.203
FFD 1.391 15 100 0.214
TTFF 0.537 15 100 0.664

Subsequently, a one-way analysis of variance (ANOVA) was performed to identify preliminary predictors. The five metrics that passed the homogeneity-of-variance test were analyzed using a one-way ANOVA, with packaging design preference scores as the dependent variable. This analysis aimed to determine whether significant differences existed in these metrics across packaging designs with different preference levels, thus evaluating their potential association with user preference. Results in Table 3 showed that FL, OL, FFD, and TTFF had significance values below 0.05, indicating statistically significant correlations with user preference. These metrics were then selected as candidate independent variables for building the regression model. In contrast, the OC metric had a significance value greater than 0.05, indicating no significant association with preference degree, and was consequently omitted from further regression analysis. This step provided the key rationale for selecting the final four core metrics.

Table 3.

Single-factor ANOVA of eye movement indexes.

Eye movement indexes F Sig
OC 0.625 0.581
FD 3.185 0.018
OL 2.517 0.021
FFD 1.691 0.037
TTFF 2.237 0.029

In summary, a two-step screening process involving tests for homogeneity of variance and one-way analysis of variance (ANOVA) identified four core eye-tracking metrics: FD, OL, TTFF, and FFD. This procedure ensured that the final set of metrics used to construct the E-Dop model satisfied the fundamental assumptions of statistical testing and provided preliminary evidence of a significant correlation with the dependent variable (user preference), thereby establishing a foundation for a robust multivariate predictive model.

Multiple linear regression

To construct the E-Dop predictive model, multiple linear regression was employed as the core analytical method. During this exploratory model-building phase, multiple linear regression offers distinct advantages due to the transparency of its parameter estimates and the strong interpretability of its results. It enables direct testing of theoretical relationships between selected eye-tracking metrics and user preference within an intuitive framework, thereby providing a clear, comparable benchmark for the development of more complex models. While acknowledging that more sophisticated modeling techniques (e.g., machine learning algorithms) represent a natural direction for future model evolution, establishing a robust, interpretable multiple linear regression baseline model at this stage is essential for validating the core methodological hypothesis—that multidimensional eye-tracking metrics can systematically predict subjective preference.

A standard multiple linear regression was conducted by entering all independent variables into the model. As shown in Table 4, the adjusted R2 value was 0.702. This indicates that, under the controlled conditions and with the homogeneous sample of this study, the linear model explains over 70% of the variance in preference scores. In behavioral science and user experience research, especially for models involving complex human preferences, this level of explained variance is generally seen as indicating good to excellent explanatory power, providing strong initial evidence for the predictive effectiveness of the proposed model. A more significant statistic in the table is the Durbin-Watson value of 1.691. Its closeness to the ideal value of 2 indicates the absence of significant serial correlation or spurious regression in the data, thereby satisfying a key assumption of the regression model. The Durbin-Watson (DW) statistic is an important measure in regression analysis, used to determine whether there is serial correlation or autocorrelation in the data. When serial correlation exists, it suggests that the regression might be a pseudo-regression, which could undermine the credibility of the results.

Table 4.

Model summary table.

Model R R-squared Adjust the R-squared Error in standard estimates Durbin-Watson
1 0.840a 0.706 0.702 1.8143 1.691

The ANOVA table is shown in Table 5. The original hypothesis of the analysis of variance is that all independent variables have no significant impact on the dependent variable. The significance level obtained based on this original hypothesis is 0.000, which is significantly less than 0.05, indicating that the probability of the event that all independent variables have no significant impact on the dependent variable is 0%. Therefore, the null hypothesis can be rejected, indicating that at least one of the four eye-tracking metrics exerts a statistically significant influence on preference scores.

Table 5.

ANOVA of eye movement indexes.

Model Sum of squares Df Mean square F Sig
Regression 19.787 4 4.972 150.971 0.000b
Residuals 8.362 251 0.033
Total 28.143 255

Regression coefficients are presented in Table 6. The coefficients for all four eye-tracking metrics were statistically significant (Sig. < 0.05). The constant in the regression equation was 0.536. The coefficients for the eye-tracking metrics FD, OL, TTFF, and FFD were 0.307, 0.662, − 0.047, and 0.385, respectively. To assess potential multicollinearity and ensure the stability of the coefficient estimates, Variance Inflation Factor (VIF) values were calculated. The VIF values for the predictor variables were as follows: FD (VIF = 2.19), OL (VIF = 2.07), TTFF (VIF = 1.06), and FFD (VIF = 1.23). All VIF values were well below the commonly used threshold of 10 and also below the more conservative threshold of 5, indicating the absence of severe multicollinearity and confirming that the regression results are robust and reliable.

Table 6.

Coefficient table.

Model Unstandardized coefficients Sig
B Standard deviation
(constant) 0.536 0.128 0.002
FD (K1) 0.307 0.023 0.000
OL (K2) 0.662 0.038 0.001
TTFF (K3) -0.047 0.015 0.000
FFD (K4) 0.385 0.013 0.003

*The time-based predictor variables in this table—FD, OL, TTFF, and FFD—are all measured in seconds (s). The unstandardized coefficients (B) represent the change in the preference score (E-Dop) associated with a one-second shift in the respective variable.

Based on the above, the multiple linear regression equation for predicting subjective preference from eye-tracking metrics is constructed as follows: E-Dop = 0.536 + 0.307K1 + 0.662K2–0.047K3 + 0.385K4. The constant in the equation represents the model’s baseline prediction. The positive coefficients for FD, OL, and FFD align with the classic cognitive hypothesis of “attention as interest,” indicating that longer fixation duration, observation length, and first fixation duration are significantly associated with higher subjective preference. The significantly negative coefficient for TTFF provides empirical confirmation of the theoretical importance of “first-glance attraction,” whereby shorter latency to first fixation predicts a higher preference score.

To provide a more comprehensive evaluation of the model’s overall explanatory power and the relative importance of each predictor, the following supplementary analyses were conducted. First, the model’s overall effect size was calculated (Cohen’s f2 = 2.36). According to Cohen’s (1988) guidelines (where f2 ≥ 0.35 indicates a large effect size), this model demonstrates a significant effect, confirming from a standardized perspective that the set of eye-tracking metrics explains a substantial, practically meaningful proportion of variance in preference. Second, to quantify the unique explanatory contribution of each variable after controlling for the others, the squared semi-partial correlation coefficients (sr2) were computed. These represent the proportion of variance in preference scores uniquely explained by each variable. The results are as follows: FFD (sr2 = 0.23) and OL (sr2 = 0.16) contributed the most significant shares of unique variance, followed by FD (sr2 = 0.12), while TTFF made a smaller unique contribution (sr2 = 0.01). This finding suggests that, within the model, “early deep processing” (as reflected by FFD) and “sustained engagement” (as jointly reflected by OL and FD) constitute the core cognitive processes driving user preference. Given that all predictors in this study share the same unit of measurement (seconds), the magnitude of the unstandardized coefficients (B) directly reflects the comparative predictive strength of each variable on the preference score on this common time scale.

Integrating these multidimensional analyses, the model, as an exploratory outcome, effectively combines eye-tracking metrics representing “sustained interest” (FD, OL, FFD) and “initial attraction” (TTFF). It thereby provides a quantifiable preliminary benchmark for understanding and predicting users’ visual preferences under the specific experimental conditions of this study.

Model validation

To rigorously evaluate the predictive efficacy and potential generalizability of the E-Dop model to new, unseen consumers, this study employed Leave-One-Subject-Out Cross-Validation (LOSOCV) as the primary validation framework. This method simulates an idealized scenario in which the model encounters a single new user. In each iteration, one of the 30 participants was held out as an independent test set. The model, with its predefined form and variables, was retrained on data from the remaining 29 participants (with coefficients re-estimated) and then used to predict the preference score of the participant left out. This process was strictly repeated 30 times, ensuring that each participant’s predicted value was generated by a model that had not been trained on their own data. This procedure yields a set of quantitative results specifically for assessing the model’s ability to predict new individuals.

The LOSOCV results provide strong evidence of the model’s predictive consistency within the study’s sample. Analysis revealed a statistically significant correlation between the model’s predicted values and the actual observed values, with a Pearson correlation coefficient of r = 0.84 (95% CI [0.69, 0.92], p < 0.001). The mean absolute error (MAE) of the predictions was 0.24 (95% CI [0.21, 0.28]). Analysis of the prediction error distribution indicated an absence of systematic bias. These results primarily indicate that the model demonstrates good stability within the current sample framework. The “eye-tracking metrics–preference” relationship it captures is not merely an overfitting to specific individual data but can generalize to new individuals within this homogeneous cohort. A scatter plot of predicted versus actual values (Fig. 11) visually demonstrates the linear association between them.

Fig. 11.

Fig. 11

Comparison of predicted and actual values based on LOSOCV.

It must be noted that the evaluation mentioned above was conducted in a controlled laboratory environment and based on a sample with highly similar demographic and educational backgrounds (Chinese university students). Therefore, the favorable performance indicated by the LOSOCV should primarily be interpreted as evidence of reliable “internal validity” and “stability” of the model within the current homogeneous study cohort, providing initial support for the feasibility and predictive potential of the E-Dop methodological framework. Concurrently, it must be explicitly stated that the homogeneity of the sample may have reduced inter-subject variability in preferences. The current results primarily validate the model’s effectiveness within specific boundary conditions and do not equate to possessing generalizable predictive power for broad, heterogeneous external consumer populations.

In summary, using Leave-One-Subject-Out Cross-Validation, this study obtained evidence supporting the model’s accuracy, error range, and statistical significance in predicting new individuals similar to those in the training sample. The results suggest that the predictive relationship established by the E-Dop model under the defined conditions of this study exhibits good robustness without signs of significant overfitting. This provides a solid preliminary empirical foundation for developing the E-Dop model into a prototypical packaging design preference assessment tool that warrants further validation and calibration in broader, more diverse populations. The external validity and generalizability of the model to more diverse consumer groups represent a key direction for future research.

E-Dop model: a three-stage interpretive framework for visual preference formation

Following the validation of the E-Dop model’s predictive robustness via LOSOCV, the focus of this study naturally shifts to exploring its underlying cognitive mechanisms. The primary contribution of the constructed E-Dop model lies in providing a quantitative predictive tool. Meanwhile, the pattern revealed by the model offers an integrative and heuristic perspective for understanding the potential cognitive mechanisms through which packaging design influences consumer preference. Specifically, the finding that four eye-tracking metrics (TTFF, FFD, FD, OL) with distinct cognitive implications can systematically predict preference underpins this perspective. Its essence is an attempt to build an interpretable theoretical bridge between objective visual behavior and subjective emotional preference.

This section aims to elucidate the cognitive stages to which these metrics may map, based on empirical findings and in conjunction with emotional design theory and cognitive psychology. It attempts to construct an explanatory theoretical framework linking the key design elements identified in “Eye-tracking test and preference analysis of a single packaging design element” (e.g., medium-high saturation, cool tones, rounded morphology) to the final formation of preference. Building on this, a “Three-Stage Visual Preference Formation Pathway” (Fig. 12) is proposed. It is crucial to explicitly state that this framework is an exploratory theoretical construct derived from the study’s data patterns, intended to integrate and interpret existing findings. It attempts to explain how a set of objective, time-sequenced metrics maps onto a coherent process of subjective experience formation. It is therefore cautiously positioned as an explanatory framework; the exact causal pathways within it await rigorous future verification through methods such as mediation analysis or structural equation modeling. Nonetheless, it provides a dynamic, temporally sequenced conceptual lens for understanding the potentially distinct roles different eye-tracking metrics play in the preference formation process and how design elements may exert their influence in stages.

Fig. 12.

Fig. 12

A three-stages interpretive framework for visual preference formation.

Stage 1: Initial Attraction and Perceptual Salience (Represented by TTFF). A comprehensive theoretical perspective is required for interpreting the negative coefficient of Time to First Fixation (TTFF). The dual-pathway theory of visual cognition posits that attentional capture can stem from stimulus-driven, bottom-up “visual salience” or be subject to goal-driven, top-down “cognitive modulation” by the observer’s expectations and internal preferences. In this study, packaging designs achieving shorter TTFF (e.g., those with medium-high saturation and cool tones) also received significantly higher subjective preference ratings and longer subsequent fixation measures (FFD, FD, OL). This coherent behavioral pattern, “rapid attraction, deep processing, high subjective preference”, suggests that in the specific context of packaging design evaluation, TTFF may capture not merely an isolated, contrast-driven “obtrusiveness,” but more likely an “effective initial attraction” aligned with subsequent positive cognitive appraisal. In other words, the visual properties of preferred design elements (e.g., the “clean,” “fresh” semantics associated with cool tones) likely resonated with participants’ internal positive schemas for shampoo products. This caused attentional capture to occur not only due to “salience” but also due to “desirability,” triggering a rapid attentional orienting that was “licensed” or “enhanced” by preliminary positive cognitive assessment. Operationally, within the context of the E-Dop model, TTFF is therefore interpreted as a quantifiable behavioral signal of effective initial attraction. It serves as an objectively measured marker that reflects a design’s successful passage through the preliminary screening of a user’s internal positive schema. Functioning as a predictive behavioral marker, it identifies designs capable of swiftly passing early cognitive filters and initiating subsequent positive evaluation processes. Although its absolute contribution is modest, its statistical significance and theoretical coherence establish it as a crucial and necessary starting point in the predictive chain.

Stage 2: Early Interest and Information Decoding (Represented by FFD). First Fixation Duration (FFD) is associated with early cognitive processing immediately following successful attentional capture. Its significant positive predictive coefficient (β = 0.385, p < 0.01) and unique contribution indicate that a design element’s efficacy lies in stimulating cognitive interest and initiating preliminary information decoding at the first encounter. This stage marks the transition from passive attentional capture to active cognitive exploration. For instance, when a “cool-toned” package successfully captures attention, its associated positive semantic connotations may encourage deeper information decoding during the first fixation, effectively “locking” the gaze and manifesting as a longer FFD. An extended FFD may indicate that the design sparked sufficient cognitive interest and preliminary positive appraisal from the outset, laying the groundwork for deeper subsequent evaluation. This finding inspires the hypothesis that a prolonged FFD could represent a critical juncture where objective fixation behavior is closely coupled with early positive subjective appraisal. The quality of early, deep processing may carry significant weight in preference formation. A sufficiently long and engaged first fixation might signify that a design successfully triggered profound information decoding and positive affective evaluation in the initial moment. This “first impression” effect may contribute substantially to the formation of final preference.

Stage 3: Sustained Engagement and Interest Maintenance (Jointly Represented by FD and OL). FD and OL are more closely associated with sustained cognitive evaluation and affective engagement, reflecting the user’s depth of exploration and the design’s capacity to sustain interest. In this stage, factors such as the high processing fluency and pleasure potentially afforded by “rounded morphologies,” combined with the overall harmony of the visual layout, support continued viewing, evidenced by longer FD and OL (positive coefficients of 0.307 and 0.662, respectively). These two metrics, with OL having the most significant unstandardized coefficient, strongly support the classic view in cognitive psychology that “the allocation of attentional resources is a core manifestation of interest and preference intensity”, constituting the primary evidence for predicting high preference. As cumulative objective measures of engagement, FD and OL directly quantify this sustained, positive cognitive and affective involvement. Echoing the “early processing depth” represented by FFD, FD, and OL, together serve as cumulative indicators of “overall engagement volume”, reflecting whether a design, after making a good first impression, continues to offer visually explorable content and a pleasurable experience that sustains user interest, thereby solidifying a positive initial reaction into a stable preference.

The proposed “Three-Stage” framework integrates eye-tracking metrics with distinct cognitive-temporal meanings into a coherent narrative of preference formation. At its core, this narrative systematically maps discrete objective behavioral measurements onto the dynamic process of subjective preference formation. This not only provides a potential theoretical annotation for the predictive mechanism of the E-Dop model but also offers a phased diagnostic logic for design optimization. Based on the data and theory presented, one interpretation of the E-Dop model is as follows: a shorter TTFF can be interpreted as efficiently initiating the evaluation process; a longer FFD may reflect greater depth of critical early processing; and FD/OL are linked to sustained engagement, collectively characterizing preference strength. The notable predictive power of FFD raises the intriguing speculation that early processing depth may carry disproportionate weight in overall preference judgments, a hypothesis warranting future investigation. This framework may provide a heuristic diagnostic tool for design practice. Designers could consider potential issues by referring to metrics at different stages: Is initial attraction insufficient (prolonged TTFF)? Does the design fail to spark early interest (short FFD) effectively? Or is the experience unsustainable (short FD/OL)? This could guide more targeted optimization strategies.

Finally, it must be explicitly stated that the “Three-Stage Visual Preference Formation Pathway” proposed here is a preliminary, exploratory, and interpretive theoretical framework derived from empirical data patterns and intended to inspire future research. It represents an attempt to integrate objective eye-tracking data streams with subjective preference judgments theoretically. This framework is grounded in staged theories of visual cognitive processing and inspired by the systematic predictive patterns and variable importance comparisons observed in this study. However, it is crucial to acknowledge that the multiple linear regression analysis employed was designed to build an efficient overall predictive model and did not statistically test for direct causal mediation effects within the specific sequential pathway of TTFF, FFD, and FD/OL. Therefore, a critical task for future research is to empirically quantify and validate the indirect effects and causal links between these stages using methods such as Structural Equation Modeling, longitudinal experimental designs, or formal mediation analysis. This represents the essential next step in evolving this interpretive framework into a rigorously tested, mature theoretical model.

While this study provides preliminary validation of the E-Dop model’s effectiveness, as an exploratory investigation, its conclusions exist within specific boundary conditions. The following section will transparently discuss the limitations of this study and outline future research directions accordingly.

Study limitations and future directions

This study developed and preliminarily validated an E-Dop model for packaging design, providing quantitative data to support objective evaluation. However, as an exploratory methodological validation study, its conclusions should be interpreted within explicit boundary conditions. A forthright examination of these limitations is intrinsic to scientific rigor and provides a clear starting point for deepening and extending subsequent research. Future work could build on the established framework to explore ways to enhance the model’s applicability, robustness, and explanatory power in more complex scenarios. The following discussion elaborates on limitations across three aspects: sample characteristics and ecological validity, stimuli and measurement scope, and methodology, and transforms each limitation into a defined direction for future research.

Sample representativeness and generalizability

This study possesses inherent limitations regarding external validity, primarily stemming from strategic design choices made to prioritize internal validity. These choices also define the model’s current application boundaries.

First and foremost, it is crucial to note that this study employed a homogeneous sample of Chinese university students (N = 30) to construct and validate the model. The selection of this limited and homogeneous sample aligns with the exploratory phase’s priority on internal validity. The strategic aim during initial model building was to control for extraneous variables such as age and broad cultural background, thereby clarifying the core relational chain of “design elements, eye-tracking response, subjective preference” to validate the feasibility of the E-Dop methodological framework. However, while this strategy facilitates the establishment of a clear initial model, it also constitutes the primary limitation regarding the external validity of the study’s conclusions. The limited sample size affects the statistical power and stability of the model’s parameter estimates. Therefore, the currently constructed E-Dop model should first be regarded as a proof-of-concept and trend-prediction tool effective within this specific cohort. More specifically, the high homogeneity of the sample in terms of age, educational background, and cultural context implies that the identified design element preferences (e.g., positive responses to cool tones and rounded forms) and the predictive weights of the E-Dop model itself are likely moderated by these group characteristics. Although this demographic constitutes a core consumer group for products like shampoo, lending direct relevance to the findings of this market segment, it must be explicitly stated that the model’s direct applicability is most relevant to consumer groups with characteristics similar to this sample (young, highly educated). Furthermore, the cultural homogeneity of the sample implies that the findings (e.g., preferences for cool tones and rounded morphologies) may be influenced by specific cultural contexts. Extrapolating it to consumers of different ages, cultural backgrounds, or product categories requires extreme caution and necessitates systematic calibration and validation through larger, more diverse samples in future research.

Second, the experimental environment was highly controlled, utilizing on-screen presentations of static packaging images in a standardized laboratory setting. This setup was a necessary prerequisite for ensuring high-precision eye-tracking data collection and eliminating environmental confounds to test the core methodological hypothesis. However, it differs from the complex, multi-sensory (e.g., tactile), dynamic, and goal-driven decision-making processes consumers engage in within authentic retail environments. Consequently, the efficacy of the current model in more complex, real-world shopping scenarios remains to be tested.

Based on these considerations, future research can extend this work in the following directions. First, the E-Dop model should be validated and calibrated using larger, more demographically diverse samples (encompassing different ages, genders, cultures, and regions) to precisely define its boundaries of generalizability and to investigate the potential moderating effects of these variables systematically. Specifically, future work could delve into how specific contextual factors, such as lifestyle and product usage scenarios, moderate visual preferences, testing whether consumer preference for packaging attributes, such as roundedness versus sturdiness in form or the “freshness” semantics of cool tones, varies according to differences in their living environment or usage habits. This expansion of the empirical data foundation is essential for subsequently exploring personalized model calibration or context adaptation. Second, the applicability and transferability of the E-Dop framework should be systematically tested across different product categories (e.g., food, electronics) to conduct cross-group and cross-category validity studies. Third, ecological validity should be significantly enhanced, for example, by testing the model in Virtual Reality (VR)-simulated shopping environments or combining it with multi-sensory evaluations of physical packaging prototypes to examine the model’s robustness in scenarios that approximate real-world decision-making.

Depth and breadth of experimental stimuli and design elements

A second layer of limitations pertains to the deliberate focus on a specific set of design elements and the measured affective dimension. This focus clearly defines the theoretical and applicative scope of the current model while simultaneously charting directions for its potential integration with advanced computational methods.

This study systematically manipulated five core visual variables: color (hue, saturation, brightness), image-to-text ratio, and morphology (rounded-to-sharp continuum). This selection, focusing on highly parameterizable two-dimensional visual attributes, was aligned with the goal of constructing a precise and controllable initial model. However, packaging design is a multi-dimensional entity. Other significant attributes, such as texture, material perception, specific typographic styles, and key ergonomic properties (e.g., gripability, anti-slip features, ease of opening), were not incorporated. These elements either rely heavily on tactile interaction and three-dimensional physicality or involve complex interaction parameters, representing highly valuable areas for future exploration.

It must be explicitly stated that the core construct measured and predicted in this study is “immediate preference directly driven by the visual design of packaging.” This construct was operationalized by characterizing it with objective eye-tracking metrics (e.g., total fixation duration, time to first fixation) and by correlating it with subjective preference ratings (Degree of Preference). This definition closely corresponds to the “visceral level” response in emotional design theory—the rapid, automatic affective tendency directly elicited by the design’s appearance. This study did not intend to measure more complex, multi-dimensional emotional states (e.g., pleasure, arousal) or deeper cognitive evaluations (e.g., brand identity). Therefore, the E-Dop model can be regarded as an effective predictive tool for this specific dimension of “visually-driven immediate preference.”

Future research can advance along two complementary paths. First, broadening the design variables and exploring multimodal integration: Building upon the established visual model, future work could systematically integrate tactile attributes and ergonomic variables to examine the interactive effects of multimodal perception on preference. For instance, using high-fidelity 3D-printed prototypes could test whether visually preferred forms maintain their appeal during actual handling experience. Second, deepening affective correlation and constructing hybrid analytical frameworks: The “immediate preference” predicted by the E-Dop model could be correlated with broader emotional measurement tools or neurophysiological indicators (e.g., Electroencephalography, EEG). In particular, the synchronous collection of multi-modal data (e.g., eye-tracking + EEG) would lay the groundwork for constructing hybrid computational models capable of capturing the complex mapping of ‘neuro-perceptual-preference’ relationships. Models integrating, for example, deep learning feature extraction with fuzzy inference could better explain how design elements are integrated through multi-sensory channels to influence decision-making ultimately. This approach would help establish the criterion-related validity of the construct predicted by our model within a richer network of affective and cognitive theories.

Methodological and model complexity

The methodological approach adopted in this study represents a transparent and robust baseline, establishing a clear comparative foundation for subsequent evolution. Concurrently, there is a clear recognition of its limitations regarding complexity, precision, and scope of application, which directly inform several promising directions for future research.

Firstly, there is considerable scope for further refinement in model interpretation and the granular evaluation of feature contributions. The multiple linear regression model employed here, valued for its parameter transparency and intuitive interpretability, is well-suited to the exploratory phase of validating fundamental hypotheses about relationships among core variables. Analysis of the significance, direction, and relative magnitude of the regression coefficients provided a foundational interpretation of the necessity and relative weight of the four predictors, FD, OL, TTFF, and FFD, within the model. For instance, the large positive coefficient for OL indicates a significant contribution to the prediction of preference. In contrast, the substantial negative coefficient for TTFF provides empirical support for the notion that rapid initial attraction facilitates preference formation. However, more granular feature-importance assessment techniques, such as ablation experiments (systematically removing specific features to observe changes in model performance) or sensitivity analyses, could quantify the independent contributions and interaction effects of each predictor from a perspective focused on model robustness and feature dependencies. Such analyses are key to advancing a predictive model from “preliminarily effective” to “systematically optimized.” As an exploratory study with the primary goal of validating the basic feasibility of the E-Dop methodological framework, the focus at this stage was on establishing a complete, interpretable baseline model. Consequently, systematic feature ablation analysis and comparative studies with more complex non-linear models are explicitly positioned as core tasks for a future phase dedicated to model performance optimization and mechanistic exploration.

Secondly, substantial potential exists for enhancing the precision of data acquisition and processing. The current study relies on widely adopted, standardized pipelines for eye-tracking data processing. Future work could integrate more advanced time-series data imputation algorithms, such as deep learning methods based on self-attention mechanisms (e.g., the SAITS model)46, to achieve higher-fidelity data recovery at the source, potentially improving the accuracy of dynamic metric calculations. Similarly, for studies involving three-dimensional spatial perception, exploring more precise gaze estimation models (e.g., the “Cyclopean eye” approach) compared to simple binocular averaging could be pursued to mitigate vergence errors47.

Finally, the current model architecture and explanatory capacity represent a starting point. Although the adjusted R² (0.702) and LOSOCV results demonstrate good internal consistency and predictive trend under the controlled conditions of this study, approximately 30% of the variance remains unexplained. This unexplained portion may stem from complex nonlinear relationships not captured by the linear model; unmeasured individual differences (e.g., long-formed aesthetic style, brand knowledge); latent factors related to consumer background, such as preferences shaped by specific usage habits, lifestyle, or cultural context; and contextual factors not replicated in the experimental environment (e.g., purchase task, brand familiarity). Furthermore, for future research involving multi-unit or multi-modal modeling, the standardized reporting of standardized coefficients will be considered an important and necessary practice to facilitate more comparable scholarly dialogue across studies. Future research could advance along two complementary paths: first, the evolution of model architecture, introducing models like Random Forests or Support Vector Machines on larger and more diverse samples to probe non-linear interactions, or exploring deep learning models that incorporate temporal features; second, the deep integration of multimodal data to construct a “neuro-behavioral-report” framework that combines eye-tracking, neurophysiological signals (EEG, GSR), and contextual variables to enhance ecological validity and explanatory depth.

In summary, the primary contribution of this study lies in establishing a quantifiable methodological pathway linking the visual features of packaging design, eye-tracking behavior, and subjective preference. The methodological limitations discussed in this section aim to define the scope of applicability of the current work objectively and systematically chart its evolution. Through continued exploration along these directions, the E-Dop framework holds the potential to evolve from a robust baseline model into a progressively more robust, interpretable, and insightful paradigm for design evaluation and decision support.

Conclusion

This exploratory study successfully developed and preliminarily validated an E-Dop predictive model within a particular consumer group (university students) and product category (shampoo packaging). The results demonstrate that, under the controlled experimental conditions of this study, integrating multiple eye-tracking metrics through multiple linear regression can effectively quantify and predict users’ subjective design preferences. The model passed the rigorous test of LOSOCV, exhibiting a highly significant correlation between predicted and actual preference scores, along with a low mean absolute error (MAE). These findings primarily confirm that the model possesses good internal validity and stability within the studied cohort, providing positive initial evidence for its predictive potential.

The experiments further identified that, under the tested conditions, design elements such as medium-high saturation, cool tones, and rounded morphologies were key drivers for enhancing visual attention and user preference. This research moves beyond simple metric correlations by evolving the E-Dop model into a conceptual framework that explains the mechanisms of visual preference formation. Within this framework, a shorter Time to First Fixation (TTFF) is interpreted as an objective behavioral indicator of effective “initial attraction” in packaging design. Conversely, longer total FD and FFD reflect the design’s capacity to elicit deep cognitive processing and sustained interest. This theoretical pathway, from initial attraction to sustained engagement, provides a new analytical perspective for understanding the cognitive and emotional responses elicited by packaging at the visual level. It is crucial to emphasize that the primary objective of the constructed E-Dop model is to predict immediate preference tendencies directly driven by the visual design of packaging. This positioning closely corresponds to the “visceral-level” response in emotional design theory, the rapid, automatic affective response triggered by sensory attributes. Therefore, the model aims to quantify a key antecedent affective component within the complex chain of consumer decision-making. It is not designed to, nor can it replace, the assessment of final choice or purchase intention, which is influenced by higher-order cognitive processes such as brand knowledge, specific product feature comparisons, or detailed contextual evaluation. Consequently, the construction of the E-Dop model represents not merely a predictive tool but also a methodological practice that integrates objective physiological measurement with subjective preference reports, offering a data-driven analytical paradigm for the evaluation of emotional design.

On both theoretical and practical fronts, this work provides important methodological exploration and a preliminary foundation for the objective evaluation and optimization of packaging design. Firstly, it proposes a quantifiable research framework that links multidimensional eye-tracking data with subjective design preference. Secondly, the E-Dop model outlines a potential workflow for data-driven pre-screening in design. Designers or researchers can apply this tool through a streamlined process: (1) create digital prototypes of packaging design alternatives; (2) recruit a small group of representative target consumers for a rapid eye-tracking test; (3) extract and calculate the mean values for key eye-tracking metrics; (4) input these means into the E-Dop equation to compute a quantitative preference prediction score for each alternative. This enables objective comparison and trend prediction before committing to mass production. Ensuring the comparability of results hinges on the standardized design of the test environment, task, and data analysis procedures. This workflow aims to balance practicality and scientific validity, with its effectiveness grounded in the model framework validated by this study.

It must be explicitly stated that the preliminary insights regarding associations between specific design elements and preference, as well as the predictive performance of the model, were obtained under controlled laboratory conditions with specific variables held constant (e.g., layout, brand information) and from a sample with high homogeneity in age, educational background, and culture. Therefore, these findings should be cautiously interpreted as trend evidence and methodological validation within specific boundary conditions, not as universal design rules or a mature predictive tool. Targeted validation and necessary adjustments are required before applying them to other populations or contexts. Their primary value lies in providing initial confirmation of the methodological feasibility of using multidimensional eye-tracking metrics to quantify ‘visually-driven immediate preference,’ and in establishing a clear starting point and direction for subsequent in-depth validation, calibration, and model optimization in broader populations and more complex scenarios.

Testing and extending this model across diverse populations and application scenarios represents a clear and critical direction for future research. Future work should first focus on validating the model’s generalizability through testing on larger, more diverse population samples, while systematically examining the potential moderating effects of demographic variables such as age, gender, and cultural background. Concurrently, methodologically, there is scope to explore more complex statistical models or machine learning algorithms to enhance predictive robustness and better capture potential nonlinear relationships. This study provides a clear pathway and a necessary methodological foundation for this series of subsequent work. In summary, by constructing and preliminarily validating the E-Dop model, this work takes a conceptual and methodological step towards advancing packaging design preference assessment into a more objective and quantifiable research paradigm. All findings from this study must be cautiously interpreted within specific boundary conditions, with the conclusions directly applicable only to the examined consumer group and controlled experimental environment. They provide preliminary empirical evidence and a reference for design optimization targeting similar market segments, while also establishing a starting point for broader subsequent exploration.

Acknowledgements

All the participants are gratefully acknowledged.

Author contributions

Y.Z.X. and J.L.F. conceived and designed the study. Y.Z.X. was primarily responsible for conducting the formal analysis, investigation, data visualization, and writing the original draft. J.L.F. contributed to the methodology, supervised the project, administered its progress, and participated in writing and extensively reviewing the manuscript. H.Y.Z. assisted in the formal analysis and visualization. Q.X.L. and Y.Y.Z. contributed to the methodological development and investigation. All authors reviewed the manuscript.

Funding

This research was funded by the National Key Research and Development Project of China: Green Substitution of Raw Materials for Express Packaging and Ecological Design Technology for Products(2023YFC3904600).

Data availability

The data and materials supporting the findings of this study are available from the corresponding author upon reasonable request.

Declarations

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Mijka, G. & Arne, W. Exploring the role of packaging in the formation of brand images: a mixed methods investigation of consumer perspectives. J. Prod. Brand Manag.34, 186–202. 10.1108/jpbm-09-2023-4738 (2025). [Google Scholar]
  • 2.Yan, W., Jing, J., Xiushuang, G., Jie, W. & Simple = Authentic The effect of visually simple package design on perceived brand authenticity and brand choice. J. Bus. Res.16610.1016/j.jbusres.2023.114078 (2023).
  • 3.Liu, C., Samsudin, M. R. & Zou, Y. The impact of visual elements of packaging design on purchase intention: brand experience as a mediator in the tea bag product category. Behav. Sci.15, 181–181. 10.3390/bs15020181 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Elliot, A. J. & Maier, M. A. Color psychology: Effects of perceiving color on psychological functioning in humans. Ann. Rev. Psychol.65, 95–120. 10.1146/annurev-psych-010213-115035 (2014). [DOI] [PubMed] [Google Scholar]
  • 5.Huang, Z., Dai, X. & Wang, L. More saturated, better performance: how color saturation affects product performance perception. J. Retail. Consum. Serv.88, 104477–104477. 10.1016/j.Jretconser.2025.104477 (2026). [Google Scholar]
  • 6.Alex, M. & Gualtiero, P. Towards a cognitive neuroscience of intentionality. Mind. Mach.28, 119–139. 10.1007/s11023-017-9437-2 (2018). [Google Scholar]
  • 7.Media, S. et al. Packaging design elements and consumers’ decision to buy from the web: A cause and effect decision-making model. Color. Res. Appl. 44, 993–1005. 10.1002/col.22427 (2019). [Google Scholar]
  • 8.Laura, L., Anna, C., Alejandra, B., Mar, L. & Luis, G. Co-creation with consumers for packaging design validated through implicit and explicit methods: exploratory effect of visual and textual attributes. Foods. 11, 1183–1183. 10.3390/FOODS11091183 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Laborde, Q. et al. Vision toolkit part 1. Neurophysiological foundations and experimental paradigms in eye-tracking research: a review. Front. Physiol.16, 1571534–1571534. 10.3389/fphys.2025.1571534 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Modi, N. & Singh, J. An analysis of perfume packaging designs on consumer’s cognitive and emotional behavior using eye gaze tracking. Multimed. Tools Appl.83, 82563–82588. 10.1007/s11042-024-18715-w (2024). [Google Scholar]
  • 11.Gunaratne, N. M. et al. Consumer acceptability, eye fixation, and physiological responses: A study of novel and familiar chocolate packaging designs using eye-tracking devices. Foods8, 253–253. 10.3390/foods8070253 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Gunaratne, N. M. et al. Effects of packaging design on sensory liking and willingness to purchase: A study using novel chocolate packaging. Heliyon510.1016/j.heliyon.2019.e01696 (2019). [DOI] [PMC free article] [PubMed]
  • 13.Kshatri, S. S. & Singh, D. A systematic review on vision-based gaze estimation: advance in computer vision and deep learning. Eng. Appl. Artif. Intell.161, 112066–112066. 10.1016/j.Engappai.2025.112066 (2025). [Google Scholar]
  • 14.Park, H., Lee, S., Lee, M., Chang, M. S. & Kwak, H. W. Using eye movement data to infer human behavioral intentions. Comput. Hum. Behav.63, 796–804. 10.1016/j.chb.2016.06.016 (2016). [Google Scholar]
  • 15.Pearce, M. T. et al. Neuroaesthetics. Perspect. Psychol. Sci.11, 265–279. 10.1177/1745691615621274 (2016). [DOI] [PubMed] [Google Scholar]
  • 16.Nissen, A., Riedl, R. & Schütte, R. Users’ reactions to website designs: A neuroimaging study based on evolutionary psychology with a focus on color and button shape. Comput. Hum. Behav.155, 108168. 10.1016/j.Chb.2024.108168 (2024). [Google Scholar]
  • 17.Yang, S., Chung, W. & Yang, F. Analyzing the packaging design evaluation based on image emotion perception computing. Heliyon10, e31408. 10.1016/j.heliyon.2024.e31408 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Laborde, Q. et al. Vision toolkit part 2. features and metrics for assessing oculomotor signal: a review. Front. Physiol.16, 1661026–1661026. 10.3389/fphys.2025.1661026 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Faura-Pujol, A., Faundez-Zanuy, M., Moral-Viñals, A. & López-Xarbau, J. Eye-tracking calibration to control a cobot. Int. J. Comput. Methods Exp. Meas.1110.18280/ijcmem.110103 (2023).
  • 20.Sielicka-Różyńska, M., Jerzyk, E. & Gluza, N. Consumer perception of packaging: an eye‐tracking study of gluten‐free cookies. Int. J. Consum. Stud.45, 14–27. 10.1111/ijcs.12600 (2020). [Google Scholar]
  • 21.Cadena, E., Ares, G. & Gantiva, C. Effects of health-related packaging cues and individual factors on attention to front-of-pack warning labels: an eye-tracking study among young Colombian consumers. Food Qual. Prefer.136, 105760–105760. 10.1016/j.Foodqual.2025.105760 (2026). [Google Scholar]
  • 22.Applegate, E., Carins, J., Vincze, L., Stainer, M. & Irwin, C. The impact of front-of-package design features on consumers’ attention and selection likelihood of protein bars: an eye-tracking study. Food Qual. Prefer.126, 105427–105427. 10.1016/j.Foodqual.2025.105427 (2025). [Google Scholar]
  • 23.Xian, P. et al. An eye tracking study: positive emotional interface design facilitates learning outcomes in multimedia learning? Int. J. Educational Technol. High. Educ.18, 40–40. 10.1186/s41239-021-00274-x (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Wen, H., Wi, S., Zhu, L. & Wu, X. Impact of High-Barrier packaging design on consumer preference for not from concentrated orange juice. Foods. 14, 2356–2356. 10.3390/foods14132356 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Ami, F. et al. The role of package design typicality on a hand sanitizer purchase. Packag. Technol. Sci.35, 737–751. 10.1002/pts.2676 (2022). [Google Scholar]
  • 26.ShuaiWang et al. Applying Kansei engineering and eye movement to packaging design Elements—A case of tea packaging design. Packag. Technol. Sci.37, 429–442. 10.1002/PTS.2798 (2024). [Google Scholar]
  • 27.En, Y. C., Yuqing, X. S. & Jun, W. Coloring the destination: the role of color psychology on Instagram. Tour. Manag.8010.1016/j.tourman.2020.104110 (2019).
  • 28.J., B. S. In living color? Understanding the importance of color complexity in listing images for accommodation sharing. Tour. Manag.90 (2022). 10.1016/j.Tourman.2021.104487
  • 29.M, B. S., Jasmina, I. & Alicia, K. Roses are red, violets are blue, sophisticated brands have a Tiffany hue: the effect of iconic brand color priming on brand personality judgments. J. Brand Manage.25, 384–394. 10.1057/s41262-017-0086-9 (2018). [Google Scholar]
  • 30.AU, W. C. W., LIN, P. M. C. & CHI, F. Nudging with colors to promote electric vehicle rentals. Ann. Tour. Res.109, 103843–103843. 10.1016/j.annals.2024.103843 (2024). [Google Scholar]
  • 31.Lungwen, K., Tsuiyueh, C. & ChihChun, L. Visual color research of packaging design using sensory factors. Color. Res. Appl.46, 1106–1118. 10.1002/col.22624 (2021). [Google Scholar]
  • 32.Ketron, S., Labrecque, L., Sohn, S. & Yazdanparast, A. Color saturation and perceived sensory intensity: an account of psychological proximity. Psychol. Mark.42, 3063–3080. 10.1002/mar.70021 (2025). [Google Scholar]
  • 33.Magnier, L. & Schoormans, J. How do packaging Material, colour and environmental claim influence package, brand and product evaluations? Packag. Technol. Sci.30, 735–751. 10.1002/pts.2318 (2017). [Google Scholar]
  • 34.Junwei, Y., Olivier, D. & Sophie, L. B. Why display motion on packaging? The effect of implied motion on consumer behavior. J. Retail. Consum. Serv.6410.1016/j.Jretconser.2021.102840 (2022).
  • 35.Xinxin, Z., Yueying, L., Shang, D., Chenlong, D. & Man, D. The influence of user cognition on consumption decision-making from the perspective of bounded rationality. Displays7710.1016/j.Displa.2023.102392 (2023).
  • 36.Liu, Q. E., He, D. & Jiang, Y. Loose = fun? How interstitial space in brand logos affects product perception. J. Bus. Res.192, 115295–115295 (2025). [Google Scholar]
  • 37.Ruiqin, L., Yan, W. & Hongli, Z. The shape of premiumness: Logo Shape’s effects on perceived brand premiumness and brand preference. J. Retail. Consum. Serv.7510.1016/J.JRETCONSER.2023.103516 (2023).
  • 38.Orsolya, C., Gábor, C. & József, D. How to implement MCDM tools and continuous logic into neural computation? Towards better interpretability of neural networks. Knowl. Based Syst.21010.1016/j.knosys.2020.106530 (2020).
  • 39.Ferdous, J. et al. Development of a generic decision tree for the integration of multi-criteria decision-making (MCDM) and multi-objective optimization (MOO) methods under uncertainty to facilitate sustainability assessment: A methodical review. Sustainability1610.3390/su16072684 (2024).
  • 40.Upadhyay, A. et al. Bio-based smart packaging: fundamentals and functions in sustainable food systems. Trends Food Sci. Technol.145, 104369. 10.1016/j.Tifs.2024.104369 (2024). [Google Scholar]
  • 41.Keisheni, G. et al. Smart packaging—A pragmatic solution to approach sustainable food waste management. Food Packag. Shelf Life. 3610.1016/j.Fpsl.2023.101044 (2023).
  • 42.Petersch, B. & Dierkes, K. Gaze-angle dependency of pupil-size measurements in head-mounted eye tracking. Behav. Res. Methods. 10.3758/s13428-021-01657-8 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Carter, B. T. & Luke, S. G. Best practices in eye tracking research. Int. J. Psychophysiol.155, 49–62. 10.1016/j.ijpsycho.2020.05.010 (2020). [DOI] [PubMed] [Google Scholar]
  • 44.Won, R. N., Whan, C. K., Coltan, S. & G, B. M. Introducing point-of-interest as an alternative to area-of-interest for fixation duration analysis. PloS One. 16, e0250170–e0250170. 10.1371/journal.pone.0250170 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Zhang, H., Wu, S., Chen, W., Gao, Z. & Wan, Z. Self-Calibrating gaze Estimation with optical axes projection for Head-Mounted eye tracking. IEEE Trans. Industr. Inf.20, 1397–1407. 10.1109/tii.2023.3276322 (2024). [Google Scholar]
  • 46.Mehdi, B., Guillermo, P. A. P., Julián, D. A. L. & Juan, I. G. L. Imputation of missing data in smooth pursuit eye movements using a self-attention-based deep learning approach. arXiv - CS - Machine Learning https://doi.org:arxiv-2506.00545 (2025).
  • 47.Luque-Buzo, E. et al. Estimation of the cyclopean eye from binocular smooth pursuit tests. IEEE Trans. Cogn. Dev. Syst.10.1109/tcds.2024.3410110

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data and materials supporting the findings of this study are available from the corresponding author upon reasonable request.


Articles from Scientific Reports are provided here courtesy of Nature Publishing Group

RESOURCES