AI-assisted quantitative analysis for evaluating melanin distribution in 3D pigmented epidermis-on-a-chip models

Yu Yao; Xuan Du; Yanhui Li; Yuchen Ma; Yuchen Li; Zilin Zhang; Boyang Song; Xiaoran Li; Jing Zhang; Jun Ouyang; Nuo Si; Ningbei Yin; Qianqian Han; Zhongze Gu; Zaozao Chen

doi:10.3389/fbioe.2026.1787959

. 2026 Apr 7;14:1787959. doi: 10.3389/fbioe.2026.1787959

AI-assisted quantitative analysis for evaluating melanin distribution in 3D pigmented epidermis-on-a-chip models

Yu Yao ^1,^2,^†, Xuan Du ^2,^†, Yanhui Li ^3,^†, Yuchen Ma ^1,^2,^†, Yuchen Li ², Zilin Zhang ⁴, Boyang Song ⁴, Xiaoran Li ², Jing Zhang ², Jun Ouyang ², Nuo Si ^1,^2,^*, Ningbei Yin ^1,^2,^*, Qianqian Han ^5,^*, Zhongze Gu ^2,⁴, Zaozao Chen ^2,^4,^*

PMCID: PMC13096039 PMID: 42022163

Abstract

Introduction

Abnormal pigmentation plays an important role in various skin diseases and in studies of whitening efficacy.Three-dimensional pigmented epidermis-on-a-chip models provide a crucial in vitro platform for exploring melanin production and regulation in skin. However, dynamic and non-invasive quantitative assessment of melanin distribution remains difficult with traditional histological methods.

Methods

In this study, an AI-assisted objective evaluation framework was established for three-dimensional pigmented epidermis-on-a-chip models based on brightfield images. Melanin regions were segmented using the MEM-ViT algorithm, and their morphological features were extracted to build a multi-indicator comprehensive analysis system for determining the “good/poor” status of the model.

Results

The results showed 98% consistency between algorithmic predictions and manual annotations, demonstrating the reliability and generalization capability of the proposed method. The framework enabled accurate segmentation of melanin regions and standardized evaluation of model quality without staining.

Discussion

This method provides a rapid, non-invasive, and standardized approach for evaluating 3D pigmented epidermis-on-a-chip models. It offers a useful technical pathway for drug efficacy research, whitening mechanism analysis, and objective assessment of skin pigmentation-related disorders.

Keywords: AI-based quantitative evaluation, melanin distribution, pigmented epidermis-on-a-chip, semantic segmentation, vision transformer (ViT)

1. Introduction

Pigmentary disorders (e.g., melasma, cutaneous photoaging, and marginal repigmentation in vitiligo), and efficacy evaluation of cosmetic products including skin-whitening/spot-lightening and photoprotective skincare, urgently require an in vitro model to recapitulate both melanogenesis and intercellular melanin transfer, while capturing the spatial distribution and temporal dynamics of pigmentation (Figure 1a) (Lei and Hearing, 2020; Bento-Lopes et al., 2023; Miao et al., 2025). Conventional two-dimensional monolayer cell models cannot couple melanocytes and keratinocytes and lack the stratified barrier context (Costa Gagosian et al., 2025; Seiberg, 2001). This makes it difficult to reproduce clinically relevant spatial phenotypes such as pigment homogeneity, lesion area, and pigment deposition. Therefore, in vitro experimental results often show poor correlation with clinical efficacy. In contrast, three-dimensional pigmented epidermis-on-a-chip models are co-constructed from melanocytes and keratinocytes (Li et al., 2023). In addition, they possess a stratified architecture from the basal layer to the stratum corneum, as well as paracrine signaling, cell–cell adhesion, and receptor-mediated pathways between neighboring cells (Hall et al., 2022; Tada et al., 1998; Correia et al., 2018). These features confer higher external validity for drug screening, efficacy verification, and safety assessment.

Infographic illustrating applications and evaluation of a 3D pigmented epidermis-on-a-chip model, including drug tests, skin photoaging, vitiligo, melasma, cosmetics reviews, and evaluation via 3D skin models, imaging with AI processing, and data analysis. — Applications of the pigmented epidermis model and AI-based evaluation workflow: **(a)** Model applications: Pigmented epidermis-on-a-chip models are used for disease modeling of photoaging, vitiligo, melasma, and related pigmentary disorders, and efficacy testing of drugs and cosmetic products. **(b)** AI evaluation workflow: Brightfield images acquired from 3D pigmented epidermis-on-a-chip models are processed by image preprocessing and segmentation using the MEM-ViT algorithm, followed by extraction of quantitative metrics and statistical analysis.

Traditionally, skin model evaluation reliess heavily on histological assessments (e.g., H&E staining, immunohistochemistry) (Hall et al., 2022). However, these methods require embedding and sectioning, involve labor-intensive procedures, and are inherently end-point measurements. This leads large lots of difficulties in continuous and dynamic observation. Functional assays (such as tyrosinase activity and gene or protein expression) are likewise destructive and do not permit real-time monitoring (Shi et al., 2024; Bass et al., 2017). In contrast, direct observation of three-dimensional pigmented epidermis-on-a-chip models under brightfield microscopy provides a non-invasive morphological evaluation approach. It is easy to perform and enables rapid assessment of melanin distribution and overall model status. Nevertheless, results depend on subjective judgment and lack quantitative precision and standardized comparability. Against this backdrop, the introduction of image-based artificial intelligence (AI) analysis is of particular importance. It can transform subjective image inspection into objective, standardized, and scalable quantitative metrics, sensitively capturing multiscale spatial heterogeneity difficult to discern with the naked eye (Wu et al., 2022; Schmitz et al., 2021). On the other hand, it allows evaluation under label-free brightfield imaging conditions (Shen et al., 2022; Park et al., 2023), thereby facilitating verification of the stability and consistency of the model as a testing platform. However, this process still faces several methodological challenges: first, brightfield images are easily affected by uneven illumination, glare, and out-of-focus artifacts, necessitating color and illumination calibration; second, melanin patches often have fuzzy boundaries and pronounced variations in scale, undermining the comparability of results across wells and time points; and finally, the use of different magnifications may further lower the analytical consistency. The current advancements in deep learning can offer solutions for blurred boundary problems, such as introducing edge-aware branches (Dong et al., 2022), graph convolution (Wang et al., 2024; Liu et al., 2022), and edge-aware loss functions (Zhan and Yang, 2025; Zheng et al., 2020), etc. For multi-scale variation problems, researchers have proposed multi-scale dilated (expansion) convolution methods (Wang et al., 2023; Gao et al., 2026) and different meta-learners to capture long-range and short-range dependencies (Yang et al., 2025; Yang et al., 2026; Holail et al., 2025), etc.

To achieve the effectively analysis, an artificial intelligence -based evaluation framework was established for pigmented epidermis-on-a-chip models using brightfield images (Figure 1b). Based on our proposed MEM-ViT (Melanin Estimation via Vision Transformer) algorithm, precise segmentation of melanin regions was first performed to generate mask images, and subsequently extract four static metrics from these masks to quantitatively characterize the extent of pigment deposition, as well as its spatial scale and optical properties. Furthermore, these metrics were jointly analyzed to enable standardized determination of model quality (“good” vs. “poor”) and the suitability of the model as a testing platform. Experimental results demonstrate that, compared with existing deep learning algorithms, MEM-ViT achieved high-precision segmentation of melanin regions and extraction of their morphological features, and effectively mitigated interference due to the variations in brightfield imaging conditions and culture environments. This work provides a non-invasive, efficient, and reliable technical approach for graded evaluation of pigmented epidermis-on-a-chip models.

2. Materials and methods

2.1. Cell sources and culture

Normal human keratinocytes (NHKs) and normal human melanocytes (NHMs) were isolated from foreskin tissue obtained from pediatric circumcision procedures. Sample collection complied with relevant laws and institutional ethical guidelines, and written informed consent and approval were obtained from the institutional ethics committee. Cells were isolated by separating the epidermis and dermis using 0.25% (w/v) trypsin at 4 °C overnight. On the following day, basal layer cells were gently scraped from the dermal papillary surface and collected by centrifugation at 200 g for 5 min. These procedures were carried out in accordance with Li et al. (2023).

NHKs were seeded onto collagen IV–precoated culture dishes and maintained in the same medium system as described by Li et al. (2023). NHMs were cultured in Medium 254 (Gibco, United States) supplemented with Human Melanocyte Growth Supplement (Gibco, United States). Cells were routinely maintained at 37 °C in a humidified atmosphere of 5% CO₂, subcultured at 60%–80% confluence, and used within six passages (Li et al., 2023).

When constructing melanocyte-containing systems, keratinocytes and melanocytes were seeded at an approximate ratio of 10:1, and the keratinocyte and melanocyte growth media were mixed at the same 10:1 ratio to prepare the co-culture medium. This formulation has been validated in the same model to support NHK proliferation and differentiation while remaining compatible with NHM co-culture. During the air–liquid interface (ALI) differentiation phase, the calcium concentration in the medium was adjusted to 1.2 mM.

2.2. Construction of 3D pigmented epidermis-on-a-chip models

Following the method for constructing 3D pigmented epidermis-on-a-chip models described by Li et al. (2023), keratinocytes and melanocytes were seeded onto PET porous membranes at a total density of approximately 6 × 10⁵ cells per well, with a keratinocyte-to-melanocyte ratio of about 10:1. After 2 days of submerged co-culture, the cultures were transitioned to air–liquid interface (ALI) conditions for 14 days to promote epidermal differentiation and formation of a stratified, melanin-containing epidermis. Detailed procedures, reagents, and equipment are described in Li et al. (2023).

2.3. Data acquisition and annotation

Avatarget high-throughput imaging system was used to acquire brightfield images of the pigmented epidermis-on-a-chip models. After several days of culture, images were captured at ×4 and ×10 magnification. For each sample, multiple fields of view were imaged and stitched when necessary to cover the entire well of the plate. To improve image sharpness and avoid defocus, multi-plane 2D images were processed using a focus-stacking algorithm to generate fully focused composite images. The final images were saved in JPG or TIFF format at resolutions of 2,248 × 2,048 or 1,360 × 1,024 pixels. In the semantic segmentation dataset, each brightfield image was manually annotated by two trained experts using the LabelMe software. The annotators delineated, at the pixel level, the boundaries of melanocytes and melanin patches, labeling all visible cells and spheroid structures as foreground (class 1), and all other regions, including the background and culture medium, as background (class 0). Ambiguous regions were carefully reviewed because of the relatively low contrast of brightfield images. If the inter-annotator agreement, measured by the pixel-wise intersection-over-union (IoU), was below 80%, the image was re-evaluated and corrected by a third senior annotator. The final binary masks were exported in PNG format, maintaining a one-to-one correspondence with the original images.

2.4. Vision transformer

Vision Transformer (ViT) is a model that introduces the Transformer architecture, originally developed in the field of natural language processing, into computer vision tasks (Dosovitskiy, 2020). Unlike conventional convolutional neural networks (CNNs), which rely on local convolutional kernels to capture spatial features, ViT divides an image into fixed-size patches, flattens them, and encodes them into a one-dimensional sequence. Afterwards, it used a self-attention mechanism to model relationships among patches at a global level, thereby enabling more effective capture of long-range dependencies and global contextual information. Within each Transformer layer, self-attention is computed as:

A t t e n t i o n (Q, K, V) = s o f t m a x (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

where $Q$ , $K$ , and $V$ denote the query, key, and value matrices derived from the patch embeddings. This global modeling capability allows ViT to exhibit powerful feature extraction performance when trained on large-scale datasets with sufficient computational resources. Vision Transformers show considerable promise in semantic segmentation tasks. Traditional CNN-based models are often limited by their local receptive fields when dealing with complex scenes, whereas Transformer-based architectures can effectively integrate global semantic information during the encoding stage, leading to more accurate segmentation in scenarios with blurred boundaries or subtle inter-class differences (Han et al., 2023). Recently proposed variants, such as TransUNet (Chen et al., 2021) and Swin-UNet (Cao et al., 2022), combine Transformer modules with U-Net–style encoder–decoder architectures and have achieved significant performance gains in medical imaging. We also apply the encoder–decoder architecture to the algorithms in this paper. At each decoding stage $i$ , the upsampled feature map $F_{i}^{u p}$ is fused with the corresponding skip feature $F_{i}^{s k i p}$ as:

F_{i} = \emptyset (F_{i}^{u p} \oplus F_{i}^{s k i p})

where $\emptyset$ denotes channel-wise concatenation and $\oplus$ represents convolutional transformations followed by normalization and non-linear activation.

2.5. Model optimization

To improve segmentation performance on low-contrast brightfield images, the following optimization strategies were adopted during training. As deep learning models are supervised methods, they require a large number of data samples to adequately tune model parameters. However, manually annotating melanin regions in images of the pigmented epidermis-on-a-chip models is time-consuming and labor-intensive. Therefore, to mitigate overfitting in the small-sample setting, data augmentation was conducted to increase both the size and diversity of the training set.

Typical augmentation techniques were applied, including random cropping, geometric transformations (e.g., rotation, translation, and flipping), adjustments of color, intensity, and contrast, as well as non-rigid image transformations (e.g., elastic deformations). The model was optimized using a composite loss function, defined as a weighted sum of binary cross-entropy (BCE) loss and Dice loss. In the BCE term, the positive class was reweighted (weighting factor 0.5) to alleviate the dominance of abundant background pixels (Li et al., 2024; Zhao et al., 2020). The Dice loss simultaneously optimizes pixel-wise classification and the overall overlap between predicted and ground-truth shapes.

Adam optimizer (initial learning rate 1 × 10⁻³, β₁ = 0.9, β₂ = 0.99) was used and a 50% learning rate decay was applied if the validation loss did not decrease for 10 consecutive epochs. In addition, dropout layers with a rate of 0.5 were inserted at the end of the encoder and the beginning of the decoder to further suppress overfitting (Srivastava et al., 2014).

L_{l o s s} = λ L_{B C E} + (1 - λ) L_{D i c e}

2.6. Model training and evaluation

We trained the model on a workstation equipped with four NVIDIA RTX 3090 GPUs (11 GB memory each). The dataset was split on a per-sample basis into training (50%), validation (20%), and test (30%) sets to ensure that images from different magnifications were evenly distributed across the three subsets. The batch size was set to 8, and the maximum number of training epochs was 200, with early stopping applied when the validation performance no longer improved. During training, the raw images and their corresponding binary masks were used as input–label pairs. The input images were normalized before feeding into the network, and the resulting probability maps were binarized using a threshold of 0.45 to generate the predicted segmentation masks.

To assess the segmentation performance of the trained model, comparison between model predictions and ground-truth labels is typically quantified using four basic concepts. True positives (TP) denote the number of pixels correctly identified by the model as belonging to the target class. False positives (FP) refer to pixels that are incorrectly classified as the target class when they actually do not. True negatives (TN) represent pixels correctly identified as background, whereas false negatives (FN) correspond to target pixels that the model fails to detect, i.e., pixels that truly belong to the target class but are misclassified as background. Based on these quantities, several commonly used evaluation metrics can be computed. Accuracy (Acc) measures the overall correctness of pixel-wise classification and reflects the model’s global segmentation performance. The Dice similarity coefficient (DSC, also known as the F1-score) evaluates the overlap between predicted and ground-truth regions, placing greater emphasis on the consistency of the target region. The intersection-over-union (IoU) quantifies the ratio between the intersection and union of the predicted and ground-truth masks and is one of the most widely used metrics for assessing semantic segmentation performance. In summary, Acc reflects overall accuracy, DSC emphasizes overlap consistency, and IoU provides a more stringent assessment of the quality of the predicted regions (Cheng et al., 2021; Zou et al., 2004; Minaee et al., 2021).

I O U = \frac{T P}{T P + F P + F N}

A c c = \frac{T P + T N}{T P + F P + T N + F N}

D S C = \frac{2 \times T P}{2 \times T P + F P + F N}

2.7. Evaluation indicators

Once the segmentation masks was obtained from the model, quantitative analysis was performed to evaluate the characteristics of melanin distribution and to interpret the biological or pharmacological relevance of the segmentation results. The primary evaluation metrics include the area-related, optical, and spatial distribution features of melanin patches. The specific definitions and descriptions of these metrics are provided below.

2.7.1. Total area of melanin patches:

A_{t o t a l} = \sum_{i = 1}^{N} A_{i}

Here, $A_{i}$ denotes the area of the $i - t h$ melanin patch, and $N$ is the total number of patches. This metric reflects the overall level of melanin deposition in the sample. A larger total melanin area indicates more severe melanin accumulation, whereas a marked reduction suggests that melanogenesis is inhibited or that a pronounced depigmenting effect has been achieved.

2.7.2. Average area of melanin patches

A_{m e a n} = \frac{1}{N} \sum_{i = 1}^{N} A_{i}

Mean area ( $A_{m e a n}$ ) is used to evaluate the average size of individual melanin deposition sites and can indicate whether melanogenesis tends to be locally concentrated or diffusely distributed. A larger $A_{m e a n}$ suggests a tendency for melanin to form larger aggregated patches, whereas a smaller $A_{m e a n}$ may indicate more dispersed melanin granules and lower melanogenic activity.

2.7.3. Melanin patch light transmittance

T = \frac{I}{I_{0}} \times 100 %

Here, $I$ denotes the mean light intensity within the melanin patch region, and $I_{0}$ represents the mean light intensity in the background or control region. Transmittance $T$ is used to quantify the degree of light absorption by melanin-containing areas. Greater melanin deposition corresponds to lower transmittance since melanin strongly absorbs light. This metric can be used to evaluate the efficacy of skin-whitening or anti-melanogenic agents: an increase in $T$ indicates enhanced light transmittance of the skin or tissue and a reduction in melanin content.

2.7.4. Optical density of melanin patches

O D = - \log_{10} (\frac{I}{I_{0}})

Here, $I$ and $I_{0}$ are defined as above, and a higher $O D$ value indicates more intense melanin deposition. Optical density ( $O D$ ) is an optical parameter that characterizes the concentration of melanin deposition and is commonly used in the analysis of microscopic images and histological sections. An increase in $O D$ reflects stronger light absorption and a higher melanin concentration in the corresponding region. In general, $O D$ is positively correlated with melanin content.

2.7.5. Composite score formula

We aimed to aggregate multiple raw metrics (total patch area ( $A_{t o t a l}$ ), mean patch area ( $A_{m e a n}$ ), transmittance ( $T$ ), and optical density ( $O D$ )) into a single composite score ranging from 0 to 100, where higher scores indicate more severe melanin deposition. To this end, a min–max linear transformation was applied to rescale each set of measurements to the specified range (e.g., ([0, 100])).

x^{'} = \frac{x - x_{\min}}{x_{\max} - x_{\min}} \times (b - a) + a

Here, $x$ denotes the original data, $x_{\min}$ and $x_{\max}$ are the minimum and maximum values of the data, respectively, and $a$ and $b$ are the lower and upper bounds of the target range (set to 0 and 100 in this study). x ^′ denotes the normalized result.

The composite score is obtained by linearly combining the above rescaled metrics as follows:

s c o r e = w_{A t o t a l} * A_{t o t a l}^{'} + w_{A m e a n} * A_{m e a n}^{'} + w_{T} * T^{'} + w_{O D} * O D^{'}

Here, $w_{Atotal}$ = 0.35, $w_{Amean}$ = 0.10, $w_{T}$ = 0.15, and $w_{OD}$ = 0.30. A′_total, A′_mean, T′, and OD′ represent the normalized total patch area, mean patch area, transmittance,and optical density, respectively. Since A′_total and OD′ are typically the primary indicators, they are assigned higher weights. In contrast, A′_mean was used mainly to characterize the distribution pattern, and thus its weight was kept lower to avoid redundant emphasis on area-related information.

3. Results

3.1. Formation of the 3D pigmented epidermis-on-a-chip model and AI-based analysis

In this study, the three-dimensional pigmented epidermis-on-a-chip model developed under air–liquid interface (ALI) culture formed a well-defined stratified epidermal structure after 2 weeks (Figures 2a,b). Melanocytes were clearly localized to the basal layer, and melanin was efficiently synthesized and transferred to the overlying keratinocytes, thereby recapitulating the in vivo pattern of epidermal melanin deposition (Figure 2b). To assess model quality, samples were categorized into two groups (“good” and “poor”) based on morphological features observed in microscopic images.

Panel (a) shows a schematic workflow of culturing NHKs and NHMs, seeding them, lifting to an air-liquid interface for two days, then allowing differentiation and stratification over fourteen days. Panel (b) displays a histological image of a cross-sectioned epidermal tissue with labeled layers: cornified, glandular, melanosome, spinous, melanocyte, and basal. Panel (c) includes four brightfield microscope images at ten times and four times magnification, a diagram of a neural network, and two illustrations of skin labeled "Bad" and "Good," indicating differences in tissue quality. — Construction of the pigmented epidermis-on-a-chip model and evaluation metrics: **(a)** Schematic illustration of submerged co-culture of normal human keratinocytes and melanocytes on the chip, followed by proliferation and differentiation of keratinocytes under air–liquid interface (ALI) conditions to form a stratified epidermis; **(b)** H&E staining of the pigmented epidermis-on-a-chip model after 2 weeks of ALI culture, with a magnified view of the local epidermal structure; and **(c)** Brightfield images of the pigmented epidermis-on-a-chip model are input into the algorithm, which evaluates the epidermal model (good vs. poor) based on the defined quantitative metrics.

“Good” models: Pigmentation appeared as fine, discrete micro-spots with an overall homogeneous distribution (Figure 2c). No large confluent patches were observed at low magnification and only delicate and uniform microtextures were present at higher magnification. This morphology suggests stable melanogenesis and efficient melanin transfer to keratinocytes, with well-coupled spatial organization. Samples with these characteristics were defined as “good” and are recommended as suitable testing platforms for skincare products, cosmetics, and pharmacological agents.

“Poor” models: Multiple dark, island-like pigmented foci were present and frequently fused into larger patches (Figure 2c), forming extensive areas at low magnification. Pronounced edge effects or ring-shaped inhomogeneities were often observed (e.g., the peripheral region markedly darker than the center, or vice versa). At high magnification, the patch size distribution was heterogeneous, and local textures appeared coarse. Such features indicate spatial disequilibrium in melanin production and transfer, with substantial variability both within batches and between positions. These samples are classified as “poor” and are not recommended as stable testing platforms.

To more comprehensively evaluate the quality of the pigmented epidermis-on-a-chip model, the outputs obtained after algorithmic segmentation of melanin patches were analyzed. A set of morphological and optical parameters were established as evaluation metrics, including: (1) total melanin patch area to reflect the overall level of melanin deposition in the model; (2) mean melanin patch area to characterizes the uniformity of melanin deposition and the degree of pigment aggregation; (3) melanin patch transmittance to assess the impact of melanin on light penetration based on image grayscale values or optical measurements, assesses; and (4) melanin patch optical density calculated from grayscale intensity to quantitatively describe the concentration and depth of melanin deposition. These metrics were subsequently rescaled and linearly combined to obtain a composite evaluation score. Joint analysis of the above indicators not only enables a macroscopic assessment of the overall pigmentation of the model, but also reveals the spatial distribution and optical characteristics of melanin deposition at a microscopic level. This provides a systematic basis for comparing differences in melanogenesis under different culture conditions, external stimuli, or pharmacological treatments.

To enable comprehensive evaluation of the model, full-well images at ×4 magnification was acquired as the primary quantitative dataset, whereas images at ×10 magnification were used only as morphological corroboration (to examine the presence of pigment aggregates and textural features). A total of 30 pigmented epidermis-on-a-chip models were included in this study. During image acquisition, approximately 1000 brightfield images were collected in total, because multiple raw fields of view were acquired for each model at different magnifications and stitched when necessary to cover the entire well. For quantitative analysis, each full-well composite image corresponded to one epidermal chip and served as the primary input for MEM-ViT analysis.

3.2. Deep learning–based melanin detection method

Within the overall methodological framework described in Section 2, the proposed MEM-ViT network serves as the core component for melanin detection. The overall architecture is shown in Figure 3. Thenetwork consists of three main components: (1) a ViT encoder (Dosovitskiy, 2020), (2) a multi-branch decoder, and (3) a post-processing pipeline. In this network, a Vision Transformer (ViT) was employed as the encoder for 3D epidermal images, leveraging the self-attention mechanism to directly model global relationships among voxel patches and thereby overcome the local receptive field limitations of conventional convolutional neural networks. The features produced by the encoder were propagated to the upsampling decoder via skip connections, enabling fusion of high-level semantic information with low-level details. Inspired by U-Net (Ronneberger et al., 2015), five skip connections were design in the decoder. The first skip connection originates from the raw input image and performs feature extraction using two 3 × 3 convolutional layers with batch normalization and ReLU activation. The remaining four skip connections extract tokens from intermediate ViT layers that are reshaped into 3D tensors and then fed into the decoder; each feature map is further processed by two convolutional layers to preserve multi-scale contextual information. Deep features are progressively upsampled through a series of transposed convolution layers, which double the spatial resolution in all three directions at each stage, followed by convolutions to adjust the channel dimension. At each decoding stage, the features are fused with those from the corresponding skip connection to ensure effective integration of multi-scale information, thereby improving the segmentation accuracy of the epidermal structure. Finally, the resulting segmentation mask undergoes post-processing to exactly match the resolution of the input 3D pigmented epidermis-on-a-chip models.

Diagram illustrating a computer vision workflow where an input microscopic image passes through a Vision Transformer (ViT), with feature maps extracted at multiple layers, followed by deconvolution, convolution, and activation blocks, leading to a segmented output image; a legend defines the process blocks. — Overall architecture of the MEM-ViT algorithm. The algorithm utilizes a Vision Transformer as the backbone network to extract features from the input image, and incorporates a “U”-shaped concatenation structure to output the melanin area.

3.3. Model performance evaluation

Quantitative assessment of the morphological features of 3D pigmented epidermis-on-a-chip models under brightfield imaging—such as melanin distribution, integrity of the epidermal structure, and other morphological parameters—constitutes one of the most direct and critical evaluation criteria in skin pigmentation research and drug screening. Unlike conventional two-dimensional cell images, which typically exhibit clear and stable monolayer structural features, brightfield imaging of 3D pigmented epidermis-on-a-chip models is influenced by multiple factors, including the culture system, tissue thickness, imaging depth, illumination conditions, and pharmacological interventions. Therefore, images acquired under different experimental conditions and magnifications often exhibit pronounced imaging artifacts. This poses serious challenges for both traditional image processing methods and current deep learning algorithms. These artifacts mainly include local occlusion and illumination inhomogeneity caused by melanin deposition, defocus and signal attenuation induced by imaging depth, morphological variations arising from different culture conditions, as well as interlayer tissue overlap and blurred boundaries. In summary, these complex factors make the automatic identification and segmentation of melanin structures within the epidermal layer under brightfield imaging highly uncertain and challenging. To systematically evaluate the performance of different algorithms in such complex scenarios, a brightfield image dataset of 3D pigmented epidermis-on-a-chip models was constructed and consisted of approximately 1000 images of pigmented epidermal models. The dataset was randomly divided into 50% for training, 20% for validation,and 30% for testing. In our experiments, the proposed method was compared with several representative cell and tissue segmentation architectures, including U-Net (Ronneberger et al., 2015), U-Net++ (Zhou et al., 2020), SegFormer (Xie et al., 2021), and Mask R-CNN (Johnson, 2018). Each model was adapted and optimized for our task and trained for 100 epochs under the same training protocol (Figure 4). Mask R-CNN and SegFormer exhibited relatively large oscillations in the training loss, whereas the other models showed smoother convergence; our model achieved the fastest and most stable convergence. Furthermore, to quantitatively evaluate melanin segmentation performance, we employed three metrics—IoU, DSC, and Acc—to assess the segmentation accuracy of six models, including our MEM-ViT (Table 1). The results showed that our method achieved improvements of 2.4%, 4.3%, and 2.2% in IoU, DSC, and Acc, respectively, compared with mainstream algorithms. Even under conditions of uneven illumination, tissue overlap, and complex morphology, our approach maintained high robustness and accuracy. Table 2 presents a comparative analysis of computational efficiency at a 512 × 512 resolution. SegFormer demonstrates superior real-time performance with a minimum latency of 9.3 ms, attributed to its efficient hierarchical design. Conversely, Mask R-CNN incurs the highest computational cost (182.4G FLOPs) because of its multi-scale feature pyramid network and the dense per-proposal computations in the RPN and mask/box heads. Notably, Mask-RCNN exhibits the maximum latency (45.2 ms), reflecting the architectural overhead of its two-stage pipeline despite moderate FLOPs. Although MEM-ViT possess larger parameter footprints, they maintain competitive inference speeds, suggesting that ViT-B backbones offer an optimal trade-off between model capacity and throughput for this task.

Grouped line graphs compare training and validation loss over one hundred epochs for six deep learning models: UNet, UNet++, Mask-RCNN, Segformer, Swin-UNet, and MEM-ViT. Each graph shows decreasing loss for both train and validation data, with plotted lines indicating model performance convergence and slight divergence between training and validation loss as epochs increase. — Training loss curves of different models on the training and validation sets. **(a–f)** Comparison of training and validation losses for six classical semantic segmentation models on the 3D pigmented epidermis-on-a-chip dataset.

TABLE 1.

Comparison of melanin segmentation performance across different models.

Methods		Metrics
Methods	IoU (%)	DSC (%)	Acc (%)
U-Net	65.2	76.8	88.1
U-Net++	69.5	81.4	89.5
Mask-RCNN	63.7	75.1	87.6
SegFormer	60.8	72.4	86.3
Swin-UNet	71.8	80.3	90.2
MEM-ViT	74.2	84.6	92.4

Open in a new tab

TABLE 2.

Performance comparison of various architectures at the same input resolution.

Methods	Backbone	Params (M)	FLOPS (G)	Inference time (ms)
U-Net	Standard CNN	31.0	68.9	18.4
U-Net++	Standard CNN	39.5	82.2	32.8
Mask-RCNN	ResNet50-FPN	44.3	182.4	45.2
SegFormer	MiT-B0	24.7	72.6	9.3
Swin-UNet	Swin-B	88.2	116.2	22.6
MEM-ViT	ViT-B	89.5	132.6	26.1

Open in a new tab

3.4. Experimental evaluation

Prediction and visualization were performed on samples from the test set to visually demonstrate the performance of the proposed model on the semantic segmentation of 3D pigmented epidermis-on-a-chip models under brightfield imaging, and to further validate the accuracy of the AI-based composite evaluation method in discriminating between “good” and “poor” models (Figures 5a,b). The first row shows brightfield microscopic images of well-performing pigmented epidermis-on-a-chip models, characterized by an intact epidermal structure and homogeneous pigment distribution that closely resembles native skin tissue. The second row presents the corresponding segmentation results, in which the algorithm exhibits high consistency and accuracy in boundary localization and region identification, effectively capturing the spatial distribution of melanin. The third row displays brightfield images of poorly performing models, marked by uneven pigment distribution and large island-like pigmented patches. The fourth row shows the corresponding segmentation visualizations, where the algorithm effectively distinguishes melanin islands of different sizes.

Panel of twelve circular petri dish images in four rows and three columns. Row (a) shows good melanin models with minimal spots. Row (b) displays segmentation results for these models, highlighting small red-marked regions. Row (c) shows bad melanin models with widespread dark spots. Row (d) presents segmentation results for these, featuring numerous red-highlighted spots corresponding to darker regions. Each column provides a different sample within these groupings. — Visualization of “good” and “poor” pigmented epidermis-on-a-chip models and their corresponding segmentation results: **(a)** Brightfield image of a “good” pigmented epidermis-on-a-chip model. Scale bar: 1 mm; **(b)** Segmentation result of the “good” model produced by the proposed algorithm; **(c)** Brightfield image of a “poor” pigmented epidermis-on-a-chip model; and **(d)** Segmentation result of the “poor” model produced by the proposed algorithm. In **(b, d)**, the red overlay represents the AI-predicted segmentation mask used to visualize melanin patches rather than melanin intensity. The uniform peripheral red area reflects a cell-free blank plate background exposed by centripetal contraction during culture and should not be interpreted as homogeneous pigmentation.

To further validate the reliability and generalization capability of the proposed AI-based method for objective evaluation of pigmented epidermal models, extensibility tests were conducted under different experimental conditions and across multiple sample batches. By assessing model performance on an independent test set not used during training and analyzing the consistency of responses across multiple feature-based indicators, we aimed to verify the robustness of the model under varying image quality, illumination conditions, and melanin distribution patterns.

Figure 6a shows the Pearson correlation heatmap between the multidimensional evaluation metrics proposed in this study ( $A_{t o t a l}$ , $O D$ , $T$ , $A_{m e a n}$ ) and the manually assigned labels (Label, good/bad). The results indicate that $A_{t o t a l}$ and $O D$ exhibit a strong positive correlation with the manual labels (r > 0.95), suggesting that these two parameters effectively capture the overall pigment load and optical density of the samples and are key variables for distinguishing between high- and low-quality models. T metric shows a moderate correlation with the manual labels (r ≈ 0.78), suggesting that this indicator to some extent reflects transmittance characteristics associated with melanin deposition and provides useful reference for model quality discrimination. In contrast, $A_{m e a n}$ displays a relatively weaker correlation with the labels (r ≈ 0.58), but still shows a consistent trend, indicating that this parameter provides auxiliary information in characterizing the uniformity of pigment distribution. Overall, the high correlations among these metrics support the consistency and rationality of the selected features in describing the state of the pigmented epidermal models.

Panel a displays a correlation heatmap of five variables with coefficients ranging from zero point four two to one point zero, colored from blue to red. Panel b presents a confusion matrix comparing human versus algorithm consistency, showing high agreement with human good and algorithm good having one hundred forty-eight cases, and human bad and algorithm bad one hundred forty-six cases, displayed in blue and white. — Evaluation of metric correlations and algorithm accuracy: **(a)** Correlation heatmap between the proposed evaluation metrics ( $A_{t o t a l}$ , $O D$ , $T$ , $A_{m e a n}$ ) and the *labels (good or bad)*; and **(b)** Confusion matrix comparing manual annotations and algorithmic predictions on the test set.

Figure 6b presents the confusion matrix illustrating the agreement between manual annotations and algorithmic predictions. The results showed that the method achieved high discriminative accuracy for both ‘Good’ and ‘Bad’ samples: among the samples manually classified as ‘Good,’ 148 cases were consistent with the algorithm-based classification,with only 2 misclassified; among the samples manually classified as ‘Bad,’ 146 were correctly classified and 4 were misclassified. The overall agreement reached 98%, indicating that the evaluation system constructed based on MEM-ViT segmentation results and multi-indicator fusion exhibited high reliability and stable discriminative performance in distinguishing high-quality from low-quality pigmented epidermis-on-a-chip models. These findings further support the feasibility and scientific validity of the multi-metric feature fusion–based quantitative evaluation approach for automated analysis of brightfield images.

4. Discussion

In this study, an AI-assisted quantitative evaluation framework was established for a pigmented epidermis-on-a-chip model to objectively assess melanin distribution under bright-field imaging. By integrating a 3D pigmented epidermal culture with an automated image-analysis pipeline, the system can classify overall pigmentation outcomes and simultaneously extract multiple quantitative parameters describing the area, intensity and spatial distribution of melanin. The AI-based classification reached 98% agreement with expert annotations, supporting its reliability as an objective readout for pigmentation in epidermis-on-a-chip platforms and laying the groundwork for subsequent applications such as efficacy testing and mechanistic studies.

Previous studies evaluating pigmentation in skin equivalents or reconstructed epidermal models mainly relied on qualitative or semi-quantitative approaches such as gross visual inspection, histological staining (e.g., Fontana–Masson) and immunohistochemistry or immunofluorescence of melanogenesis-related markers (Hall et al., 2022; Nissan et al., 2011). These methods require tissue processing and sectioning, are time-consuming and costly, and are intrinsically destructive, which precludes longitudinal assessment on the same sample and limits their suitability for high-throughput applications (Benito-Martínez et al., 2020). Even with the introduction of digital image analysis, it is often limited to simple readings, such as the percentage of pixels corresponding to melanin in manually selected epidermal regions or the average intensity of user-defined areas. This leads to considerable inter-operator variability and challenges in standardizing quantitative criteria across laboratories (Miot et al., 2012; Pena et al., 2022). By contrast, our AI-based framework operates directly on bright-field images of intact pigmented epidermis-on-a-chip, eliminating the need for additional labeling or invasive sampling and thus enabling non-invasive, low-cost, and scalable assessment of melanin distribution. Compared with these traditional workflows, the automated pipeline markedly improves analytical efficiency, reduces dependence on subjective human scoring, and facilitates reproducible, standardized evaluation of pigmentation for large-scale experiments and screening studies.

Another advantage of our approach is that it provides a multidimensional quantitative description of pigmentation rather than a single aggregated score. From bright-field images, the model outputs area-related, optical-density and spatial-distribution features of melanin, enabling a more nuanced and biologically informative characterization of pigmentation patterns (Benito-Martínez et al., 2020; Pena et al., 2022). Compared with AI methods that primarily generate segmentation masks or global hyperpigmentation severity scores and typically report only standard metrics such as IoU, DSC or accuracy (Kojima et al., 2021; Zhang et al., 2025; Draelos et al., 2025), our ViT-based segmentation model, when benchmarked against U-Net, UNet++, SegFormer and Mask R-CNN, achieved the highest IoU, DSC and pixel-level accuracy on both the training and independent test sets, supporting its robustness and generalizability for automated analysis of pigmented epidermis-on-a-chip images (Ronneberger et al., 2015; Zhou et al., 2020; Xie et al., 2021; Johnson, 2018).

This study still has several limitations, as the framework was trained on a relatively small dataset and relies only on static endpoint images. Future work will expand data diversity and incorporate dynamic time-series modeling by combining imaging features with temporal changes in melanin production and redistribution under different stimuli to more closely approximate the native skin microenvironment. Such a spatiotemporal extension would further increase the utility of the model for drug screening, studies of skin-whitening mechanisms and the assessment of pathological pigmentation states.

5. Conclusion

In summary, the AI-driven quantitative evaluation framework proposed demonstrates high accuracy, repeatability, and biological interpretability in the brightfield image analysis of three-dimensional pigmented epidermis-on-a-chip models. By combining semantic segmentation with multi-metric feature fusion, this approach enables an objective and standardized assessment of melanin distribution. Future studies will further extend this framework by incorporating time-series analysis to achieve dynamic monitoring of melanogenesis and melanin metabolism. This is expected to provide new technical support for drug efficacy evaluation and studies of skin-whitening mechanisms.

Funding Statement

The author(s) declared that financial support was received for this work and/or its publication. This work was supported by the Science and Technology Project of Jiangsu Province (Grant No. BK20232023), Frontier Technology Research and Development Program Project of Jiangsu Province (Grant No. BF2024074), Open Research Fund of Southeast University and Jiangsu Province Hospital (Grant No. 2024-K02).

Footnotes

Edited by: Stephanie J. Hachey, University of California, Irvine, United States

Reviewed by: Kui Yang, Wuhan University, China

Linh Vuong, University of California, Irvine, United States

Data availability statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/supplementary material.

Ethics statement

The studies involving human participants were reviewed and approved by the Medical Ethics Committee of Suzhou University Affiliated Children’s Hospital (approval No.2022CS163).Written informed consent to participate in this study was provided by the participants’ legal guardians.

Author contributions

YY: Writing – review and editing, Writing – original draft. XD: Writing – original draft. YaL: Writing – original draft. YM: Writing – original draft. YuL: Writing – original draft. ZZ: Writing – review and editing. BS: Writing – review and editing. XL: Writing – review and editing. JZ: Writing – review and editing. JO: Writing – review and editing. NS: Writing – original draft, Writing – review and editing. NY: Writing – review and editing, Writing – original draft. QH: Writing – review and editing, Writing – original draft. ZG: Writing – original draft, Writing – review and editing. ZC: Writing – original draft, Writing – review and editing.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

Bass J. J., Wilkinson D. J., Rankin D., Phillips B. E., Szewczyk N. J., Smith K., et al. (2017). An overview of technical considerations for Western blotting applications to physiological research. Scand. J. Med. Sci. Sports 27 (1), 4–25. 10.1111/sms.12702 [DOI] [PMC free article] [PubMed] [Google Scholar]
Benito-Martínez S., Zhu Y., Jani R. A., Harper D. C., Marks M. S., Delevoye C. (2020). Research techniques made simple: cell biology methods for the analysis of pigmentation. J. Invest. Dermatol. 140 (2), 257–268.e8. 10.1016/j.jid.2019.12.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
Bento-Lopes L., Cabaço L. C., Charneca J., Neto M. V., Seabra M. C., Barral D. C. (2023). Melanin's journey from melanocytes to keratinocytes: uncovering the molecular mechanisms of melanin transfer and processing. Int. J. Mol. Sci. 24 (14), 11289. 10.3390/ijms241411289 [DOI] [PMC free article] [PubMed] [Google Scholar]
Cao H., Wang Y., Chen J., Jiang D., Zhang X., Tian Q., et al. (2022). “Swin-unet: Unet-like pure transformer for medical image segmentation,” in Computer Vision – ECCV 2022 Workshops (Cham: Springer; ), 205–218. 10.1007/978-3-031-25066-8_9 [DOI] [Google Scholar]
Chen J., Lu Y., Yu Q., Luo X., Adeli E., Wang Y., et al. (2021). Transunet: transformers make strong encoders for medical image segmentation. arXiv Preprint arXiv:2102.04306. 10.48550/arXiv.2102.04306 [DOI] [Google Scholar]
Cheng B., Girshick R. B., Dollár P., Berg A. C., Kirillov A. (2021). “Boundary IoU: improving object-centric image segmentation evaluation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 15334–15342. 10.1109/CVPR46437.2021.01508 [DOI] [Google Scholar]
Correia M. S., Moreiras H., Pereira F. J. C., Neto M. V., Festas T. C., Tarafder A. K., et al. (2018). Melanin transferred to keratinocytes resides in nondegradative endocytic compartments. J. Invest. Dermatol. 138 (3), 637–646. 10.1016/j.jid.2017.09.042 [DOI] [PubMed] [Google Scholar]
Costa Gagosian V. S., Coronel R., Buss B. C., Dos Santos M. L. F., Liste I., Anta B., et al. (2025). In vitro skin models as non-animal methods for dermal drug development and safety assessment. Pharmaceutics 17 (10), 1342. 10.3390/pharmaceutics17101342 [DOI] [PMC free article] [PubMed] [Google Scholar]
Dong Y., Liu Q., Du B., Zhang L. (2022). Weighted feature fusion of convolutional neural network and graph attention network for hyperspectral image classification, IEEE Trans. Image Process. 31, 1559–1572. 10.1109/TIP.2022.3144017 [DOI] [PubMed] [Google Scholar]
Dosovitskiy A. (2020). An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929. [Google Scholar]
Draelos R. L., Kesty C. E., Kesty K. R. (2025). Artificial intelligence predicts fitzpatrick skin type, pigmentation, redness, and wrinkle severity from color photographs of the face. J. Cosmet. Dermatol 24 (4), e70050. 10.1111/jocd.70050 [DOI] [PMC free article] [PubMed] [Google Scholar]
Gao G., Li C., Zhang X., Yao B., Chen Z. (2026). Mamba-CNN hybrid multi-scale ship detection Network driven by a dual-perception feature of Doppler and Scattering. ISPRS 232, 656–674. 10.1016/j.isprsjprs.2026.01.004 [DOI] [Google Scholar]
Hall M. J., Lopes-Ventura S., Neto M. V., Charneca J., Zoio P., Seabra M. C., et al. (2022). Reconstructed human pigmented skin/epidermis models achieve epidermal pigmentation through melanocore transfer. Pigment. Cell Melanoma Res. 35 (4), 425–435. 10.1111/pcmr.13039 [DOI] [PMC free article] [PubMed] [Google Scholar]
Han K., Wang Y., Chen H., Chen X., Guo J., Liu Z., et al. (2023). A Survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 45 (1), 87–110. 10.1109/TPAMI.2022.3152247 [DOI] [PubMed] [Google Scholar]
Holail S., Saleh T., Xiao X., Zahran M., Xia G. S., Li D. (2025). Edge-CVT: edge-Informed CNN and vision transformer for building change detection in satellite imagery. ISPRS J. Photogrammetry Remote Sens. 227, 48–68. 10.1016/j.isprsjprs.2025.05.021 [DOI] [Google Scholar]
Johnson J. W. (2018). Adapting mask-rcnn for automatic nucleus segmentation. arXiv Preprint arXiv:1805.00500. 10.48550/arXiv.1805.00500 [DOI] [Google Scholar]
Kojima K., Shido K., Tamiya G., Yamasaki K., Kinoshita K., Aiba S. (2021). Facial UV photo imaging for skin pigmentation assessment using conditional generative adversarial networks. Sci. Rep. 11 (1), 1213. 10.1038/s41598-020-79995-4 [DOI] [PMC free article] [PubMed] [Google Scholar]
Lei T. C., Hearing V. J. (2020). Deciphering skin re-pigmentation patterns in vitiligo: an update on the cellular and molecular events involved. Chin. Med. J. Engl. 133 (10), 1231–1238. 10.1097/CM9.0000000000000794 [DOI] [PMC free article] [PubMed] [Google Scholar]
Li Q., Wang C., Li X., Zhang J., Zhang Z., Yang K., et al. (2023). Epidermis-on-a-chip system to develop skin barrier and melanin mimicking model. J. Tissue Eng. 14, 20417314231168529. 10.1177/20417314231168529 [DOI] [PMC free article] [PubMed] [Google Scholar]
Li Q., Jia X., Zhou J., Shen L., Duan J. (2024). Rediscovering bce loss for uniform classification. arXiv preprint arXiv:2403.07289. 10.48550/arXiv.2403.07289 [DOI] [Google Scholar]
Liu Q., Dong Y., Zhang Y., Luo H. (2022). A fast dynamic graph convolutional network and CNN parallel network for hyperspectral image classification. IEEE Trans. Geoscience Remote Sens. 60, 1–15. 10.1109/tgrs.2022.3179419 [DOI] [Google Scholar]
Miao F., Wan J., Zhou Y., Shi Y. (2025). Unraveling melasma: from epidermal pigmentation to microenvironmental dysregulation. Biol. (Basel) 14 (10), 1402. 10.3390/biology14101402 [DOI] [PMC free article] [PubMed] [Google Scholar]
Minaee S., Boykov Y., Porikli F., Plaza A. J., Kehtarnavaz N., Terzopoulos D., et al. (2021). Image segmentation using deep learning: a survey. IEEE Transactions Pattern Analysis Machine Intelligence 44 (7), 3523–3542. 10.1109/TPAMI.2021.3059968 [DOI] [PubMed] [Google Scholar]
Miot H., Brianezi G., Tamega A. d. A., Miot L. D. B. (2012). Techniques of digital image analysis for histological quantification of melanin. An. Bras. Dermatol. 87, 608–611. 10.1590/s0365-05962012000400014 [DOI] [PubMed] [Google Scholar]
Nissan X., Larribere L., Saidani M., Hurbain I., Delevoye C., Feteira J., et al. (2011). Functional melanocytes derived from human pluripotent stem cells engraft into pluristratified epidermis. Proc. Natl. Acad. Sci. U. S. A. 108 (36), 14861–14866. 10.1073/pnas.1019070108 [DOI] [PMC free article] [PubMed] [Google Scholar]
Park T., Kim T. K., Han Y. D., Kim K. A., Kim H., Kim H. S. (2023). Development of a deep learning based image processing tool for enhanced organoid analysis. Sci. Rep. 13 (1), 19841. 10.1038/s41598-023-46485-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
Pena A.-M., Decencière E., Brizion S., Sextius P., Koudoro S., Baldeweck T., et al. (2022). In vivo melanin 3D quantification and z-epidermal distribution by multiphoton FLIM, phasor and Pseudo-FLIM analyses. Sci. Rep. 12 (1), 1642. 10.1038/s41598-021-03114-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
Ronneberger O., Fischer P., Brox T. (2015). “U-net: convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (Cham: Springer; ), 234–241. 10.1007/978-3-319-24574-4_28 [DOI] [Google Scholar]
Schmitz R., Madesta F., Nielsen M., Krause J., Steurer S., Werner R., et al. (2021). Multi-scale fully convolutional neural networks for histopathology image segmentation: from nuclear aberrations to the global tissue architecture. Med. Image Anal. 70, 101996. 10.1016/j.media.2021.101996 [DOI] [PubMed] [Google Scholar]
Seiberg M. (2001). Keratinocyte-melanocyte interactions during melanosome transfer. Pigment. Cell Res. 14 (4), 236–242. 10.1034/j.1600-0749.2001.140402.x [DOI] [PubMed] [Google Scholar]
Shen C., Lamba A., Zhu M., Zhang R., Zernicka-Goetz M., Yang C. (2022). Stain-free detection of embryo polarization using deep learning. Sci. Rep. 12 (1), 2404. 10.1038/s41598-022-05990-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
Shi G., Feng Y., Tonissen K. F. (2024). Development of a human tyrosinase activity inhibition assay using human melanoma cell lysate. Biotechniques 76 (11), 547–551. 10.1080/07366205.2024.2441637 [DOI] [PubMed] [Google Scholar]
Srivastava N., Hinton G. E., Krizhevsky A., Sutskever I., Salakhutdinov R. (2014). Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15 (1), 1929–1958. [Google Scholar]
Tada A., Suzuki I., Im S., Davis M. B., Cornelius J., Babcock G., et al. (1998). Endothelin-1 is a paracrine growth factor that modulates melanogenesis of human melanocytes and participates in their responses to ultraviolet radiation. Cell Growth Differ. 9 (7), 575–584. [PubMed] [Google Scholar]
Wang C., Yang K., Yang W., Qiang H., Xue H., Lu B., et al. (2023). R-MFNet: analysis of urban carbon stock change against the background of land-use change based on a residual multi-module fusion network. Remote Sens. (Basel). 15 (11), 2823. 10.3390/rs15112823 [DOI] [Google Scholar]
Wang J., Ai T., Wu H., Xu H., Xiao T., Li G. (2024). Graph-based spatial co-location pattern mining: integrate geospatial analysis and logical reasoning. Int. J. Digital Earth 17 (1), 2390434. 10.1080/17538947.2024.2390434 [DOI] [Google Scholar]
Wu Y., Cheng M., Huang S., Pei Z., Zuo Y., Liu J., et al. (2022). Recent advances of deep learning for computational histopathology: principles and applications. Cancers (Basel) 14 (5), 1199. 10.3390/cancers14051199 [DOI] [PMC free article] [PubMed] [Google Scholar]
Xie E., Wang W., Yu Z., Anandkumar A., Álvarez J. M., Luo P. (2021). SegFormer: simple and efficient design for semantic segmentation with transformers. Adv. Neural Information Processing Systems 34, 12077–12090. 10.48550/arXiv.2105.15203 [DOI] [Google Scholar]
Yang K., Cui D., Wang C., Tang Q., Miao L. (2025). Intelligent assessment of habitat quality based on multiple machine learning fusion methods. Eng. Appl. Artif. Intell. 162, 112395. 10.1016/j.engappai.2025.112395 [DOI] [Google Scholar]
Yang K., Cui D., Zhan C. (2026). Spatial distribution of bird diversity sensitivity to air pollutants. J. Clean. Prod. 542, 147616. 10.1016/j.jclepro.2026.147616 [DOI] [Google Scholar]
Zhan C., Yang K. (2025). WCMamba: enhancing high-resolution remote sensing image semantic segmentation with pyramid wavelet convolution and SS2D. Knowledge-Based Syst. 324, 113877. 10.1016/j.knosys.2025.113877 [DOI] [Google Scholar]
Zhang J., Jiang Q., Chen Q., Hu B., Chen L. (2025). Deep learning-based multiclass framework for real-time melasma severity classification: clinical image analysis and model interpretability evaluation. Clin. Cosmet. Investig. Dermatol 18, 1033–1044. 10.2147/CCID.S508580 [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhao R., Qian B., Zhang X., Li Y., Wei R., Liu Y., et al. (2020). “Rethinking dice loss for medical image segmentation,” in 2020 IEEE international conference on data mining (ICDM) (Piscataway, NJ: IEEE; ), 851–860. 10.1109/ICDM50108.2020.00094 [DOI] [Google Scholar]
Zheng X., Huan L., Xia G. S., Gong J. (2020). Parsing very high resolution urban scene images by learning deep ConvNets with edge-aware loss. ISPRS J. Photogrammetry Remote Sens. 170, 15–28. 10.1016/j.isprsjprs.2020.09.019 [DOI] [Google Scholar]
Zhou Z., Siddiquee M. M. R., Tajbakhsh N., Liang J. (2020). UNet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 39 (6), 1856–1867. 10.1109/TMI.2019.2959609 [DOI] [PMC free article] [PubMed] [Google Scholar]
Zou K. H., Warfield S. K., Bharatha A., Tempany C. M. C., Kaus M. R., Haker S. J., et al. (2004). Statistical validation of image segmentation quality based on a spatial overlap index1: scientific reports. Acad. Radiol. 11 (2), 178–189. 10.1016/s1076-6332(03)00671-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/supplementary material.

[B1] Bass J. J., Wilkinson D. J., Rankin D., Phillips B. E., Szewczyk N. J., Smith K., et al. (2017). An overview of technical considerations for Western blotting applications to physiological research. Scand. J. Med. Sci. Sports 27 (1), 4–25. 10.1111/sms.12702 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B2] Benito-Martínez S., Zhu Y., Jani R. A., Harper D. C., Marks M. S., Delevoye C. (2020). Research techniques made simple: cell biology methods for the analysis of pigmentation. J. Invest. Dermatol. 140 (2), 257–268.e8. 10.1016/j.jid.2019.12.002 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3] Bento-Lopes L., Cabaço L. C., Charneca J., Neto M. V., Seabra M. C., Barral D. C. (2023). Melanin's journey from melanocytes to keratinocytes: uncovering the molecular mechanisms of melanin transfer and processing. Int. J. Mol. Sci. 24 (14), 11289. 10.3390/ijms241411289 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B4] Cao H., Wang Y., Chen J., Jiang D., Zhang X., Tian Q., et al. (2022). “Swin-unet: Unet-like pure transformer for medical image segmentation,” in Computer Vision – ECCV 2022 Workshops (Cham: Springer; ), 205–218. 10.1007/978-3-031-25066-8_9 [DOI] [Google Scholar]

[B5] Chen J., Lu Y., Yu Q., Luo X., Adeli E., Wang Y., et al. (2021). Transunet: transformers make strong encoders for medical image segmentation. arXiv Preprint arXiv:2102.04306. 10.48550/arXiv.2102.04306 [DOI] [Google Scholar]

[B6] Cheng B., Girshick R. B., Dollár P., Berg A. C., Kirillov A. (2021). “Boundary IoU: improving object-centric image segmentation evaluation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 15334–15342. 10.1109/CVPR46437.2021.01508 [DOI] [Google Scholar]

[B7] Correia M. S., Moreiras H., Pereira F. J. C., Neto M. V., Festas T. C., Tarafder A. K., et al. (2018). Melanin transferred to keratinocytes resides in nondegradative endocytic compartments. J. Invest. Dermatol. 138 (3), 637–646. 10.1016/j.jid.2017.09.042 [DOI] [PubMed] [Google Scholar]

[B8] Costa Gagosian V. S., Coronel R., Buss B. C., Dos Santos M. L. F., Liste I., Anta B., et al. (2025). In vitro skin models as non-animal methods for dermal drug development and safety assessment. Pharmaceutics 17 (10), 1342. 10.3390/pharmaceutics17101342 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B9] Dong Y., Liu Q., Du B., Zhang L. (2022). Weighted feature fusion of convolutional neural network and graph attention network for hyperspectral image classification, IEEE Trans. Image Process. 31, 1559–1572. 10.1109/TIP.2022.3144017 [DOI] [PubMed] [Google Scholar]

[B10] Dosovitskiy A. (2020). An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929. [Google Scholar]

[B11] Draelos R. L., Kesty C. E., Kesty K. R. (2025). Artificial intelligence predicts fitzpatrick skin type, pigmentation, redness, and wrinkle severity from color photographs of the face. J. Cosmet. Dermatol 24 (4), e70050. 10.1111/jocd.70050 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B12] Gao G., Li C., Zhang X., Yao B., Chen Z. (2026). Mamba-CNN hybrid multi-scale ship detection Network driven by a dual-perception feature of Doppler and Scattering. ISPRS 232, 656–674. 10.1016/j.isprsjprs.2026.01.004 [DOI] [Google Scholar]

[B13] Hall M. J., Lopes-Ventura S., Neto M. V., Charneca J., Zoio P., Seabra M. C., et al. (2022). Reconstructed human pigmented skin/epidermis models achieve epidermal pigmentation through melanocore transfer. Pigment. Cell Melanoma Res. 35 (4), 425–435. 10.1111/pcmr.13039 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B14] Han K., Wang Y., Chen H., Chen X., Guo J., Liu Z., et al. (2023). A Survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 45 (1), 87–110. 10.1109/TPAMI.2022.3152247 [DOI] [PubMed] [Google Scholar]

[B15] Holail S., Saleh T., Xiao X., Zahran M., Xia G. S., Li D. (2025). Edge-CVT: edge-Informed CNN and vision transformer for building change detection in satellite imagery. ISPRS J. Photogrammetry Remote Sens. 227, 48–68. 10.1016/j.isprsjprs.2025.05.021 [DOI] [Google Scholar]

[B16] Johnson J. W. (2018). Adapting mask-rcnn for automatic nucleus segmentation. arXiv Preprint arXiv:1805.00500. 10.48550/arXiv.1805.00500 [DOI] [Google Scholar]

[B17] Kojima K., Shido K., Tamiya G., Yamasaki K., Kinoshita K., Aiba S. (2021). Facial UV photo imaging for skin pigmentation assessment using conditional generative adversarial networks. Sci. Rep. 11 (1), 1213. 10.1038/s41598-020-79995-4 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B18] Lei T. C., Hearing V. J. (2020). Deciphering skin re-pigmentation patterns in vitiligo: an update on the cellular and molecular events involved. Chin. Med. J. Engl. 133 (10), 1231–1238. 10.1097/CM9.0000000000000794 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B19] Li Q., Wang C., Li X., Zhang J., Zhang Z., Yang K., et al. (2023). Epidermis-on-a-chip system to develop skin barrier and melanin mimicking model. J. Tissue Eng. 14, 20417314231168529. 10.1177/20417314231168529 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B20] Li Q., Jia X., Zhou J., Shen L., Duan J. (2024). Rediscovering bce loss for uniform classification. arXiv preprint arXiv:2403.07289. 10.48550/arXiv.2403.07289 [DOI] [Google Scholar]

[B21] Liu Q., Dong Y., Zhang Y., Luo H. (2022). A fast dynamic graph convolutional network and CNN parallel network for hyperspectral image classification. IEEE Trans. Geoscience Remote Sens. 60, 1–15. 10.1109/tgrs.2022.3179419 [DOI] [Google Scholar]

[B22] Miao F., Wan J., Zhou Y., Shi Y. (2025). Unraveling melasma: from epidermal pigmentation to microenvironmental dysregulation. Biol. (Basel) 14 (10), 1402. 10.3390/biology14101402 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B23] Minaee S., Boykov Y., Porikli F., Plaza A. J., Kehtarnavaz N., Terzopoulos D., et al. (2021). Image segmentation using deep learning: a survey. IEEE Transactions Pattern Analysis Machine Intelligence 44 (7), 3523–3542. 10.1109/TPAMI.2021.3059968 [DOI] [PubMed] [Google Scholar]

[B24] Miot H., Brianezi G., Tamega A. d. A., Miot L. D. B. (2012). Techniques of digital image analysis for histological quantification of melanin. An. Bras. Dermatol. 87, 608–611. 10.1590/s0365-05962012000400014 [DOI] [PubMed] [Google Scholar]

[B25] Nissan X., Larribere L., Saidani M., Hurbain I., Delevoye C., Feteira J., et al. (2011). Functional melanocytes derived from human pluripotent stem cells engraft into pluristratified epidermis. Proc. Natl. Acad. Sci. U. S. A. 108 (36), 14861–14866. 10.1073/pnas.1019070108 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B26] Park T., Kim T. K., Han Y. D., Kim K. A., Kim H., Kim H. S. (2023). Development of a deep learning based image processing tool for enhanced organoid analysis. Sci. Rep. 13 (1), 19841. 10.1038/s41598-023-46485-2 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B27] Pena A.-M., Decencière E., Brizion S., Sextius P., Koudoro S., Baldeweck T., et al. (2022). In vivo melanin 3D quantification and z-epidermal distribution by multiphoton FLIM, phasor and Pseudo-FLIM analyses. Sci. Rep. 12 (1), 1642. 10.1038/s41598-021-03114-0 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B28] Ronneberger O., Fischer P., Brox T. (2015). “U-net: convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (Cham: Springer; ), 234–241. 10.1007/978-3-319-24574-4_28 [DOI] [Google Scholar]

[B29] Schmitz R., Madesta F., Nielsen M., Krause J., Steurer S., Werner R., et al. (2021). Multi-scale fully convolutional neural networks for histopathology image segmentation: from nuclear aberrations to the global tissue architecture. Med. Image Anal. 70, 101996. 10.1016/j.media.2021.101996 [DOI] [PubMed] [Google Scholar]

[B30] Seiberg M. (2001). Keratinocyte-melanocyte interactions during melanosome transfer. Pigment. Cell Res. 14 (4), 236–242. 10.1034/j.1600-0749.2001.140402.x [DOI] [PubMed] [Google Scholar]

[B31] Shen C., Lamba A., Zhu M., Zhang R., Zernicka-Goetz M., Yang C. (2022). Stain-free detection of embryo polarization using deep learning. Sci. Rep. 12 (1), 2404. 10.1038/s41598-022-05990-6 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B32] Shi G., Feng Y., Tonissen K. F. (2024). Development of a human tyrosinase activity inhibition assay using human melanoma cell lysate. Biotechniques 76 (11), 547–551. 10.1080/07366205.2024.2441637 [DOI] [PubMed] [Google Scholar]

[B33] Srivastava N., Hinton G. E., Krizhevsky A., Sutskever I., Salakhutdinov R. (2014). Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15 (1), 1929–1958. [Google Scholar]

[B34] Tada A., Suzuki I., Im S., Davis M. B., Cornelius J., Babcock G., et al. (1998). Endothelin-1 is a paracrine growth factor that modulates melanogenesis of human melanocytes and participates in their responses to ultraviolet radiation. Cell Growth Differ. 9 (7), 575–584. [PubMed] [Google Scholar]

[B35] Wang C., Yang K., Yang W., Qiang H., Xue H., Lu B., et al. (2023). R-MFNet: analysis of urban carbon stock change against the background of land-use change based on a residual multi-module fusion network. Remote Sens. (Basel). 15 (11), 2823. 10.3390/rs15112823 [DOI] [Google Scholar]

[B36] Wang J., Ai T., Wu H., Xu H., Xiao T., Li G. (2024). Graph-based spatial co-location pattern mining: integrate geospatial analysis and logical reasoning. Int. J. Digital Earth 17 (1), 2390434. 10.1080/17538947.2024.2390434 [DOI] [Google Scholar]

[B37] Wu Y., Cheng M., Huang S., Pei Z., Zuo Y., Liu J., et al. (2022). Recent advances of deep learning for computational histopathology: principles and applications. Cancers (Basel) 14 (5), 1199. 10.3390/cancers14051199 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B38] Xie E., Wang W., Yu Z., Anandkumar A., Álvarez J. M., Luo P. (2021). SegFormer: simple and efficient design for semantic segmentation with transformers. Adv. Neural Information Processing Systems 34, 12077–12090. 10.48550/arXiv.2105.15203 [DOI] [Google Scholar]

[B39] Yang K., Cui D., Wang C., Tang Q., Miao L. (2025). Intelligent assessment of habitat quality based on multiple machine learning fusion methods. Eng. Appl. Artif. Intell. 162, 112395. 10.1016/j.engappai.2025.112395 [DOI] [Google Scholar]

[B40] Yang K., Cui D., Zhan C. (2026). Spatial distribution of bird diversity sensitivity to air pollutants. J. Clean. Prod. 542, 147616. 10.1016/j.jclepro.2026.147616 [DOI] [Google Scholar]

[B41] Zhan C., Yang K. (2025). WCMamba: enhancing high-resolution remote sensing image semantic segmentation with pyramid wavelet convolution and SS2D. Knowledge-Based Syst. 324, 113877. 10.1016/j.knosys.2025.113877 [DOI] [Google Scholar]

[B42] Zhang J., Jiang Q., Chen Q., Hu B., Chen L. (2025). Deep learning-based multiclass framework for real-time melasma severity classification: clinical image analysis and model interpretability evaluation. Clin. Cosmet. Investig. Dermatol 18, 1033–1044. 10.2147/CCID.S508580 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B43] Zhao R., Qian B., Zhang X., Li Y., Wei R., Liu Y., et al. (2020). “Rethinking dice loss for medical image segmentation,” in 2020 IEEE international conference on data mining (ICDM) (Piscataway, NJ: IEEE; ), 851–860. 10.1109/ICDM50108.2020.00094 [DOI] [Google Scholar]

[B44] Zheng X., Huan L., Xia G. S., Gong J. (2020). Parsing very high resolution urban scene images by learning deep ConvNets with edge-aware loss. ISPRS J. Photogrammetry Remote Sens. 170, 15–28. 10.1016/j.isprsjprs.2020.09.019 [DOI] [Google Scholar]

[B45] Zhou Z., Siddiquee M. M. R., Tajbakhsh N., Liang J. (2020). UNet++: redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imaging 39 (6), 1856–1867. 10.1109/TMI.2019.2959609 [DOI] [PMC free article] [PubMed] [Google Scholar]

[B46] Zou K. H., Warfield S. K., Bharatha A., Tempany C. M. C., Kaus M. R., Haker S. J., et al. (2004). Statistical validation of image segmentation quality based on a spatial overlap index1: scientific reports. Acad. Radiol. 11 (2), 178–189. 10.1016/s1076-6332(03)00671-8 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

AI-assisted quantitative analysis for evaluating melanin distribution in 3D pigmented epidermis-on-a-chip models

Yu Yao

Xuan Du

Yanhui Li

Yuchen Ma

Yuchen Li

Zilin Zhang

Boyang Song

Xiaoran Li

Jing Zhang

Jun Ouyang

Nuo Si

Ningbei Yin

Qianqian Han

Zhongze Gu

Zaozao Chen

Roles

Abstract

Introduction

Methods

Results

Discussion

1. Introduction

FIGURE 1.

2. Materials and methods

2.1. Cell sources and culture

2.2. Construction of 3D pigmented epidermis-on-a-chip models

2.3. Data acquisition and annotation

2.4. Vision transformer

2.5. Model optimization

2.6. Model training and evaluation

2.7. Evaluation indicators

2.7.1. Total area of melanin patches:

2.7.2. Average area of melanin patches

2.7.3. Melanin patch light transmittance

2.7.4. Optical density of melanin patches

2.7.5. Composite score formula

3. Results

3.1. Formation of the 3D pigmented epidermis-on-a-chip model and AI-based analysis

FIGURE 2.

3.2. Deep learning–based melanin detection method

FIGURE 3.

3.3. Model performance evaluation

FIGURE 4.

TABLE 1.

TABLE 2.

3.4. Experimental evaluation

FIGURE 5.

FIGURE 6.

4. Discussion

5. Conclusion

Funding Statement

Footnotes

Data availability statement

Ethics statement

Author contributions

Conflict of interest

Generative AI statement

Publisher’s note

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases