Skip to main content
BMC Biology logoLink to BMC Biology
. 2025 Oct 21;23:313. doi: 10.1186/s12915-025-02411-8

A knowledge-driven deep learning framework for organoid morphological segmentation and characterization

Yiming Qin 1,2, Jiajia Li 3, Yin Heng 2, Zheyuan Wang 4, Dezhi Wu 4, Mahi Rahman 2, Pengwei Hu 5, Tobias Plötz 2, Alexander Hopp 2, Nicholas Kurniawan 6, Mathias Winkel 2, Philipp Harbach 2, Chunling Tang 7,, Feng Tan 2,
PMCID: PMC12538981  PMID: 41121276

Abstract

Background

Organoids have great potential to revolutionize various aspects of biomedical research and healthcare. Researchers typically use the fluorescence-based approach to analyse their dynamics, which requires specialized equipment and may interfere with their growth. Therefore, it is an open challenge to develop a general framework to analyse organoid dynamics under non-invasive and low-resource settings.

Results

In this paper, we present a knowledge-driven deep learning system named TransOrga-plus to automatically analyse organoid dynamics in a non-invasive manner. Given a bright-field microscopic image, TransOrga-plus detects organoids through a multi-modal transformer-based segmentation module. To provide customized and robust organoid analysis, a biological knowledge-driven branch is embedded into the segmentation module which integrates biological knowledge, e.g. the morphological characteristics of organoids, into the analysis process. Then, based on the detection results, a lightweight multi-object tracking module based on the decoupling of visual and identity features is introduced to track organoids over time. Finally, TransOrga-plus outputs the dynamics analysis to assist biologists for further research. To train and validate our framework, we curate a large-scale organoid dataset encompassing diverse tissue types and various microscopic imaging settings. Extensive experimental results demonstrate that our method outperforms all baselines in organoid analysis. The results show that TransOrga-plus provides comparable analytical results to biologists and significantly accelerates organoid work process.

Conclusions

In conclusion, TransOrga-plus integrates the biological expertise with cutting-edge deep learning-based model and enables the non-invasive analysis of various organoids from complex, low-resource, and time-lapse situations.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12915-025-02411-8.

Keywords: Organoid, Deep learning, Knowledge-driven

Background

Organoids are three-dimensional structures that mimic the architecture and function of organs in a miniaturized and simplified form [1]. They are derived from stem cells or tissue samples and cultured in vitro, providing a more physiologically relevant model compared to traditional two-dimensional cell cultures. In biomedical [27] and healthcare research [811], organoids are beginning to demonstrate their revolutionary potential. For example, they can be used to model human diseases more accurately than traditional cell cultures, allowing researchers to better understand disease mechanisms and identify potential treatments.

To cultivate organoids, researchers are investigating a bio-engineering approach for fast and stable in vitro culturing. One key aspect of bio-engineering organoids is to be able to longitudinally analyse their growth dynamics, such as the organoid cells morphology, population distribution, and time-course variance. However, researchers and practitioners still face several challenges when performing their analytical work. First, organoids grow in complex culture media with interference factors, such as air bubbles and nutritional debris. These interference factors are not static and change over time. Second, organoid dynamics is an integrated process characterized by morphological heterogeneity. Different types of organoids exhibit diverse morphologies, and even within the same type, their morphology can vary over time. Additionally, the occurrence of organoid cell connections and overlapping during growth further complicates the analysis of organoid dynamics.

Existing methods typically use fluorescence staining microscopic images or genetically modification [1115] to conduct organoid dynamics analysis. Researchers first stain the organoids with specific fluorescent dyes that help differentiate organoid cells from the culturing medium. They then label the organoids, either manually or with computer assistance, and empirically set microscopic parameters to conduct growth analysis. However, these methods are invasive and highly resource intensive. Some fluorescence dye-based approaches may disrupt the intrinsic cellular dynamics of the original samples [16, 17] or induce cumulative toxicity due to prolonged culture periods and restricted diffusion within the hydrogel matrix [18]. Moreover, these methods require the additional purchase of fluorescence dyes and professional dyeing, resulting in high resource overhead. Therefore, there is a growing need for an automatic, non-invasive, and low-resource approach [1923]. Previous researchers proposed non-invasive and low-resource approaches using bright-field or phase-contrast microscopic images [2429]. Furthermore, specialized imaging platforms have been developed for label-free, non-invasive assessment of cellular viability [3033]. However, these methods commonly suffer from limited robustness and generalizability. First, compared to the fluorescent images, the bright-field or phase-contrast microscopic images lack colour and texture context, which poses a challenge to the robustness of the image analysis. Second, current deep learning-based methods heavily depend on many organoid samples for training and lack of domain knowledge, resulting in poor generalizability. However, in real-world practice, organoid samples are limited, and the biologists’ approach is not solely based on images. Instead, they consider both the images and biological knowledge, such as the tissue type and culture medium elements, to make analytical decisions. Additionally, the design of existing methods still has inherent flaws in organoid dynamics analysis. For instance, the pyramid structure in the feature extraction of classical deep learning-based recognition methods can result in unavoidable information loss and consequently inaccurate organoid detection [28].

To address these gaps, we propose a knowledge-driven deep learning framework, named TransOrga-plus, for organoid dynamics analysis. Given the bright-field microscopic image and the biological knowledge provided by scientists, TransOrga-plus automatically detects, tracks, and analyses cellular dynamics. Our model mainly contains three modules, a biological knowledge-driven branch embedded multi-modal segmentation module, a tracking module and an analysis module. Unlike traditional hand-crafted formulas, the term biological knowledge in our work specifically refers to image-based morphological characteristics of organoid-derived cells as recognized by domain experts—such as shape, size, texture, edge contrast, and compactness. These features, grounded in biological expertise, are used to differentiate meaningful organoid-derived cells from bright-field microscopy images. The biological knowledge-driven branch extracts the features from the biological knowledge provided by the user, and then fuses the extracted features with the whole image features to guide the analysis. To accurately detect organoids, we develop a multi-modal transformer-based model that utilizes the frequency domain features to provide morphological clues and the spatial domain features to provide visual clues. The tracking module decouples identity features and visual features of organoids for lightweight multi-organoid tracking. Finally, the analysis module outputs the single-organoid analysis, bulk analysis, time-course analysis. Extensive experimental results demonstrate that TransOrga-plus outperforms all baselines in organoid dynamics analysis and achieves comparable performance to manual ways. Additionally, TransOrga-plus successfully completes customized organoid dynamics analysis based on feedback from biologists.

Specifically, our novelties and contributions can be summarized as follows:

  1. We proposed TransOrga-plus, a knowledge-driven deep learning framework for robust and customizable organoid dynamics analysis. The framework integrates detection, tracking, and human-in-the-loop feedback to adapt to diverse experimental conditions and morphological variability.

  2. We designed a biological knowledge-driven module that incorporates domain-specific prior knowledge, interactively or explicitly, into the learning process. This module guides detection and tracking with interpretable biological constraints and improves generalizability with limited data by reducing reliance on extensive annotations.

  3. To train and validate our learning framework, we curated a large-scale dataset from both internal and external scenarios. This dataset covers a wide range of tissue types at different maturity phases. Extensive experimental results show that our framework not only outperforms all baselines in detection and tracking accuracy but also generalizes to various tissue types.

Results

Overall framework

The overview of TransOrga-plus is shown in Fig. 1a. Given the bright-field microscopic image and biological knowledge, TransOrga-plus automatically analyses organoids based on biological knowledge. TransOrga-plus mainly contains three modules: a biological knowledge-driven branch embedded multi-modal segmentation module, a light-weight tracking module and an analysis module. The multi-modal segmentation module utilizes visual and frequency domain clues to detect organoids from bright-field microscopic images. Using the biological knowledge-driven branch, we integrate user-provided biological knowledge into the model and guide it to generate personalized analysis results. The lightweight tracking module is designed to suit the high-throughput organoids. The analysis module utilizes the detection and tracking results to complete organoid dynamics analysis. To train and validate our model, we build a large-scale dataset that contains different organoid types. The collection of our dataset is shown in Fig. 1b and c.

Fig. 1.

Fig. 1

The overview of TransOrga-plus and our dataset. a The overview of TransOrga-plus. Given bright-field microscopic images and the biological knowledge provided by scientists, TransOrga-plus automatically and interactively detects, tracks, and analyses organoids. b Dataset. Our dataset includes OrganoID, TU/e, and Merck. c Dataset annotation. The hybrid way is designed to annotate our dataset efficiently. Biologists first manually select and annotate a sub-dataset. TransOrga-plus is trained on the sub-dataset. Using the trained TransOrga-plus, biologists obtain coarse masks of left samples and then correct them to produce final annotations

Large-scale organoid dataset

To train and evaluate the proposed framework, we curated a large-scale dataset by following these procedures, as shown in Fig. 1b. The raw data came from the OrganoID [28], Eindhoven University of Technology (TU/e) [34], and Merck (Table 1). Our large-scale dataset comprises 1153 bright-field microscopic images, encompassing a diverse array of organoid types and various imaging settings. The organoid types include salivary adenoid cystic carcinoma (ACC), colon epithelia (Colon), lung epithelia (Lung), pancreatic ductal adenocarcinoma (PDAC), and mammary (Mammary). Regarding image settings, we have 862 images with a resolution of 512 × 512 pixels and 291 high-resolution images (1024 × 1024 pixels or higher). Additionally, our dataset features 42 microscopic image sequences, each capturing a 92-h growth process of organoids. For data labelling, we adopted a hybrid annotation approach which contains manual annotation and annotation with TransOrga-plus assistance (Fig. 1c). The detailed data curation process is described in the “Methods” section.

Table 1.

The curated dataset. Our dataset includes samples from OrganoID, TU/e, and Merck. It contains diverse types of organoids and image settings

Source Total sample High-resolution sample Sequence sample Tissue type
OrganoID 66 0 42 Colon, Lung, PDAC, ACC
TU/e 1074 278 0 Lung, Mammary
Merck 13 13 0 Lung

Accurate, robust, and generalizable organoid detection

We conduct comparative experiments using our dataset to investigate TransOrga-plus’s capability to accurately, robustly, and generally detect complex organoids in microscopic images. We benchmark TransOrga-plus with the following state-of-the-art methods: SegNet [35], A-Unet [36], StartDist [37], CellPose [38], ilastik [18], and OrganoID [28]. The training and validation datasets contain different tissue types of organoids, including ACC, Colon, Lung, and PDAC.

The Dice measures the overlap between the predicted segmentation mask and the ground truth mask. Mean Intersection over Union (mIoU) is a popular metric for evaluating the performance across negative (background) and positive (organoid) classes in segmentation tasks, with 1 being perfect segmentation. Precision measures the proportion of correctly predicted positive (organoid) pixels out of all predicted positive pixels. Recall measures the proportion of correctly predicted positive pixels out of all actual positive pixels. The F1 Score is the harmonic mean of Precision and Recall, balancing the two. It is especially useful when a balance between Precision and Recall is desired. A high F1-score indicates a model that has both high recall and precision, which is essential when both false positives and false negatives are critical in segmentation accuracy. As shown in Table 2 and Fig. 2a–d, compared with baselines, TransOrga-plus demonstrates excellent quantitative performance with Dice 0.919 ± 0.02, mIoU 0.851 ± 0.04, precision 0.819 ± 0.07, recall 0.904 ± 0.01 and F1-score 0.856 ± 0.04 (all P < 0.001). The qualitative results are shown in Fig. 3. The baseline methods produce false and broken detections, as indicated by red arrows, due to medium bubble and debris interference, whereas our approach is resistant to such interference. Additionally, baseline methods lack constraints on generating segmentation with rational shapes, resulting in irregular and fragmented outputs, as indicated by red arrows.

Table 2.

Quantitative results of baselines and our method on different tissues. We compute the Dice, mIoU, Precision, Recall, and F1-score of different methods on various types of organoids. The upward arrow indicates that a higher score is better. We highlight the best results using bold

Model Dice mIoU Precision Recall F1-score
SegNet [35] 0.853±0.04 0.746±0.06 0.741±0.13 0.789±0.05 0.738±0.08
A-Unet [36] 0.899±0.03 0.818±0.06 0.743±0.11 0.931±0.04 0.818±0.07
OrganoID [28] 0.871±0.03 0.773±0.04 0.698±0.07 0.872±0.05 0.768±0.06
StartDist [37] 0.791±0.01 0.654±0.01 0.721±0.02 0.876±0.01 0.791±0.02
CellPose [38] 0.754±0.01 0.605±0.01 0.642±0.02 0.913±0.01 0.754±0.03
Ilastik [18] 0.767±0.01 0.622±0.02 0.728±0.01 0.810±0.01 0.767±0.01
Ours 0.919±0.02 0.851±0.04 0.819±0.07 0.904±0.01 0.856±0.04

Fig. 2.

Fig. 2

Results of organoid morphological segmentation and characterization. a The Dice score of SOTA methods and our method. b The mIoU score of SOTA methods and our method. c The Precision score of SOTA methods and our method. d The F1-Score score of SOTA methods and our method. The analysis used a paired Student t test. e Tracking accuracy. We calculate the tracking accuracy of TransOrga-plus based on the manual tracking. f Organoid count. The x-axis indicates the number counted by biologists and the y-axis indicates the number counted by our method. g Organoid area calculation. The x-axis indicates the area calculated by biologists, and the y-axis indicates the area calculated by our method. h Single organoid area curve. We recorded five organoid cells area change curves over time using TransOrga-plus

Fig. 3.

Fig. 3

Comparison results of organoid detection. We compared our method with baselines on different types of organoids. Our method achieves the best results. We highlight the failure cases using red arrows

Regarding scalability to high-resolution images, the model performances are evaluated on high-resolution microscopic images (1024 × 1024 or higher) where models’ structure scalability can be assessed as well. Additionalfile 1: Fig. S1 shows that the state-of-the-art (SOTA) OrganoID suffers from performance degradation. However, TransOrga-plus maintains robust detection performance for high-resolution images.

To evaluate the effectiveness of our proposed modules, we also conducted an ablation study. In the ablation experiments, we remove the proposed modules one by one and retrain the model lacking the module to get the results. The ablation study results shown in Table 3 indicate that all proposed modules positively impact organoid recognition.

Table 3.

Quantitative Results of Ablation Study. Our proposed components contribute significantly to model performance. The upward arrow indicates that a higher score is better. We highlight the best results using bold

Model Dice mIoU Precision Recall F1-score
Ours without multi-modal 0.879±0.03 0.789±0.06 0.692±0.04 0.916±0.02 0.776±0.03
Ours without Lcom 0.841±0.03 0.711±0.02 0.579±0.10 0.930±0.02 0.719±0.06
Ours without biological knowledge 0.882±0.03 0.823±0.03 0.726±0.06 0.890±0.01 0.800±0.03
Ours 0.919±0.02 0.851±0.04 0.819±0.07 0.904±0.01 0.856±0.04

We further conducted an organoid type classification task across four categories: ACC, Colon, Lung, and PDAC. Our model demonstrated robust performance, achieving an average classification accuracy of 92.86% ± 2.28%, indicating its strong ability to distinguish between different organoid types based on morphological or structural features.

Consistent tracking of organoid growth and cellular viability analysis

We conduct tracking experiments to evaluate TransOrga-plus’s ability to track organoid growth consistently. The microscopy image sequences were formed by taking snapshots of the organoids in the medium every 2 h over a period of 92 h. TransOrga-plus takes the microscopy image sequence as input and produces numerically labelled tracking results. Specialized biologists performed organoid tracking task on the same microscopy image sequence to serve as ground-truth labels. Tracking accuracy is defined as the fraction of identified organoids that are correctly matched at each time step [28]. We also performed tracking experiments at different time intervals, e.g. 4, 6, 12 h, by discarding intermediate frames.

We compared our method with original segment-and-track anything (STA) [40, 40] and fine-tuned STA using our dataset. The results, as shown in Additional file 1: Fig. S2, indicate that our method successfully captures the growth trajectory of organoids at different time steps, demonstrating TransOrga-plus’s advantage of integrating biological knowledge for long-term tracking. The original segment anything model (SAM) shows poor generalization on organoids. The results of the tracking accuracy are shown in Fig. 2e, which range from 92.1 to 94.4%. The temporal consistency in tracking results allows for the observation of dynamic phenotypic changes at the single-cell level, which have been associated with cellular viability [30]. A video demonstration of the tracking capabilities is also available on our official website (https://github.com/dev-csftan/TransOrga).

Performance improved by biological knowledge

To assess the benefits of the knowledge-driven module in TransOrga-plus, we conducted interactive experiments with emulated biologist feedback. In the interactive experiments, given a microscopy image, (1) TransOrga-plus outputs the initial detection result; (2) based on the initial detection result, the biologist provides specific knowledge, i.e. organoid morphology, through a bounding box; (3) conditioned on the extracted biological knowledge, TransOrga-plus outputs the refined detection result.

We validated the knowledge-driven process on various tissue types with different biological insights. We also utilized the detection metrics for quantitative analysis. As shown in Fig. 4, the biologist selects the organoid morphology in the red box as biological knowledge feedback for the TransOrga-plus. The model refines the detection results for organoid cells with similar morphologies and eliminates those with significant differences, as indicated by the red arrows. The results demonstrate that the knowledge-driven module can handle a diverse range of organoids under complex and incomplete situations. Quantitative metrics, as shown in Table 3, also show that fusing biological knowledge improves the detection performance.

Fig. 4.

Fig. 4

Comparison results of organoid detection conditioned on biological knowledge (i.e. organoid morphology). We compared the results of our method with and without integrating biological knowledge. The yellow region represents the detection region. Error regions are highlighted with red arrows. The results indicate that biological information has a positive impact on detection outcomes

Automatic and accurate organoid analysis

TransOrga-plus can automatically analyse multiple organoid indicators, including single-organoid analysis, bulk analysis, and time-course analysis. For single-organoid analysis, our method can accurately categorize different types of organoids (Fig. 3). For bulk analysis, we measure the organoid count and areas. We compare the areas calculated using our method with those calculated manually. Due to different scales from various data sources, we counted the number of pixels occupied. The results are shown in Fig. 2 f and g, where each black dot represents a sample. The x-axis indicates the results calculated by biologists, and the y-axis indicates the results calculated by our method. r2 is 0.96 for organoid count and 0.92 for organoid area, which indicates that our method achieved results comparable to the manual way. For time-course analysis, we record the single organoid area curve along time. The results are shown in Fig. 2h, which indicate that our method can efficiently track the area changes of different organoid cells.

Discussion

Organoid dynamics analysis holds significant importance in biomedical and healthcare research. Organoid data is high-throughput data containing a large number of organoid cells. Therefore, there is a critical need for an automatic organoid analysis approach. OrganoID previously introduced a deep learning method to extract organoid information from bright-field microscopic images of organoids. Nonetheless, organoid analysis remains challenging due to the intricate nature of organoids and the scarcity of adequate training samples. In response, we propose TransOrga-plus, a novel image-based and knowledge-driven system for organoid analysis. TransOrga-plus comprises four main modules: a knowledge-fusion-and-guidance module, a multi-modal segmentation module, a lightweight tracking module, and an analysis module. Our model is trained and validated on our large-scale organoid dataset encompassing diverse organoid types and varied image settings. To our knowledge, TransOrga-plus represents the pioneering application of a biological knowledge-driven deep learning approach to organoid analysis and provides an automatic, non-invasive, resource-efficient, and personalized organoid dynamics analysis tool.

Previous studies [2428] have demonstrated the potential of deep learning methods in detecting organoids from bright-field images. However, these approaches suffer from model structure and are heavily reliant on the size of the training dataset, leading to limited accuracy and poor generalization. In our study, we introduce a transformer architecture augmented with biological knowledge. The transformer effectively captures robust long-range visual features, preserving global context and feature integrity. Moreover, our method excels in handling ultra-high-resolution images, significantly expanding its applicability. Incorporating biological knowledge embeds specific biological characteristics into the model, mitigating reliance solely on dataset knowledge and thereby enhancing generalization.

Organoid dynamics analysis is crucial for advancing biomedical and healthcare research, including elucidating disease mechanisms and developing treatment strategies. Traditional approaches involve staining organoid with fluorescent dyes or built from genetically modified fluorescent organoid cells for detection, tracking, and analysis. However, fluorescence dye-based approaches may disrupt the intrinsic cellular dynamics of the original samples [16, 17] or induce cumulative toxicity due to prolonged culture periods and restricted diffusion within the hydrogel matrix [18].There is a pressing need for non-invasive methods that utilize bright-field imaging. However, compared to stained images, bright-field images often lack sufficient detail. In our study, we propose a multi-modal transformer-based segmentation module designed to detect organoids from bright-field microscopic images. This innovative model integrates frequency domain information to capture morphological cues and spatial domain information to extract visual cues.

Cost-effectiveness plays a pivotal role in organoid research. Biomedical and healthcare studies often require iterative experimentation for achieving desired outcomes, underscoring the necessity of economical approaches. In organoid dynamics analysis, methods utilizing fluorescent dyes not only involve the expense of these dyes but also necessitate specialized staining expertise, significantly elevating research costs. Our approach utilizes bright-field images without requiring staining, thereby automating organoid dynamics analysis and markedly enhancing cost-effectiveness. Moreover, biologists can easily tailor TransOrga-plus using the proposed knowledge-fusion-and-driven module by incorporating biological insights, facilitating personalized organoid dynamic analysis. This capability addresses specific experimental requirements more effectively than previous deep learning-based methods.

The dataset is a crucial limiting factor for organoid dynamic analysis research based on deep learning. In our study, we have curated a comprehensive dataset encompassing various types of organoid tissues and diverse image settings. Our dataset incorporates publicly available OrganoID data, along with proprietary data from TU/e and Merck. By providing this extensive dataset, we establish a robust foundation for advancing deep learning methods in organoid research, paving the way for future innovations in the field.

Our study has Limitations. First, while in our current version, the biological knowledge is primarily provided through visual data, further investigation into integrating other forms such as mathematical formulations and natural language descriptions is warranted. Second, our current model exhibits reduced accuracy in detecting very small organoid cells, necessitating future modifications to the model architecture. Third, our dataset does not encompass all types of organoids. Fourth, our dataset does not contain intracellular structures, the ground truth for viability of the organoids and the 3D structure of organoids, prompting ongoing efforts to expand and diversify our dataset. Besides, the ablation study shows that removing Lcom slightly increases detection recall, likely because Lcom enforces a shape prior that favours compact object structures, which may cause non-compact organoids to be partially missed or inadequately segmented, thereby lowering recall. In future work, we plan to explore adaptive regularization strategies that modulate compactness constraints based on local context or learned shape distributions. These steps are crucial for enhancing the comprehensiveness and efficacy of our approach in organoid research.

Conclusions

In conclusion, we have developed TransOrga-plus, a biological knowledge-driven deep learning system comprising a knowledge-fusion-and-guidance module, a multi-modal segmentation module, a lightweight tracking module, and an analysis module. TransOrga-plus represents a non-invasive, cost-effective, and personalized tool for analysing organoid dynamics. This system significantly enhances the efficiency of organoid dynamic analysis, thereby advancing research in organoid-related fields. Extensive experiments demonstrate that TransOrga-plus holds promise as a transformative solution for organoid culturing and research.

Methods

Data collection

Data sources

We collected our dataset from different sources, including OrganoID, TU/e, and Merck Corporation. OrganoID is a public dataset containing 66 organoid samples. Researchers at TU/e collected 1352 organoid samples. Researchers at Merck collected 13 organoid samples.

Data types

All organoid samples were recorded with RGB format microscopic images. The general resolution is 512×512pixels. We also collected high-resolution microscope images, all of which have resolutions exceeding 1024×1024pixels. The ground-truth mask is the binary image with the same resolution as its associated microscopic image, where 1 represents the organoid and 0 represents the background.

Data labelling

Data labelling is very time-consuming and labour-intensive due to the complex and high-throughput organoids in each microscopic image. To mitigate this issue, we adopted a hybrid labelling method. Firstly, we manually labelled 44 samples of various types of organoids as sub-dataset. Second, we utilized the sub-dataset to train TransOrga-plus and employed various data augmentation techniques, e.g. crop, rotation, and contrast adjustment, to enhance generalization. Third, we utilized the trained TransOrga-plus to infer the coarse masks for samples without the ground-truth masks. Finally, biologists corrected the coarse masks to formulate the ground-truth mask.

Segmentation module

Input preprocessing

To address memory and computational constraints while preserving fine-grained morphological detail for high-density input and high-resolution input, we employed a sliding window (divide-and-conquer strategy) method. The original input image is tiled into overlapping 512 × 512 patches. Each patch is processed independently by the segmentation and tracking modules. The outputs are then stitched back together, using overlap-aware merging and confidence-based voting to ensure spatial coherence and avoid boundary artifacts. This strategy allows the model to maintain high segmentation fidelity while being computationally feasible on standard GPUs. Our current model is unable to reliably detect or segment organoid-derived cells that are smaller than the size of a single visual pixel in the input bright-field microscopy images. This limitation is primarily due to the resolution constraints of the imaging setup and the receptive field of the neural network.

Multi-modal encoder

On the one hand, negative factors in medium and imaging equipment often introduce extra noise into microscopic images. On the other hand, microscopic images lack sufficient colour and texture context for organoids. To solve these challenges, we introduce features of the frequency domain to augment the features in the temporal domain, as shown in Fig. 5. By employing the Fourier transform, we can obtain features in the frequency domain that not only isolate noise but also extract image variations. Given the microscopic image Itgt, we obtain amplitude Atgt and phase spectrum Ptgt as follows,

Fig. 5.

Fig. 5

The architecture of TransOrga-plus. TransOrga-plus mainly contains three modules: a biological knowledge-driven branch embedded multi-modal segmentation module, a tracking module and an analysis module. Given the bright-field microscopic image or sequence, the multi-modal segmentation modal recognizes the organoid of each image which is represented as the organoid mask. The biological knowledge-driven branch introduces the biological knowledge feedback from biologists into the model to obtain customized results. Using the sequence of organoid images and segmentation results, the tracking module consistently tracks organoids over time. Based on the segmentation and tracking results, single-organoid analysis, bulk analysis, and time-course analysis are conducted

Atgt=AFFTItgt+e-10Ptgt=PFFTItgt, 1

Where FFT represents the Fourier transform operation. A· and P· are amplitude and phase spectra functions, respectively, and e-10 is a constant. Then, we obtain the logarithmic amplitude Ltgt based on Atgt. A mean filter with a kernel size of 3×3, a stride of 1, and a padding of 1 is used to obtain the mean spectrum Ltgt` from Ltgt. Subsequently, the residual spectrum Rtgt is computed as the difference between Ltgt and Ltgt`, followed by a Fourier inverse transform combined with the phase spectrum Ptgt to yield Etgt. Etgt is then passed through a Gaussian smoothing filter to generate the saliency map (Stgt) as follows,

Stgt=GIFFTexpRtgt+Ptgt2, 2

Where IFFT denotes the inverse Fourier transform and G denotes the Gaussian smoothing filter. Finally, we merge the temporal domain features Itgt and the frequency domain features Stgt along the channel dimension and obtain the fused feature map Otgt through the convolution layer.

Once we obtain the fused feature map Otgt, we partition the image into multiple patches of dimension 16×16pixels which are transformed into a one-dimensional embedding sequence via path embedding and position embedding. Then, we leverage multiple stacked Transformer layers, containing Multi-head Self-Attention (MSA) and Multi-Layer Perception (MLP) blocks, to extract the features Z as follows,

Zl=MSAZl-1+MLPMSAZl-1. 3

Additionally, layer normalization is implemented before the MSA and MLP blocks for brevity, although it is not explicitly depicted in the formula. This paper employs Z1,Z2,,ZL as the features of the Transformer layer l=1Z.

Multi-branch decoder

To foster interaction across features from different layers, we introduce a multi-branch aggregation design that progressively integrates corresponding contexts from different layers. In our model, we choose features from layers 2, 5, 8, and 11, named Z2,Z5,Z8, and Z11. Initially, we reshape the features from a 2D shape (HW512×C) to a 3D shape (H16×W16×C). Then, these features pass through three convolutional layers with kernel sizes of 1×1, 3×3, and 3×3, respectively. To promote interactions among branches, we introduce a top-down aggregation structure, merging top and bottom layer features via element-wise summation and expanding the size dimensions through an up-sampling operation. Once we obtain four sets of combined features, we concatenate them along the channel dimension and rescale their size to the input dimensions using convolutional and up-sampling operations. Based on multi-branch features, we generate the predicted organoid segmentation mask Mtgt, where the channel of P is 2.

Biological knowledge-driven branch

TransOrga-plus offers a knowledge-driven branch to interactively integrate feedback from biologists, as shown in Fig. 5. Biologists provide biological information through visual annotations Ibio, such as single-organoid regions and medium regions. Given a target image input Itgt and the initial biological information Ibio. The multi-modal encoder extracts visual features Etgt and Ebio from Itgt and Ibio, respectively. The biological-driven knowledge branch introduces the extracted biological knowledge Ebio into the model. $B$ contains the biological knowledge and the original bright-field image features. Finally, the decoder utilizes the combined features B and outputs target recognition M and tracking T results. The mathematical formulation for this branch can be found as follows,

Etgt=EncoderItgtEbio=EncoderIbioB=BioEncoderEtgt,EbioM,T=DecoderB. 4

Tracking module

To effectively track high-throughput organoids from microscopic image sequences, we utilize the decoupling features in the hierarchical propagation (DeAOT) method [40], which leverages hierarchical propagation with decoupled feature representation. Given a sequence of microscopic images and a reference annotation Mt, DeAOT propagates this annotation across the entire sequence, enabling consistent object tracking over time.

The core innovation of DeAOT lies in its decoupling of visual features and identification features into two distinct branches: the Visual Branch and the Identification Branch. In the Visual Branch, organoids are matched across frames via attention maps computed on patch-wise visual embeddings. These attention maps guide the propagation of visual features from previously stored frames to the current frame. Importantly, this propagation is entirely decoupled from identification embeddings, which ensures that the visual embeddings remain unbiased and purely appearance-based. The visual feature propagation is formulated as follows:

Ilt~=AttIltWlK,IlmWlK,IlmWlV=CorrIltWlK,IlmWlKIlmWlI, 5

Where Ilt is the visual embedding of l-th propagation layer at t-th image and Ilm is the stored visual embeddings of m images. WlK and WlV are parameters to project visual features into matching space and propagation space, respectively. The Identification Branch, in parallel, handles the propagation of object-specific identity features. It utilizes both the stored mask features and additional identification information to ensure that each organoid is consistently recognized across frames. This is formulated as:

Mlt~=Att(IltWlK,IlmWlK,MlmWlV¯+ID(Ym))=CorrIltWlK,IlmWlKMlmWlV¯+IDYm, 6

Where Mlm and Ym is the stored identification features of m masks. ID· is the identification method [40]. DeAOT enables accurate and real-time tracking of high-throughput organoids, preserving both spatial continuity and object identity across complex dynamic sequences.

Analysis module

After obtaining segmentation results, we can perform single-organoid analysis.

How to obtain single-organoid: Typically, neural network image segmentation methods apply an absolute threshold to predicted pixels to generate a binary detection mask. While effective, this approach ignores valuable prediction confidence information. During training, the network learns to produce a 2-pixel boundary from the training dataset. Consequently, the network predictions have slightly less confidence in pixels near organoid boundaries. First, we compute the partial derivative of pixel intensity to detect high-contrast regions. Second, we apply a blurring method to smooth noisy regions. Third, we employ a hysteresis-based threshold to identify locally strong edges. Subsequently, edges are removed from the thresholder prediction image to designate the centres of each organoid. These centres serve as the initial basins in a watershed transformation to segment contacting organoids. The image undergoes further refinement to eliminate organoids potentially outside the field-of-view or below a specific size threshold. The pipeline yields a labelled image wherein pixels representing individual organoids are assigned a unique organoid identifier (ID) number, facilitating subsequent single-organoid analysis.

For single-organoid analysis, the grouping of pixels in the segmentation mask is essential to identify individual organoids. For isolated organoids, this task is straightforward since all high-confidence pixels in a cluster correspond to one organoid. However, for organoids that are in physical contact, it becomes more challenging. To address this issue, we introduce an organoid separation pipeline that leverages the raw network prediction to group pixels into single-organoid clusters.

For time-course analysis, we measure changes such as the size and shape of individual organoids over time. Based on the single-organoid analysis for each frame, our tracking module consistently follows the individual organoids between different frames and outputs time-course organoid analysis.

Losses

As the first step in the system, the output quality of segmentation influences the down-stream processing. Considering the characteristics of organoids, we design a series of loss functions, including the focal loss Lfocal, the dice loss Ldice and the compact loss Lcom as the weighted loss function in segmentation optimization objective. Due to the imbalanced ratio between organoids and background in the image, we utilize the focal loss Lfocal to penalize the error-predicted mask at pixel level as follows,

Lfocal=-α1-ptγlogptpt=p^ifargmaxMtgt=11-p^otherwise, 7

Where argmax· obtains the channel index which has the maximum value. p^ is the predicted probability of the organoid class. α and γ are hyperparameters for the sample weight and the weight for hard cases, respectively. The dice loss Ldice measures the ratio of the overlap between the predicted mask and ground-truth mask to their union as follows,

Ldice=1-2argmaxMtgtGtgtMtgtGtgt, 8

Where Gtgt is the ground-truth segmentation mask. Since most organoids are circles or ellipses, which are highly compact. We take advantage of this phenomenon and introduce compact loss Lcom as follows,

Lcom=iϵΩpih2+p2+ε4πiΩpi+ε 9

Where pi is the predicted segmentation results, Ω is the pixel set of predicted segmentation mask. pih and piv are the gradients for pixels in the horizontal and vertical directions. ϵ is a hyperparameter. Above all, we combine all these losses to form the final loss as follows,

Lfinal=λ1Lfocal+λ2Ldice+λ3Lcom 10

Implementation details

The proposed model is implemented using Python 3.9 and PyTorch 1.13 and trained on Ubuntu 20.04 with NVIDIA GeForce RTX 3090 for 100 epochs. We train our model using the stochastic gradient descent algorithm with a stochastic weight-averaging strategy. Dropout is enabled during the training and testing for uncertainty estimation. The initial learning rate is 0.01, and we reduce it by a factor 0.1 every 10 epochs. Our dataset is randomly split into training, validation and test sets (70%, 20%, 10%, respectively). The validation set is used to improve our models and select the best model hyperparameters. In our experiments, we set λ1 as 1.0, λ2 as 0.5 and λ3 as 0.5. The detailed guidelines about how to train our model on private dataset is shown in https://github.com/dev-csftan/TransOrga.

Supplementary Information

12915_2025_2411_MOESM1_ESM.docx (1.9MB, docx)

Additional file 1: Fig. S1. Comparison results on unseen high-resolution bright-field microscopic organoids images. Fig. S2. Comparison Results of organoid tracking using the microscopic image sequence.

Acknowledgements

The biological lab of Merck Group manually labelled the sub-dataset.

Abbreviations

TU/e

Eindhoven University of Technology

ACC

Salivary adenoid cystic carcinoma

Colon

Colon epithelia

Lung

Lung epithelia

PDAC

Pancreatic ductal adenocarcinoma

Mammary

Mammary

mIoU

Mean intersection over union

SOTA

State-of-the-art

STA

Segment-and-track anything

SAM

Segment anything model

MSA

Multi-head self-attention

MLP

Multi-layer perception

DeAOT

Decoupling features in the hierarchical propagation

ID

Identifier

Authors’ contributions

FT, CT, MW, PH and YQ conceived and designed the study. FT, CT, TP, NK, MW and YQ participated in preparing the manuscript. FT, CT, HY, MR, NK and YQ participated in data and sample collection. YQ, JL, ZW, DW, PH, TP and FT participated in designing model. YQ, FT, CT, ZW, AH and NK participated in designing statistical analysis plan. All authors read and approved the final manuscript.

Funding

This research was supported by Shanghai Science and Technology Commission Research Project No.23440790200.

Data availability

The access of the organoid data analysed in this manuscript is provided by the Merck Group. The data, code and demo for this paper are available at [https://zenodo.org/records/16900123] (https://zenodo.org/records/16900123) [40].

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Chunling Tang, Email: chunling.tang@kcl.ac.uk.

Feng Tan, Email: feng.tan@merckgroup.com.

References

  • 1.Kretzschmar K, Clevers H. Organoids: modeling development and the stem cell niche in a dish. Dev Cell. 2016;38(6):590–600. [DOI] [PubMed] [Google Scholar]
  • 2.Li C, Fleck JS, Martins-Costa C, Burkard TR, Themann J, Stuempflen M, et al. Single-cell brain organoid screening identifies developmental defects in autism. Nat. 2023;621(7978):373–80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Park DS, Kozaki T, Tiwari SK, Moreira M, Khalilnezhad A, Torta F, et al. iPS-cell-derived microglia promote brain organoid maturation via cholesterol transfer. Nat. 2023;623(7986):379-405. [DOI] [PubMed]
  • 4.Volmert B, Kiselev A, Juhong A, Wang F, Riggs A, Kostina A, et al. A patterned human primitive heart organoid model generated by pluripotent stem cell self-organization. Nat Commun. 2023;14(1):8245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hashimi M, Sebrell TA, Hedges JF, Snyder D, Lyon KN, Byrum SD, et al. Antiviral responses in a Jamaican fruit bat intestinal organoid model of SARS-CoV-2 infection. Nat Commun. 2023;14(1):6882. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Harter MF, Recaldin T, Gerard R, Avignon B, Bollen Y, Esposito C, et al. Analysis of off-tumour toxicities of T-cell-engaging bispecific antibodies via donor-matched intestinal organoids and tumouroids. Nat Biomed Eng. 2023. 10.1038/s41551-023-01156-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Mead BE, Hattori K, Levy L, Imada S, Goto N, Vukovic M, et al. Screening for modulators of the cellular composition of gut epithelia via organoid models of intestinal stem cell differentiation. Nat Biomed Eng. 2022;6(4):476–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Zhou Z, Van der Jeught K, Fang Y, Yu T, Li Y, Ao Z, et al. An organoid-based screen for epigenetic inhibitors that stimulate antigen presentation and potentiate T-cell-mediated cytotoxicity. Nat Biomed Eng. 2021;5(11):1320–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Abe K, Yamashita A, Morioka M, Horike N, Takei Y, Koyamatsu S, et al. Engraftment of allogeneic iPS cell-derived cartilage organoid in a primate model of articular cartilage defect. Nat Commun. 2023;14(1):804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Beumer J, Geurts MH, Lamers MM, Puschhof J, Zhang J, van der Vaart J, et al. A CRISPR/Cas9 genetically engineered organoid biobank reveals essential host factors for coronaviruses. Nat Commun. 2021;12(1):5498. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Brandenberg N, Hoehnel S, Kuttler F, Homicsko K, Ceroni C, Ringel T, et al. High-throughput automated organoid culture via stem-cell aggregation in microcavity arrays. Nat Biomed Eng. 2020;4(9):863–74. [DOI] [PubMed] [Google Scholar]
  • 12.Fiorenzano A, Sozzi E, Birtele M, Kajtez J, Giacomoni J, Nilsson F, et al. Single-cell transcriptomics captures features of human midbrain development and dopamine neuron diversity in brain organoids. Nat Commun. 2021;12(1):7302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.de Medeiros G, Ortiz R, Strnad P, Boni A, Moos F, Repina N, et al. Multiscale light-sheet organoid imaging framework. Nat Commun. 2022;13(1):4864. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Ghosheh M, Ehrlich A, Ioannidis K, Ayyash M, Goldfracht I, Cohen M, et al. Electro-metabolic coupling in multi-chambered vascularized human cardiac organoids. Nat Biomed Eng. 2023(11). 10.1038/s41551-023-01071-9. [DOI] [PubMed]
  • 15.Mukashyaka P, Kumar P, Mellert DJ, Nicholas S, Noorbakhsh J, Brugiolo M, et al. High-throughput deconvolution of 3D organoid dynamics at cellular resolution for cancer pharmacology with Cellos. Nat Commun. 2023;14(1):8406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Bailey SR, Maus MV. Gene editing for immune cell therapies. Nat Biotechnol. 2019;37(12):1425–34. [DOI] [PubMed] [Google Scholar]
  • 17.Ang LT, Tan AKY, Autio MI, Goh SH, Choo SH, Lee KL, et al. A roadmap for human liver differentiation from pluripotent stem cells. Cell Rep. 2018;22(8):2190–205. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Berg S, Kutra D, Kroeger T, Straehle CN, Kausler BX, Haubold C, et al. Ilastik: interactive machine learning for (bio) image analysis. Nat Methods. 2019;16(12):1226–32. [DOI] [PubMed] [Google Scholar]
  • 19.Kim J, Koo B-K, Knoblich JA. Human organoids: model systems for human biology and medicine. Nat Rev Mol Cell Biol. 2020;21(10):571–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Hofer M, Lutolf MP. Engineering organoids. Nat Rev Mater. 2021;6(5):402–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Fei K, Zhang J, Yuan J, Xiao P. Present application and perspectives of organoid imaging technology. Bioengineering. 2022;9(3):121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bai L, Wu Y, Li G, Zhang W, Zhang H, Su J. Ai-enabled organoids: construction, analysis, and application. Bioact Mater. 2024;31:525–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Zhang X-S, Xie G, Ma H, Ding S, Wu Y-X, Fei Y, et al. Highly reproducible and cost-effective one-pot organoid differentiation using a novel platform based on PF-127 triggered spheroid assembly. Biofabrication. 2023;15(4):045014. [DOI] [PubMed] [Google Scholar]
  • 24.Borten MA, Bajikar SS, Sasaki N, Clevers H, Janes KA. Automated brightfield morphometry of 3D organoid populations by OrganoSeg. Sci Rep. 2018;8(1):5319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kassis T, Hernandez-Gordillo V, Langer R, Griffith LG. Orgaquant: human intestinal organoid localization and quantification using deep convolutional neural networks. Sci Rep. 2019;9(1):12479. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kok RNU, Hebert L, Huelsz-Prince G, Goos YJ, Zheng X, Bozek K, et al. OrganoidTracker: Efficient cell tracking using machine learning and manual error correction. PLoS ONE. 2020;15(10):e0240802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Larsen BM, Kannan M, Langer LF, Leibowitz BD, Bentaieb A, Cancino A, et al. A pan-cancer organoid platform for precision medicine. Cell Rep. 2021. 10.1016/j.celrep.2021.109429. [DOI] [PubMed] [Google Scholar]
  • 28.Matthews JM, Schuster B, Kashaf SS, Liu P, Ben-Yishay R, Ishay-Ronen D, et al. OrganoID: a versatile deep learning platform for tracking and analysis of single-organoid dynamics. PLoS Comput Biol. 2022;18(11):e1010584. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Chiang C-C, Anne R, Chawla P, Shaw RM, He S, Rock EC, et al. Deep learning unlocks label-free viability assessment of cancer spheroids in microfluidics. Lab Chip. 2024;24(12):3169–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Park S, Veluvolu V, Martin WS, Nguyen T, Park J, Sackett DL, et al. Label-free, non-invasive, and repeatable cell viability bioassay using dynamic full-field optical coherence microscopy and supervised machine learning. Biomed Opt Express. 2022;13(6):3187–94. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Abd El-Sadek I, Shen LT-W, Mori T, Makita S, Mukherjee P, Lichtenegger A, et al. Label-free drug response evaluation of human derived tumor spheroids using three-dimensional dynamic optical coherence tomography. Sci Rep. 2023;13(1):15377. [DOI] [PMC free article] [PubMed]
  • 32.Wu H, Yang Y, Bagnaninchi PO, Jia J. Electrical impedance tomography for real-time and label-free cellular viability assays of 3D tumour spheroids. Analyst. 2018;143(17):4189–98. [DOI] [PubMed] [Google Scholar]
  • 33.Abd El-Sadek I, Morishita R, Mori T, Makita S, Mukherjee P, Matsusaka S, et al. Label-free visualization and quantification of the drug-type-dependent response of tumor spheroids by dynamic optical coherence tomography. Sci Rep. 2024;14(1):3366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Tang C, Wang X, D’Urso M, van der Putten C, Kurniawan NA. 3D interfacial and spatiotemporal regulation of human neuroepithelial organoids. Adv Sci. 2022;9(22):2201106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Badrinarayanan V, Kendall A, Cipolla R. Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell. 2017;39(12):2481–95. [DOI] [PubMed] [Google Scholar]
  • 36.Oktay O, Schlemper J, Le Folgoc L, Lee M, Heinrich M, Misawa K, et al. Attention u-net: learning where to look for the pancreas. Proc 1st Med Imaging Deep Learn, 2018;1–12.
  • 37. Schmidt U, Weigert M, Broaddus C, Myers G. Cell detection with star-convex polygons. In: Med Image Comput Comput Assist Interv. 2018;11071:265–73.
  • 38.Stringer C, Wang T, Michaelos M, Pachitariu M. Cellpose: a generalist algorithm for cellular segmentation. Nat Methods. 2021;18(1):100–6. [DOI] [PubMed] [Google Scholar]
  • 39.Kirillov A, Mintun E, Ravi N, Mao H, Rolland C, Gustafson L, et al. Segment Anything. Proc IEEE/CVF Int Conf Comput Vis. 2023;4015–26.
  • 40. Cheng Y, Li L, Xu Y, Li X, Yang Z, Wang W, et al. Segment and track anything. arXiv preprint arXiv:230506558. 2023.
  • 41. Yang Z, Yang Y. Decoupling features in hierarchical propagation for video object segmentation. In: Adv Neural Inf Process Syst. 2022;35:36324–36336.
  • 42. Yang Z, Wei Y, Yang Y. Associating objects with transformers for video object segmentation. In: Adv Neural Inf Process Syst. 2021;34:2491–2502.
  • 43.Qin Y, Li J, Heng Y, Wang Z, Wu D, Rahman M, et al. A knowledge-driven deep learning framework for organoid morphological segmentation and characterisation. 2025. Zenodo. 10.5281/zenodo.16900123.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. Qin Y, Li J, Heng Y, Wang Z, Wu D, Rahman M, et al. A knowledge-driven deep learning framework for organoid morphological segmentation and characterisation. 2025. Zenodo. 10.5281/zenodo.16900123.

Supplementary Materials

12915_2025_2411_MOESM1_ESM.docx (1.9MB, docx)

Additional file 1: Fig. S1. Comparison results on unseen high-resolution bright-field microscopic organoids images. Fig. S2. Comparison Results of organoid tracking using the microscopic image sequence.

Data Availability Statement

The access of the organoid data analysed in this manuscript is provided by the Merck Group. The data, code and demo for this paper are available at [https://zenodo.org/records/16900123] (https://zenodo.org/records/16900123) [40].


Articles from BMC Biology are provided here courtesy of BMC

RESOURCES