Skip to main content
Biomedical Optics Express logoLink to Biomedical Optics Express
. 2025 Feb 4;16(3):894–909. doi: 10.1364/BOE.547292

OAH-Net: a deep neural network for efficient and robust hologram reconstruction for off-axis digital holographic microscopy

Wei Liu 1,, Kerem Delikoyun 2,3,, Qianyu Chen 2,3, Alperen Yildiz 3, Si Ko Myo 3, Win Sen Kuan 4,5, John Tshon Yit Soong 4,6, Matthew Edward Cove 6, Oliver Hayden 2,3, Hwee Kuan Lee 1,7,8,9,10,*
PMCID: PMC11919354  PMID: 40109528

Abstract

Off-axis digital holographic microscopy is a high-throughput, label-free imaging technology that provides three-dimensional, high-resolution information about samples, which is particularly useful in large-scale cellular imaging. However, the hologram reconstruction process poses a significant bottleneck for timely data analysis. To address this challenge, we propose a novel reconstruction approach that integrates deep learning with the physical principles of off-axis holography. We initialized part of the network weights based on the physical principle and then fine-tuned them via supersized learning. Our off-axis hologram network (OAH-Net) retrieves phase and amplitude images with errors that fall within the measurement error range attributable to hardware, and its reconstruction speed significantly surpasses the microscope’s acquisition rate. Crucially, OAH-Net, trained and validated on diluted whole blood samples, demonstrates remarkable external generalization capabilities on unseen samples with distinct patterns. Additionally, it can be seamlessly integrated with other models for downstream tasks, enabling end-to-end real-time hologram analysis. This capability further expands off-axis holography’s applications in both biological and medical studies.

1. Introduction

Digital holographic microscopy (DHM) is emerging as an innovative imaging modality in computational microscopy. It provides high-resolution, quantitative, and three-dimensional information about samples without labeling. These unique features make DHM a promising technique for imaging living cells, as it captures intracellular structures while preserving cells in their natural state, which could be used for more precise analysis [14].

DHM records the interference pattern between the object and the reference beams, which is then reconstructed using algorithms to retrieve the wave of the object beam in terms of phase and amplitude [57]. Holography can be classified into two major types based on the alignment of the beams: inline holography and off-axis holography [8,9]. For inline holography, the reference beam is parallel to the object beam. Although the setup is relatively simple, the hologram reconstruction requires multiple exposures of the same sample at varying sample-to-sensor distances. For off-axis holography, the reference beam is slightly titled to form a small angle with the object beam, resulting in spatial separation of the hologram in the frequency domain and subsequently facilitating reconstruction. Off-axis holography requires only one exposure per sample for reconstruction, making it an ideal technique for imaging nonstatic samples and high-throughput applications. For example, off-axis DHM has been used for blood-based cellular diagnostics by processing blood samples through a specialized microfluidic channel and imaging blood cells at high speed [10,11].

To fully leverage DHM’s high-throughput imaging capabilities, high-speed analysis methods are essential for efficiently processing large volumes of images. In our previous pipeline [11], achieving a typical turnaround time (TAT) of 30 minutes for hematology analysis remains infeasible (see Fig. 1(b)). Although deep-learning techniques have shown significant advantages in computer vision, with some state-of-the-art methods achieving real-time analysis speeds [1215], these methods cannot be directly applied to DHM because holograms must first be reconstructed into phase and amplitude images. Consequently, the speed of hologram reconstruction has become the primary bottleneck, limiting DHM’s clinical feasibility for applications requiring rapid turnaround times.

Fig. 1.

Fig. 1.

(a) Clinical workflow for point-of-care (POC) diagnostics using DHM, where clinical blood samples are screened at DHM. OAH-Net-based holographic reconstruction, integrating with an improved data analysis model, enables real-time analysis with a turnaround time of under 5 minutes for a measurement typically involving 10,000 frames in 1.5 minutes. (b) Comparison of process breakdown between Ovizio-based and OAH-Net-based workflows, showing OAH-Net dramatically accelerates image processing and achieves near real-time POC diagnosis. (c) The overall architecture of OAH-Net. OAH-Net consists of two main modules: Fourier Imager Heads (FIHs) and the complex-valued network (CVN). More details are provided in the Methods section. Although the size of the output images is 1/16 of the input image, there is no information loss. This is because the useful Fourier frequency occupies less than 1/16 of the overall spectrum.

Two broad strategies have been used to improve the speed and accuracy of off-axis hologram reconstruction. The first strategy is purely data-driven, treating hologram reconstruction as a typical image-to-image task to convert holograms into phase and amplitude images [1619]. A major drawback of this approach is the performance drop when dealing with unseen samples distinct from the training data. This is particularly problematic in clinical settings, where patient data varies substantially across different disease phenotypes, medical instrumentation, or user intervention. The second category focuses on filtering signals in the frequency domain, taking advantage of the physical principle of off-axis holography. Many manual-designed frequency filters have been proposed, including but not limited to regular shape filters [20,21], Butterworth filters [22], binary adaptive filters [23], weighted adaptive filters [24], and iterative thresholding filters [25]. Existing filters have certain limitations; they are either too rough [2022], need manual selection of a sample-dependent threshold [23,24], or involve an iterative process for thresholding [25]. In addition to manual-designed filters, Xiao et al. proposed a deep-learning approach to generate sample-dependent filters using a U-Net model [26]. However, training this model requires manually annotating frequency filters for each sample, a process that is both time-consuming and requires specialized expertise, making it challenging for users to train the model with their own data or fine-tune a pre-trained model.

In this study, we explore a supervised deep learning approach that leverages the physical principles of off-axis holography to streamline real-time hologram reconstruction. Our approach converts frequency domain operations, such as filtration, into matrix multiplications with trainable parameters, which are then refined within a deep learning framework. This framework generates ground truth automatically, removing the need for manual annotation and utilizing the extensive datasets provided by high-throughput DHM. Our model, the Off-Axis Hologram Network (OAH-Net), directly incorporates the physics of off-axis DHM within its architecture, enabling robust generalization even on samples with distinct patterns not encountered during training. Designed as an end-to-end solution, OAH-Net eliminates the need for preprocessing or post-processing steps, such as phase unwrapping.

One key advantage of OAH-Net is its single-pass processing approach. Unlike other methods that rely on iterative reconstruction [18,25], our model processes each hologram only once, allowing all holograms within a batch to complete simultaneously and maximizing parallel computing efficiency. The inference speed is under 3 ms/frame–significantly faster than the 9.5 ms/frame acquisition rate of the microscope camera–achieving real-time processing of high-resolution (1536 × 2048 pixels) holograms. Additionally, the reconstruction error is minimal, within the range of the hardware’s inherent measurement error. When tested with YOLO for subsequent object detection, the performance on OAH-Net-reconstructed images closely matches that of ground-truth data.

Our work introduces a novel holographic reconstruction and phase retrieval algorithm tailored to clinical applications, outperforming established methods in accuracy and external generalizability. This study highlights the clinical relevance, cost-effectiveness, and diagnostic potential of our advanced algorithm, underscoring its value for healthcare applications.

2. Materials and methods

2.1. Data collection

Our data consists of de-identified hologram videos recording human whole blood samples flown inside the microfluidic channel. All samples were taken from patients with written informed consent before blood draw in the Emergency Department and the Medical Intensive Care Unit of the National University Hospital of Singapore. Ethics approval was obtained from the National Healthcare Group Domain Specific Review Board (reference numbers 2021/00930 and 2021/01130). Blood samples were diluted with phosphate-buffered saline containing 0.05% polyethylene oxide, which viscoelastically focused the blood cells into a thin line within the microscope’s focal plane, following protocols from our previous studies [10,11]. In total, 1672 videos were collected and randomly divided into 1038 for training, 331 for validation, and 303 for testing. Each video is approximately 1.5 minutes long with around 10,000 frames. However, due to storage limitations, only 200 frames were extracted from each video and used for this study.

Furthermore, we imaged a phase mask target (Quantitative Phase Target NIST Traceable, Benchmark Technologies, USA) to quantify our DHM setup’s spatial resolution and phase retrieval capabilities. Different areas of this static sample with varying features and thicknesses are shown in Fig. 5(a).

Fig. 5.

Fig. 5.

(a) External generalization of OAH-Net. The white dashed squares in the 2D images indicate the areas plotted in 3D. The cross-section plots are provided in Fig. S4. (b) Comparison of SSIM values for phase and amplitude reconstructions across different models. All wedding cake images were grouped for statistical analysis.

2.2. DHM system overview

The customized DHM used in this study is provided by Ovizio Imaging Systems, Belgium, patented as "differential digital holographic microscopy" [27,28]. As shown in Fig. 2(a), a superluminescent light-emitting (SLED) from Osram provides partially coherent Köehler illumination, which is filtered through a pinhole. After the light scatters from blood cells flowing within a microfluidic channel, it is collected by a 40 × objective lens with a numerical aperture of 0.55. A grating then splits the light, and a wedge introduces a slight tilt to the upper portion of the light beam. The reference and object beams are eventually superimposed and recorded by the imaging sensor to generate the holograms. More technical details remain undisclosed by the device manufacturer.

Fig. 2.

Fig. 2.

(a) Experimental setup of our DHM. (b) Illustration of the tilt angles between the object beam and the reference beams in our DHM setup. The two reference beams are titled along x and y axes, respectively. (c) An illustrative example of a raw hologram image and the corresponding spectrum after Fourier transform (denoted as F ), revealing the spatial separation of image signals: 0, F(RxRx+RyRy+OO) ; 1, F(ORx) ; 2, F(ORy) ; 3, F(RxRy) ; 4, F(RxRy) . More details are provided in the Method and Supplement 1 (9.9MB, pdf) .

2.3. Spatial separation of off-axis holograms in the frequency domain

In our DHM device setup, there is one object beam and two reference beams tilted along the perpendicular x and y axes (see Fig. 2(b)). The intensity distribution IH recorded by the CCD sensor is the interference pattern between the reference waves and the object wave:

IH=(Rx+Ry+O)(Rx+Ry+O)=RxRx+RyRy+OO+ORx+RxO+ORy+RyO+RxRy+RxRy (1)

where represents the complex conjugate. Figure 2(c) shows the terms in Eq. (1) can be effectively separated in the frequency domain, facilitating the exacting of ORx or ORy from IH . The object wave O can be reconstructed from either ORx or ORy , with more details provided in the Supplement 1 (9.9MB, pdf) .

2.4. Model architecture

The architecture of OAH-Net is shown in Fig. 1(c), comprising two main modules: the Fourier Imager Heads (FIHs) and the Complex-Valued Network (CVN). FIHs are designed to filter out the undesired component of the hologram in the Fourier frequency domain and to correct the frequency shift caused by the titled reference beam. The CVN module converts the object beam wave in complex-valued form to amplitude and unwrapped phase. In some DHM devices, including the one used in our experiments, there is more than one reference beam, and the object beam wave can be derived from any of them individually. In such cases, the CVN module is also responsible for merging the multiple waves of the object beam into one.

In the FIHs module of OAH-Net, OR separation and shifting in the Fourier frequency domain are performed by two matrix multiplications (denoted as × ) and one element-wise matrix multiplication (denoted as ) in the FIHs module:

F=Ml×F(IH)×MrMmask (2)

where F is the Fourier transformation. IH is kept at high resolution with an original size of 1536×2048 pixels. The matrices Ml , Mr , and Mmask are two-dimensional trainable matrices, with shapes 384×1536 , 2048×512 , and 384×512 , respectively. F is the output of FIH that is the corresponding Fourier frequency of OR . A set of Ml , Mr , and Mmask is grouped as a Fourier Imager Head. In our study, two heads are used to output ORx and ORy , respectively. The initialization of Ml , Mr , and Mmask is based on the circular-shape filter, with further details provided in the Supplement 1 (9.9MB, pdf) .

The output of FIHs, ORx and ORy , is further demodulated to remove Rx and Ry via element-wise division to generate two object waves O with minor differences. Subsequently, the two object waves O are fed into the CVN module as input. The primary tasks of CVN include: 1) converting O from the complex-value format into the phase-amplitude format, 2) merging multiple object waves O into one, and 3) unwrapping the calculated phase. Task 1 requires no trainable parameters, while tasks 2 and 3 are implemented using CNN layers. Alternatively, the order can be switched, with the merging of multiple O first in complex-value format using complex-valued CNN layers, followed by conversion to the phase-amplitude format. We compared these two strategies and found no significant performance differences. Hence, we opt for the first strategy to avoid complex-valued computation in the CNN layer, thereby enabling further acceleration of the model using techniques such as NVIDIA TensorRT.

2.5. Model training and testing

The entire model is trained end-to-end in a supervised manner using blood cell samples. The input consists of holograms recorded in the presence of samples and a background hologram recorded without any sample. The target images, or ground truth, are the phase ( ϕ ) and amplitude ( A ) images reconstructed from the sampled hologram using methods such as where Fourier transformation and spatial filtering in the frequency domain are applied [5,29]. This study generated target images using Ovizio, an API software provided for our customized DHM with undisclosed technical details [30]. In this study, autofocusing is not considered or implemented, as our flow cytometry setup ensures more than 90% of the cells are imaged with correct focus [11].

As shown in Fig. 4(d), there is a data imbalance between the background pixels and the sample pixels. Hence, we use a weighted L1 loss function to guide the model’s focus on areas of higher importance:

Lloss=1ni=1nj=1wk=1h(|ϕˆjkiϕjki|+wA×|AˆjkiAjki|)×W(ϕˆjki,ϕjki) (3)

where Aˆ and ϕˆ are the predicted amplitude and phase, respectively, j and k are the pixel coordinates, while w and h are the width and height of the image, respectively. i indicate the i th sample and n is the batch size. wA is the weight for amplitude errors, with a default value of 0.1 . W is the pixel weight function:

Wjk(ϕˆjk,ϕjk)=40×clamp[max(|ϕˆjk|,|ϕjk|),0,0.05]+20×clamp[grad(ϕjk),0,0.1]+1

where

clamp(x,a,b)=max[a,min(x,b)]

grad(ϕjk) is the gradient of the target phase image derived by the Sobel filter [31] followed by a Gaussian blur.

Fig. 4.

Fig. 4.

(a) The circular shape filter in the frequency domain [20], which is also used as the initial weight of Mmask of FIHs. (b-c) The fine-tuned Mmask in both FIHs. All the filters are shifted to position the zero-frequency at the center for a better display. (d) The phase value distribution of all the ground truth images with blood cell samples. Pixels with phase values greater than 0.2 constitute only 1.04% of all pixels in the blood cell dataset. (e) Detailed MAE of OHA-Net predictions for each phase value range on the test dataset of blood cell sample.

All networks mentioned in this paper were trained using the Adam optimizer [32] with a constant learning rate optimized by grid search. Training was halted when the validation loss showed no improvement for 200 successive epochs, and the model with the lowest validation loss was selected for further testing. The inference speed was measured with a batch size of 1 on a NVIDIA GeForce RTX 4090 GPU, after compiling the networks with the Pytorch JIT compiler [33]. The structural similarity index (SSIM) was measured using the corresponding function in scikit-image [34], where the value range of the phase image was normalized from (2,5) to (0,255) , making it the same as amplitude images for easier comparison.

3. Result

3.1. Model performance

We first assess the performance of the well-trained model in test data that recorded blood cell samples. Overall, OAH-Net’s predictions are highly accurate. The error statistics are shown in Table 1, and a typical example is presented in Fig. 3(a). To compare with intrinsic measurement error, we recorded hologram videos of several static samples (see Fig. 5(a)) and reconstructed the frames using Ovizio.api. The reconstructed images are not identical and exhibit minor fluctuations caused by hardware settings. The mean absolute error (MAE) averages for any two successive frames are 0.030±0.001 for phase images and 0.677±0.003 for amplitude images, higher than the prediction error of OAH-Net. Interestingly, in rare cases, low-frequency noise remains in the phase image reconstructed via Ovizio.api, as shown in Fig. 3(b). In contrast, OAH-Net successfully removes the noise and generates a more homogeneous background, even though it was trained based on Ovizio.api.

Table 1. Performance metrics of OAH-Net, its variants, and benchmarks on the blood cell test dataset a .

Methods Inference speed (ms/frame) Number of trainable parameters Phase (range: 25 rad)
Amplitude (range: 0255 a.u.)
MAE1 MAE2 SSIM MAE1 MAE2 SSIM
HRNet [16] 28.54 ± 0.54 2.8M 0.052 ± 2.71e-3 0.232 ± 3.31e-3 0.974 ± 2.02e-3
Y-Net [19] 122.95 ± 0.09 14.7M 0.104 ± 1.37e-2 0.221 ± 1.46e-2 0.957 ± 6.50e-3 3.787 ± 4.36e-1 5.161 ± 3.72e-1 0.944 ± 8.55e-3
U-Net filter [26] 19.11 ± 0.57 4.0M 0.028 ± 5.19e-3 0.104 ± 3.46e-2 0.993 ± 2.00e-3 0.809 ± 4.24e-2 3.75 ± 3.02e-1 0.985 ± 3.04e-3
Circular filter [20] 2.65 ± 0.30 48.4K 0.033 ± 6.72e-3 0.122 ± 5.39e-2 0.928 ± 4.51e-3 0.368 ± 5.27e-3 0.912 ± 2.89e-3 0.995 ± 3.77e-4
OAH-Net vanilla 2.65 ± 0.30 3.7M 0.012 ± 3.07e-4 0.049 ± 9.29e-4 0.997 ± 7.89e-5 0.372 ± 1.65e-2 0.575 ± 3.92e-2 0.995 ± 2.19e-4
OAH-Net variant-1 2.65 ± 0.30 441K 0.021 ± 4.91e-4 0.085 ± 1.41e-3 0.995 ± 5.15e-5 0.454 ± 7.53e-2 0.827 ± 1.31e-2 0.994 ± 8.55e-4
OAH-Net variant-2 5.59 ±0.52 9.24M 0.019 ± 5.11e-4 0.066 ± 2.32e-3 0.943 ± 7.94e-3 1.898 ± 2.18e-1 3.987 ± 1.49e-1 0.986 ± 9.25e-4
a

MAE1, mean absolute error over the entire image; MAE2, mean absolute error over the cell regions only; SSIM, structural similarity index.

The inference speed is measured using an NVIDIA GeForce RTX 4090 GPU with a batch size of 1 for all methods.

The Circular filter method, OAH-Net vanilla, and OAH-Net variant-1 share the same model architecture and total parameter count (including trainable and non-trainable parameters), resulting in the same inference speed.

Fig. 3.

Fig. 3.

Comparison between the target images and the predicted images of OAH-Net in the test data. (a) A typical sample with a medium prediction error of phase. (b) The sample with the highest prediction error of phase. A low-frequency noise remains in the target phase image while it is successfully removed in the prediction of OAH-Net. (c) Blood cells of various types reconstructed via OAH-Net exhibit well-preserved intracellular structures, which can be utilized for downstream tasks. RBC, red blood cell; MONO, monocyte; EOS, eosinophil; LYM, Lymphocyte; NEU, neutrophil; PLT, platelet.

The error measurement discussed above pertains to the entire image and is denoted as MAE1. Practically, it is more important to accurately reconstruct regions with blood cells, defined as pixels with target phase values greater than 0.2 . We denote the MAE specifically for cell regions as MAE2. MAE2 is slightly higher than MAE1, possibly due to the imbalanced data distribution. The pixels in the cell region constitute only 1.04% of all pixels in the blood cell dataset (see Fig. 4(d)). Although the loss function is customized to encourage the model to focus more on the cell regions, the data imbalance issue cannot be solved entirely. Nevertheless, further analysis suggests that the error has negligible effects on the performance of the downstream tasks, as shown in Table 2. Figure 4(e) shows that predicting pixels with negative target phase values is more error-prone. These pixels are usually part of the background, and their deviation between the target and prediction values plays a positive effect in making the background more homogeneous.

Table 2. Performance of YOLO models on the blood cell dataset a .

Image source Object type YOLOv5
YOLOv8
Amplitude Image Phase Image Amplitude Image Phase Image

mAP50 mAP95 mAP50 mAP95 mAP50 mAP95 mAP50 mAP95
Ovizio RBC 0.992±2.00e-2 0.922±2.00e-2 0.993±4.71e-4 0.963±2.94e-3 0.993±2.08e-2 0.930±1.53e-2 0.993±4.71e-4 0.975±1.25e-3
WBC 0.993±1.53e-2 0.966±2.65e-2 0.995±4.71e-4 0.993±4.71e-4 0.995±1.00e-3 0.970±1.00e-3 0.995±4.71e-4 0.993±9.43e-3
PLT 0.956±4.58e-2 0.676±3.21e-2 0.983±1.00e-3 0.884±2.62e-3 0.959±5.77e-4 0.691±1.00e-3 0.984±1.70e-3 0.889±4.50e-3

OAH-Net RBC 0.994±2.37e-3 0.923±4.51e-3 0.995±4.71e-4 0.974±4.71e-4 0.993±2.00e-3 0.932±2.00e-3 0.996±4.71e-4 0.966±8.16e-4
WBC 0.993±2.08e-3 0.972±6.03e-3 0.995±4.71e-4 0.990±1.70e-3 0.993±2.08e-3 0.846±1.00e-3 0.995±4.71e-4 0.990±4.71e-4
PLT 0.955±4.36e-3 0.684±4.16e-3 0.980±2.49e-3 0.842±9.43e-4 0.963±1.53e-3 0.697±1.00e-3 0.981±3.68e-3 0.842±4.50e-3
a

mAP50, mean Average Precision at an intersection over union (IOU) threshold of 0.5 [35].

mAP95, mean Average Precision at an IOU threshold of 0.95 [36].

RBC, red blood cells; WBC, white blood cells; PLT, platelet.

As shown in Eq. (2), each FIH involves three trainable matrices: Ml , Mr , and Mmask , which are multiplied with the Fourier-transformed hologram F(IH) . Mmask functions as a filter in the frequency domain, with the corresponding initial and fine-tuned weights shown in Fig. 4(a)-c. The fine-tuned Mmask of both FIHs are distinct from the initial circular shape and exhibit interesting patterns not reported before [24]. The initial and fine-tuned weights of Ml and Mr are shown in Fig. S2 and Fig. S3. Their initial weights consist only of 0 or 1, and are solely responsible for cropping and shifting the Fourier spectrum. After being fine-tuned as trainable parameters, Ml and Mr partially contributed to frequency filtration and improved model performance. During training, the weights of Ml , Mr , and Mmask are not strictly restricted within the range of [0,1] , but most of the fine-tuned values (approximately 99% ) fall within this range. More detailed statistics are provided in Table S2.

3.2. Benchmarks and model variations

In Table 1, we compared the inference speed and performance across various methods, including the OAH-Net, its variants, and other selected algorithms, evaluated on the test dataset of blood cell sample. We excluded methods that require manual selection or iterative processing, as these are unsuitable for high-throughput analysis. Each method was trained three times with different initialization seeds. The vanilla version of OAH-Net consistently achieved the lowest MAE and highest SSIM across all metrics.

HRNet [16] and Y-Net [19] are standalone algorithms that adapt the purely data-driven strategy to convert holograms into phase images (HRNet) or both phase and amplitude images (Y-Net). Y-Net is based on U-Net [37], while HRNet is based on ResNet [38]. Both models were constructed as described in their original reports, with modifications to their final layers to output images 1/16 of the size of the input, aligning with our model’s output dimensions. The loss function applied to both models is identical to that used in OAH-Net, as defined in Eq. (3). The amplitude error weight, wA , was set to 0 for HRNet, as it does not predict amplitude, while for Y-Net, wA was set to 1 to balance phase and amplitude errors.

The U-Net filter and circular filter are frequency-domain filtering approaches, not standalone hologram reconstruction methods. We integrated each of these filters within the OAH-Net framework, substituting them for Mmask while keeping Ml and Mr fixed at their initial values. The U-Net filter, following the method of Xiao et al. [26], produces a sample-dependent filter. In the original study, the U-Net filter was trained in a supervised manner with manually annotated frequency filters for each sample, a process that is time-consuming and potentially subjective. Here, the U-Net filter was indirectly trained within the OAH-Net framework in conjunction with the CVN module, without manual annotation of the filters. The circular filter is a classic, non-trainable, regular-shape filter [20], depicted in Fig. 4(a), which serves as Mmask without additional training. We tested three different radii for the circular filter, and the best performance is reported in Table 1.

Matrices Ml and Mr in FIHs are primarily responsible for cropping and shifting in Fourier space. To investigate whether training these matrices is necessary, we developed OAH-Net variant-1, which keeps Ml and Mr fixed as binary masks for cropping and shifting while retaining the rest part the same as the original OAH-Net. Results in Table 1 indicate that variant-1 slightly underperforms compared to the original OAH-Net, suggesting that training Ml and Mr can improve frequency filtration. Furthermore, when non-parallel carrier fringe patterns were generated by certain DHM setups [39], phase compensation could not be achieved by merely shifting the Fourier frequency. The design of trainable matrices Ml and Mr would be advantageous for adapting to more complex frequency mappings required for phase compensation. Another variant, OAH-Net variant-2, was tested to determine if a more complex CVN module, adopting a U-Net architecture according to Xiao’s method [26], could further improve model performance. However, this modification did not yield any notable performance gains, but doubled the inference time per frame. Based on these findings, the vanilla OAH-Net remains the preferred model for its optimal balance of accuracy and efficiency.

3.3. Model generalization ability

To assess the generalization capability of OAH-Net, which was trained exclusively on blood cell images, we evaluated its performance on a diverse set of unseen sample patterns. As illustrated in Fig. 5(a), OAH-Net demonstrates consistent accuracy across various sample patterns, indicating that it leverages underlying physical principles rather than overfitting to the training data. In contrast, benchmark methods show a significant drop in SSIM, particularly on challenging samples with "parallel stripes" and the "Siemens star" pattern (Fig. 5(b)). The overall good performance of OAH-Net on the "wedding cake" pattern is likely due to its larger background area with minimal phase and amplitude variation, which is less demanding in terms of reconstruction fidelity. These results suggest that OAH-Net is robust to a range of pattern types, enabling broader applicability in real-world settings with variable sample conditions.

3.4. Performance in downstream task

In most cases, hologram reconstruction serves as a precursor to subsequent tasks such as classification and object detection for further sample analysis [40]. To assess the broader utility of our model, we performed additional validation by integrating it with two object detection models, YOLOv5 and YOLOv8 [12,13]. Our approach involved training and testing the YOLO models initially on target images, followed by repeating the process using images reconstructed by OAH-Net. Both YOLO models were trained using default settings without hyperparameter tuning. Performance comparison, detailed in Table 2, reveals no significant differences in object detection metrics between the two image sources. This outcome suggests that the accelerated hologram reconstruction achieved by our model does not entail any discernible performance trade-offs in subsequent object detection tasks. Furthermore, typical cells of various types reconstructed via OAH-Net are shown in Fig. 3(c). The intracellular structures are well preserved and can be used for a more detailed analysis. Due to limited annotations, object detection models treat all white blood cells as the same class, disregarding their subclasses in this study.

4. Discussion

We demonstrated an end-to-end network for off-axis hologram reconstruction to address the critical bottleneck in high-throughput DHM applications. By integrating deep learning models with the physical principles of off-axis holography, OAH-Net achieves high speed and accuracy in hologram reconstruction, outperforming state-of-the-art methods. Our model processes the original high-resolution hologram (1536 × 2048 pixels) without downsizing or dividing it into smaller fields of view, thus preserving all detailed sample information for precise downstream analysis. Our tests showed that the reconstructed images using OAH-Net performed equally well in the object detection task of different types of blood cells using YOLOv5 and YOLOv8 compared to ground truth data. Meanwhile, streamlined computation allows OAH-Net to achieve an inference speed of less than 3 ms/frame, significantly faster than the acquisition rate of the microscope camera (9.5 ms/frame). Given YOLOv8’s reported inference speed of 3.5 ms per frame for its largest model, OAH-Net could seamlessly integrate with YOLO models to create a real-time diagnostic pipeline encompassing both hologram reconstruction and data analysis. Future work will focus on implementing this pipeline and exploring correlations between the statistical characteristics of blood samples and patient health outcomes.

Compared to previous frameworks [10,11], OAH-Net reduces the holographic reconstruction time by up to 97% (excluding downstream data analysis), achieving a TAT of under 5 minutes (see Fig. 1(b)). In addition, OAH-Net addresses the storage challenges associated with high-throughput imaging by eliminating the need to save raw holographic data. With real-time reconstruction, only the reconstructed images’ at 1/16 of the original data size–are stored, significantly lowering storage requirements and costs. When integrated with downstream models for real-time analysis, storage efficiency can be further optimized by saving only frames containing regions of interest, enhancing the framework’s practicality and scalability for clinical applications.

OAH-Net demonstrated strong generalization capability across a broad spectrum of sample patterns distinct from the training data in zero-shot learning. This is a crucial advantage for DHM applications in the real world, where inference data distributions may significantly differ from training data distribution due to sample-related or user-related variations. However, it is important to note that the integrated physical principles are responsible only for frequency filtration. The phase unwrapping task is performed in the CVN module, which is purely data-driven. Hence, samples with phase values outside the training data range [2,5] may be susceptible to reconstruction errors, necessitating fine-tuning of the model with additional data.

The target images used to train OAH-Net are automatically generated without manual annotation. In this study, they are derived via Ovizio.api but could be generated using other numerical methods [5,29]. As shown in our results, OAH-Net produces more homogeneous images and is more effective at filtering low-frequency noise than the ground truth method, indicating its robustness across different ground truth generation methods.

One potential limitation of OAH-Net is the absence of an integrated autofocusing module in the network, as our device setup ensures that more than 90% of cells are imaged in correct focus [11]. However, defocus remains a comment and significant issue in many other DHM applications. Incorporating autofocusing functionality into the framework will be a key direction for future study.

Although this study primarily focuses on whole blood cells, the approach can be easily adapted to studies involving other cell types with minimal modification, such as cell culture quality control [41], infection monitoring [42], and toxicity testing [43]. The improvements in efficiency and robustness make the OAH-Net an advantageous and practical solution for real-world applications, especially in clinical diagnostics, where speed and accuracy are paramount. We anticipate that the network architecture’s inherent simplicity and demonstrated effectiveness will catalyze the further adoption of OAH-Net in quantitative phase imaging. Moreover, the potential applications of this approach extend beyond its current uses. It can be seamlessly integrated into various biomedical imaging modalities, especially where phase retrieval and direct processing into the frequency domain are essential. This adaptability and efficiency could mark a significant step forward in the field, promising to advance how we approach and utilize holographic imaging in medical diagnostics and biomedical research.

Supplemental information

Supplement 1. Supplementary.
boe-16-3-894-s001.pdf (9.9MB, pdf)

Acknowledgments

This research is supported by the National Research Foundation, Prime Minister’s Office, Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE) program. The computational work for this article was partially performed on resources of the National Supercomputing Centre (NSCC), Singapore (https://www.nscc.sg).

Funding

National Research Foundation Singapore10.13039/501100001381 (NRF2019-THE002-0008).

Disclosures

W.L., K.D., Q.C., S.K.M., O.H., and H.L. are inventors of a pending patent related to this work filed by the Singapore Patent Office (10202401892U, filed on 26 June 2024). The authors declare that they have no other competing interests.

Data availability

Data and code underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 (9.9MB, pdf) for supporting content.

References

  • 1.Charrière F., Pavillon N., Colomb T., et al. , “Living specimen tomography by digital holographic microscopy: morphometry of testate amoeba,” Opt. Express 14(16), 7005–7013 (2006). 10.1364/OE.14.007005 [DOI] [PubMed] [Google Scholar]
  • 2.Xu W., Jericho M. H., Meinertzhagen I. A., et al. , “Digital in-line holography for biological applications,” Proc. Natl. Acad. Sci. 98(20), 11301–11305 (2001). 10.1073/pnas.191361398 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Park Y., Diez-Silva M., Popescu G., et al. , “Refractive index maps and membrane dynamics of human red blood cells parasitized by plasmodium falciparum,” Proc. Natl. Acad. Sci. U. S. A. 105(37), 13730–13735 (2008). 10.1073/pnas.0806100105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Kemper B., Bauwens A., Bettenworth D., et al. , Label-Free Quantitative In Vitro Live Cell Imaging with Digital Holographic Microscopy (Springer International Publishing, 2019), pp. 219–272. [Google Scholar]
  • 5.Verrier N., Atlan M., “Off-axis digital hologram reconstruction: some practical considerations,” Appl. Opt. 50(34), H136–H146 (2011). 10.1364/AO.50.00H136 [DOI] [PubMed] [Google Scholar]
  • 6.Chen H., Huang L., Liu T., et al. , “Fourier imager network (FIN): a deep neural network for hologram reconstruction with superior external generalization,” Light: Sci. Appl. 11(1), 254 (2022). 10.1038/s41377-022-00949-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Huang L., Chen H., Liu T., et al. , “Self-supervised learning of hologram reconstruction using physics consistency,” Nat. Mach. Intell. 5(8), 895–907 (2023). 10.1038/s42256-023-00704-7 [DOI] [Google Scholar]
  • 8.Koch C. T., Lubk A., “Off-axis and inline electron holography: A quantitative comparison,” Ultramicroscopy 110(5), 460–471 (2010). 10.1016/j.ultramic.2009.11.022 [DOI] [Google Scholar]
  • 9.Latychevskaia T., Formanek P., Koch C., et al. , “Off-axis and inline electron holography: Experimental comparison,” Ultramicroscopy 110(5), 472–482 (2010). 10.1016/j.ultramic.2009.12.007 [DOI] [Google Scholar]
  • 10.Ugele M., Weniger M., Stanzel M., et al. , “Label-free high-throughput leukemia detection by holographic microscopy,” Adv. Sci. 5(12), 1800761 (2018). 10.1002/advs.201800761 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Klenk C., Erber J., Fresacher D., et al. , “Platelet aggregates detected using quantitative phase imaging associate with covid-19 severity,” Commun. Med. 3(1), 161 (2023). 10.1038/s43856-023-00395-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Jocher G., “Yolov5 by ultralytics,” Github, 2020, https://github.com/ultralytics/yolov5.
  • 13.Jocher G., Chaurasia A., Qiu J., “Ultralytics yolov8,” Github, 2023, https://github.com/ultralytics/ultralytics/blob/main/docs/en/models/yolov8.md.
  • 14.Zhang H., Li F., Liu S., et al. , “Dino: Detr with improved denoising anchor boxes for end-to-end object detection,” arXiv, 2022. 10.48550/arXiv.2203.03605. [DOI]
  • 15.Liu Z., Lin Y., Cao Y., et al. , “Swin transformer: Hierarchical vision transformer using shifted windows,” arXiv (2021). 10.48550/arXiv.2103.14030 [DOI]
  • 16.Ren Z., Xu Z., Lam E. Y. M., “End-to-end deep learning framework for digital holographic reconstruction,” Adv. Photonics 1(01), 1 (2019). [Google Scholar]
  • 17.Zhang G., Guan T., Shen Z., et al. , “Fast phase retrieval in off-axis digital holographic microscopy through deep learning,” Opt. Express 26(15), 19388–19405 (2018). 10.1364/OE.26.019388 [DOI] [PubMed] [Google Scholar]
  • 18.Li Z., Chen Y., Sun J., et al. , “High bandwidth-utilization digital holographic reconstruction using an untrained neural network,” Appl. Sci. 211, 10656 (2022). 10.3390/app122010656 [DOI] [Google Scholar]
  • 19.Wang K., Dou J., Kemao Q., et al. , “Y-net: a one-to-two deep learning framework for digital holographic reconstruction,” Opt. Lett. 44(19), 4765–4768 (2019). 10.1364/OL.44.004765 [DOI] [PubMed] [Google Scholar]
  • 20.Cuche E., Marquet P., Depeursinge C., “Spatial filtering for zero-order and twin-image elimination in digital off-axis holography,” Appl. Opt. 39(23), 4070–4075 (2000). 10.1364/AO.39.004070 [DOI] [PubMed] [Google Scholar]
  • 21.Mann C., Yu L., Lo C.-M., et al. , “High-resolution quantitative phase-contrast microscopy by digital holography,” Opt. Express 13(22), 8693–8698 (2005). 10.1364/OPEX.13.008693 [DOI] [PubMed] [Google Scholar]
  • 22.Matrecano M., Memmolo P., Miccio L., et al. , “Improving holographic reconstruction by automatic butterworth filtering for microelectromechanical systems characterization,” Appl. Opt. 54(11), 3428–3432 (2015). 10.1364/AO.54.003428 [DOI] [PubMed] [Google Scholar]
  • 23.Weng J., Li H., Zhang Z., et al. , “Design of adaptive spatial filter at uniform standard for automatic analysis of digital holographic microscopy,” Optik 125(11), 2633–2637 (2014). 10.1016/j.ijleo.2013.11.035 [DOI] [Google Scholar]
  • 24.Hong Y., Shi T., Wang X., et al. , “Weighted adaptive spatial filtering in digital holographic microscopy,” Opt. Commun. 382, 624–631 (2017). 10.1016/j.optcom.2016.08.056 [DOI] [Google Scholar]
  • 25.He X., Nguyen C. V., Pratap M., et al. , “Automated Fourier space region-recognition filtering for off-axis digital holographic microscopy,” Biomed. Opt. Express 7(8), 3111–3123 (2016). 10.1364/BOE.7.003111 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Xiao W., Wang Q., Pan F., et al. , “Adaptive frequency filtering based on convolutional neural networks in off-axis digital holographic microscopy,” Biomed. Opt. Express 10(4), 1613–1626 (2019). 10.1364/BOE.10.001613 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Dubois F., Yourassowsky C., US patent 7,362,449 b2 (2004). P. 1.
  • 28.Dubois F., Yourassowsky C., US patent 9,207,638 b2 (2015). P. 1.
  • 29.Akhter N., Min G., Kim J. W., et al. , “A comparative study of reconstruction algorithms in digital holography,” Optik 124(17), 2955–2958 (2013). 10.1016/j.ijleo.2012.09.002 [DOI] [Google Scholar]
  • 30.Ovizio Imaging Systems, “Osone software,” [Online]. Available: https://ovizio.com/software-osone/.
  • 31.Kanopoulos N., Vasanthavada N., Baker R. L., “Design of an image edge detection filter using the sobel operator,” IEEE J. Solid-State Circuits 23(2), 358–367 (1988). 10.1109/4.996 [DOI] [Google Scholar]
  • 32.Kingma D. P., Ba J., “Adam: A method for stochastic optimization,” arXiv (2017). 10.48550/arXiv.1412.6980 [DOI]
  • 33.Paszke A., Gross S., Massa F., et al. , “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems , vol. 32 Wallach H., Larochelle H., Beygelzimer A., et al., eds. (Curran Associates, Inc., 2019), pp. 8024–8035. [Google Scholar]
  • 34.van der Walt S., Schönberger J. L., Nunez-Iglesias J., et al. , “scikit-image: image processing in Python,” PeerJ 2, e453 (2014). 10.7717/peerj.453 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Everingham M., Van Gool L., Williams C. K., et al. , “The pascal visual object classes (voc) challenge,” Int. J. Comput. Vis. 88(2), 303–338 (2010). 10.1007/s11263-009-0275-4 [DOI] [Google Scholar]
  • 36.Lin T.-Y., Maire M., Belongie S., et al. , “Microsoft coco: Common objects in context,” arXiv (2015). 10.48550/arXiv.1405.0312 [DOI]
  • 37.Ronneberger O., Fischer P., Brox T., “U-net: Convolutional networks for biomedical image segmentation,” arXiv (2015). 10.48550/arXiv.1505.04597 [DOI]
  • 38.He K., Zhang X., Ren S., et al. , “Deep residual learning for image recognition,” arXiv (2015). 10.48550/arXiv.1512.03385 [DOI]
  • 39.Min J., Yao B., Ketelhut S., et al. , “Simple and fast spectral domain algorithm for quantitative phase imaging of living cells with digital holographic microscopy,” Opt. Lett. 42(2), 227–230 (2017). 10.1364/OL.42.000227 [DOI] [PubMed] [Google Scholar]
  • 40.Röhrl S., Bernhard L., Lengl M., et al. , “Explainable feature learning with variational autoencoders for holographic image analysis,” in BIOIMAGING, (2023), pp. 69–77.
  • 41.Kastl L., Isbach M., Dirksen D., et al. , “Quantitative phase imaging for cell culture quality control,” Cytometry Pt. A 91(5), 470–481 (2017). 10.1002/cyto.a.23082 [DOI] [PubMed] [Google Scholar]
  • 42.Ekpenyong A. E., Man S. M., Achouri S., et al. , “Bacterial infection of macrophages induces decrease in refractive index,” J. Biophotonics 6(5), 393–397 (2013). 10.1002/jbio.201200113 [DOI] [PubMed] [Google Scholar]
  • 43.Kühn J., Shaffer E., Mena J., et al. , “Label-free cytotoxicity screening assay by digital holographic microscopy,” Assay Drug Dev. Technol. 11(2), 101–107 (2013). 10.1089/adt.2012.476 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1. Supplementary.
boe-16-3-894-s001.pdf (9.9MB, pdf)

Data Availability Statement

Data and code underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.


Articles from Biomedical Optics Express are provided here courtesy of Optica Publishing Group

RESOURCES