Abstract
Optical coherence Doppler tomography (ODT) increasingly attracts attention because of its unprecedented advantages with respect to high contrast, capillary-level resolution and flow speed quantification. However, the trade-off between the signal-to-noise ratio (SNR) of ODT images and A-scan sampling density significantly slows down the imaging speed, constraining its clinical applications. To accelerate ODT imaging, a deep-learning-based approach is proposed to suppress the overwhelming phase noise from low sampling density. To handle the issue of limited paired training datasets, a generative adversarial network (GAN) is performed to implicitly learn the distribution underlying Doppler phase noise and to generate the synthetic data. Then a 3D based convolutional neural network (CNN) is trained and applied for image denoising. We demonstrate this approach outperforms traditional denoise methods in noise reduction and image details preservation, enabling high speed ODT imaging with low A-scan sampling density.
Keywords: cerebral capillary flow imaging, deep learning, optical coherence Doppler tomography, imaging speed improvement
1. INTRODUCTION
Optical Coherence Doppler Tomography (ODT) is an emerging biophotonic imaging technique that enables subsurface imaging of blood flow based on the Doppler effect induced by moving red blood cells1, 2. Compared to multiphoton microscopy and other angiography modalities (e.g., fluorescein angiography, indocyanine green angiography), ODT is tracker-free and capable of quantitative 3D imaging of microvascular flow over a large field of view (FOV), thus uniquely attractive for various diagnosis associated with vascular abnormalities (e.g., tumor microenvironment, wound healing, retinal vascular occlusion) as well as in vivo assessment of disease prognosis and even therapeutic outcomes3–5. However, the limited imaging rate of current ODT poses a potential drawback that hinders its clinical adoption. ODT operates on the detection of the phase shift (Δφ) between two A scans with a given duration T: Δφ = 4nπTv/λ where v is flow speed and λ is the central wavelength; slow and dense A-scans are required for reconstructing slow microcirculatory flows per B-scan, resulting in long duration for 3D capillary flow imaging over a larger FOV2. Attempts to improve ODT imaging speed include increasing A-scan rate (decreasing T) or reducing A-scan sampling density. Increasing A-scan rate will compromise ODT sensitivity for detecting capillary flow networks, whereas reducing sampling density will decrease the correlation between adjacent A-scans and result in overwhelming background phase noise in ODT images6.
To tackle the challenge, here we propose a deep learning based denoise method to accelerate frame rate of flow imaging by restoring flow signal from noisy ODT images acquired in a faster, lower-sampling density mode. Methods such as block-matching and 3D filtering(BM3D)7 and sparsity-based dictionary learning8 have shown great promise for suppressing noise in OCT images. However, these methods fail to sufficiently preserve image details and are computationally intensive, and thus may not be appropriate for ODT image denoise. Recently, deep learning approaches have been reported to effectively address the end-to-end image restoration problems including image denoise9, 10. Studies have featured the utilization of convolutional neural networks to reduce speckle noise in OCT images11–13. The flexible architecture and strong learning ability enable deep learning models to automatically and intensively extract image features, leading to better performance for preserving image details than conventional methods. However, enough paired noisy and clean images are needed to train a deep neural network for image denoise. Such a paired training dataset of ODT images may not be practical to derive because of high workload and complexity for acquiring in vivo data from live animals. In previous reports10, the noisy images in training dataset were simulated by adding Gaussian noise on clean images. However, this method is not applicable for ODT images because the assumption that phase noise follows the Gaussian distribution is not guaranteed. In addition, the noise level in ODT images is not linearly correlated with the sampling density, so prior information of noise level is unavailable.
To lift these limitations, we present a novel ODT denoise method based on a convolutional neural denoising network (DenoiseNet) to transform a low-sampling (LS) image to a high-sampling (HS) image. To adapt to volumetric ODT images, we applied a 3D convolution neural network to reduce background noise and preserve microvasculature. By further applying dilated convolution fusion blocks14, DenoiseNet enabled more contexts covered with fewer layers, thus effectively improving feature learning and reducing the network scale. To solve the issue of limited LS-HS image pairs, we adopted a generative adversarial network (GAN) to implicitly learn the noise distribution from noise patches extracted from the LS images. The noisy LS images were then simulated by adding noise samples generated by GAN on existing HS image dataset to construct the image pairs which were utilized to train the proposed DenoiseNet. This strategy allowed effective learning of deep neural network with only unpaired datasets available. By applying DenoiseNet on ODT images, microcirculatory flow signals can be successfully recovered from LS images disrupted by severe noise, making it possible to significantly increase image acquisition speed while maintaining flow image quality.
2. METHODS
2.1. System setup and image acquisition
In vivo 3D ODT images of mouse microcirculatory cerebral blood flows (CBF) were acquired using a ultrahigh-resolution fiberoptic OCT (μOCT) setup15 illuminated with a ultrabroad band light source (λ=1310nm, Δλ=220nm) to achieve an axial resolution of 2.5μm. Light exiting the sample arm was collimated, transversely scanned by a fast servo mirror and focused with a NIR objective (f=16mm/NA=0.25) onto mouse cortex through a cranial window, yielding a lateral resolution of ~3.2μm. In the detetcion arm, the spectral interference fringes encoding the depth profile of OCT signal (A-scan) was detected by a linescan InGaAs camera (2048pixels, 148klines/s; GL2048, Sensors Unlimited). Synchronized with transverse scans for μOCT acquisition, cross-sectional (B-scan) ODT images were reconstructed by the phase subtraction method2. The B-scan ODT images were acquired at 6k/s to detect capillary flows in mouse cortex, but with 3.5k and 14k A-line numbers for LS and HS flow images, respectively, corresponding to 2×3mm2 in the transverse and depth directions. A stack of 600 B-scans were acquried to form a 3D ODT image of 2×3×2.4mm3.
To train the GAN for noise generation, 2D noise patches extracted from LSODT images by which the mask of blood flows was first segmented on noisy ODT images and then used as a sliding window to search for regions without flow signals. A total of 20k noise patches were extracted from 500 LS B-scan images. To train DenoiseNet, ten 3D HSODT images containing 6k B-scans in total were acquired as ground truth, whereas the corresponding B-scan LSODT images(SimulD) were retrieved using the method described below (Section 2.2). In addition, datasets of HSODT images with additive white Gaussian noises with level σ = 15 (SimulD-G) were used to compare our method with those trained with additive Gaussian noise. Additional four 3D ODT image sets (ValidD), each containing paired LS and HS images, were acquired for evaluating denoise efficacies.
2.2. GAN for noise learning
The deep neural network training requires sizable and diverse datasets from large numbers of animals in vivo, which is not practical to acquire. To simplify dataset construction, we generated synthetic noisy images by adding noise to clean images. Some neural network-based denoise methods were trained with noisy input images that were simply generated by adding Gaussian noise with a specific noise level10. However, such methods may not work because the Doppler phase noise pattern is much more complicated than the Gaussian model, with contributions from multiple noise sources. To model Doppler phase noise, we applied a GAN method for learning from real ODT images with low sampling density16, 17.
The GAN framework consisted of a generative network (G) and a discriminative network (D) that competed against each other17 as shown in Figure 1A. The proposed G applied a dense connection layer and a reshape layer to transform input random vectors to 2D feature maps. Then five convolutional layers with the same number of kernels were performed to extract features. The first four convolutional layers were followed by batch normalization (BN), Relu activation, and a upsampling layer to double the size of feature maps, and the output layer was activated with tanh function18. The outputs of G along with real noise patches extracted from LSODT images were fed to D, which similarly contained 4 convolutional layers followed by BN, and a dense connection as the output layer without activation. The kernel size for convolutional layers in both G and D was to be 3×3. During adversarial training, the Wasserstein adversarial loss was applied to measure the similarity between the real data distribution pr and the generated data distribution pg19:
| (1) |
| (2) |
where G and D are operators of generator and discriminator networks, respectively, x and z represent real images and random input. This adversarial loss pushes G to its limit to learn a mapping from random input to output images to have same distribution with real images (Figure 1A)18.
FIGURE 1.

A schematic to illustrate the proposed deep learning-based method. (A) GAN for noise image generation. In training stage, the generator (G) receives random vectors and produces synthetic noise patches. The discriminator (D) is trained to distinguish the synthetic noise patches from real-noise patches. Both G and D are CNN models; (B) DenoiseNet contains a denoise network for image denoise and an auxiliary D for consistency enhancement. The denoise network is constructed on an encoder-decoder structure based on 3D convolutional layers; D is a CNN model. 3DConv and BN represent 3D convolutional layer and batch normalization, respectively.
2.3. DenoiseNet for ODT denoise
Image denoise to recover clean volumetric image x ∈ RN×M×L from noisy image y ∈ RN×M×L can be written as:
| (3) |
where μN×M×L is the noise image. The denoise process can be expressed as:
| (4) |
where f is the denoiser and is the estimated image. DenoiseNet is thus trained by modelling f to approach to x. It consists of two components: a single 3D convolutional network for image denoise and an additional discriminator network used to improve the consistency between denoised outputs and realistic HS images. These two components were trained in the adversarial fashion similar to GAN in Section 2.1. Figure 1B) illustrates the principle of DenoiseNet, in which the denoised network acts as G where the inputs are 3D patches of LS images and the outputs are 3D feature maps. In addition, the auxiliary discriminator network, D, is used to determine whether the denoised images are consistent with real HS images. The denoised network employed an encoder-decoder architecture to efficiently reduce memory usage and computational time by first downsampling the inputs to half. Instead of standard 2D convolutional layers, we employed 3D convolutional layers to better learn vasculature (curvilinear) features in 3D ODT images. The encoder part of denoised network contains four 3D convolutional layers in which the first one downsampled the inputs to half size. The decorder ultilized three 3D dilated fusion blocks20, followed by a 3D deconvolutional layers to restore the feature maps to the original size. The 3D dilated fusion block merged normal and dilated 3D convolutional layers in two paths to learn better representations20. To facilitate the training, the skip connection was built between the input and output to integrate the residual learning formulation. The associated D contains four 3D convolutional layers with a global sum layer used to project the feature map to a vector which was processed by a fully connected layer for the final output. The kernel size is set to be 3×3×3 for the DenoiseNet. To train DenoiseNet, two loss functions were jointly used: Mean Absolute Error(MAE) for reconstruction and GAN loss for the fidelity of outputs. Thus, the loss function of G in optimization becomes:
| (5) |
where the MAE loss is defined as L(x) = |G(x) – x| and α is weighing hyper parameter to balance two loss functions. The loss function of D remains to be the same as Equation 2. A three-phase training procedure was adopted to stabilize the training21: (1) trained the denoised network with MAE loss only, (2) with parameters of the denoised network frozen, trained D with GAN loss from Eq.(2), and (3) trained 2 networks jointly with the loss function from Eq.(5).
2.4. Training details
The length l of input random vector for GAN generator was set to 100. The size of output patch V′ was 64×64. We trained the GAN in a min-max optimization procedure using the Adam solver with learning rate 5×10−4 and a mini-batch size of 64. For the DenoiseNet, the size of input patches was set to 64×64×64. In all three phases of training, the Adam solver with the initial learning rate of 10–4 was applied. The learning rate decay was used in the first phase. The mini-batch size for DenoiseNet was set to 8 due to limited GPU memory. The parameter in α Eq.(5) was empirically set to 0.2 to balance the image content consistency and background noise reduction. The proposed method was implemented with Tensorflow22 and trained using the NVIDIA RTX 2070 with 8G memory.
2.5. Performance evaluation
In our design, the noise learning of DenoiseNet is assumed to promote performance of image denoise. To investigate the influence of these designs, DenoiseNet was trained without it and compared with the full approach. Two variations of networks were trained: DenoiseNet-G was trained on SimulD-G without noise learning and the full DenoiseNet. At testing time, all above networks were evaluated on dataset ValiD. Furthermore, the performance of our network was compared with other competing methods including BM3D7 and DnCNN10, DDFN14 and IrCNN23. The DnCNN, DDFN and IrCNN models were trained with SimulD to have fair comparison with DenoiseNet.
To quantitatively evaluate the performance of different methods, the peak signal-to-noise ratio (PSNR) was calculated as an evaluation metric. PSNR measures the similarity of the denoised image vs. the ground truth:
| (6) |
where MAX is the max value of image and MSE is the mean squares error. In addition, to evaluate the performance on preserving vascular topology, we applied an automatic segmentation method24 to extract vascular masks of the denoised MIP images from each methods. Then Dice coeffcient was used the assess the similarity to the ground truth:
| (7) |
where TP, FP, and FN are true positive, false positive, and false negative rates.
3. RESULTS
3.1. Phase noise generation
Figure 2A plots the outputs of the generator at different training steps and shows the comparison between the additive white Gaussian noise model(AWGN) and GAN generated noise samples with the real images. The phase noise in real images shows heterogenous patterns of randomly distributed bright spots. Given the correlation between phase noise and OCT signal intensity, this special noise pattern is associated with speckle noise resulting from the heterogeneity of tissue micromorphology. The Step-30K panels in Figure 2A show that GAN effectively learned the heterogenous noise patterns and generated patches that show high similarity to those of input real images. In comparison, the AWGN patches in Gaussian panels are uniformly distributed noise patterns that are much less correlated with the real images. The histograms of the corresponding noise types in Figure 2B indicate that GAN can learn complicated noise distribution and effectively simulate ODT noise performances.
FIGURE 2.

Comparison between AWGN and GAN generated noise. (A) The outputs of the generator at different training steps show that the GAN gradually learns the noise distribution. At the final training step(30k), the generated noise patches show highly similar distribution with real ones that are significantly different from Gaussian noise, which is evident for the three exemplary images displayed in each row; (B) the histogram shows the difference among three types of noise distribution.
3.2. Performance of denoise
To demonstrate the effectiveness of the proposed DenoiseNet approach, we compared it with other state-of-art methods (Figure 3), in which panels (Figure 3A,B) show the exemplary clean HS and noisy LS images, and panels (Figure 3C–G) compare the denoised results from different methods. The 3D ODT images in Figure 3A–G show that the flow signal in the LS image in Figure 3B is severely disturbed by phase noise, resulting in washout of microcirculatory flows compared to the clean HS image in Figure 3A. The phase noise is effectively reduced by deep learning based methods (Figure 3D–G), but the noise reducing effect by BM3D (Figure 3C) is minimal compared with the input image (Figure 3B). To illustrate the noise reducing effect, we generated the maximum intensity projection (MIP) images from 3D ODT data. For better visulization, MIP images were encoded by 8-bit glow pseudo color, indicating the flow velocity ranging from 0mm/s to 1.4mm/s. As shown in Figure 3A1–G1, the deep learning methods effectively reduce the background noise (Figure 3D1–G1) but the non-deep-learning method fails (Figure 3C1). With respect to signal restoration, deep learning based methods can well retrieve the capillary flows; whereas the SNR improvement by the BM3D is marginal. We noticed that DenoisedNet achieved better flow signal consistency across B-scans than other deep learning methods. As highlighted by green arrows in Figure 3F1, flow discontinuity can be observed on the denoised images of IrCNN. This is likely due to the fact that the 2D convolution based methods are performed in a slicewise fashion, thus discarding valuable 3D context information that is crucial for tracking 3D curvilinear structures. To further assess the efficacy of these denoise methods for enhancing SNR of the capillary network, we compared the MIP images in deeper cortex at 100um below the surface (Figure 3A2–G2) and their zoom-in images of the dashed boxes (Figure 3A3–G3). Although DenoiseNet and other deep learning based methods can effectively minimize phase noise, DenoiseNet (Figure 3G2,G3) is more reliable in preserving the vascular connectivity, as pointed out by the 2 blue arrows in the zoom-in images. Among the methods for comparison, DnCNN and DDFN are slightly better than IrCNN on signal restoration as highlighted by green arrows in Figure 3D3–F3. This might reflect the limited depth of IrCNN(9 layers) which limits signal preservation. To further demonstrate that DenoiseNet outperforms other methods on preserving vascular topology, we overlaped the vascular masks of denoised images(Figure 3D2–G2) from individual methods with ground truth and coded error pixels in red color (Figure 3A4–D4). The overlaped masks clearly show that DenoiseNet can maintain better connectivity of the vascular networks. Quantitative analyses of these methods are summarized in Table 1. DenoiseNet improves the PSNR by 70.1% over the LS image and outperforms all other methods, especially the non-deep-learning approach.
FIGURE 3.

Comparisons of the denoise efficacy of different methods. (A-G): 3D display of clean HS ODT, noisy LS ODT, denoise results of a non-deep-learning-based method (BM3D) and deep-learning-based methods (DnCNN, DDFN, IrCNN, DenoiseNet), respectively. (A1-G1): MIP images of upper cortex (0–100um); (A2-G2): MIP images of lower cortex between 100–200um from cortical surface; (A3-G3): zoom-in images within the dashed boxes in (A2-G2) with blue arrows highlighting that DenoiseNet can better enhance microvascular flows than other methods; (A4-D4): binary masks of (D2-G2) with the error (missing) pixels encoded by red color. Note that lower density of missing pixels in DenoiseNet (D4).
TABLE 1.
Summary of PSNR and Dice of different methods
| LS-ODT | BM3D | DnCNN | DDFN | IrCNN | DenoiseNet | |
|---|---|---|---|---|---|---|
| PSNR | 20.18 | 25.23 | 33.82 | 34.31 | 33.80 | 34.33 |
| Dice | NA | NA | 0.851 | 0.858 | 0.831 | 0.862 |
3.3. Effects of noise learning
To demonstrate the efficacy of noise generation GAN to further enhance DenoiseNet for phase noise suppression, we compared the results of DenoiseNet trained by Gaussian noise corrupted images (DenoiseNet-G) and by GAN generated noise corrupted images (DenoiseNet). Figure 4A–B shows a pair of exemplary clean HSODT and noisy LSODT images; Figure 4C–D show the resultant images denoised with two training variations of DenoiseNet. Due to different distributions between Gaussian noise and real noise, DenoiseNet-G failed to effectively remove the background noise as indicated by the surface plots of noise distribution (Figure 4A2–D2) within the dashed green boxes of the zoom-in images (Figure 4A1–D1), although the capillary flows are largely preserved. By comparison, DenoiseNet (Figure 4D1) suppresses the background noise better and maintains the flow connectivity of the microvascular network. For quantitative analysis, we calculated the background noise levels (noise standard deviation) within each dashed blue boxes and compared the noise performances as summarized in Table 2. The results show that both DenoiseNet-G and DenoiseNet drastically improved PSNR from 20.18 to 33.23 and 34.33. More importantly, the noise background of denoised image with DenoiseNet (2.22±0.23) was significantly lower than that with DenoiseNet-G (4.38±0.24, p<0.005), thus demonstrating the efficacy of DenoiseNet training with noisy images generated by GAN rather than generic Gaussian noise distribution.
FIGURE 4.

Noise reducing effects of DenoiseNet variations. (A-D): MIP images of HS and LS ODT, and denoised results of B by DenoiseNet-G and DenoiseNet, respectively; (A1-D1): zoom-in images of the dashed boxes. Blue boxes in C1 and D1 are regions to evaluate noise suppression effects. (A2-D2): surface plots of noise distribution within the dashed green boxes in (A1-D1).
TABLE 2.
Summary of PSNR and noise standard deviation (Noise SD) of DenoiseNet variations
| HS ODT | LS ODT | DenoiseNet-G | DenoiseNet | |
|---|---|---|---|---|
| PSNR | NA | 20.18 | 33.23 | 34.33 |
| Noise SD | 1.23±0.28 | 12.23±0.43 | 4.38±0.24 | 2.22±0.23 |
3.4. Enhancing the analysis of cocaine elicited microischemic events
The application of ODT to investigate the effects of cocaine in the brain function benefited from its high spatial resolution for flow detection and large FOV25. But the image rate is compromised to maintain the high sensitivity required for capillary flow detection. To solve the problems, we acquired 3D ODT images of mouse cortex to track flow changes before and after cocaine injection. The proposed DenoiseNet method was aimed at enabling higher temporal resolution to measure the dynamic changes in the CBF network and to identify more detailed microischemic events induced by acute cocaine administration. To demonstrate the advantages of faster image acquisition boosted by the proposed noise deep learning methods, we tracked flow changes using deep-learning-based LS ODT images and compared them with the HS ODT images. The LS ODT images were acquired at a relatively high acquisition rate (10k/s) and with a low A-line number (3.5k/B-scan), thus reducing the total time for acquiring each image cube to about 1min. DenoiseNet was then applied to the LS images to reduce background noise and enhance the flow detection sensitivity. The HS ODT images were acquired at the same A-line rate but with a high A-line number (14k/B-scan), requiring a significantly longer acquisition time of 4min. Results in Figure 5 show that our deep-learning-based denoise approach can effectively suppress phase noise in 3D ODT imaging of capillary CBF network, thus releasing the time constraints required to detect subtle dynamic CBF changes, which in our example was applied to measure the response to a pharmacological challenge (e.g., cocaine) but that could also be applied to measure the transient response to functional stimulation. Figure 5A shows a full-size (2.2×1.8×1.4mm3) HS ODT images at baseline (e.g., t=−4min before cocaine injection), in which a smaller panel (0.35×1.8×1.4mm3) in the dashed blue box (Figure 5B) was selected to track dynamic flow changes. Figure 5C,D show pairs of the flow dynamic images within smaller-panel acquired after cocaine injection, e.g., at t=−2, 8, 12 min for fast LS ODT and after DenoiseNet processing, respectively. Flows in 4 branch vessels (3 arterioles and 1 venule) and the averaged flows from 3 capillaries were selected for tracking cocaine-elicited flow changes. Because of the overwhelming background phase noise, most small flows in Figure 5C were barely detectable; whereas DenoiseNet effectively suppressed phase noise and restored flow signals so that Figure 5D clearly displays flow distributions. Figure 5E,F plot time-lapse relative CBF changes, ΔCBF(t)=[CBF(t)-CBF(tb)] /CBF(tb) (tb: baseline), in the selected 4 vessels and capillaries acquired with slow HS scans (Figure 5E) and fast LS scans (Figure 5F), respectively. Cocaine-induced CBF decreases lasted about 10min (Figure 5F), during which flow decreased in all 4 vessels and reached their minimums around 5–9min after injection (10mg/kg, iv), followed by a recovery phase till 10min and then some overshooting effects. Interestingly, the flow decreases were heterogeneous in these vessels, ranging from −10% to −35%, and both decrease and recovery phases in the venular flow were slightly lagged (~1min) behind the arteriolar flows. The capillary flow followed the same trend (decreased first and then recovered) although it varied dramatically during the time. In comparison, because of 4-time slower acquisition of HS ODT images, the dynamic features of the CBF network (Figure 5E) were largely washed out, e.g., the flow decrease amplitudes of all vessels (<10%) were seriously underestimated, including interfering with the tracking of the heterogenous dynamic changes observed for the different microvessels.
FIGURE 5.

Deep-learning-based 3D ODT to enhance dynamic tracking of CBF changes in mouse cortex induced by acute cocaine injection. (A) Full-size 3D HS ODT image at baseline (t=−4min), in which a smaller region in the dashed blue box was selected to track dynamic flow changes after cocaine injection; (B) A zoom-in HS ODT at baseline (t≈-4min); (C-D) 3D LS images at t=−2, 8, 12min before and after noise suppression with DenoiseNet; (E-F) Time-lapse CBF changes within 4 vascular compartments, e.g., 3 arterioles (yellow, green and purple curves), 1 venule (dark blue curve) and capillaries (blue curve) as marked in panels (B, D) acquired with slow (HS) ODT scans and fast (LS) ODT scans after denoise processing, respectively, and quantified as the ratio of the flow compared to baseline (pre-cocaine injection). The dynamic curve corresponds to the selected ROI with the same color. Note the much greater sensitivity to detect flow changes after noise suppression.
4. DISCUSSION
In this paper, we presented a deep learning approach to suppress phase noise caused by low A-line sampling density and in vivo animal results to demonstrate the potential to significantly increase ODT imaging speed. The relevance of this method for enhancing ODT denoise can be summarized from the following two perspectives. (1) Instead of using Gaussian noise for simulation, we applied a GAN model to learn Doppler flow noise features explicitely from the acquired ODT images. Results in Figure 2 show that the GAN was able to effectively learn the phase noise features and generate noise patches with closer similarity to real noise patches than the Gaussian model. This allowed us to generate training dataset with smaller distribution gap to the real dataset and more effective denoise neural network on real noisy images. Indeed, the comparitive study shown in Figure 4 verified that the denoise network trained with our GAN model outperformed those trained with Gaussian noise. (2) We desgined DenoiseNet to perform image denoise based on a 3D encoder-decoder archetecture. To accomodate for 3D ODT images, we performed 3D convolution layers in the denoise network to more accurately capture features of the vascular networks in ODT images, and thus better preserve the flow image details. To further enhance the correlation between the denoised outcome and the real HSODT image, an auxiliary discriminator was adopted to perform adversarial training of the denoise network. Compared with other state-of-art methods, our DenoiseNet model showed superior performance on flow noise removal and recovery of fine capillary flows(Figure 3).
The ability to image dynamic CBF changes with high spatiotemporal resolutions is critical to advance our understanding on the high complex neurovascular interactions under various stimulations and brain diseases associated with ischemia and the cascading cellular and molecular events. Unlike multiphoton microscopy, ODT enables tracker-free, larger-scale and quantitative 3D imaging of the CBF networks in mouse cortex, which is highly suitable for studying neurovascular events to pharmacological responses such as cocaine elicited microischemia that require both high sptial resolution and large field of view25, 26. However, as Doppler flow detection sensitivity is inversely proportional to its A-scan rate, a major challenge in current ODT study lies in the fact that the ability of ODT to track flow network dynamics is compromised by reduced sensitivity for capillary flow detction and/or limited field of view. As DenoiseNet can effectively reduce phase noise and thus enhnace the signal-to-noise ratio flow detection, this method may reduce A-scan numbers and accelerate 3D ODT image acqusition. For proof of concept, we applied the method to track the dynamic CBF response in mouse cortex to an acute cocaine challenge(1mg/kg, iv). Indeed, Figure 5 indicates that by applying DenoiseNet, we were able to increase the imaging rate by 4 times while maintaining almost the same high sensitivity for capillary flow detction. Importantly, the improved temporal resolution allowed us to track dynamic changes in various microvessels elicited by acute cocaine which could have been otherwise washed out. Although more work is needed to optimize deep-learning algothms and ODT scanning schemes, this result show the potential to extend 3D ODT to applications for brain functional studies such as transient ischemia, stroke, cortical spreading depression, and various brain activations27–29.
In practice, the pattern and level of phase noise on ODT images are system dependent, which are affected by various factors including A-line sampling density, scanning methods, laser source and so on. To achieve the optimized performance, the neural networks for image denoise need to be retrained with datasets from different imaging settings. Our proposed methods provide a feasible solution to learn phase noise accurately, so that performance optimization for different imaging settings only require collection of small LS ODT dataset while HS dataset can be shared in different training sessions. In this study, the sampling density of A-scan was set to 3.5k. In the future, we will explore the possibility to recover ODT images with further reduced sampling density for higher imaging speed. One limitation of the proposed neural network is from the unbalanced weight between background and flow signal. Due the sparsity of flow signal on ODT images, background occupied more pixels in training images, making the DenoiseNet lean to removing noise rather than recovering signal. To solve this issue, we will explore the possibility of using different weight balancing techniques30.
In summary, we report a deep-learning method for efficient phase noise removal on ODT images. With the generated datasets from GAN, the proposed denoise network permits effective noise reduction and flow signal preservation. We also present in vivo mouse brain image results to show the efficacy to effectively shorten 3D ODT image acquisition time suitable for brain functional studies at high spatiotemporal resolutions.
ACKNOWLEDGMENTS
We thank Kichon Park for the help on in vivo imaging. This research was supported by National Institutes of Health grants R01DA029718 (C.D. and Y.P.), R21DA042597 (Y.P. and C.D.) and RF1DA048808 (Y.P., C.D.).
Abbreviations:
- OCT
Optical Coherence Tomography
- ODT
Optical Coherence Doppler Tomography
- CNN
Convolutional Neural Network
- GAN
Generative Adversarial Network
- BM3D
block-matching and 3D filtering
- PSNR
peak signal-to-noise ratio
- HS
high sampling
- LS
low sampling
- MAE
Mean Absolute Error
- CBF
cerebral blood flow
- G
generative network
- D
discriminative network
- AWGN
additive white Gaussian noise
- MIP
maximum intensity projection
- BN
batch normalization
Footnotes
CONFLICT OF INTEREST
The authors declare no potential conflict of interests.
REFERENCES
- [1].Izatt JA, Kulkarni MD, Yazdanfar S, Barton JK, Welch AJ Optics letters. 1997, 22, 1439–1441. [DOI] [PubMed] [Google Scholar]
- [2].Zhao Y, Chen Z, Saxer C, Xiang S, de Boer JF, Nelson JS Optics letters. 2000, 25, 114–116. [DOI] [PubMed] [Google Scholar]
- [3].Standish BA, Lee KK, Jin X, Mariampillai A, Munce NR, Wood MF, Wilson BC, Vitkin IA, Yang VX Cancer research. 2008, 68, 9987–9995. [DOI] [PubMed] [Google Scholar]
- [4].You J, Pan C, Park K, Li A, Du C Journal of biophotonics. 2019, e201960091. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Wang Y, Bower BA, Izatt JA, Tan O, Huang D Journal of biomedical optics. 2008, 13, 064003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Park BH, Pierce MC, Cense B, Yun S-H, Mujat M, Tearney GJ, Bouma BE, de Boer JF Optics Express. 2005, 13, 3931–3944. [DOI] [PubMed] [Google Scholar]
- [7].Dabov K, Foi A, Egiazarian K in Video denoising by sparse 3D transform-domain collaborative filtering, Vol. (Ed.Êds.: Editor), IEEE, City, 2007, pp.145–149. [DOI] [PubMed] [Google Scholar]
- [8].Aharon M, Elad M, Bruckstein A IEEE Transactions on signal processing. 2006, 54, 4311–4322. [DOI] [PubMed] [Google Scholar]
- [9].Dong C, Loy CC, He K, Tang X in Learning a deep convolutional network for image super-resolution, Vol. (Ed.Êds.: Editor), Springer, City, 2014, pp.184–199. [Google Scholar]
- [10].Zhang K, Zuo W, Chen Y, Meng D, Zhang L IEEE Transactions on Image Processing. 2017, 26, 3142–3155. [DOI] [PubMed] [Google Scholar]
- [11].Menon SN, Reddy VV, Yeshwanth A, Anoop B, Rajan J in A Novel Deep Learning Approach for the Removal of Speckle Noise from Optical Coherence Tomography Images Using Gated Convolution–Deconvolution Structure, Vol. (Ed.Êds.: Editor), Springer, City, 2020, pp.115–126. [Google Scholar]
- [12].Qiu B, Huang Z, Liu X, Meng X, You Y, Liu G, Yang K, Maier A, Ren Q, Lu Y Biomedical Optics Express. 2020, 11, 817–830. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Devalla SK, Subramanian G, Pham TH, Wang X, Perera S, Tun TA, Aung T, Schmetterer L, Thiéry AH, Girard MJ Scientific reports. 2019, 9, 1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Chen C, Xiong Z, Tian X, Wu F in Deep boosting for image denoising, Vol. (Ed.Êds.: Editor), City, 2018, pp.3–18. [Google Scholar]
- [15].You J, Zhang Q, Park K, Du C, Pan Y Optics letters. 2015, 40, 4293–4296. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Chen J, Chen J, Chao H, Yang M in Image blind denoising with generative adversarial network based noise modeling, Vol. (Ed.Êds.: Editor), City, 2018, pp.3155–3164. [Google Scholar]
- [17].Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y in Generative adversarial nets, Vol. (Ed.Êds.: Editor), City, 2014, pp.2672–2680. [Google Scholar]
- [18].Radford A, Metz L, Chintala S arXiv preprint arXiv:1511.06434 2015. [Google Scholar]
- [19].Arjovsky M, Chintala S, Bottou L arXiv preprint arXiv:1701.07875 2017. [Google Scholar]
- [20].Yu F, Koltun V arXiv preprint arXiv:1511.07122 2015. [Google Scholar]
- [21].Iizuka S, Simo-Serra E, Ishikawa H ACM Transactions on Graphics (ToG). 2017, 36, 107. [Google Scholar]
- [22].Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard in Tensorflow M: A system for large-scale machine learning, Vol. (Ed.Êds.: Editor), City, 2016, pp.265–283. [Google Scholar]
- [23].Zhang K, Zuo W, Gu S, Zhang L in Learning deep CNN denoiser prior for image restoration, Vol. (Ed.Êds.: Editor), City, 2017, pp.3929–3938. [Google Scholar]
- [24].Li A, You J, Du C, Pan Y Biomedical optics express. 2017, 8, 5604–5616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [25].Ren H, Du C, Yuan Z, Park K, Volkow ND, Pan Y Molecular psychiatry. 2012, 17, 1017–1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Pan Y, You J, Volkow ND, Park K, Du C Neuroimage. 2014, 103, 492–501. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Choi WJ, Li Y, Wang RK IEEE transactions on medical imaging. 2019, 38, 1427–1437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Obrenovitch TP, Chen S, Farkas E Neuroimage. 2009, 45, 68–74. [DOI] [PubMed] [Google Scholar]
- [29].Chen W, Park K, Pan Y, Koretsky AP, Du C NeuroImage. 2020, 210, 116554. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Sudre CH, Li W, Vercauteren T, Ourselin S, Cardoso MJ in Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations, Vol., Springer, 2017, pp.240–248. [DOI] [PMC free article] [PubMed] [Google Scholar]
