Significance
Near-infrared (NIR) fluorescence imaging using biocompatible dyes such as ICG in the NIR-I (800–1,000 nm) has facilitated tumor imaging and imaging-guided surgery, but suffers from shallow imaging depth, low contrast, and poor clarity caused by light scattering and autofluorescence. On the other hand, fluorescence imaging in NIR-IIb (1,500–1,700 nm) affords much improved spatial resolution and deeper tissue penetration, but currently relies on fluorophores not approved for human use. This work presents a deep-learning–based approach to transform a blurred NIR-I image to a much higher-clarity image like in NIR-IIb, leading to an image closely resembling the ground truth. The deep-learning–enabled high-resolution NIR imaging could facilitate basic biomedical research and empower diagnostics and imaging-guided surgery in the clinic.
Keywords: deep learning, near-infrared imaging, second near-infrared window
Abstract
Detecting fluorescence in the second near-infrared window (NIR-II) up to ∼1,700 nm has emerged as a novel in vivo imaging modality with high spatial and temporal resolution through millimeter tissue depths. Imaging in the NIR-IIb window (1,500–1,700 nm) is the most effective one-photon approach to suppressing light scattering and maximizing imaging penetration depth, but relies on nanoparticle probes such as PbS/CdS containing toxic elements. On the other hand, imaging the NIR-I (700–1,000 nm) or NIR-IIa window (1,000–1,300 nm) can be done using biocompatible small-molecule fluorescent probes including US Food and Drug Administration-approved dyes such as indocyanine green (ICG), but has a caveat of suboptimal imaging quality due to light scattering. It is highly desired to achieve the performance of NIR-IIb imaging using molecular probes approved for human use. Here, we trained artificial neural networks to transform a fluorescence image in the shorter-wavelength NIR window of 900–1,300 nm (NIR-I/IIa) to an image resembling an NIR-IIb image. With deep-learning translation, in vivo lymph node imaging with ICG achieved an unprecedented signal-to-background ratio of >100. Using preclinical fluorophores such as IRDye-800, translation of ∼900-nm NIR molecular imaging of PD-L1 or EGFR greatly enhanced tumor-to-normal tissue ratio up to ∼20 from ∼5 and improved tumor margin localization. Further, deep learning greatly improved in vivo noninvasive NIR-II light-sheet microscopy (LSM) in resolution and signal/background. NIR imaging equipped with deep learning could facilitate basic biomedical research and empower clinical diagnostics and imaging-guided surgery in the clinic.
Fluorescence detection in the second near-infrared window (1,000–1,700 nm, NIR-II window) has been explored for noninvasive in vivo imaging benefiting from deeper tissue penetration, lower background, and higher spatial resolution afforded by reduced light scattering and diminished autofluorescence (1–8). Imaging at the long-wavelength end of the NIR-II window (1,500–1,700 nm, NIR-IIb) benefits the most, allowing single-cell resolution at subcentimeter tissue penetration depth (9). Several classes of materials have been explored as NIR-IIb fluorescent probes for in vivo imaging, including carbon nanotubes (10, 11), inorganic semiconducting quantum dots (12–15), rare-earth–based down-conversion nanoparticles (16–18), and nanoparticles of organic molecules (19, 20). Some of the probes contain toxic elements such as Pb, Cd, and As that hinder clinical translation (21, 22). In contrast, organic small-molecule fluorophores exhibit favorable excretion pharmacokinetics and low in vivo toxicity (23). Several organic fluorescent probes have been approved by the US Food and Drug Administration (FDA) or are under clinical trials, such as indocyanine green (ICG), methylene blue, and bioconjugates of IRDye800CW (24). Recently, it has been demonstrated that by exploiting the off-peak emission tail of FDA-approved fluorophores in the NIR-II window, the imaging quality of ICG and related dyes could be further improved (24–26). Nevertheless, small-molecule fluorophores reported thus far mainly emit in the conventional near-infrared window (NIR-I, 700–1,000 nm) and the short-wavelength region of the NIR-IIa window (1,000–1,300 nm, NIR-IIa), which are not optimal for deep-tissue imaging due to scattering.
Here we explore artificial neural networks to train NIR imaging in the 900–1,300-nm window using a large set of NIR-IIb images accumulated in our laboratory over the years as targeted results. This allows for translating a smeared NIR image to produce sharp, high-signal/background images achievable only in NIR-IIb. We employ the generative adversarial network (GAN), a class of deep-learning algorithms that aims to generate new data with the same distribution as the training examples (27). Variants of GANs have been applied for image-to-image translation, in which an image belonging to one domain is transferred to an image in another domain. For instance, the pix2pix algorithm learns a mapping from one domain to another by training on pairs of images (28). However, it can be time-consuming and not possible to collect paired data in many cases, and unsupervised algorithms, such as CycleGAN (29), have been proposed to circumvent this problem. In the CycleGAN model, a pair of mappings F: A → B and G: B→ A are learned, and a cycle consistency loss is introduced to make sure F(G(A)) ∼ A and vice versa. These image-to-image translation algorithms have been explored for biomedical image synthesis tasks, such as cross-modality medical image transfer (30–32), denoising low-dose medical images (33–35) and reconstructing superresolved microscopic images from low-resolution inputs (36, 37). Compared to traditional image-processing techniques, the data-driven deep-learning approaches typically use less prior knowledge of the image formation processes and hand-engineered features. Once the neural networks are trained, they can readily be used to generate images without further manual parameter search.
To enable artificial neural networks transforming a fluorescent image in the NIR-IIa (1,000–1,300 nm) window to one in the NIR-IIb (1,500–1,700 nm) window, we trained a CycleGAN model using two large sets of in vivo mice wide-field fluorescence images taken in the NIR-IIa and NIR-IIb windows, respectively. After training, the generator network could transform a blurred NIR-IIa image to one resembling an NIR-IIb image without obvious artifacts. The neural networks generalized well to previously unseen data, allowing rapid image processing without further parameter optimization after training. Furthermore, the methods trained in NIR-IIa could be extended to improve NIR-I ∼900-nm fluorescence imaging, a modality already used in clinical trials for human use, without the need of using a large set of NIR-I images for training. Lymph node imaging with the clinical gold standard NIR dye ICG achieved an unprecedented signal-to-background ratio (SBR) of >100 after image transformation. Further, we demonstrated that molecular imaging using a fluorophore–antibody complex with high tumor-targeting specificity in the NIR-I and NIR-IIa windows could be significantly improved by the neural networks. Upon imaging transformation a high epidermal growth factor receptor (EGFR) overexpressing tumor-to-normal tissue signal ratio of ∼20 could be realized using an IRDye 800CW–Cetuximab conjugate in the NIR-I and NIR-IIa windows. The reduced background also allowed identification of tumor margins more precisely to potentially facilitate imaging-guided tumor resection. Last, we showed that GAN-based image-to-image translation algorithms could enhance NIR-IIa light-sheet microscopy (LSM) by utilizing the supervised pix2pix model owing to existing paired NIR-IIa and NIR-IIb LSM images (9). The generated LSM images exhibited similar SBR and size of vasculatures to the ground-truth NIR-IIb LSM images at various depths, increasing the depth limit of one-photon LSM in the NIR-IIa window from <2 mm to ∼2.5 mm.
Results
Training a CycleGAN Model for NIR Image Processing.
Using thousands of in vivo fluorescence images of mice, we trained a CycleGAN model (29) to transfer a wide-field NIR-IIa (1,000–1,300 nm) image to an NIR-IIb (1,500–1,700 nm) image (Fig. 1A). We employed 1,024 in vivo fluorescent mouse NIR-IIa images and 1,800 NIR-IIb images recorded in our laboratory in the past 3 y (see examples in Fig. 1A and SI Appendix, Fig. S1). The images were randomly split into training, validation, and test sets with a ratio of 80:10:10. To further increase the diversity of the training set, random horizontal flip was applied for data augmentation. We analyzed the SBR of the images by plotting the cross-sectional intensity profiles of the same area in the NIR-IIa and NIR-IIb windows (Fig. 1B). A much higher SBR in the NIR-IIb window was observed due to reduced scattering.
Fig. 1.
CycleGAN-based NIR-IIa–to–NIR-IIb image transfer. (A) Comparison of NIR-IIa and NIR-IIb images. A balb/c mouse was injected with p-FE and P3-QDs at the same time and excited by an 808-nm laser. A 1,000-nm long-pass filter and a 1,200-nm short-pass filter were used to collect the NIR-IIa image, and a 1,500-nm long-pass filter was used to collect the NIR-IIb image. (Scale bar, 5 mm.) (B) Cross‐sectional intensity profiles of the same area (labeled in A) imaged in the NIR-IIa and NIR-IIb windows. (C) Training process of the CycleGAN model. An NIR-IIa image was randomly selected from the training set and processed by the generator GA to obtain a generated NIR-IIb image, which was used as input for another generator GB to reconstruct the original NIR-IIa image. A discriminator DB was trained to tell whether an NIR-IIb image was real or generated. A cycle consistency loss () was defined to ensure meaningful image-to-image translation. The overall loss is a weighted sum of the adversarial loss () and the cycle consistency loss ().
In the CycleGAN model, a pair of generators GA and GB were applied to transform image from one domain to another, and a pair of discriminators DA and DB were used to differentiate real images from generated ones (Fig. 1C). We used a U-Net (38) architecture for the generators (SI Appendix, Fig. S2A). It consisted of an encoding path, in which a feature map of the original image was extracted by convolutional layers, and a decoding path, in which the extracted feature map was transformed to the final output image. A PatchGAN (28, 29) structure was used as the discriminators (SI Appendix, Fig. S2C). For an input NIR-IIa image x, GA(x) generated an image, and an adversarial loss was applied to enforce that the generated image looked similar to a real NIR-IIb image. Subsequently, GB(GA(x)) reconstructed the original image. To guarantee meaningful image-to-image translation, the reconstructed image was forced to be close to the original image by minimizing the cycle consistency loss , which was defined as the L1 distance between the original image x and the reconstructed image GB(GA(x)). The total loss function was a weighted sum of the adversarial loss and the cycle consistency loss. The generators were trained to minimize the loss function, while the discriminators were trained to maximize the loss function (see Materials and Methods and SI Appendix for detailed structure and training procedure of the neural networks) (29). Once the neural networks were trained, only the generator GA was needed for NIR-IIa image processing.
Contrast-Enhanced Wide-field NIR-IIa Fluorescence Imaging with CycleGAN.
After the neural networks were trained, the generator GA was used to process NIR-IIa images (see examples in Fig. 2A and SI Appendix, Fig. S3). Note that all these examples were not seen as a priori by the neural networks during training. After image transformation, the contrast of the images was largely enhanced, while the features such as blood vessels, major organs, and lymph nodes were preserved and sharpened. To confirm the validity of this neural network-based image-processing method, a balb/c mouse was injected with a nanoscale NIR-IIa probe p-FE (39) (hydrodynamic size ∼12 nm) and an NIR-IIb fluorophore P3-QD (40) (hydrodynamic size ∼26 nm) at the same time, and fluorescence imaging of mouse blood vessels labeled by the two probes in the NIR-IIa and NIR-IIb windows, respectively, was recorded to obtain matched images in the two domains (Fig. 2B). The trained neural network was utilized to process the NIR-IIa image, producing a generated image (Fig. 2B) that remarkably resembled the ground truth, i.e., in this case the NIR-IIb image (Fig. 2B). The NIR-IIb and generated images showed highly similar intensity profiles (Fig. 2D) and spatial frequency patterns (SI Appendix, Fig. S4), demonstrating that the generator could faithfully enhance the contrast of the NIR-IIa images without introducing artifacts.
Fig. 2.
Wide-field fluorescence imaging with GAN. (A) Examples of real NIR-IIa images and generated images. (Scale bar, 1 cm.) (B) In vivo fluorescence imaging of a balb/c mouse injected with p-FE and P3-QDs. The NIR-IIa image was processed by the generator GA to obtain the contrast-enhanced image. (Scale bar, 5 mm.) (C) In vivo fluorescence imaging of a balb/c mouse injected with ICG and QDs and the images generated by the U-Net generator. (Scale bar, 1 cm.) (Fig. 2C: reproduced with permission from ref. 40.) (D) Cross‐sectional intensity profiles of the same vessel (labeled in B) in the NIR-IIa, NIR-IIb, and generated images. (E) Normalized fluorescence intensity of the lines shown in C. Fluorescence intensity in D and E was normalized by the maximum intensity on the line. (F) A balb/c mouse was injected with IR783@BSA-GSH complex and imaged in the NIR-I window using a CRi’s Maestro in vivo imaging system with an exposure time of 100 ms at 5 min post injection (42). The trained generator GA was used to transform the NIR-I image to a high-resolution image. (Scale bar, 1 cm.)
We found that the nearest neighbor of the generated images in the training set (SI Appendix, Fig. S5) looked similar to the generated images, but were not identical, which indicated that the neural network did not memorize the results from the training set and it generalized well to previously unseen data. Further, we analyzed the output feature map after the first two convolutional layers in the encoding part of the generator and found that some channels showed interpretable patterns. For example, channel 10 extracted background tissue of the mice, channel 30 labeled major organs including liver, spleen, and tumor, and channel 32 showed regions outside the mice (SI Appendix, Fig. S6). These results further confirmed that the neural network learned useful information from the training data, which was then utilized for synthesizing new images.
We compared the U-Net generator with another commonly used network structure with residual blocks (29, 41) (denoted ResNet, SI Appendix, Fig. S2B). ResNet could also generate high-contrast images after training (SI Appendix, Fig. S7). However, it suffered from more artifacts as indicated by the mismatched spatial frequency patterns compared to the ground-truth NIR-IIb images (SI Appendix, Fig. S4). The better performance of the U-Net might be attributed to skipping the connections between the encoding path and the decoding path, which allowed context information captured in the encoding path to be passed to the decoding path more easily (38). To demonstrate the importance of both the adversarial loss and the cycle consistency loss, we performed ablation study on the full loss function in which the neural networks were trained with only the adversarial loss or the cycle consistency loss. Compared to the results using the full loss function, the generator performed much worse (SI Appendix, Fig. S8). We also optimized the training epochs. The neural networks failed to generate meaningful images after a small number of epochs. When the networks were trained for ∼30–60 epochs, an NIR-IIb–like image with high quality could be obtained. Further increasing the training iterations may also introduce artifacts, which was caused by overfitting of the training examples (SI Appendix, Fig. S9).
Encouraged by the successful application of neural networks for NIR-IIa–to–NIR-IIb image translation, we transformed wide-field NIR-IIa fluorescence images of lymph nodes (40) using the trained generator. An FDA-approved small-molecule dye ICG was injected into the foot pads of the mouse concurrently with an NIR-IIb nanoparticle P3-QD for comparison (40). The emission tail of ICG was utilized for imaging in the NIR-IIa window (25), and the obtained images were processed by the trained U-Net generator to obtain contrast-enhanced images (Fig. 2C and SI Appendix, Fig. S10). The lymph node-to-background ratios measured with the NIR-IIa image, the generated image, and the NIR-IIb image were 8.44, 117.0, and 159.1, respectively, for the superficial sacral lymph node (Fig. 2E) and 5.65, 39.2, and 45.0, respectively, for the deep-lumbar lymph node (SI Appendix, Fig. S10). Comparison of the full width at half maximum (FWHM) of the lymph nodes at different depths also confirmed that the generated images closely resembled the real NIR-IIb images (SI Appendix, Table S1). Interestingly, the SBR of lymph node imaging in the NIR-I window could also be improved by the neural network (SI Appendix, Fig. S11), even though NIR-I images were not used for training. The improved SBR and higher resolution of the structures suggested that lymphatic imaging using an FDA-approved fluorescent dye can obtain similar image quality as NIR-IIb imaging using PbS-based probes, owing to deep learning.
Further, we applied the neural networks to process NIR-I images recorded by commercial imaging systems. A balb/c mouse was injected with IR783@BSA-GSH and imaged with the CRi Maestro in vivo fluorescence-imaging system in the NIR-I window (42). Upon image transformation with the trained generator, feature clarity, sharpness, and SBR of the image were largely improved (Fig. 2F). Deep learning from images captured on a home-built imaging system can afford neural networks easily applicable to other fluorescence-imaging platforms, affording a broadly useful approach to enhancing in vivo NIR imaging in general.
CycleGAN Deep Learning for Molecular Imaging of Cancer.
Next, we investigated neural networks for near-infrared molecular imaging of cancer using mouse tumor models. Traditional NIR-I imaging in the 800- to 900-nm range has entered clinical trials for tumor imaging and imaging-guided resection surgery (7, 43, 44), with a caveat of low T/NT ∼3–5 limited by light scattering and autofluorescence problems. We investigated the squamous cell carcinoma of head and neck, 90% of which overexpressed the EGFR that presented a molecular target for diagnosis and anticancer therapy (45). Bioconjugate of IRDye800CW and Cetuximab, a monoclonal antibody of EGFR, had been evaluated in clinical trials as an NIR-I–imaging agent for detecting head and neck tumors during surgical procedures (46), but showed shallow imaging depth, low spatial resolution, and signal/background ratios. To enhance molecular imaging in NIR-I by our deep-learning image transformation approach, we implanted SCC-1 human cancer cells subcutaneously in athymic nude mice and administered IRDye800CW-Cetuximab (Fig. 3A) intravenously. The mice were then imaged in the NIR-I (900–1,000 nm, Fig. 3B) and NIR-IIa (1,100–1,300 nm, Fig. 3C) windows at 24 h post injection. After image processing with the neural network, a high T/NT of 18.2 in the NIR-I window, and 25.3 in the NIR-IIa window, could be achieved (Fig. 3E), which were 3–5 times higher than those of the original images.
Fig. 3.
Molecular imaging with CycleGAN. (A) Conjugation of IRDye800-NHS to Cetuximab. (B and C) Nude mice (n = 3) with SCC-1 tumors were injected with IR800CW-Cetuximab. The NIR-I (B, 900–1,000 nm) and NIR-IIa (C, >1,100 nm) images were taken at 24 h post injection. The trained U-Net generator was used to process the original images. (Scale bar, 1 cm.) (D) High-resolution NIR-I (900–1,000 nm) imaging of an SCC-1 tumor at 24 h after the injection of IR800CW-Cetuximab. (Scale bar, 5 mm.) (E) Tumor-to-normal tissue signal ratio of the real and generated images in the NIR-I and NIR-IIa windows. (F) Fluorescence intensity of the lines shown in D.
In vivo fluorescence imaging has been explored in clinics for imaging-guided resection of tumors. We imaged the SCC-1 tumors at a higher magnification (Fig. 3D). The cross-sectional line profile of the generated image showed a reduced FWHM (2.95 vs. 3.52 nm) compared to that of the original image (Fig. 3F). The significantly improved image clarity and sharper resection margin allowed more precise removal of the tumors while minimizing damage to surrounding normal tissues, which is a key requirement for imaging-guided resection surgery in clinics.
We further explored enhancing NIR-I and NIR-IIa molecular imaging in a mouse model of immunotherapy based on anti–PD-L1. Immunotherapy based on checkpoint blockade of the programmed cell death protein-1 (PD-1) or its ligand PD-L1 has shown great promise for treating cancer in clinics (47, 48). In vivo molecular imaging could assess the expression level of PD-L1 in tumors in real time and help to evaluate the efficacy of immunotherapy. We conjugated IRDye800CW to Atezolizumab and administered intravenously (i.v.) to balb/c mice bearing CT26 tumors overexpressing PD-L1. Similar to the SCC-1 model, translation of NIR-I and NIR-IIa images enhanced T/NT up to ∼20 from ∼5 (SI Appendix, Fig. S12). The same neural network was compatible with different targeting ligands and tumor types, establishing the generality of this method.
To compare deep-learning–enhanced NIR-IIa imaging with real NIR-IIb imaging, we conjugated a small-molecule NIR-IIa dye IR12-NHS (49) and an NIR-IIb PbS/CdS core-shell quantum dot (40) to Atezolizumab, and i.v. injected into a CT26-bearing balb/c mouse. At 24 h post injection, we recorded wide-field images of the tumor in both NIR-IIa windows by detecting the IR12–anti-PDL1 (1,000–1,200 nm, SI Appendix, Fig. S13) and in the NIR-IIb window by detecting PbS/CdS–anti-PDL1 (1,500–1,700 nm, SI Appendix, Fig. S13). Upon transforming the NIR-IIa image, the measured T/NT of the generated image was much higher than that of the original NIR-IIa image and approached that of the ground-truth NIR-IIb image (T/NT ∼26.2 in generated image vs. 8.46 in NIR-IIa and 30.8 in NIR-IIb). Furthermore, the generated image resembled the ground-truth NIR-IIb image, showing similar cross-sectional intensity profiles in the tumor area (SI Appendix, Fig. S13B). This result showed that with deep learning, molecular imaging in NIR-IIa (1,000–1,300 nm) afforded similar results as in NIR-IIb.
Deep Learning for NIR-II LSM.
LSM in the NIR-II window is a recent development allowing in vivo volumetric optical imaging of mouse tissues with a high spatial and temporal resolution in a noninvasive manner (9). Both the excitation and emission wavelengths have been shifted to the 1,300-nm range to suppress light scattering and increase imaging depth/volume. Here, we explored deep learning for NIR LSM with organic probes. For comparison, two nanoparticle-based NIR-II probes p-FE (785-nm excitation, 1,000- to 1,300-nm emission in NIR-IIa) (39, 50) and PbS/CdS CSQD (785-nm excitation, 1,500- to 1,700-nm emission in NIR-IIb) (14) were injected into the same mouse. The mouse was euthanized at 30 min after administration, and ex vivo imaging of the brain vasculatures containing the circulating fluorescent probes was performed with our home-built light-sheet microscope described previously (see ref. 9 and Materials and Methods for details). We first used the pretrained U-Net generator to transform the NIR-IIa LSM images to NIR-IIb ones. However, compared to the ground-truth NIR-IIb LSM images, the generated results showed broadening of structures and artifacts such as vertical stripes (SI Appendix, Fig. S14). This could be attributed to mismatched distribution between the training data (whole-body imaging) and the test data (LSM imaging). Compared to whole-body low-magnification images, LSM images had a much smaller field of view and feature sizes, requiring alternative training methods for faithful transformation.
We retrained the neural network with matched NIR-IIa images and the corresponding NIR-IIb images recorded in the same tissue volumes (9) using the supervised image-to-image translation algorithm (pix2pix) (28). The pix2pix model utilized a generator GA to transfer an NIR-IIa LSM image to an NIR-IIb one, and a discriminator DB to differentiate real and generated NIR-IIb LSM images. Instead of using randomly selected images, a pair of NIR-IIa and NIR-IIb LSM images at the same position were used as inputs when training the neural network. Different from the unconditional GANs, the discriminator DB also took the input of the generator as its input (SI Appendix and Fig. 4A). We used a U-Net (SI Appendix, Fig. S2A) as the generator and a PatchGAN (SI Appendix, Fig. S2C) as the discriminator (28). The training set consisted of 1,000 NIR-IIa LSM images and 1,000 NIR-IIb LSM images (9) (Fig. 4B), and the loss function was a weighted sum of the adversarial loss and L1 distance between the generated image and the real NIR-IIb image (see Materials and Methods for detailed architecture and training procedures).
Fig. 4.
Pix2pix-based NIR-IIa–to–NIR-IIb LSM image processing. (A) pix2pix model used for training. A pair of NIR-IIa and NIR-IIb LSM images were selected from the training set. The NIR-IIa image was processed by the generator GA to obtain a generated NIR-IIb image. The real or generated IIb image was concatenated with the real IIa image and used as an input of the discriminator DB. The overall loss is a weighted sum of the adversarial loss () and the L1 distance between the real and generated IIb images. (.) (B) LSM images at different depths from the training set (9). (Scale bar, 200 µm.)
After training, NIR-IIa LSM images at different depths were used as inputs for the U-Net generator. The effect of tissue scattering made it difficult to identify cerebral vasculatures at a depth of >2.0 mm based on original images recorded in the NIR-IIa window. In contrast, in the deep-learning–generated image, background signal was significantly reduced, allowing similar three-dimensional (3D) volumetric imaging quality to the ground-truth NIR-IIb LSM imaging (Fig. 5A and Movie S1). Further, we analyzed FWHM of the smallest vessels (Fig. 5B and SI Appendix, Fig. S15) and SBR (Fig. 5B) at different depths. The original NIR-IIa LSM data suffered from broadened vessels and reduced SBR at deeper tissue penetration depths, while the generated LSM were able to achieve similar results compared to the real NIR-IIb LSM data (Fig. 5 B and C). Deep learning afforded a powerful approach to enhancing 3D volumetric imaging using LSM.
Fig. 5.
LSM image processing by pix2pix GAN. (A) LSM images at different depths. (Scale bar, 200 µm.) (B and C) Comparison of FWHM (B) and SBR (C) at various depths. The error bars in B and C represent the SD of 5 data at each depth.
Discussion
Fluorescence imaging is a useful modality to probe biological systems with high spatial and temporal resolution. Several fluorescent molecules and their conjugates have been approved by FDA for human use or progressed into clinical trials, but none of these molecules could emit in the NIR-IIb window to minimize light scatter and maximize imaging depth and resolution. Although the image quality can be significantly improved by utilizing the off-peak emission tail in the NIR-IIa window (13, 26, 49), it is far from optimal owing to residual tissue light scattering in the <1,300-nm spectral range. Although other image-processing methods, such as deconvolution (51), have been explored to improve the resolution of the fluorescence imaging, these methods rely on a priori information about the specific imaging system and are difficult to generalize to new optical systems.
This work developed a deep-learning–based method to transform an NIR-I or NIR-IIa image to an NIR-IIb one possessing the highest signal/background ratios and spatial resolution among all one-photon NIR-imaging approaches. None of the data shown in this work were seen previously by neural networks, suggesting the methods generalized well to new imaging data. We also showed that the method is compatible with a wide range of NIR fluorophores and targeting ligands, regardless of their detailed optical and biological properties, establishing the versatility of this framework.
The ability to generate high-resolution images from scattering-blurred NIR images by neural networks could open new opportunities in clinical translations. Instead of trying to improve the biocompatibility and alleviating the toxicity of the nanoparticle-based NIR-IIb fluorophores, one could apply FDA-approved molecules directly and transform the low-resolution images to high-resolution ones, which would be an excellent application of artificial intelligence. We demonstrated here that an unprecedented lymph node-to-background ratio of >100 could be achieved for ICG-based imaging after transformation by the neural network, which could enhance sentinel lymph node mapping in the clinic (52). Furthermore, tumor-to-normal tissue signal ratio could be significantly improved in a mouse model of colon cancer and a mouse model of head and neck cancer after administration of IRDye800CW–antibody conjugates, and clearer tumor margins were identified. These results could facilitate fluorescence imaging for diagnosis of tumors or guided resection of tumors. In the case of diagnosis, one will need to investigate many positive and negative images, analyze tumor/normal tissue signal ratios, and compare the results with and without transformation. One could then adjust cutoff criteria to assess positive vs. negative cases with a goal of improving the diagnosis accuracy based on NIR imaging. We expect that a similar training scheme can be adapted for training neural networks to transform images to sharp NIR-IIb images for larger animals or humans. Alternatively, parameters of the neural networks trained on imaging data of small animals can be used as a starting point, and a transfer learning approach (53) can be used to fine-tune the model. Ultimately, one may be able to apply the deep-learning–based NIR fluorescence image-processing methods for human use in the clinic.
In addition to applications in clinical diagnostics and imaging-guided surgery, neural network-enabled NIR fluorescence imaging could also provide a powerful tool for biomedical research. For instance, NIR-II LSM shown in this work allowed in vivo and ex vivo 3D deep-tissue imaging at cellular resolution with small-molecule fluorescent probes. We showed that LSM in the NIR-IIa window could afford similar penetration depth and SBR to LSM in the NIR-IIb window, suggesting that we could perform LSM in a wider wavelength region without compromising performance. With the expanded optical window, more candidate fluorophores could be utilized, enabling probing of more biomarkers at the same time. Further, the reduced size of these small-molecule probes compared to nanomaterials allowed easier migration in biological tissues, which leads to more precise localization of the targeted structures. With the help of the deep neural networks, the cost-efficient and less invasive LSM in the NIR-IIa window could become a complementary method to other in vivo optical imaging methods, such as two-photon microscopy (54).
Materials and Methods
The materials and methods used in this study are described in detail in SI Appendix, Materials and Methods. Information includes descriptions of NIR-II fluorescent probes used in this study, mouse handling methods, wide-field NIR-II fluorescence imaging, CycleGAN structure and training, bioconjugation methods, NIR-II LSM, and pix2pix structure and training. All animal experiments were approved by the Stanford Institutional Animal Care and Use Committee (IACUC).
Supplementary Material
Acknowledgments
This study was supported by the NIH (DP1-NS-105737). We thank Shoujun Zhu for the CRi Maestro imaging data.
Footnotes
The authors declare no competing interest.
This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2021446118/-/DCSupplemental.
Data Availability.
Datasets and code for training and testing the models are available at https://dailab.stanford.edu under “Software for deep learning for in vivo near-infrared imaging Download: Software.”
References
- 1.Welsher K., et al. , A route to brightly fluorescent carbon nanotubes for near-infrared imaging in mice. Nat. Nanotechnol. 4, 773–780 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Hong G., et al. , In vivo fluorescence imaging with Ag2S quantum dots in the second near-infrared region. Angew. Chem. Int. Ed. Engl. 51, 9818–9821 (2012). [DOI] [PubMed] [Google Scholar]
- 3.Hong G., et al. , Ultrafast fluorescence imaging in vivo with conjugated polymer fluorophores in the second near-infrared window. Nat. Commun. 5, 4206 (2014). [DOI] [PubMed] [Google Scholar]
- 4.Antaris A. L., et al. , A small-molecule dye for NIR-II imaging. Nat. Mater. 15, 235–242 (2016). [DOI] [PubMed] [Google Scholar]
- 5.Cosco E. D., et al. , Flavylium polymethine fluorophores for near- and shortwave infrared imaging. Angew. Chem. Int. Ed. Engl. 56, 13126–13129 (2017). [DOI] [PubMed] [Google Scholar]
- 6.Yang Y., et al. , Small-molecule lanthanide complexes probe for second near-infrared window bioimaging. Anal. Chem. 90, 7946–7952 (2018). [DOI] [PubMed] [Google Scholar]
- 7.Hong G., Antaris A. L., Dai H., Near-infrared fluorophores for biomedical imaging. Nat. Biomed. Eng. 1, 10 (2017). [Google Scholar]
- 8.Ding F., Fan Y., Sun Y., Zhang F., Beyond 1000 nm emission wavelength: Recent advances in organic and inorganic emitters for deep-tissue molecular imaging. Adv. Healthc. Mater. 8, e1900260 (2019). [DOI] [PubMed] [Google Scholar]
- 9.Wang F., et al. , Light-sheet microscopy in the near-infrared II window. Nat. Methods 16, 545–552 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Diao S., et al. , Biological imaging without autofluorescence in the second near-infrared region. Nano Res. 8, 3027–3034 (2015). [Google Scholar]
- 11.Diao S., et al. , Fluorescence imaging in vivo at wavelengths beyond 1500 nm. Angew. Chem. Int. Ed. Engl. 54, 14758–14762 (2015). [DOI] [PubMed] [Google Scholar]
- 12.Franke D., et al. , Continuous injection synthesis of indium arsenide quantum dots emissive in the short-wavelength infrared. Nat. Commun. 7, 12749 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Bruns O. T., et al. , Next-generation in vivo optical imaging with short-wave infrared quantum dots. Nat. Biomed. Eng. 1, 56 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Zhang M., et al. , Bright quantum dots emitting at ∼1,600 nm in the NIR-IIb window for deep tissue fluorescence imaging. Proc. Natl. Acad. Sci. U.S.A. 115, 6590–6595 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Ma Z., et al. , Near-infrared IIb fluorescence imaging of vascular regeneration with dynamic tissue perfusion measurement and high spatial resolution. Adv. Funct. Mater. 28, 1803417 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Zhong Y., et al. , Boosting the down-shifting luminescence of rare-earth nanocrystals for biological imaging beyond 1500 nm. Nat. Commun. 8, 737 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Fan Y., et al. , Lifetime-engineered NIR-II nanoparticles unlock multiplexed in vivo imaging. Nat. Nanotechnol. 13, 941–946 (2018). [DOI] [PubMed] [Google Scholar]
- 18.Zhong Y., et al. , In vivo molecular imaging for immunotherapy using ultra-bright near-infrared-IIb rare-earth nanoparticles. Nat. Biotechnol. 37, 1322–1331 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Sun C., et al. , J-aggregates of cyanine dye for NIR-II in vivo dynamic vascular imaging beyond 1500 nm. J. Am. Chem. Soc. 141, 19221–19225 (2019). [DOI] [PubMed] [Google Scholar]
- 20.Li Y., et al. , ACQ-to-AIE transformation: Tuning molecular packing by regioisomerization for two-photon NIR bioimaging. Angew. Chem. Int. Ed. Engl. 59, 12822–12826 (2020). [DOI] [PubMed] [Google Scholar]
- 21.Liu Z., et al. , In vivo biodistribution and highly efficient tumour targeting of carbon nanotubes in mice. Nat. Nanotechnol. 2, 47–52 (2007). [DOI] [PubMed] [Google Scholar]
- 22.Liu Z., et al. , Circulation and long-term fate of functionalized, biocompatible single-walled carbon nanotubes in mice probed by Raman spectroscopy. Proc. Natl. Acad. Sci. U.S.A. 105, 1410–1415 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Vahrmeijer A. L., Hutteman M., van der Vorst J. R., van de Velde C. J. H., Frangioni J. V., Image-guided cancer surgery using near-infrared fluorescence. Nat. Rev. Clin. Oncol. 10, 507–518 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Zhu S., Tian R., Antaris A. L., Chen X., Dai H., Near-infrared-II molecular dyes for cancer imaging and surgery. Adv. Mater. 31, e1900321 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Carr J. A., et al. , Shortwave infrared fluorescence imaging with the clinically approved near-infrared dye indocyanine green. Proc. Natl. Acad. Sci. U.S.A. 115, 4465–4470 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hu Z., et al. , First-in-human liver-tumour surgery guided by multispectral fluorescence imaging in the visible and near-infrared-I/II windows. Nat. Biomed. Eng. 4, 259–271 (2020). [DOI] [PubMed] [Google Scholar]
- 27.Goodfellow I. J., et al. , Generative adversarial networks. arXiv:1406.2661 (10 June 2014).
- 28.Isola P., Zhu J., Zhou T., Efros A. A., Image-to-image translation with conditional adversarial networks. arXiv:1611.07004 (21 November 2016).
- 29.Zhu J., Park T., Isola P., Efros A. A., Unpaired image-to-image translation using cycle-consistent adversarial networks. arXiv:1703.10593 (30 March 2017).
- 30.Nie D., et al. , Medical image synthesis with deep convolutional adversarial networks. IEEE Trans. Biomed. Eng. 65, 2720–2730 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Choi H., Lee D. S.; Alzheimer’s Disease Neuroimaging Initiative , Generation of structural MR images from amyloid PET: Application to MR-less quantification. J. Nucl. Med. 59, 1111–1117 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Jin C. B., et al. , Deep CT to MR synthesis using paired and unpaired data. Sensors (Basel) 19, 2361 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Wang Y., et al. , 3D conditional generative adversarial networks for high-quality PET image estimation at low dose. Neuroimage 174, 550–562 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Chen H., et al. , Low-dose CT denoising with convolutional neural networks. arXiv:1610.00321 (2 October 2016).
- 35.Ran M., et al. , Denoising of 3D magnetic resonance images using a residual encoder-decoder Wasserstein generative adversarial network. Med. Image Anal. 55, 165–180 (2019). [DOI] [PubMed] [Google Scholar]
- 36.Wang H., et al. , Deep learning enables cross-modality super-resolution in fluorescence microscopy. Nat. Methods 16, 103–110 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Jin L., et al. , Deep learning enables structured illumination microscopy with low light levels and enhanced speed. Nat. Commun. 11, 1934 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Ronneberger O., Fischer P., Brox T., U-Net: Convolutional networks for biomedical image segmentation. arXiv:1505.04597 (18 May 2015).
- 39.Wan H., et al. , A bright organic NIR-II nanofluorophore for three-dimensional imaging into biological tissues. Nat. Commun. 9, 1171 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Ma Z., et al. , Advancing nanomedicine with cross‐link functionalized nanoparticles for rapid excretion. Angew. Chem. Int. Ed., 10.1002/anie.202008083 (2020). [DOI] [PubMed] [Google Scholar]
- 41.He K., Zhang X., Ren S., Sun J., Deep residual learning for image recognition. arXiv:1512.03385 (10 December 2015).
- 42.Tian R., et al. , Albumin-chaperoned cyanine dye yields superbright NIR-II fluorophore with enhanced pharmacokinetics. Sci. Adv. 5, eaaw0672 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.van der Vorst J. R., et al. , Near-infrared fluorescence-guided resection of colorectal liver metastases. Cancer 119, 3411–3418 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Warram J. M., et al. , Fluorescence-guided resection of experimental malignant glioma using cetuximab-IRDye 800CW. Br. J. Neurosurg. 29, 850–858 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Zimmermann M., Zouhair A., Azria D., Ozsahin M., The epidermal growth factor receptor (EGFR) in head and neck cancer: Its role and treatment implications. Radiat. Oncol. 1, 11 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Gao R. W., et al. , Safety of panitumumab-IRDye800CW and cetuximab-IRDye800CW for fluorescence-guided surgical navigation in head and neck cancers. Theranostics 8, 2488–2495 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Okazaki T., Honjo T., PD-1 and PD-1 ligands: From discovery to clinical application. Int. Immunol. 19, 813–824 (2007). [DOI] [PubMed] [Google Scholar]
- 48.Chen L., Han X., Anti-PD-1/PD-L1 therapy of human cancer: Past, present, and future. J. Clin. Invest. 125, 3384–3391 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Zhu S., et al. , Repurposing cyanine NIR-I dyes accelerates clinical translation of near-Infrared-II (NIR-II) bioimaging. Adv. Mater. 30, e1802546 (2018). [DOI] [PubMed] [Google Scholar]
- 50.Ma Z., et al. , A theranostic agent for cancer therapy and imaging in the second near-infrared window. Nano Res. 12, 273–279 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Richardson W., Bayesian-based iterative method of image restoration. J. Opt. Soc. Am. 62, 55–59 (1972). [Google Scholar]
- 52.Lin H., Ding Z., Kota V. G., Zhang X., Zhou J., Sentinel lymph node mapping in endometrial cancer: A systematic review and meta-analysis. Oncotarget 8, 46601–46610 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Tan C., et al. , A survey on deep transfer learning. arXiv:1808.01974 (6 August 2018).
- 54.Helmchen F., Denk W., Deep tissue two-photon microscopy. Nat. Methods 2, 932–940 (2005). [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Datasets and code for training and testing the models are available at https://dailab.stanford.edu under “Software for deep learning for in vivo near-infrared imaging Download: Software.”





