Abstract
Multiplex immunofluorescence (MxIF) is an emerging technique that allows for staining multiple cellular and histological markers to stain simultaneously on a single tissue section. However, with multiple rounds of staining and bleaching, it is inevitable that the scarce tissue may be physically depleted. Thus, a digital way of synthesizing such missing tissue would be appealing since it would increase the useable areas for the downstream single-cell analysis. In this work, we investigate the feasibility of employing generative adversarial network (GAN) approaches to synthesize missing tissues using 11 MxIF structural molecular markers (i.e., epithelial and stromal). Briefly, we integrate a multi-channel high-resolution image synthesis approach to synthesize the missing tissue from the remaining markers. The performance of different methods is quantitatively evaluated via the downstream cell membrane segmentation task. Our contribution is that we, for the first time, assess the feasibility of synthesizing missing tissues in MxIF via quantitative segmentation. The proposed synthesis method has comparable reproducibility with the baseline method on performance for the missing tissue region reconstruction only, but it improves 40% on whole tissue synthesis that is crucial for practical application. We conclude that GANs are a promising direction of advancing MxIF imaging with deep image synthesis.
Keywords: MxIF, inpainting, reproducibility, multi-channel
1. INTRODUCTION
Crohn’s disease (CD) is a challenging inflammatory bowel disease (IBD), which is characterized by chronic, relapsing, and remitting bowel inflammation [1]. The prevalence of IBD is increasing, with an estimated 3.1 million, or 1.3%, of U.S. adults affected [2]. The Gut Cell Atlas (GCA), an initiative funded by The Leona M. and Harry B. Helmsley Charitable Trust, seeks to create a reference platform to understand the human gut focused on comparing Crohn’s disease patients to healthy controls (https://www.gutcellatlas.helmsleytrust.org/). The GCA project provides a unique opportunity to map cell number, distribution, and protein expression profiles as a function of anatomical location and physiological perturbations of IBD. As part of data collection [3], our site acquires formalin-fixed paraffin-embedded tissues from the terminal ileum (TI) and ascending colon (AC), followed by Multiplexed Immunofluorescence (MxIF) and imaging to understand cell composition, functional state, and cell to cell fluctuations [4].
MxIF is an emerging technique that allows multiple cellular and histological markers to be assessed on a single tissue section, with repeated rounds of stain, imaging, stripping, and re-staining [4], [5]. The current MxIF acquisition pipeline we use includes a total of 29 markers that are distributed in 19 rounds [4]. The limited size of endoscopic biopsies and the degraded tissues caused by repeated cutting of sections is an impetus for MxIF, and more tissue would be lost. Figure 1 shows three samples’ patches (8192×8192 pixels) scanned with 20× magnification across different rounds’ of DAPI staining. The white region final retention tissue masks indicate that the tissue survives from the first round of staining untill the last round. The red portion indicates areas of missing tissue along with the staining round. Overall, this study observed a 28.87± 14.3% (mean ±std) tissues missing ratio across nine biopsies. It is inevitable to that there will be some loss of tissue physically, which reduces the overall tissue projection area for further downstream analysis. Thus, investigating a digital reconstruction way to reproduce the missing tissue and increase the usable region of interest is appealing since it would increase the useable projection areas for the downstream single-cell analysis. Moreover, synthetic data generation is a crucial adjunct strategy to complement staining optimization because of the limited sample size.
Figure 1.
One of the critical issues of the MxIF imaging staining technique: is the tissue loss. Such loss typically inevitable during iterative staining and de-staining on the same tissue. Here we present three biopsies samples’ three different rounds of DAPI stains. The colored dots are zoomed in to show how tissue loss along the staining sequence. The white region of final retention tissue masks means that the tissue survives from the first round of stain till the last round. The red portion is overall missing overall tissue along the staining sequence.
In medical image analysis, numerous imaging inpainting and synthesis techniques have been developed for improving registration, segmentation, and classification [6]–[9][10][11]. However, there were very few image inpainting studies in the digital pathology field. Moreover, most of such works focused on H&E and nuclei segmentation. Hou et al. proposed a synthesized H&E nucleus segmentation mask [12]. Liu et al. proposed a nuclei inpainting mechanism to remove the auxiliary nuclei in the synthesized histopathology images [13]. Gong et al. inpainted nuclei regions and simulated the nuclei’s texture and intensity characteristics in the authentic images to create a foreground [14]. To the best of our knowledge, a study that focused on tissue inpainting for MxIF does not exist.
2. METHODS
Our work is motivated by Figure 2, for example, where the γActin is the last structural marker to stain with the most significant number of missing tissues. To reproduce γActin, we can directly utilize the remaining γActin to provide intra structural knowledge as standard image inpainting does. Meanwhile, the remaining of the markers provides inter morphological spatial relationships that also provides valuable structural mapping. Then, the first challenge of the work is integrating both inter and intra spatial information to synthesize the marker with missing tissue. A typical practice in medical image inpainting [15], [16] imposes rectangle boxes on the original image and reconstructs the missing parts. However, the MxIF missing tissue can occur both on the boundary and middle of the tissue, thus the inpainting pattern is random, and modelling such randomness is challenging. Finally, it is hard to obtain ground truth labeling due to MxIF’s high resolution. To solve the above challenges, we propose a novel multi-channel image synthesis approach, called pixNto1-MT, to inpaint the missing tissues by modeling markers’ inter and intra-stain conditional probabilities.
Figure 2.
Four types of structural stains are shown that target cell, universal, stromal, and epithelial composition. The number in the bracket represents the round of stains, γActin is stained at the end. Here, we assume that γActin contains missing tissue, then use intra-stain knowledge to reconstruct the γActin is attainable. Meanwhile, we assume other markers hold complete images signals from previous rounds. Then, we can also potentially use inter multi-modal knowledge to synthesize the broken γActin. Thus, the paper aims to integrate both inter and intra morphology information to restore the missing tissue.
In this work, we aim to reconstruct the partial tissues of the last round’s structural stain. We assume the complete images from previous rounds of stains are available for the missing tissue regions in the current staining round. A canonical image inpainting design is to reconstruct the image with missing values by itself. Since the extra information from other markers is available. Then the synthesis problem is reformed to conditional cross-modality inpainting of both inter multi-modal and intra-image.
Generative adversarial networks (GANs) have been broadly validated in medical image inpainting synthesis [17], [18]. MxIF markers are stained on the same tissue sample. Figure 2 shows pixel to pixel correspondence across different markers, so we define such cross-modality synthesis is in paired Pix2Pix GAN manner [19]. The Pix2Pix is a conditional GAN framework. The generator G synthesize fake image to fool the discriminator. The discriminator D criticize the synthesis output from the generator with the real image. The training is performed by using the following GAN loss:
(1) |
where x is the input image, y is the real image. As we claimed, there is no current inpainting work focus on missing tissue inpainting. It is unknown how to utilize the potential reconstruction image. The classic computer vision inpainting work put the missing part area back to the original images. In the MxIF scenario, we usually fail to understand which parts are missing and if the lost is due to tissue depletion or the staining operation. Moreover, if the inpainted tissue’s intensity is inconsistent with the original image, it might be difficult to paste the synthesized tissue to the original tissue seamlessly. Thus, we model y as the entire marker tissue to synthesize rather than only representing the missing tissue. As Figure 3 illustrated, we present a GAN based network called pixNto1-MT, where MT represents “missing tissue”. The framework has the following of two main obligations:
Intra stain modeling: to simulate realistic missing tissue training samples, we build up a biochemical mask library. Briefly, we sample the real mask patches from our available datasets, split them to 1024*1024 tiles, and filter the patches whose none zero intensity ratios are between 20% and 80%. As such, there are in total 523 masks are applicable. Those masks are randomly applied to the modality that is supposed to have missing tissue (γActin in our case) to simulate various tissue erosion inputs sent to the G. While the original images are treated as real image synthesis targets for D.
Inter multi-modal modeling: the corresponding stains that contain a full image are simply fed to both G and D in a multi-channel manner.
Figure 3.
Workflow of the PixNto1-MT to inpaint the missing tissue regions by aggregating the information from residual tissues and information from other channels. Only four markers are shown for illustration. Specifically, the biochemical mask library provides noise applied to the target channel and generates simulated stains with missing tissue. The generator aims to synthesis the whole target image patch instead of the missing area only.
Let us set capital letter M to indicate a set of inter multi-modal markers. The bio mask library is defined as δ, then δ(y) means the stains with missing tissue. Then let X represents a union where X = {M, δ(y)}. Then the final objective function is
(2) |
The proposed method aims to manipulate the data input of training datasets. The method is architecture-agnostic, we could incorporate the Intra stain and inter multi-modal framework to any other pipelines.
3. DATA AND EXPERIMENTAL SETTING
Datasets.
9 sample biopsies were collected from 3 CD patients and 2 healthy controls from the ascending colon and terminal ileums area. All data were de-identified and under Internal Review Board approval. The dataset includes 1 active disease and 8 non-disease biopsies. The MxIF markers we used were scanned with 20× magnification and stained in the following order - DAPI(first round), Muc2, Collagen, β catenin, pEGFR, HLA A, PanCK, Na-KATPase, Vimentin, SMA, and γActin. Note that the functional markers were not evaluated in this study since the patterns of the functional markers were more disease-dependent than structural markers. The standard DAPI based registration and autofluorescence correction were applied to build the pixel-to-pixel relationship from different markers [4]. We computed the tissue masks that covered the tissue pixels that contained all markers across all staining rounds to ensure effective learning. The masks were applied to the images and preprocessed with group-wise linear normalization. For testing purposes, part of the tissue on γActin was artifactual removed to emulate the missing tissue case.
Model training.
We adopted high-resolution image synthesis and semantic manipulation with conditional GANs (pix2pixHD) [20] as our backbone. We randomly chose four samples for training and five samples for testing. For each training sample, we first split each image into 1024×1024 patches without re-sampling. Except for the γActin channel, if any channel contains a patch with less than 5% non-zero intensity pixels, we removed the whole patch of all markers in the training datasets. Finally, we randomly split 80 percent of the datasets for training (180 tiles) and the remaining 20 percent for validation (45 patches). We follow the pix2pixHD’s fashion, namely pretrained a coarse generator first in loading data by 512*512 with batchSize = 4, then training a high resolution model in original patch size (1024*1024) with batchSize =1. During both the coarse and fine train process, models were trained by 200 epochs and saved every 10 epochs. We chose the structure similarity index measure (SSIM) to evaluate the validation synthesis performance and select the best model. All experiment models were trained on NVIDIA Titan Xp 12GB graphical card and implemented in PyTorch (https://github.com/NVIDIA/pix2pixHD).
Experiment design.
The baseline of this work is applying trivial image erosion way to synthesize the missing tissue. We iteratively apply openCV’s morphology erosion method [21] during the training phase, with a trivial random iteration number selection, until the ratio of the eroded image and original image reaches the threshold. As baseline, we choose 50% tissue missing ratio (erosion-50%) and 90% tissue depletion ratio (erosion-90%). Due to large testing sets, it is infeasible to annotate all testing data manually. To validate the effectiveness of the inpainting images, we utilize the ilastik pixel classification project to generate the membrane probabilities masks [22]. Four αActinin patches (in an average of 1482 pixels) were interactively manually traced by a domain expert in cell and molecular biology, with the training output from the ilastik feature selection intermediate output. For image synthesis usefulness validation, we concatenate all testing patches to the original full sample scale and feed them to the ilastik. Then, we split the generated membrane probabilities map back to 1024*1024 tiles for patch-based performance analysis purposes. There are 1671 available patches for whole tissue synthesis analysis and 203 usable patches for missing tissue reconstruction analysis. We compare the ilastik output using the original images and the synthesized images and validate reproducibility performance. We focus on two performance comparisons: missing tissue region only and whole tissue area. As we stated earlier, it is impractical to utilize the inpainted tissue back to the remaining tissue. We need to use the full synthesis tissue sample to conduct the ilastik downstream analysis. Mean square error (MSE) is selected as the metric. For the sensitivity analysis, we study the tissue erosion ratio on the γActin channel’s input training data. Except for the two baselines, we set three more different missing tissue thresholds: 60%, 70%, 80%, respectively.
4. RESULTS
One sample testing γActin patch (1024*1024) was randomly selected, as Figure 4 illustrated. Four particular regions were zoomed for both missing tissue and whole tissue synthesis. Within the zoomed region of interest, the baseline method erosion-50% generated more similar probability membrane lines than erosion-90% in missing tissue scenarios but created broken membrane probability lines in whole tissue case. The proposed pixNto1-MT achieved stable performance in both test cases visually. Quantitatively, Figure-5 demonstrates our pixNto1-MT moderately outperformed the erosion-50% and established comparable results with the baseline erosion-90% in missing tissue reconstruction validation results. For the whole tissue reproducibility, pixNto1-MT surpassed both erosion-50% and erosion-90% in average of 60% and 40% with significant difference (p <= 1.00e-04). The further active CD disease sample and normal tissue reproducibility results show similar performance pattern between pixNto1-MT and baselines. Table 1 presents the sensitivity analysis, where the reproducibility performance become worse for re-identifying missing tissue only, withholding more remaining tissue during the training phase. Not surprisingly, reserving more portion of the remaining tissue led a better tissue reproductivity results.
Figure 4.
The intensity of original tissue and synthesized images is tuned together. The left panel shows a random patch (1024*1024 tile) qualitative inpainting synthesis ilastik reproducibility performance among baseline ersion-50%, erosion-90% and our proposed (pixNto1-MT) method.
Figure 5.
The quantitative results with MSE. Wilcoxon signed-rank test shows the significant difference with ‘*’ marked, where p <= 1.00e-04, and ns represents non-significant.
Table 1.
Sensitivity analysis reproducibility result (mean ± standard deviation) when using different tissue erosion ratio threshold to emulate the missing tissue case.
MSE * 10−3 | pixNto1-MT | erosion-50% | erosion-60% | erosion-70% | erosion-80% | erosion-90% |
---|---|---|---|---|---|---|
Missing tissue only | 2.15±1.07 | 2.38±1.27 | 2.30±1.19 | 2.23±1.06 | 2.16±1/06 | 2.13±1.06 |
Whole tissue | 0.65±0.52 | 1.09±1.38 | 1.08±0.14 | 1.23±0.10 | 1.49±0.12 | 1.60±0.13 |
5. DISCUSSION
Our proposed pixNto1-MT is more accurate than the baseline methods in overall reproducibility performance. We can see that for the missing tissue reproductivity performance, the baseline achieved similar results as the pixNto1-MT implying that inter multi-modal information might contribute higher weight in the inpainting task. The sensitivity analysis proves that the proposed biochemical mask library provides more dynamic noise on the input training data than trivial image erosion operation. The sensitivity analysis results also suggest there is a trade-off between remaining and missing tissue reconstruction synthesis that is worth further investigation. With more data collection, we need to validate further the usefulness of the synthesized images in a disease-driven manner, i.e., include more biopsies from CD active patients and surgery patients. It will also be helpful to see the usefulness of the synthesized image on different batches. The most critical future step is to evaluate the performance of real and synthesized images for downstream tasks (e.g., cell segmentation), since that is the ultimate goal for the membrane segmentation. The next technical step would be to integrate multiple as discriminators [23] to model local and global image variance to improve image inpainting performance. Another future direction is to incorporate the synthesis mask to help to cluster the functional markers. For instance, we might apply the technique to generate complete membrane masks to assess if it results in more precise segmentation of clumped cells and ultimately creating full cell maps of IBD.
6. ACKOWLEDGEMENTS
This research was supported by the Leona M. and Harry B. Helmsley Charitable Trust grant G-1903-03793 and G-2103-05128, NSF CAREER 1452485, and in part using the resources of the Advanced Computing Center for Research and Education (ACCRE) at Vanderbilt University, Nashville, TN. This project was supported in part by the National Center for Research Resources, Grant UL1 RR024975-01, and is now at the National Center for Advancing Translational Sciences, Grant 2 UL1 TR000445-06, the National Institute of Diabetes and Digestive and Kidney Diseases, the Department of Veterans Affairs I01BX004366, and I01CX002171. The de-identified imaging dataset(s) used for the analysis described were obtained from ImageVU, a research resource supported by the VICTR CTSA award (ULTR000445 from NCATS/NIH), Vanderbilt University Medical Center institutional funding and Patient-Centered Outcomes Research Institute (PCORI; contract CDRN-1306-04869).
REFERENCES
- [1].Baumgart DC and Sandborn WJ, “Crohn’s disease,” Lancet, vol. 380, no. 9853, pp. 1590–1605, 2012. [DOI] [PubMed] [Google Scholar]
- [2].Dahlhamer JM, Zammitti EP, Ward BW, Wheaton AG, and Croft JB, “Prevalence of inflammatory bowel disease among adults aged≥ 18 years—United States, 2015,” Morb. Mortal. Wkly. Rep, vol. 65, no. 42, pp. 1166–1169, 2016. [DOI] [PubMed] [Google Scholar]
- [3].Bao S et al. , “A cross-platform informatics system for the Gut Cell Atlas: integrating clinical, anatomical and histological data,” in Medical Imaging 2021: Imaging Informatics for Healthcare, Research, and Applications, 2021, vol. 11601, p. 1160106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Lin J-R, Fallahi-Sichani M, and Sorger PK, “Highly multiplexed imaging of single cells using a high-throughput cyclic immunofluorescence method,” Nat. Commun, vol. 6, no. 1, pp. 1–7, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Stack EC, Wang C, Roman KA, and Hoyt CC, “Multiplexed immunohistochemistry, imaging, and quantitation: a review, with an assessment of Tyramide signal amplification, multispectral imaging and multiplex analysis,” Methods, vol. 70, no. 1, pp. 46–58, 2014. [DOI] [PubMed] [Google Scholar]
- [6].Liu X, Xing F, Yang C, Kuo C-CJ, El Fakhri G, and Woo J, “Symmetric-constrained irregular structure inpainting for brain mri registration with tumor pathology,” in Brainlesion: glioma, multiple sclerosis, stroke and traumatic brain injuries. BrainLes (Workshop), 2021, vol. 12658, p. 80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Zhang H, Bakshi R, Bagnato F, and Oguz I, “Robust Multiple Sclerosis Lesion Inpainting with Edge Prior,” in International Workshop on Machine Learning in Medical Imaging, 2020, pp. 120–129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Xiong H, Wang C, Barnett M, and Wang C, “Multiple Sclerosis Lesion Filling Using a Non-lesion Attention Based Convolutional Network,” in International Conference on Neural Information Processing, 2020, pp. 448–460. [Google Scholar]
- [9].Guizard N, Nakamura K, Coupé P, Fonov VS, Arnold DL, and Collins DL, “Non-local means inpainting of MS lesions in longitudinal image processing,” Front. Neurosci, vol. 9, p. 456, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Annunziata R, Garzelli A, Ballerini L, Mecocci A, and Trucco E, “Leveraging multiscale hessian-based enhancement with a novel exudate inpainting technique for retinal vessel segmentation,” IEEE J. Biomed. Heal. informatics, vol. 20, no. 4, pp. 1129–1138, 2015. [DOI] [PubMed] [Google Scholar]
- [11].Zheng T et al. , “Unsupervised segmentation of COVID-19 infected lung clinical CT volumes using image inpainting and representation learning,” in Medical Imaging 2021: Image Processing, 2021, vol. 11596, p. 115963F. [Google Scholar]
- [12].Hou L, Agarwal A, Samaras D, Kurc TM, Gupta RR, and Saltz JH, “Robust histopathology image analysis: To label or to synthesize?,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 8533–8542. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Liu D et al. , “Unsupervised Instance Segmentation in Microscopy Images via Panoptic Domain Adaptation and Task Re-weighting (Supplementary material).” [Google Scholar]
- [14].Gong X, Chen S, Zhang B, and Doermann D, “Style Consistent Image Generation for Nuclei Instance Segmentation,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 3994–4003. [Google Scholar]
- [15].Mirsky Y, Mahler T, Shelef I, and Elovici Y, “CT-GAN: Malicious tampering of 3D medical imagery using deep learning,” in 28th {USENIX} Security Symposium ({USENIX} Security 19), 2019, pp. 461–478. [Google Scholar]
- [16].Jin D, Xu Z, Tang Y, Harrison AP, and Mollura DJ, “CT-realistic lung nodule simulation from 3D conditional generative adversarial networks for robust lung segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2018, pp. 732–740. [Google Scholar]
- [17].Jiao L, Wu H, Wang H, and Bie R, “Multi-scale semantic image inpainting with residual learning and GAN,” Neurocomputing, vol. 331, pp. 199–212, 2019. [Google Scholar]
- [18].Liu H, Wan Z, Huang W, Song Y, Han X, and Liao J, “PD-GAN: Probabilistic Diverse GAN for Image Inpainting,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 9371–9381. [Google Scholar]
- [19].Isola P, Zhu J-Y, Zhou T, and Efros AA, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1125–1134. [Google Scholar]
- [20].Wang T-C, Liu M-Y, Zhu J-Y, Tao A, Kautz J, and Catanzaro B, “High-resolution image synthesis and semantic manipulation with conditional gans,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8798–8807. [Google Scholar]
- [21].Howse J, OpenCV computer vision with python Packt Publishing Birmingham, 2013. [Google Scholar]
- [22].Sommer C, Straehle C, Koethe U, and Hamprecht FA, “Ilastik: Interactive learning and segmentation toolkit,” in 2011 IEEE international symposium on biomedical imaging: From nano to macro, 2011, pp. 230–233. [Google Scholar]
- [23].Yu J, Lin Z, Yang J, Shen X, Lu X, and Huang TS, “Generative image inpainting with contextual attention,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 5505–5514. [Google Scholar]