Significance Statement
Podocytes are depleted in several renal parenchymal processes. The current gold standard to identify podocytes considers histopathologic staining of nuclei using specific antibodies and manual enumeration, which is expensive and laborious. We have developed PodoSighter, a cloud-based tool for automated, label-free podocyte detection, and three-dimensional quantification from periodic acid–Schiff-stained histologic sections. A diverse dataset from rodent models of glomerular diseases (diabetic kidney disease, crescentic GN, and dose-dependent direct podocyte toxicity and depletion), human biopsies for steroid resistant nephrotic syndrome, and human autopsy tissue, demonstrate generalizability of the tool. Samples were derived from multiple laboratory, supporting broad application. This tool may facilitate clinical assessment and research involving podocyte morphometry.
Keywords: podocyte detection, deep learning, pix2pix GAN, Deeplab, cloud, cloud computing, urinary tract, viscera, podocytes, CNN
Visual Abstract
Abstract
Background
Podocyte depletion precedes progressive glomerular damage in several kidney diseases. However, the current standard of visual detection and quantification of podocyte nuclei from brightfield microscopy images is laborious and imprecise.
Methods
We have developed PodoSighter, an online cloud-based tool, to automatically identify and quantify podocyte nuclei from giga-pixel brightfield whole-slide images (WSIs) using deep learning. Ground-truth to train the tool used immunohistochemically or immunofluorescence-labeled images from a multi-institutional cohort of 122 histologic sections from mouse, rat, and human kidneys. To demonstrate the generalizability of our tool in investigating podocyte loss in clinically relevant samples, we tested it in rodent models of glomerular diseases, including diabetic kidney disease, crescentic GN, and dose-dependent direct podocyte toxicity and depletion, and in human biopsies from steroid-resistant nephrotic syndrome and from human autopsy tissues.
Results
The optimal model yielded high sensitivity/specificity of 0.80/0.80, 0.81/0.86, and 0.80/0.91, in mouse, rat, and human images, respectively, from periodic acid–Schiff-stained WSIs. Furthermore, the podocyte nuclear morphometrics extracted using PodoSighter were informative in identifying diseased glomeruli. We have made PodoSighter freely available to the general public as turnkey plugins in a cloud-based web application for end users.
Conclusions
Our study demonstrates an automated computational approach to detect and quantify podocyte nuclei in standard histologically stained WSIs, facilitating podocyte research, and enabling possible future clinical applications.
Podocyte depletion, either absolute (reduced number of podocytes per glomerulus) or relative (reduced number per glomerular volume), is associated with progressive glomerular damage in many kidney diseases,1,2 including diabetic kidney disease (DKD),3,4 crescentic GN,2,5 puromycin nephropathy,5–7 and steroid-resistant nephrotic syndrome (SRNS),8,9 among others. Therefore, an efficient method of podocyte quantification would be of significant research and clinical value.
On standard histochemical stains, podocytes are identified as nuclei within the Bowman space abutting the glomerular basement membrane and are best seen at the periphery of the glomerulus (hereinafter referred to as “peripheral podocytes”). When such clues on relative location to other glomerular structures are absent, podocyte nuclei can be difficult to identify without molecular-specific markers. These difficult-to-identify podocytes are often located more centrally in the glomerulus (hereinafter referred to as “central podocytes”).
Because the two-dimensional estimation of podocyte nuclear density (number per glomerular area, henceforth referred to as “podocyte area density”) lacks critical information regarding the pathophysiology behind podocyte depletion,10 the three-dimensional assessment of podocyte nuclear density (number per glomerular volume, henceforth referred to as “podocyte volume density”) was established as the gold standard. One such gold standard follows the approach proposed by Venkatareddy et al.11 (also known as the Wiggins method), which involves the manual counting of immunostained podocyte nuclei and the subsequent application of a correction factor (CF) to estimate the podocyte volume density. This approach is labor intensive and has added technical costs. Alternative methods to estimate podocyte counts employ optical dissectors, or use flow cytometry–based12 techniques. These techniques have inherent drawbacks involving time-intensive and tedious manual counting13 of podocyte nuclei and the use of additional stains, all beyond the scope of clinical practice. To our knowledge, no automated method exists that directly detects and quantifies podocyte nuclei and their volume densities from standard brightfield images of histologically stained renal tissue sections.
We have developed and tested a robust in situ podocyte quantification tool that we have named PodoSighter. This cloud-based application detects podocyte nuclei and quantifies their volume densities from giga-pixel brightfield microscopy whole-slide images (WSIs) of histologically stained renal tissue sections using deep learning. This computational tool was trained using immunohistochemical (IHC) and immunofluorescence (IF) stains for podocyte nuclei within renal tissue sections as ground-truth. To demonstrate the adaptability of PodoSighter for different molecular targets, we used two popular podocyte nuclei markers, p57 and Wilms’ Tumor 1 (WT1). We show the robustness of our tool using multispecies (mouse, rat, and humans), multi-institutional data, comprised of a diverse set of glomerular diseases, including rodent models for DKD, dose-dependent direct podocyte toxicity and depletion, and crescentic GN, and human biopsies for SRNS and human autopsy tissue. Additionally, this dataset included diverse staining protocols, thereby incorporating large amounts of variation in the training data, aiding in the generation of a robust pipeline. The extracted morphometrics of podocyte nuclei using PodoSighter were found to be informative in identifying diseased glomeruli, suggesting the potential application of PodoSighter in basic and clinical research, and possibly providing future applications in the clinical setting. We have made PodoSighter freely available to the general public as turnkey plugins in a cloud-based web application for end users.
Methods
Overview
In deep learning, automated detection of structures within an image requires two sets of images for network training: (1) “images” (wherein relevant structures are to be identified), and (2) “label images” (wherein all pixels of relevant structures are demarcated). For our PodoSighter pipeline (Figure 1), each tissue section was stained twice to acquire two types of WSI: (1) periodic acid–Schiff (PAS) stains served as “images” and (2) IF or IHC stains specifically labeled for podocyte nuclei using the markers p57 or WT1 served as the basis of creating “label images” (i.e., ground-truth) after thresholding and processing of the raw signal. Image registration14 was performed to provide a near-perfect pixel-to-pixel overlay of the PAS and IF/IHC images, resulting in demarcation of podocyte nuclei on the PAS stain using the podocyte nucleus–specific IF/IHC positive stain signal.
Because identification of glomeruli is a prerequisite for podocyte detection, we used a modified version of our previously published computational tool15,16(preprint) (termed H-AI-L) to automatically identify and extract glomeruli from the PAS-stained WSIs. Next, the training images for deep learning in our pipeline were generated by using image patches of the extracted PAS-stained glomeruli along with their corresponding IF/IHC image patches, which were processed and converted into label images of suitable formats for network training. After training, the deep learning network models were tested using hold-out PAS WSIs, and the resultant predictions were displayed along with input PAS WSIs.
Data Acquisition
A total of 122 WSIs were used (Table 1). For rodent analysis, 74 renal sections were obtained from three models: (1) streptozotocin (STZ)-induced DKD mouse model, (2) nephrotoxic serum (NTS) nephritis mouse model mimicking human crescentic GN (also referred to as nephrotoxic nephritis model), and (3) puromycin aminoglycoside nephropathy (PAN) rats, which is a dose-dependent, direct podocyte toxicity and depletion model. The human data included 14 autopsy sections and 34 pediatric biopsies, the latter of which were obtained from patients with SRNS. For murine analysis, each of control and diseased WSI datasets was randomly split into 80%/20% portions for training/testing. For the human data, each dataset was randomly split into 80%/20% portions for training/testing.
Table 1.
Sr. No. | Species | Dataset | Country of Origin | Disease | Podocyte Marker | #WSIs | #Glomeruli |
---|---|---|---|---|---|---|---|
1 | Mouse | STZ-model | USA | DKD | WT1 | 24 | ∼1.5K |
2 | Mouse | NTN-model | France | Crescentic GN | p57 | 10 | ∼2.8K |
3 | Rat | PAN-model | Germany | Dose-dependent direct podocyte toxicity and depletion | WT1 | 20 | ∼5K |
4 | Rat | PAN-model | Germany | Same as above | p57 | 20 | ∼5K |
5 | Human | Autopsy sectionsa | USA | — | WT1 | 14 | ∼2K |
6 | Human | Autopsy sectionsa | USA | — | p57 | 14 | ∼2K |
7 | Human | Biopsy sections | Germany | SRNS | WT1 | 19 | ∼0.3K |
8 | Human | Biopsy sections | Germany | SRNS | p57 | 15 | ∼0.3K |
Multi-institutional data encompassing various species, podocyte nuclei markers, and disease models utilized for this study. The number of WSIs and the total number of glomeruli extracted per dataset are listed. NTN, nephrotoxic nephritis.
For the autopsy data, the same sections were stained using both WT1 and p57 markers, one after the other, via bleaching.
To generate the training set, each section was stained in one of two ways: (1) initially stained with IF or bleachable IHC markers and subsequently with PAS, or (2) initially stained with PAS and subsequently stained using IHC markers (Supplemental Figure 1). For podocyte nuclei labeling, the podocyte nuclear markers p57 and WT1 were used.
Tissue Preparation, Staining, and Imaging
For the STZ model, we used a standard STZ-treated mouse model.17 All animal studies were performed in accordance with protocols approved by the Institutional Animal Care and Use Committee at University at Buffalo. They were also consistent with federal guidelines and regulations and followed the recommendations of the American Veterinary Medical Association guidelines on euthanasia. The STZ drug dosage is listed in Supplemental Table 1. Renal tissue sections from seven control and seven STZ-treated mice were used, which were formalin fixed and paraffin embedded (FFPE), then sectioned at 2 µm thickness. The tissue sections were spotted with DAPI mounting media (Vectashield Antifade Mounting Medium with DAPI, ex/em [nm]: 358/461; Vector Laboratories, Inc., Burlingame, CA). The podocyte nuclei were labeled using WT1 (ab89901; Abcam, Cambridge, UK) as a primary antibody with Alexa Fluor 594 (ex/em [nm]: 590/617) goat anti-rabbit IgG (1:1000, Life Technologies, Carlsbad, CA) as the secondary antibody. The slides were imaged for fluorescence at 40× magnification (0.13 µm/pixel) using an Aperio VERSA digital whole slide scanner (Leica Biosystems, Buffalo Grove, IL). After fluorescence imaging, the slides were prepared for poststaining via PAS. Slides were soaked in xylene for about an hour, until the edge of the coverslip could gently be lifted with a knife. Once the coverslip was removed, the tissues were rehydrated. Rehydration consisted of 2 × 10-minute washes in xylene, followed by 2 × 10-minute washes in 100% ethanol, before stepping down with 2 × 5-minute washes in 70% ethanol, and then submerging the slides in double distilled H2O. A PAS Stain Kit (ab150680; Abcam, Cambridge, UK) was used to poststain the tissues. The slides were again imaged in brightfield mode at 40× magnification with the Aperio VERSA system.
We also used NTS to induce a passive mouse model of GN. C57BL6/J mice who were 2 months old received a retro-orbital injection of either NTS (15 µl, diluted in 85 µl of PBS) or PBS (100 µl) at days 0, 1, and 2 (2.5 µl/g), and were sacrificed at day 10. Kidneys were harvested, and the FFPE 3-µm-thick sections were obtained (sectioned using Leica RM-2145 microtome). After dewaxing, rehydration, and antigen retrieval using a citrate buffer, the sections were incubated overnight at 4°C with the primary antibody. Rabbit anti-p57 (1:250, ab75974; Abcam, Cambridge, MA) was used as the primary antibody, and donkey anti-rabbit AF488-conjugated antibody (1:500, Invitrogen, Carlsbad, CA) the secondary. The nuclei were stained with Hoechst (1 µg/ml). The slides were imaged for fluorescence at 40× magnification using a digital whole slide scanner (Nanozoomer HT2.0, C9600–12, Hamamatsu). After fluorescence imaging, the slides were prepared for poststaining via PAS. Slides were soaked in distilled water at 40°C for about 10 minutes until the edge of the coverslip could gently be lifted with a knife. Once the coverslip was removed, the tissues were washed in distilled water. The slides were incubated with periodic acid (1%) for 5 minutes, washed again in distilled water, incubated with Schiff reagent for 30 minutes, and then counterstained with hematoxylin. The slides were again imaged in brightfield mode at 40× magnification with a Hamamatsu Nanozoomer scanner.
For the PAN model,6 a single shot of 15 mg of puromycin aminoglycoside (P7130; Sigma-Aldrich, Germany) per 100 g of body wt was given intravenously to rats. FFPE 2-µm-thick sections were extracted and labeled using a standard PAS reaction with hematoxylin nuclear counterstain. The slides were imaged in brightfield mode at 40× magnification with a Hamamatsu Nanozoomer scanner (Hamamamtsu, Herrsching am Ammersee, Germany). Subsequently, the slides were labeled using either p57 (1:500; BioSB, Inc., Santa Barbara, CA) or WT1 (1:20; Leica Biosystems Novocastra, Bannockburn, IL). Both were run on a Leica Bond Max staining machine with the Polymer Refine Detection Kit from Leica. The slides were again imaged in brightfield mode using the Nanozoomer scanner.
For the autopsy data, FFPE 4-µm-thick sections (sectioned using Leica RM-2255 microtome) were deparaffinized with xylene and rehydrated with gradient concentrations of ethanol. For antigen retrieval, the slides were transferred to a pressure cooker including an EDTA solution (BSB 0032, BioSB, Inc., Santa Barbara, CA), boiled at high pressure for 30 minutes. After retrieval, slides were subjected to the multiplexed immunohistochemistry. On day 1, slides were blocked with a peroxidase blocker (BSB 0054, BioSB, Inc., Santa Barbara, CA), washed with an immunoDNA washer buffer (BSB 0150, BioSB, Inc., Santa Barbara, CA) once; then incubated with 1:100 of rabbit anti-human p57 antibody (ab75974;) for 90 minutes. After washes, a horseradish peroxidase (HRP)-conjugated anti-rabbit secondary reagent (RU-HRP-100; Diagnostic BioSystem, Pleasanton) was applied. The AEC-red chromogen (SK-4205; Vector Laboratories, Inc., Burlingame, CA) was used for color development. After staining, the slides were mounted with an aqueous medium (H-5501; Vector Laboratories, Inc., Burlingame, CA) and then digitally scanned at 20× magnification using a MoticEasyScan Pro (Motic). On day 2, the slides were warmed in PBS buffer at 55°C to detach the coverslip automatically. After detachment of the coverslip, the slides were decolored in 90% ethanol for 5–10 minutes and stripped in an antibody elution buffer (0.2% SDS, 62.5 mM Tris-HCI, pH 6.8%, and 0.8% β-mercaptoethanol) in a 55°C water bath for 15 minutes. The slides were run using distilled H2O for 10 minutes, washed with wash buffer twice for 10 minutes, and incubated in PBS for 10 minutes. After completion of antibody elution, a WT1 antibody (ab89901) at 1:300 work solution was applied, and a highly sensitive Mouse/Rabbit PolyDetector Plus DAB HRP Detection System (BSB 0269, BioSB, Inc., Santa Barbara, CA) was used to develop tissue staining with AEC chromogen. Slides were mounted and scanned as previously prescribed. On day 3, the same manner was used to remove coverslip and decolorize. The slides were finally counterstained with periodic acid and hematoxylin using a PAS kit (VWR, 84000252, supplier 87007). The slides were then scanned in brightfield mode at 40× magnification.
For the human pediatric samples used in this study, patients and their representatives gave informed consent for the use of tissue after the completion of diagnostics. FFPE 2-µm-thick sections were utilized. The slides were labeled using standard PAS reaction with hematoxylin nuclear counterstain. Subsequently, the slides were labeled using either p57 (1:500) or WT1 (1:20). Both were run on a Leica Bond Max staining machine with the Polymer Refine Detection Kit from Leica. The slides were again imaged in brightfield mode at 40× magnification with a Hamamatsu Nanozoomer scanner.
Image Registration/Alignment
PAS-stained WSIs and their corresponding IF/IHC WSIs were aligned via landmark-based image registration.14 The four corners of the bounding box containing the tissue and their centroids were used as landmark points for registration. For cases that displayed slight misalignment (as seen in most human slides), an offset value was manually added to obtain precise alignment. Additionally, for cases with multiple tissue sections on the same slide or in cases with artifacts, a manual translation value was added. However, because our dataset was generated by careful quality control of tissue sectioning, IF staining, and coverslip removal for PAS staining, and because the WSI data were generated using the same imaging system, the registration process was simplified drastically. The registered WSIs were visually inspected for alignment. Note that manual processing for registration is tied only to the training dataset and does not alter the automatic aspect of the pipeline in the test set.
Automated Extraction of Glomeruli
Glomeruli were identified and extracted from the PAS-stained WSIs using our previously published H-AI-L tool.15 Briefly, H-AI-L is a convolutional neural network (CNN) trained to automatically segment glomeruli from brightfield WSIs of renal tissue sections. Alternatively, the pipeline permits users to insert manual glomerular annotations. Because the PAS and IF/IHC WSIs were previously aligned by image registration, extraction of the glomerulus from the PAS-stained WSI also extracted the same glomerulus on the IF/IHC WSI.
Generation of Training Images
The label image for each glomerulus was generated by segmenting the podocyte nucleus–specific IF/IHC signal using standard morphologic processing.18 For deep learning, we studied two different methods, CNN-based semantic segmentation and generative adversarial network (GAN)-based synthetic image generation. The former uses an image mask generated from the IF/IHC-stained image on the basis of the positive signal for podocyte nuclei to demarcate the podocyte nuclei of the corresponding PAS-stained image. The latter fundamentally translates one type of image (domain A) into another type of image (domain B). For the goal of our study, domain A images were the PAS-stained glomeruli and domain B images were either the IF/IHC images labeling podocyte nuclei or an insilico image (generated before training) demarcating podocyte nuclei (Supplemental Figure 2). We examined a subset of datasets from our cohort to determine which one of these two domain B images generated better performance, and chose the latter for our GAN-based pipeline.
Proposed Deep Learning Network Model Training and Testing
The Deeplab V3+ network19 and the pix2pix conditional GAN20 were employed for the CNN- and GAN-based pipelines, respectively. For the CNN, the training labels were uint8 images with index values corresponding to individual classes (e.g., background, glomerulus, podocyte nuclei, and other glomerular nuclei). Our model was trained for 50K steps, using a batch size of 12 and a learning rate of 1e−3, on an NVIDIA GeForce GTX 1080 GPU (NVIDIA, Santa Clara, CA). The Xception network backbone was used with the implementation and hyperparameters described by Chen et al.19 For the GAN, the pix2pix network models were trained for 20 epochs on an NVIDIA GeForce GTX 1080 GPU on the basis of the PyTorch implementation and hyperparameters specified by Isola et al.20 For quality control, image patches containing staining or technical artifacts were removed from the analysis. Post training, these deep learning network models were tested using hold-out PAS WSIs. Podocyte nuclei were identified by the network models in each H-AI-L segmented glomerular location, and the resultant podocyte nuclear boundaries were mapped back to the input PAS-stained images. Because H-AI-L preserves the original glomerular locations, the segmented podocyte nuclei can be visualized on the WSI level in the test set.
Performance Evaluation of Proposed Deep Learning Network Models for Podocyte Nuclei Detection
To evaluate the performance of the CNN and the GAN in accurately identifying podocyte nuclei, we compared the network predictions with the ground-truth labels offered by the IF/IHC images. The sensitivity/specificity of podocyte nuclei detection of the two networks was extracted by automatically extracting the ground-truth podocyte nuclei from the IF/IHC images by segmenting the WT1- or p57-positive nuclei via morphologic processing18 and were compared with the network-estimated podocyte nuclei.
Podocyte Volume Density Estimation
Once the podocyte nuclei were detected by the two networks, the podocyte volume densities were extracted using the Wiggins method (Venkatareddy et al.11), wherein the detected podocyte nuclear counts were first corrected using a CF value (derived as a function of the podocyte nuclear shape, mean caliper diameter, and tissue thickness [T]), and then divided by the glomerular volume (calculated as the glomerular area × T).
Because the Wiggins method has been established in IF images but not yet validated in PAS images, we first wanted to assess the feasibility of using PAS images for the extraction of mean nuclear caliper diameter and CF values. Thus, we extracted the mean average apparent caliper diameter (d) (calculated as the average metric of width and height of the bounding box containing each podocyte nucleus, and computed mean of the resulting average metric for all of the podocyte nuclei per WSI) independently from: (1) the podocyte nuclei in the ground-truth IF/IHC WSIs, and (2) the same nuclei in the corresponding PAS WSIs (hematoxylin-stained nuclei), from all of the mouse, rat, and human p57–stained sections (59 WSIs with approximately 11K glomeruli) (Supplemental Figure 3). Next, their respective true caliper diameter (D) values (obtained by applying the quadratic equation derived by Venkatareddy et al.,11 by employing a shape coefficient of k=0.72 and the respective T values). Then the CF values (calculated as a function of D and T) were computed and compared. For this part of the study, we restricted our analysis to the datasets employing the p57 marker to avoid potential inclusion of parietal epithelial cells (PECs) because WT1 is also reported to be expressed by PECs.21
Comparison of Network Performance with Manual Ground-Truth
To evaluate the performance of the networks in extracting podocyte volume densities, we obtained a manual ground-truth, wherein the p57-labeled podocyte nuclei were manually counted from all of the glomeruli in the IF/IHC images from the hold-out WSIs. Next, d, D, CF, and podocyte volume densities were automatically extracted from these IF/IHC ground-truth podocyte nuclei using the Wiggins method (shown in Supplemental Figure 3). (Note that it would be straightforward to count all of the podocyte nuclei from IF/IHC images using simple morphologic image analysis tools,18 however, we conduct manual counting herein to remain consistent with the Wiggins method.) The resulting values were then compared with the respective deep learning–estimated podocyte nuclei from the corresponding PAS images of the same glomeruli, namely, d, D, CF, and podocyte volume densities. To quantify the agreement between the ground-truth and estimated podocyte volume densities, the Bland–Altman plot22 was utilized, which constructs the 95% limits of agreement using the mean (µ) and 1.96× the SD (σ) of the differences between the two measurements. We further assessed the association between the computationally estimated and ground-truth podocyte volume densities using Pearson’s correlation coefficient.
Network Performance in Human Biopsies with Respect to Number of Podocytes and Glomeruli
To evaluate the potential clinical use of our proposed computational tools in analyzing human biopsies, we analyzed the performance of our proposed computational tool in quantifying podocyte nuclei from the human pediatric biopsy cohort (34 WSIs). The ground-truth podocyte count per glomeruli was measured using the method discussed in Podocyte Volume Density Estimation and the total number of glomeruli per WSI was obtained automatically using the glomerular annotations generated by our H-AI-L tool.15,16 We used a three-fold cross-validation method to train and test our network exclusively on all 34 WSIs. For this study, we chose the CNN network, which displayed a better performance among the two proposed deep learning network models. Once the network had detected podocyte nuclei on the biopsy WSIs, a regression plot (Supplemental Figure 4) was used to assess the performance of the network on each biopsy, with respect to two criteria: (1) number of podocyte nuclei per glomerulus, and (2) number of glomerular profiles per WSI. The dice coefficient (a measure of similarity between estimated and ground-truth podocyte nuclei, where a dice coefficient of 0 signifies no overlap and 1 signifies perfect overlap between the podocyte nuclei pixels in the ground truth and the estimated podocyte nuclei) was used to compare the performance of the network with the ground-truth IF/IHC images. Note that in our human pediatric biopsy WSI cohort a single WSI contains tissue sections from multiple cores, and hence the glomeruli per WSI vary from 4 to 56. Further, in practice, renal pathologists may integrate results from <20 glass slides (hence WSIs), each with one or more section for assessment of renal biopsies in the clinic. For this process there exists no fixed or consensus rule. For this study, our model was trained for 50K steps, using a batch size of 12 and a learning rate of 1e−3, on an NVIDIA GeForce GTX 1080 GPU. The Xception network backbone was used with the implementation and hyperparameters described by Chen et al.19 Additionally, among the 34 WSIs, because 19 WSIs had WT1 marker as the ground-truth, we used the CNN model trained exclusively on p57-labeled podocyte nuclei to identify podocyte nuclei (and eliminate PECs).
Network Performance with respect to Preanalytical Errors
To determine the variations in network performance for preanalytical errors, we had three renal pathologists identify glomerular image patches containing preanalytical errors, from our mouse, rat, and human cohort data, namely imaging aberration, tissue folds, presence of cutting artifacts, etc. The pathologists identified approximately 100 glomerular image patches containing such artifacts. We then analyzed the performance of the network in these image patches. For the full list of artifacts, see Supplemental Methods. For this study, we chose the CNN network, which displayed a better performance of the two proposed deep learning network models.
Statistical Analysis
To determine if there was significant podocyte depletion in the disease datasets (STZ-treated mice, NTS-treated mice, and PAN-treated rats) as compared with their respective control counterparts, we analyzed three separate results: (1) ground-truth podocyte volume densities (as obtained from the IF/IHC images), (2) CNN-estimated volume densities (as estimated from the corresponding PAS images), and (3) GAN-estimated volume densities (as estimated from the corresponding PAS images). Furthermore, to determine the type of statistical test to be used for evaluating podocyte depletion in disease cases, we first analyzed the distribution of all of the aforementioned podocyte volume densities, using the Shapiro–Wilk23 normality test. The normality test indicated all of the podocyte volume densities (ground-truth, CNN estimated, and GAN estimated) were nonGaussian distributions, and consequently the Mann–Whitney Wilcoxon test24 (one sided) was utilized to extract the significance of podocyte depletion in the disease datasets, separately, for ground-truth, CNN-estimated, and GAN-estimated podocyte volume densities.
Effects of Central Versus Peripheral Podocytes
Central podocytes and their nuclei are significantly harder to accurately identify by visual assessment than peripheral podocyte nuclei due to the lack of clear relative structural locations. However, we hypothesize that computational algorithms should perform similarly in identifying both the central and peripheral podocyte nuclei. To test this hypothesis, we had three renal pathologists (A1, A2, and A3) manually annotate podocyte nuclei from a subset of the PAS images (72 image patches: three randomly chosen glomeruli from each of the 24 WSIs from the STZ dataset). To quantitatively analyze the detection of central and peripheral podocyte nuclei, a hand-segmented mask was used to separate the two for these image patches. This mask was selected so that any podocyte nucleus that was not directly at the periphery of the glomerular tuft was considered as a central podocyte nucleus (Supplemental Figure 5). The IF images, along with the mask, were used to obtain the ground-truth of peripheral and central podocyte nuclei. Detection performance of the computational algorithm and by the three pathologist annotators were qualitatively examined. Additionally, we calculated the linear weighted Cohen’s kappa25 (κ) for the annotators and the algorithm, for three classes of nuclei within the glomerulus: (1) peripheral podocytes, (2) central podocytes, and (3) nonpodocyte cells. κ values of <0, 0–0.21, 0.21–0.4, 0.41–0.6, 0.61–0.8, and 0.81–1, respectively, indicated no, slight, fair, moderate, substantial, and near-perfect agreement. We further used Bland–Altman22 plots to quantify the agreement between the ground-truth and estimated podocyte volume densities by the networks and the agreement between the ground-truth and the three annotators. For the ground-truth podocyte volume densities, the IF/IHC images were used to automatically obtain the d, D, and CF values, whereas for the estimated volume densities (by the network and the annotators), the corresponding PAS images were used to obtain the d, D, and CF values. Because the goal in this study is to examine the power of the computational algorithm in identifying central versus peripheral podocyte nuclei in comparison with manual annotation, we used one of the two deep learning methods considered in this work, namely, the GAN-based method.
Extraction of Podocyte Nuclear Morphometrics to Identify Diseased Glomeruli
The podocyte nuclei were identified from PAS images in one of three ways: nuclei that were (1) WT1 or p57 positive in the corresponding IF/IHC images (ground-truth), (2) CNN estimated, and (3) GAN estimated. Depending on the identification method, the features extracted from these nuclei are referred to as “ground-truth podocyte features,” “CNN-podocyte features,” and “GAN-podocyte features.” The extracted podocyte nuclear features included morphometric features. A detailed list of such features is available in Supplemental Table 2. Additionally, the glomerular areas were extracted.
To determine if podocyte nuclear morphometrics were of value, a naïve Bayesian (NB) classifier with and without these features was trained to distinguish diseased from control glomeruli. The data were split into 80%/20% portions for training/testing. First, a classifier (NB1) was trained using glomerular area as the only feature. Second, the glomerular area and the ground-truth podocyte features were used to train a separate classifier (NB2). Third, the glomerular area and the CNN-podocyte features were used to train a separate classifier (NB3). Lastly, the glomerular area and the GAN-podocyte features were used to train a separate classifier (NB4). We hypothesized that the extracted features from podocyte nuclei were informative if the classifiers incorporating podocyte nuclear morphometrics (i.e., NB2, NB3, NB4) performed better than the ones without (NB1).
Cloud-Based Plugin Development
HistomicsUI,26 a distributed system with RESTful API, developed by Kitware Inc. (Clifton Park, NY), was utilized to deploy our algorithm as an online web plugin for end users; see https://athena.ccr.buffalo.edu/. We used Docker, a framework enabling the development of applications in independent containers, to package our pipeline as a docker image, which was then deployed to the cloud. The Docker image is available at https://bit.ly/3e6XZzs for anyone to download and install our pipeline in another server, and for further development. Detailed instructions on executing the pipeline are provided in Supplemental Methods. Video instruction to use the pipeline is available at https://bit.ly/3e6XZzs. Once the user-defined inputs are provided, the plugin runs to generate an output xml file with podocyte nuclear annotations, which can be viewed either in Aperio ImageScope, or in the web interface itself by selecting an option from the drop-down menu to convert the xml into a json file. An Excel file is generated by the plugin containing the average podocyte nuclear counts, d, D, and CF values, and the podocyte volume density per WSI.
Results
Deep Learning Networks Demonstrate High Performance Regardless of Species and Disease State
The CNN- and GAN-based outputs (Figure 2) displayed similar results in most datasets (Table 2). CNN consistently showed high sensitivity/specificity in all species and a higher true-positive rate in detecting podocyte nuclei than the GAN in rat and human datasets (Figure 3). Additionally, in all disease models, CNN exhibited sensitivity >0.70 and specificity >0.70, validating the robustness of the network. GAN demonstrated variable performance with the datasets (by outperforming the CNN in the STZ-treated mouse model [sensitivity/specificity of 0.86/0.83 versus 0.80/0.80] and underperformed the CNN on the NTS-treated mouse model [0.33/0.89 versus 0.80/0.81]; it similarly underperformed on the WT1-labeled human data [0.60/0.82 versus 0.82/0.87] Table 2).
Table 2.
Species | Podocyte Marker | Dataset | pix2pix | CNN | ||
---|---|---|---|---|---|---|
Ctrl. | Dis. | Ctrl. | Dis. | |||
Mouse | WT1 | STZ-model | 0.83/0.85 | 0.90/0.82 | 0.73/0.8 | 0.83/0.79 |
Mouse | p57 | NTN-model | 0.35/0.87 | 0.31/0.91 | 0.94/0.73 | 0.71/0.86 |
Rat | WT1 | PAN-model | 0.75/0.82 | 0.80/0.80 | 0.80/0.85 | 0.80/0.79 |
Rat | p57 | PAN-model | 0.80/0.83 | 0.83/0.82 | 0.83/0.90 | 0.80/0.87 |
Human | WT1 | Autopsy sections | 0.67/0.80 | 0.85/0.85 | ||
Human | p57 | Autopsy sections | 0.75/0.91 | 0.75/0.94 | ||
Human | WT1 | Biopsy sections | 0.42/0.89 | 0.75/0.97 | ||
Human | p57 | Biopsy sections | 0.83/0.92 | 0.86/0.94 |
Performance of the networks for individual datasets compared with the IF/IHC ground-truth. Network performance (sensitivity/specificity) was calculated by comparing the network-estimated podocytes with the ground-truth podocytes extracted from the IF/IHC images via morphologic processing. Ctrl., control; Dis., disease; NTN, nephrotoxic nephritis.
CF Values Computed from IF/IHC Images Are Similar to PAS Images
The extracted D and CF values, independent from the IF/IHC images and the PAS images, from each dataset, are shown in Table 3. For the full table, see Supplemental Table 3. Overall, the D values from PAS images demonstrated an absolute error of 0.4±0.2 µm, 0.8±0.4 µm, and 0.7±0.6 µm in mouse, rat, and human data, compared with the IF/IHC images. However, the calculated CF values remained relatively unchanged (Table 3) between the IF/IHC and PAS images with an average absolute error of 0.01±0.01. Additionally, our automated technique generated similar D and CF values to that of the values reported by Venkatareddy et al.11 in mice, rat, and human data (Supplemental Table 4).
Table 3.
Species | Podocyte Nuclei Marker | Disease State | On the basis of PAS Images | On the basis of IF/IHC Images | Average Absolute Error in CF (|CFPAS-CFIF/IHC|) | ||
---|---|---|---|---|---|---|---|
DPAS (µm) | CF PAS | DIF/IHC (µm) | CF IF/IHC | ||||
Mouse | p57 | Control | 6.81±0.24 | 0.31±0.01 | 6.20±0.26 | 0.32±0.01 | 0.01 |
Mouse | p57 | NTN | 6.73±0.48 | 0.31±0.01 | 7.05±0.27 | 0.30±0.01 | 0.01 |
Rat | p57 | Control | 7.19±0.31 | 0.22±0.01 | 7.79±0.87 | 0.21±0.02 | 0.01 |
Rat | p57 | PAN | 6.92±0.22 | 0.22±0.01 | 8.10±0.69 | 0.20±0.01 | 0.02 |
Human | p57 | Biopsy | 6.62±0.28 | 0.23±0.01 | 7.85±0.89 | 0.21±0.02 | 0.02 |
Human | p57 | Autopsy | 7.98±0.84 | 0.33±0.02 | 8.33±1.17 | 0.33±0.03 | 0.00 |
The absolute error in CF values on the basis of true D extracted from PAS and IF/IHC images. The results indicate that the calculated CF values remain relatively unchanged between the IF/IHC and PAS images. NTN, nephrotoxic nephritis.
Deep Learning Network Model Estimated Podocyte Volume Densities Are Similar to Manual Ground-Truth
The network-estimated and ground-truth podocyte D values, CFs, and podocyte volume densities are shown in Table 4. For full table see Supplemental Table 5. The CNN and GAN both generated podocyte volume densities with low absolute residual errors (0.47±0.49 and 0.49±0.53 podocytes/104 µm3, respectively) compared with the manual ground-truth (Figure 4), confirming both networks generated similar results. Moreover, Figure 4 demonstrates the high signal-to-noise ratio of our proposed methods. Additionally, for all species, in aggregate, both CNN (r=0.82, P<0.001) and GAN (r=0.68, P<0.001) demonstrated strong and moderate correlations with the ground-truth, respectively.
Table 4.
Species | Disease State | Ground-Truth on the Basis of Manual Counts from IF/IHC Images | On the Basis of CNN Predictions from PAS Images | On the Basis of pix2pix Predictions from PAS Images | ||||||
---|---|---|---|---|---|---|---|---|---|---|
DIF/IHC (µm) | CF IF/IHC | PDIF/IHC (n/104 µm3) | DCNN (µm) | CF CNN | PDCNN (n/104 µm3) | Dpix2pix (µm) | CF pix2pix | PDpix2pix (n/104 µm3) | ||
Mouse | Control | 6.43 | 0.32 | 2.42 | 6.84 | 0.30 | 3.24 | 8.19 | 0.27 | 1.81 |
Mouse | NTN | 7.08 | 0.30 | 1.47 | 6.19 | 0.33 | 1.93 | 8.12 | 0.27 | 1.06 |
Rat | Control | 7.95 | 0.20 | 0.97 | 7.17 | 0.22 | 1.37 | 7.96 | 0.20 | 0.95 |
Rat | PAN | 8.37 | 0.19 | 0.98 | 7.20 | 0.22 | 1.43 | 7.47 | 0.21 | 0.95 |
Human | Biopsy | 7.96 | 0.20 | 1.84 | 8.25 | 0.20 | 2.05 | 9.25 | 0.18 | 1.72 |
Human | Autopsy | 9.64 | 0.29 | 0.56 | 10.12 | 0.28 | 0.60 | 11.53 | 0.26 | 0.47 |
The absolute error in CF values on the basis of true nuclear D extracted from PAS and IF/IHC images are shown. The results indicate that both networks generate podocyte volume densities that are similar to the ground-truth. PD, podocyte volume density; NTN, nephrotoxic nephritis.
Network Performance Remains Unaffected by the Number of Podocytes and Improves with Increase In Glomerular Profiles per WSI
Result in the Supplemental Figure 4A shows the dice coefficient values measured with respect to average ground-truth podocyte nuclei count per glomerulus in a WSI, which varies from six to 38 podocyte nuclei per glomerulus. The dice coefficient shown here is the average for a WSI averaged over all of the glomeruli. Supplemental Figure 4B shows the same performance metric averaged for all of the glomeruli in a WSI with respect to the number of glomerular profiles per WSI, which varies from four to 56 glomeruli per WSI.
Our results indicated the performance of the network remained unaffected (R2=0.01 with P=0.49) by the number of podocytes per glomerulus, but improved with an increase in number of glomeruli per biopsy (R2=0.11 with P=0.05). Our data contained WSIs with glomerular profiles in the range of 4–56 where for each WSI, the dice coefficients were reasonable. This result suggests it is theoretically possible to set a desired performance with statistical significance with respect to the total number of glomeruli presence in a WSI or for a clinical assessment using our method involving multiple WSIs.
Network Performance with Respect to Preanalytical Errors Highlight the Importance of Quality Control
The network demonstrated an average dice coefficient value of 0.56±0.14 (with a minimum of 0.28 and a maximum of 0.84) in these image patches. The high variation in dice coefficients demonstrates that the network performance fluctuates with varying types of artifacts, highlighting the importance of quality control before analysis.
Network Performance for Central Versus Peripheral Podocytes
In detecting peripheral podocyte nuclei, the three annotators (A1, A2, and A3) achieved sensitivity/specificity of 0.88/0.97, 0.84/0.97, and 0.82/0.98, respectively, whereas our computational tool achieved sensitivity/specificity values of 0.93/0.96. The Bland–Altman plot further confirmed this finding (Figure 5). Furthermore, Figure 5 shows that pix2pix displays a smaller residual error of podocyte volume density than for human annotations. In contrast, for central podocyte nuclei, annotators A1, A2, and A3 achieved sensitivity/specificity values of 0.24/0.99, 0.41/0.98, and 0.42/0.99, respectively, whereas the computational tool achieved sensitivity/specificity of 0.86/0.98. Additionally, on average, compared with the ground-truth, the annotators displayed substantial agreement in identifying peripheral, central, and nonpodocyte nuclei, with κ = 0.78 and 95% confidence interval (CI), 0.75 to 0.81, whereas the computational tool offered a near-perfect agreement, with κ = 0.85 and 95% CI, 0.83 to 0.88. As further validation, the conditional probability of class assignment with respect to the ground-truth was computed for the annotators and compared with that of the computational approach (Supplemental Table 6). The computational approach demonstrated the highest conditional probabilities for accurately identifying central and peripheral podocyte nuclei, in comparison with the expert annotators.
Networks Identify Relative Podocyte Loss in Kidney Disease Models
The STZ- and NTS-treated mice showed a significant (P<0.001) reduction in the podocyte volume densities (Figure 6), compared with controls, in the ground-truth IF/IHC images quantified by manual counting. The deep learning networks also detected a significant podocyte depletion (P<0.001) compared with control. The PAN-treated rats did not show a significant podocyte depletion (P>0.05) compared with their controls in the respective ground-truth IF/IHC images, and thus a statistically significant difference was not detected using the computational models.
Podocyte Nuclear Morphometrics Improve Classifier Performance in Distinguishing Control from Diseased Glomeruli
The performance in classifying control from diseased glomeruli using only the glomerular area feature with classifier NB1 offered a slight improvement compared with that of a baseline classifier that assigns classes at random (Table 5). The best classification performance was obtained in the NTS-treated model, showing a classifier performance improvement, with respect to baseline, of 1.27-fold. This result indicates the glomerular area is informative in predicting the disease state. Furthermore, NB2, NB3, and NB4 showed a significant improvement in testing accuracies compared with NB1, indicating that podocyte nuclear morphometrics enhanced the networks’ predictive performance in identifying glomeruli from diseased kidneys. The best classification performance was again observed in the NTS-treated model using computationally estimated podocyte nuclei with classifier performance improvement with respect to a baseline of 1.42-fold (Table 5).
Table 5.
Dataset | NB1 | NB2 | NB3 | NB4 | ||||
---|---|---|---|---|---|---|---|---|
Trn. Acc. | Tst. Acc. | Trn. Acc. | Tst. Acc. | Trn. Acc. | Tst. Acc | Trn. Acc. | Tst. Acc. | |
STZ mouse WT1 | 0.62 | 0.58 | 0.73 | 0.68a | 0.68 | 0.70a | 0.69 | 0.62a |
Baseline | 0.49 | 0.52 | 0.49 | 0.52 | 0.49 | 0.52 | 0.49 | 0.52 |
NTN mouse p57 | 0.65 | 0.66 | 0.70 | 0.67a | 0.72 | 0.74a | 0.63 | 0.69a |
Baseline | 0.53 | 0.52 | 0.53 | 0.52 | 0.53 | 0.52 | 0.53 | 0.52 |
PAN rat WT1 | 0.48 | 0.50 | 0.60 | 0.55a | 0.60 | 0.56a | 0.55 | 0.59a |
Baseline | 0.48 | 0.48 | 0.48 | 0.48 | 0.48 | 0.48 | 0.48 | 0.48 |
PAN rat p57 | 0.55 | 0.57 | 0.60 | 0.59a | 0.66 | 0.59a | 0.64 | 0.59a |
Baseline | 0.50 | 0.55 | 0.50 | 0.55 | 0.50 | 0.55 | 0.50 | 0.55 |
The performance of the NB classifier with (NB2, NB3, and NB4) and without (NB1) podocyte morphometric features is shown here. NB1 indicates the classifier trained using glomerular area alone, NB2 indicates the classifier trained using glomerular area and podocyte (ground-truth) morphologic features, NB3 indicates the classifier trained using glomerular area and podocyte (identified by CNN) morphologic features, and lastly, NB4 indicates the classifier trained using glomerular area and podocyte (identified by GAN) morphologic features. Tst. Acc., testing accuracy; Trn. Acc., training accuracy; NTN, nephrotoxic nephritis.
Classifiers displaying an increase in testing accuracies, with the addition of podocyte morphometrics.
Cloud-Based Plugin Enables Easy Access to PodoSighter Pipelines
The PodoSighter plugin layout is shown in Figure 7 along with a representative PAS image of a human biopsy. The predicted podocyte nuclei by one of the networks are highlighted in green.
Custom-Designed In Silico Ground-Truth Images Enhance pix2pix Performance
We found that employing the in silico IF images as domain B for training the GAN generated a sensitivity/specificity of 0.80/0.80, whereas the IF/IHC based ground-truth images generated sensitivity/specificity of 0.75/0.90. Thus, the in silico IF images were chosen as the optimal domain B for training the GAN-based computational model. These results indicate that the GAN may be sensitive to signal variations in IF images. Thus, preprocessing them to eliminate such variations led to an improvement in the network (for the case of in silico image-based ground-truths), potentially due to the elimination of stain-related variations in the training set, such as varying signals of antibodies (Figure 8).
Effects of Pixel-Wise Class Imbalance
The pixel-wise distribution of podocyte nuclei in all datasets revealed that the GAN model performed better in uniformly distributed datasets (STZ-treated model). However, it underperformed in datasets with a skewed distribution (NTS-treated model) or with large SDs in the percentage of podocyte nuclei pixels per glomerulus (human WT1-labeled biopsy cases and autopsy cases) (Supplemental Figure 6).
Discussion
The podocyte-depletion hypothesis in progressive glomerular diseases27 has led to the realization that quantifying podocytes could provide valuable prognostic information. The subsequent realization that the podocyte area densities were not informative enough to identify the pathophysiology behind podocyte depletion led to the establishment of podocyte volume densities as the gold standard for quantifying podocyte depletion.10
Several approaches have been proposed for quantifying podocytes. The single FFPE section method reported by Venkatareddy et al.11 uses regular 2–5 µm histologic sections to estimate the podocyte volume density. The thin and thick section method28 uses two histologic sections from which the podocyte area density is obtained independently. Subsequently, the differences in podocyte number are related to the difference in section thickness to derive the podocyte volume density. A more recent approach, as reported by Zimmerman et al.29 uses deep learning–based workflow to perform podocyte morphometric profiling. Another recent approach by Schaub et al.30 uses high-resolution optical scanning images to extract the podocyte volume densities. However, all of the aforementioned approaches use antibody-based staining protocols to identify podocytes, which is not part of the clinical routine. Although the quantification of podocytes in immunostained images may be a more direct approach, the use of additional staining is not always possible, especially in the research and clinical domain, where the availability of human tissues is limited. One such direct approach is demonstrated in a companion work from our group by Santo et al.,31(preprint) wherein standard morphologic processing is applied to IHC-stained renal sections to automatically quantify p57-positive nuclei from various murine models of kidney diseases. Furthermore, to the best of our knowledge, no comprehensive automated pipeline currently exists for whole-slide–based (1) label-free identification of podocyte nuclei, (2) automated quantification of the nuclear caliper diameters, and (3) estimation of the podocyte volume densities per WSI.
In this study we introduce PodoSighter, a framework encompassing two separate deep learning pipelines: (1) CNN-based training (using Deeplab V3+ network32), and (2) GAN-based training (using pix2pix conditional GAN20), to automatically detect and quantify podocyte nuclei in standard brightfield WSIs of renal tissue. To demonstrate the robustness of our pipeline, we used a diverse dataset encompassing multiple species and obtained from multiple laboratories, which included rodent disease models such as DKD, crescentic GN, and dose-dependent direct podocyte toxicity and depletion, wherein podocyte damage is reported, and human SRNS and autopsy. This large and diverse set of preanalytic variance likely makes our trained machine learning models very robust for podocyte detection and quantification.
Additionally, the volume of podocyte nuclei can be estimated on the basis of the geometrical assumptions mentioned by Venkatareddy et al.,11 namely that podocyte nuclei are ellipsoidal, with a shape factor of k=0.72 and spheroid geometry parameter of α=0.80. Thus, with α being the ratio of the equatorial axis (A) to the radius of an equivalent sphere (Asphere), one can approximate the podocyte nuclei volume by simply using the equation for volume estimation of a spheroid, where A is equivalent to D/2, and D is the true podocyte nuclear caliper diameter. Because the D values are already being calculated by the pipeline, the volume of the nucleus would be highly correlated with the D value.
To generate the training set for PodoSighter, proper tissue staining and image registration were crucial. Some tissue sections were initially stained with PAS, and subsequently with IHC markers (not vice versa). This staining sequence generates a cleaner PAS image, enabling more accurate identification of podocytes directly from the stained images, which is closer to the intended application on normal PAS slides. Further, among the IHC markers, although the p57 marker generally is expressed only by podocytes, in our datasets, this marker stained a few PECs. However, such false-positive staining was rare. Regarding the image registration, because our dataset was generated with careful quality control in tissue staining and imaging, the registration process was relatively straightforward. Note that registration is needed only for training set generation and does not affect the automated aspect of the proposed pipeline in detecting podocyte nuclei.
Deep convolution neural networks have been widely implemented33–35 in the field of histopathology. Similarly, generative models have lately seen a surge in interest in all domains of science, offering new opportunities to generate realistic synthetic images. Our CNN-based pipeline performed uniformly well, regardless of disease state, podocyte nuclei marker, or species, whereas our GAN-based pipeline displayed varying performance with different datasets. We believe the crucial capability, being able to adjust the label weights, provided the CNN with an added advantage over the GAN method, especially for a task involving a large class imbalance. This observation is further reflected in Supplemental Figure 6, suggesting GAN performs better than the CNN in datasets with uniform distributions of podocyte pixels with respect to the glomeruli. Our CNN-based pipeline, if implemented in clinical practice, could improve the speed and accuracy of podocyte detection. The GAN-based model could enhance research studies requiring highly accurate identification of podocytes in carefully curated datasets that have minimal variations in podocyte intensities or pixels (i.e., the number of pixels in the image occupied by podocytes).
In our GAN-based pipeline, experiments to identify the ideal domain B for training the network revealed that the GAN models perform better with in silico IF images demarcating the podocyte structures only, and not with the IF/IHC gray-scale intensity images. The improvement was potentially due to the elimination of staining-related artifacts in in silico images. Furthermore, this technique of using in silico images also allows combining multiple datasets for the training, regardless of the imaging system or the chromogen used to stain podocytes. Additionally, the in silico IF images were designed to depict capillary boundaries and tubules in different color contrasts (than podocytes), allowing the GAN system to learn better from the increased contrast in the domain B images.
Podocytes are very difficult to manually annotate in histologically stained WSIs, often because they are difficult to register with respect to the capillary loops in an image. Podocytes in the middle of a two-dimensional glomerular projection are the most difficult to annotate because they are frequently indistinguishable from other resident glomerular cell types. We have shown this trend quantitatively in comparisons of manual annotators’ performance with respect to the ground-truth (see Figure 5, Supplemental Figure 5 and Table 6, and the Effects of Central versus Peripheral Podocytes). Additionally, this complexity in identifying podocytes has prevented the development of automated approaches for detecting podocytes in standard brightfield images. As a superior alternative, our automated computational tool exploiting the IF/IHC ground-truth labels in our training dataset can detect podocyte nuclei directly from PAS-stained brightfield images, regardless of their locations within the glomerulus.
Additionally, our framework quantifies podocyte morphometric variations, which proved to be informative in identifying diseased glomeruli. However, the highest testing accuracy achieved by our NB classifier was 0.74. Furthermore, both the CNN and GAN models detect some false-positive podocytes, which limited their overall precision values in podocyte detection to approximately 0.54 and 0.36, respectively. Incorporating a larger training set, with a larger diversity in disease models, is expected to address this issue. Nevertheless, our proposed framework is a robust and reproducible alternative for automated podocyte detection from PAS WSIs.
Finally, we incorporated our developed network models as end-user web plugins in a cloud-based application, a user-friendly and turnkey solution that eliminates the need to download additional software and concerns about operating system compatibility. Developers can use the shared Docker image to implement our tool in their own servers and can customize the tool further using enhanced training sets and other end-user specific options.
Disclosures
A.Z. Rosenberg reports receiving research funding from the National Institutes of Health and the National Kidney Foundation; reports receiving honoraria from Georgetown University, Ichilov Hospital (Tel Aviv, Israel), and Stony Brook University; and reports being a scientific advisor or member of Escala. D. Manthey reports being employed by Kitware Inc. F. Thaiss reports having consultancy agreements with B. Braun Viamedis, Novartis, Sanofi, and Union Chimique Belge; reports receiving honoraria from Alexion, Bristol Myers Squibb, Chiesi, Hexal, Novartis, Pfizer, and Sanofi; and reports being a scientific advisor or member of Novartis and Sanofi. J.E. Tomaszewski reports having an ownership interest in AXA, General Electric, and Neurovascular Diagnostics, Inc.; reports receiving honoraria from the American College of Veterinary Pathologists, Dakota Cancer Collaborative on Translational Activity, and the University of Washington; reports being a scientific advisor or member of Neurovascular Diagnostics, Inc.; and reports having other interests/relationships with the Board of Directors, Kidney Precision Medicine Project (National Institute of Diabetes and Digestive and Kidney Diseases [NIDDK]) external evaluation panel through May 2021, National Kidney Foundation, SPIE (the Society of Photo-Optical Instrumentation Engineers) Medical Imaging, Western New York Chapter, and the Editorial Boards of SPIE Medical Imaging; Dakota Cancer Collaborative on Translational Activity, External Evaluation Panel, and the Journal Pathology Informatics. J.U. Becker reports having consultancy agreements with Sanofi. P.F. Hoyer reports having consultancy agreements with Boehringer Ingelheim; and reports being a scientific advisor or member of the Archives of Disease in Childhood. P.L. Tharaux reports having consultancy agreements with and receiving honoraria from Travere Therapeutics; reports being a scientific advisor or member as an Advisory Board Member for Nature Review Nephrology, Associate Editor for Kidney International, the French National Institute for Medical Research, the French Society of Cardiology, and the French Society of Hypertension. P. Sarder reports receiving research funding from the CKD Biomarker Consortium, Clinical and Translational Science Institute at the University at Buffalo, Kidney Precision Medicine Project, NIDDK, the State University of New York, and the University at Buffalo; reports being a scientific advisor or member as Associate Editor of PLoS One, and Editorial Board Member of the Journal of American Society of Nephrology. All remaining authors have nothing to disclose.
Funding
This work was supported by NIDDK grant R01 DK114485, National Institutes of Health OD (Office of Director) grants R01 DK114485 02S1 and R01 DK114485 03S1, NIDDK CKD Biomarker Consortium grant U01 DK103225, NIDDK Kidney Precision Medicine Project grant U2C DK114886, and the Deutsche Forschungsgemeinschaft (BE-3801 to J.U. Becker).
Supplementary Material
Acknowledgments
D. Govind designed and implemented the computational algorithm and wrote the manuscript; J.U. Becker, J. Dang, P.F. Hoyer, A.Z. Rosenberg, F. Thaiss, P.L. Tharaux, and R. Yacoub generated the data for the study; J.U. Becker, K.-Y. Jen, A.Z. Rosenberg, P.L. Tharaux, J.E. Tomaszewski, and R. Yacoub assisted in manuscript preparation; K.-Y. Jen, A.Z. Rosenberg, and V. Walavalkar were the expert renal pathologists who performed manual annotations of podocyte nuclei; J. Miecznikowski assisted with the statistical analysis; I. Mohammad and A.M. Worral performed the staining and digitization of the tissue sections; B. Lutnick and D. Manthey assisted with the HistomcsUI plugin development; J.U. Becker contributed to the concept of 3D podometrics; P. Sarder conceived the overall research scheme, coordinated the study team, conceptualized the overall computational study design and the statistical performance analysis, supervised the computation implementation, critically analyzed the results, and assisted in manuscript preparation. We acknowledge the assistance of the Multispectral Imaging Suite and Histology Core Laboratory in the Department of Pathology and Anatomical Sciences, Jacobs School, University at Buffalo. We thank Prof. James Ballard for detailed scientific editing of the work. We also acknowledge the anonymous reviewers for providing valuable information that improved the quality of the paper.
Footnotes
Published online ahead of print. Publication date available at www.jasn.org.
Data Availability Statement
Relevant data for our study (WSIs and trained models) can be found at https://bit.ly/3e6XZzs.
Code Availability
Our codes for the PodoSighter pipeline can be found at https://github.com/SarderLab/PodoSighter, and the PodoSighter plugin can be found at https://github.com/SarderLab/HistomicsTK_PodoSighter. To convert xml formats into binary masks, we used the py-wsi package, available at https://github.com/ysbecca/py-wsi. The docker image of the HistomicsUI web plugin is available at https://bit.ly/3e6XZzs for end-users to establish the tool in their own servers and conduct further customization. The developed automated pipeline for caliper diameter estimation directly from immunostained images are also available at https://bit.ly/3e6XZzs.
Supplemental Material
This article contains the following supplemental material online at http://jasn.asnjournals.org/lookup/suppl/doi:10.1681/ASN.2021050630/-/DCSupplemental.
Supplemental Figure 1. Data description.
Supplemental Figure 2. Generation of in silico IF images for training the pix2pix network.
Supplemental Figure 3. Automation of Wiggins method for estimating podocyte nuclear caliper diameter.
Supplemental Figure 4. Performance of the proposed deep learning network model in detecting and segmenting podocyte nuclei in human biopsy WSIs with respect to the number of podocyte nuclei per glomerulus and number of glomerular profiles per WSI.
Supplemental Figure 5. Comparison of pix2pix results with manual annotation of podocytes.
Supplemental Figure 6. Pixel-wise class imbalance of training data.
Supplemental Table 1. STZ dose per mouse for DKD data.
Supplemental Table 2. Extracted features per PAS image patch from podocyte nuclei (as identified by ground-truth, CNN, and pix2pix) and glomeruli, averaged over each dataset, per disease state.
Supplemental Table 3. Comparison of d, D, and CF values extracted from IF and PAS images.
Supplemental Table 4. Comparison of D and CF values extracted from mouse, rat, and human IF/IHC ground-truth images with values reported by Venkatareddy et al., JASN, 2014.
Supplemental Table 5. Comparison of d, D, and CF values, and podocyte volume density estimates extracted from ground-truth IF/IHC images and computationally predicted podocyte nuclei in PAS images from the hold-out WSIs.
Supplemental Table 6. Comparison of conditional probability of class assignment for central and peripheral podocyte nuclei, and nonpodocyte glomerular nuclei between the expert annotators and the GAN.
References
- 1.Yu D, Petermann A, Kunter U, Rong S, Shankland SJ, Floege J: Urinary podocyte loss is a more specific marker of ongoing glomerular damage than proteinuria. J Am Soc Nephrol 16: 1733–1741, 2005 [DOI] [PubMed] [Google Scholar]
- 2.Nagata M: Podocyte injury and its consequences. Kidney Int 89: 1221–1230, 2016 [DOI] [PubMed] [Google Scholar]
- 3.Lin JS, Susztak K: Podocytes: The weakest link in diabetic kidney disease? Curr Diab Rep 16: 45, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Miyauchi M, Toyoda M, Kobayashi K, Abe M, Kobayashi T, Kato M, et al. : Hypertrophy and loss of podocytes in diabetic nephropathy. Intern Med 48: 1615–1620, 2009 [DOI] [PubMed] [Google Scholar]
- 5.Henique C, Bollée G, Loyer X, Grahammer F, Dhaun N, Camus M, et al. : Genetic and pharmacological inhibition of microRNA-92a maintains podocyte cell cycle quiescence and limits crescentic glomerulonephritis. Nat Commun 8: 1829, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Meyer-Schwesinger C, Lange C, Bröcker V, Agustian P, Lehmann U, Raabe A, et al. : Bone marrow-derived progenitor cells do not contribute to podocyte turnover in the puromycin aminoglycoside and renal ablation models in rats [published correction appears in Am J Pathol 179: 537, 2011]. Am J Pathol 178: 494–499, 2011 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kim YH, Goyal M, Kurnit D, Wharram B, Wiggins J, Holzman L, et al. : Podocyte depletion and glomerulosclerosis have a direct relationship in the PAN-treated rat. Kidney Int 60: 957–968, 2001 [DOI] [PubMed] [Google Scholar]
- 8.Machuca E, Benoit G, Antignac C: Genetics of nephrotic syndrome: Connecting molecular genetics to podocyte physiology. Hum Mol Genet 18[R2]: R185–R194, 2009 [DOI] [PubMed] [Google Scholar]
- 9.Bridges CR, Myers BD, Brenner BM, Deen WM: Glomerular charge alterations in human minimal change nephropathy. Kidney Int 22: 677–684, 1982 [DOI] [PubMed] [Google Scholar]
- 10.Lemley KV, Bertram JF, Nicholas SB, White K: Estimation of glomerular podocyte number: A selection of valid methods. J Am Soc Nephrol 24: 1193–1202, 2013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Venkatareddy M, Wang S, Yang Y, Patel S, Wickman L, Nishizono R, et al. : Estimating podocyte number and density using a single histologic section. J Am Soc Nephrol 25: 1118–1129, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wanner N, Hartleben B, Herbach N, Goedel M, Stickel N, Zeiser R, et al. : Unraveling the role of podocyte turnover in glomerular aging and injury. J Am Soc Nephrol 25: 707–716, 2014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Puelles VG, Bertram JF: Counting glomeruli and podocytes: Rationale and methodologies. Curr Opin Nephrol Hypertens 24: 224–230, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lutnick B, Ginley B, Govind D, McGarry SD, LaViolette PS, Yacoub R, et al. : An integrated iterative annotation technique for easing neural network training in medical image analysis. Nat Mach Intell 1: 112–119, 2019. 31187088 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Lutnick B, Ginley B, Govind D, McGarry SD, LaViolette PS, Yacoub R, et al. : An integrated iterative annotation technique for easing neural network training in medical image analysis. Nat Mach Intell 1: 112–119, 2019 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lutnick B, Manthey D, Sarder P: A tool for user friendly, cloud based, whole slide image segmentation. arXiv. 10.1111/j.1440- 1797.2007.00796.x (Preprint posted January 18, 2021) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Tesch GH, Allen TJ: Rodent models of streptozotocin-induced diabetic nephropathy. Nephrology (Carlton) 12: 261–266, 2007 [DOI] [PubMed] [Google Scholar]
- 18.Chen LC, Zhu Y, Papandreou G, Schroff F, Adam H: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Computer Vision–ECCV 2018. ECCV 2018. Lecture Notes in Computer Science, vol. 11211, edited by Ferrari V, Hebert M, Sminchisescu C, Weiss Y. Cham, Springer, 2018, pp 833–851 [Google Scholar]
- 19.Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H, editors: Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European conference on computer vision (ECCV), Hawaii Convention Center, Honolulu, HI, July 21–26, 2017
- 20.Isola P, Zhu J-Y, Zhou T, Efros AA, editors: Image-to-image translation with conditional adversarial networks. Proceedings of the IEEE conference on computer vision and pattern recognition, Hawaii Convention Center, Honolulu, HI, July 21–26, 2017
- 21.Kabgani N, Grigoleit T, Schulte K, Sechi A, Sauer-Lehnen S, Tag C, et al. : Primary cultures of glomerular parietal epithelial cells or podocytes with proven origin. PLoS One 7: e34907, 2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Giavarina D: Understanding bland Altman analysis. Biochem Med (Zagreb) 25: 141–151, 2015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Shapiro SS, Wilk MB: An analysis of variance test for normality (complete samples). Biometrika 52: 591–611, 1965 [Google Scholar]
- 24.Mann HB, Whitney DR: On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat 18: 50–60, 1947 [Google Scholar]
- 25.Cohen J: Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 70: 213–220, 1968 [DOI] [PubMed] [Google Scholar]
- 26.Gutman DA, Khalilia M, Lee S, Nalisnik M, Mullen Z, Beezley J, et al. : The digital slide archive: A software platform for management, integration, and analysis of histology for cancer research. Cancer Res 77: e75–e78, 2017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Wiggins R-C: The spectrum of podocytopathies: A unifying view of glomerular diseases. Kidney Int 71: 1205–1214, 2007 [DOI] [PubMed] [Google Scholar]
- 28.Sanden SK, Wiggins JE, Goyal M, Riggs LK, Wiggins RC: Evaluation of a thick and thin section method for estimation of podocyte number, glomerular volume, and glomerular volume per podocyte in rat kidney with Wilms’ tumor-1 protein used as a podocyte nuclear marker. J Am Soc Nephrol 14: 2484–2493, 2003 [DOI] [PubMed] [Google Scholar]
- 29.Zimmermann M, Klaus M, Wong MN, Thebille A-K, Gernhold L, Kuppe C, et al. : Deep learning-based molecular morphometrics for kidney biopsies. JCI Insight 6: 144779, 2021 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Schaub JA, O’Connor CL, Shi J, Wiggins RC, Shedden K, Hodgin JB, et al. : Quantitative morphometrics reveals glomerular changes in patients with infrequent segmentally sclerosed glomeruli [published online ahead of print January 11, 2021]. J Clin Pathol 10.1136/jclinpath-2020-207149 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Santo BA, Govind D, Daneshpajouhnejad P, Yang X, Wang XX, Myakala K, et al. : PodoCount: A robust, fully automated whole-slide podocyte quantification tool. bioRxiv. 10.1101/2021.04.27.441689 (Preprint posted April 28, 2021) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Chen L-C, Papandreou G, Kokkinos I, Murphy K, Yuille AL: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40: 834–848, 2018 [DOI] [PubMed] [Google Scholar]
- 33.Xing F, Xie Y, Yang L: An automatic learning-based framework for robust nucleus segmentation. IEEE Trans Med Imaging 35: 550–566, 2016 [DOI] [PubMed] [Google Scholar]
- 34.Van Valen DA, Kudo T, Lane KM, Macklin DN, Quach NT, DeFelice MM, et al. : Deep learning automates the quantitative analysis of individual cells in live-cell imaging experiments. PLOS Comput Biol 12: e1005177, 2016 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Joon Ho D, Fu C, Salama P, Dunn KW, Delp EJ, editors: Nuclei segmentation of fluorescence microscopy images using three dimensional convolutional neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Hawaii Convention Center, Honolulu, HI, July 21–26, 2017
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Relevant data for our study (WSIs and trained models) can be found at https://bit.ly/3e6XZzs.