Skip to main content
Journal of Pathology Informatics logoLink to Journal of Pathology Informatics
. 2021 Nov 1;12:44. doi: 10.4103/2153-3539.330157

Abstract

PMCID: PMC8609292
J Pathol Inform. 2021 Nov 1;12:44.

A New Resource for Pathology Informatics: The New Mexico Decedent Image Database

Shamsi Daneshvari Berry 1,2, Heather J H Edgar 3,4

Background: The New Mexico Decedent Image Database is a new and unique database comprised of computed tomography (CT) scans and associated personal, health, lifestyle and circumstances of death data. The database consists of 15,242 decedents whose deaths were investigated at the New Mexico Office of the Medical Investigator.

Methods: In 2014, a study determined the minimum data set to associate with these CT images using a modified Delphi method. Through an iterative process, researchers from a wide variety of fields (anthropology, medicine, forensics, informatics, epidemiology, biomedical research and dentistry), selected variables that they believed to be essential to making the CT scans useful to researchers in multiple fields. In 2016, the National Institute of Justice awarded a grant (2016-DN- BX-0144) to create the CT database with associated lifestyle, health, and cause of death information. The data for all 69 variables derive from both the medical examiner’s database and through phone interviews with next of kin.

Results: The sample includes decedents who died between 2010 and 2017, and accounts for approximately 11% of deaths in New Mexico. Over two thirds of the scans have no discernable decomposition. Ten thousand decedents are male, 30% are Hispanic and 13% Native American. Natural causes of death and accidents account for 73% of the sample with the remainder of deaths due to suicide, homicide and unknown. The information available can differ greatly between individuals but can include variables such as: education, occupations, habitual activities, number of children, country of origin for decedent, parents, and grandparents, health history, medications, socioeconomic status, and medical diagnoses.

Conclusions: As of January 2021, there 372 users of the database from 34 countries, with over 100 requesting access to the CT images. Reasons for accessing the database have included research, education and art. Projects have included COVID-19 research, age estimation, cancer treatment, automobile safety, biomechanics, morphometric analyses and informatics projects. The database is available at NMDID.UNM.EDU.

J Pathol Inform. 2021 Nov 1;12:44.

Multi-Biomarker Analysis for Hodgkin Lymphoma using Automatic Registration

Abubakr Shafique 1, Morteza Babaie 1, Adrian Batten 2, Soma Skdar 2, Hamid Tizhoosh 1

Background: Hodgkin lymphoma (HL) affects 8500 people annually in the United States, which is nearly 10.2% of all lymphoma cases.[1] An accurate and timely diagnosis of HL is critical for an appropriate treatment plan.[1] For the accurate diagnosis, joint analysis of multiple protein expressions and tissue morphology is performed by the pathologists using hematoxylin and eosin (H&E) staining, as well as immunohistochemical biomarkers including CD20, CD30, and Pax5. Currently, pathologists manually examine the co-localized areas across IHC and H&E slides for a final diagnosis, [2] which is a tedious and challenging task because of the morphological deformations introduced during slide preparation and also large variations on cell appearance and tissue morphology across different stainings. It is, therefore, an important task to align all the IHC and H&E images for easy and accurate analysis.

Methods: We propose a two-step automatic feature-based cross-staining WSI registration to enable the localization of metastatic foci in the assessment of HL. In the first step, WSI pairs were aligned allowing for translation, rotation, and scaling using affine transformation. The registration was performed automatically by first detecting landmarks in both source and target images, using the scale-invariant image transform (SIFT), followed by the fast sample consensus (FSC) protocol for finding point correspondences and finally aligned the images.[3] In the second step, rigidly aligned source and target images are vertically divided into three sections to get the landmarks evenly distributed over the tissue region. These automatically detected landmarks are then used as the control points for non-rigid transformation using local weight mean (LWM) transformation.

Results: For evaluation, we used 47 pairs of images registered automatically, and manually to compare the performance of the proposed system. The images were acquired from the Grand River Hospital, Ontario Canada. The specimens are taken from different parts of the body. Jaccard similarity index for each manual and automatically registered pair is calculated to evaluate the proposed system. Average Jaccard similarity index for 47 pairs was 0.92. Moreover, few registered samples from the data are shown in Figure 1.

Figure 1.

Figure 1

Illustrates few examples of registered biomarkers according to the target H and E Image

Conclusion: Multi-protein cross-stain analysis is pivotal for disease diagnosis such as HL for suitable prognosis. For this, we propose a fully automatic and accurate stain registration method to align IHC biomarkers with H&E for an accurate diagnosis.

References

  • 1.Ansell SM. Hodgkin lymphoma: 2018 update on diagnosis, risk-stratification, and management. Am J Hematol. 2018;93:704–15. doi: 10.1002/ajh.25071. [DOI] [PubMed] [Google Scholar]
  • 2.Trahearn N, Epstein D, Cree I, Snead D, Rajpoot N. Hyper-stain inspector: A framework for robust registration and localised co-expression analysis of multiple whole-slide images of serial histology sections. Sci Rep. 2017;7:5641. doi: 10.1038/s41598-017-05511-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Wu Y, Ma W, Gong M, Su L, Jiao L. A novel point-matching algorithm based on fast sample consensus for image registration. IEEE Geosci Remote Sens Lett. 2015;12:43–7. [Google Scholar]
J Pathol Inform. 2021 Nov 1;12:44.

Interpreting Complex and Uncommon MNS Alleles from Whole Genomes

Justin B L Halls 1,2, Sunitha Vege 3, Judith Aeschlimann 3, Helen H Mah 1, Matthew S Lebo 1,2,4,5, Prathik K Vijay Kumar 5, Connie M Westhoff 3, William J Lane 1,2

Background: The MNS blood group system consists of three homologous genes: GYPA, GYPB, and GYPE. Many MNS alleles contain complex structural variations (SV) such as partial gene deletions and multi-step gene recombinations that form hybrid genes, which represent a challenge for the development of WGS genotyping algorithms. Here we performed WGS on nine established MNS samples exemplifying diverse types of MNS alleles: U+var and GYP hybrid series GYP (A-B), GYP (B-A), GYP (B-A-B), and GYP (B-E-B).

Methods: The MNS single nucleotide variations (SNV) and SV were identified using our bloodTyper software with SV being called using a combination of read depth, paired reads, and split read interpretations. Analysis of interpretive gaps from these nine known samples was used to update bloodTyper and then used to call the MNS alleles in all 3,202 high coverage whole genomes from the 1000 Genomes Project.

Results: U+var was shown to be expressed mostly as a hemizygous change trans to GYPB deletions, a finding confirmed in 25 known U+var samples. The analysis of the nine known samples also led to the description of unique breakpoints and characterization of three novel alleles: GYP*Hil.02, *JL.02, *JL.03, and confirmation of the recently described GYP*Bun.02. In addition, the GYP*JL.03 sample was identified to be compound heterozygous for the GYPA*01N allele, allowing for the first ever description of the exact breakpoint for this long established allele. Furthermore, the breakpoints for the GYP*Dantu (NE) were updated to include a region of GYPB exon 6 in a duplicated copy of GYPE. Analysis of the 1000 Genomes Project found GYP*Hil, *Sch (with three different breakpoints), *Dantu (NE) and several potentially novel alleles including two B-A hybrids, one E-A hybrid, and four complex SV likely representing several recombination events.

Conclusion: This work enhances characterization of WGS data within the MNS blood group system to include rare alleles, and further develops genomic analytical strategies and automated interpretation of blood group alleles.

J Pathol Inform. 2021 Nov 1;12:44.

Generalizability of Machine learning Models for Autoverification of Mass Spectrometry Assays in the Clinical Laboratory

James H Harrison 1, Zacary Abushmaies 1

Background: Generalizable machine learning models retain utility across contexts such as location or task. We previously reported a support vector model for autoverification of tetrahydrocannabinol (THC) by mass spectrometry. We now assess the generalizability of this type of model to other mass spectrometry assays using cocaine/benzoylecgonine (CBE) as a test case.

Methods: Retrospective data from routine urine THC (n=1267) and CBE (n=982) analyses were exported from Agilent instrumentation (Santa Clara, CA, USA). Samples repeated based on operator judgement were targets for identification by machine learning. Data analysis and machine learning used Python 3.8.3 (Anaconda, Inc., Austin, Texas, USA) with the Scikit-Learn library (v. 0.23.2). Run parameters were normalized to the middle calibrator, data was scaled to the interquartile range, and features were pruned based on correlation and importance ranking. The data for each assay were divided into training (67%) and test (33%) sets. The training data were concatenated to create an additional combined training/test data set (67%/33%). Separate support vector classifiers were built using each training set and predictive performance was evaluated within and between the test data sets.

Results: Ten percent of THC and 15% of BE assays were repeated. Six features were highest ranked and used in the analysis: sample height/area, sample peak shape, sample qualifier ion 1 & 2 ratios, internal standard peak area ratio, and internal standard qualifier ion 1 ratio. Recall, precision, and ROC AUC of the THC model for repeated assays in THC and CBE test data were 0.91/0.69/0.97 and 0.58/0.87/0.96, respectively. The BE model yielded 0.94/0.68/0.98 (CBE data) and 0.98/0.50/0.93 (THC data). The model trained on combined data [Figure 1] yielded recall, precision, and AUC values of 0.94/0.64/0.97 (combined), 0.98/0.59/0.98 (THC), and 0.90/0.79/0.98 (CBE).

Figure 1.

Figure 1

ROC for combined SVM vs. combined, BE, or THC data

Conclusions: The CBE-trained model was generalizable to THC data and the combined model was reasonably generalizable to both individual assays. The THC model was not generalizable to CBE data. These results suggest that common models may be possible for mass spectrometry autovalidation if they are carefully constructed and validated.

J Pathol Inform. 2021 Nov 1;12:44.

Mycobacterial Detection and Localization on WSI Using Machine Learning

Chady Meroueh 1, Jun Jiang 2, Thomas Flotte 2

Introduction: Identification of mycobacteria in microscopic slides remains labor-intensive. Recently, object detection models (such as mitosis) have been introduced into clinical practice. However, detection of mycobacteria could be problematic due to the smaller size and lack of defined structures. We aim to evaluate the feasibility of using machine learning for mycobacterial detection within histological slides.

Methods: We collected 138 AFB-Stained slides from Mayo Clinic (2016-2018, 26 positive and 112 negative slides, 40x, Philips scanner). Five slides were annotated by 2 annotators with a Cohen’s kappa of 86.9% ( consensus by a senior pathologist), with the remainder annotated by a single pathologist. The initial dataset was divided using a ratio of 70 /15/15 ( D0, Train: {627 patches, 819 instances}, Validation:{ 27 patches, 19 instances}, held-out Test:{ 36 patches, 26 instances}, Patch = 640x640 pixels). An out-of-the box YOLOv5 algorithm was trained in a transfer learning manner with weights initiated from a publicly available model (YOLOv5x, ImageNet). Hyperparameter tuning with mosaic data augmentation were used to train multiple models.

Results: Using test-time image augmentation, an ensemble of 4 models (M1) showed a Precision 0.855 (Recall = 0.621, F1 = 0.72). This ensemble was then used to detect Mycobacteria on previously non-annotated patches (D1, Train:{531 patches, 1118 instances} + 504 negative patches). A single new model M2 (dataset = D0+D1) showed improvement in Recall (0.724), and F1 score (0.77). Furthermore, 13 negative test whole slides were inferenced through our models (M1 and M2). A slide is considered positive if a single Mycobacteria is detected on any of the patches. M2 (Precision = 0.43, Recall = 0.75, time = 77 minutes, 1 T4 GPU) showed an improvement in all metrics compared to M1 (Precision = 0.28, Recall = 0.5, time = 250 minutes).

Conclusion: In our preliminary testing, our trained model showed promising results for mycobacterial detection from our small dataset. This baseline model will prove extremely helpful in automatic annotation of numerous slides from multiple institutions and scanned using multiple scanners, in order to iteratively enrich our dataset for generalizability and real-world application.

J Pathol Inform. 2021 Nov 1;12:44.

A Statistically Robust Approach to Machine Learning for Model Development and Validation under the Constraint of Small Datasets

Cherub Kim 1, Zhen Zhang 1

Background: Despite the existence of a large amount of high-quality laboratory data of clinical analytes, most pathology datasets available for machine learning are annotated manually, which limits the sample size of available data for model development and validation. Machine learning models developed on small datasets may have high variance with poor generalizability in independent validation. We present a statistically principled workflow for machine learning model development for small datasets problems.

Technology: Grid-search using a combination of cross-validation, bootstrapping, and multiple rounds of stratified splitting were used in combination to select support vector machine parameters and analyte combinations. Scikit-learn v. 0.24.0, Pandas v. 1.2.0, Matplotlib v. 3.3.3, JupyterLab v. 3.0.0 libraries were used in Python3.8 to train and evaluate the models.

Methods: We perform a full grid-search over the range of model parameters and analyte combinations. At each grid point, the development set is bootstrapped ten times. Each bootstrap is stratified split five times into train and test sets. The support vector machine and analyte combination specified by each grid point is trained and tested with each of these datasets and the median and standard deviation are calculated as an overall score for each grid point. The top scoring parameter and variable combinations are used to train a five-member ensemble of support vector machines. These are evaluated with the initially left out validation set. This method was tested on an n = 81 prostate cancer marker dataset with 12 different cancer markers (including PHI) that is labeled as aggressive versus non-aggressive prostate cancer. The performance of our model was compared to that of PHI by area-under-curve, F1 and F2 score using stratified five-fold cross validation.

Results: Figure 1 depicts our parameter and variable selection process and result validation. Models developed using our method show good consistency (four of five splits) in outperforming PHI in F1 and F2 score. Conclusions: We demonstrate a method for developing robust machine learning models given the small datasets that are common in pathology.

Figure 1.

Figure 1

Dataflow diagram for selection of model parameters and variable combinations and five-fold cross-validation of the proposed method

J Pathol Inform. 2021 Nov 1;12:44.

Automation of Proficiency Testing to Improve Adoption and Quality Laboratory Testing

Oluwatobi O Ozoya 1, Philip R Foulis 1,2, George T Carlton 2, Steven J Agosti 2

Background: The College of American Pathologists (CAP) provides Proficiency Testing (PT) to clinical laboratories as one of several methods to assure quality in interpretation and reporting of results. There are pertinent challenges that arise when incorporating proficiency testing materials into routine laboratory workflow. These challenges include a mismatch between the PT materials and the usual clinical orders or panels utilized. There is also increased demand on laboratory staff and time to order PT specific tests and required analytes, resulting in disruption of workflow. Three areas were identified to improve our PT process. They include the need to track personnel that assayed PT materials and their competency levels, reduce the labor- intensive process of ordering specific analytes and eliminating transcription errors during submission of PT results. Technology: Access (Microsoft Redmond, Washington), VistA (Veterans Information Systems and Technology Architecture VistA, Washington, DC) and CAP portal (College of American Pathologists, Northfield Illinois).

Methods: Proficiency testing surveys are defined in an Access database (Microsoft Redmond, Washington). Each survey has a predefined set of analytes. The supervisory technologist selects the survey. Using Electronic Health Record (Veterans Information Systems and Technology Architecture VistA, Washington, DC), the computer automatically orders the requisite tests. Labels are generated and affixed to the testing material. These are given to a medical technologist and are processed. The technologist’s name is recorded; assuring testing material is randomly assigned to all individuals. After verification, the results are automatically uploaded into the CAP portal.

Results: We eliminated transcription error, decreased supervisor overhead of PT oversight by 90% and tracked users that perform PT testing, therefore satisfying the regulatory requirement for competency assessment. We now have a database interfaced to our laboratory information system that automatically orders, based on those analytes for a specific PT instance. The new system closely mirrors patient testing by labeling PT material similar to patient samples. The process now takes approximately 30 minutes and is fully automated. The automation of our PT program has drastically reduced the time required to administer, monitor and report results.

Conclusion: Incorporating competency assessment into routine laboratory processing can be better performed through automated processes.

J Pathol Inform. 2021 Nov 1;12:44.

Pooled Laboratory Testing for COVID -19

Vandana Panwar 1, Christoph U Lehmann 2,3, Richard Healey 4, Andrew Quinn 1

Background: SARS-COV-2 mass diagnostic screening is essential to disrupt its transmission and spread. Gold standard RT-qPCR testing capacity is limited, time consuming, and resource intensive. Pooled testing or group testing provides a solution. Pooled testing was first introduced in 1943 and recently applied in the context of COVID screening. Here we describe a novel high throughput workflow implemented in our laboratory to screen high volumes of specimens and discuss optimal sample size selection.

Methods: As depicted in the workflow diagram the client portal generates orders with patient identifier and transmit them to Epic Beaker. A human readable (Accession) and machine-readable (Instrument) IDs are transmitted back to the client portal, which creates barcoded labels. Three robots pool specimens on 384 well plates containing RT-qPCR reagents and documents the relationship of pool to the individual specimens. RT-qPCR is performed by QuantStudio. If a pool tests positive or indeterminate, the specimens are identified and run individually. We leveraged the formula below to determine the likelihood of a pool being positive and the optimal sample size for our pools.

P (probability of pool to be positive) = 1 - (1 - prevalence of infection) ^ pooled sample size

Results: From Jan 1 to Mar 8, 2021 our laboratory processed 19,634 individual samples averaging 59 pools/day or 297 samples/day. 3-5% of pools/day test intermittent/positive requiring individual samples to be retested. Using a pool size of 5 our laboratory can handle more than 6,000 pooled/test per day (30,000 individuals/day). Assuming 5% of pools testing positive, 1500 samples would require individual retesting thus 7500 reactions to test 30,000 individual samples. Currently, pooling samples in batches of 5 provides us with savings of ~74% PCR reactions/day.

Conclusions: “Optimal pool size” is challenging as it varies with infection prevalence, testing frequency, and assay sensitivity of the assay. Pool size is naturally limited due to increased prevalence of false negativity and low viral load in group testing combining too many samples. Our table allows laboratories to determine the optimal pool size considering the prevalence of positive swabs in their community and confirmed that a pool of 5 was optimal in our patient population.

J Pathol Inform. 2021 Nov 1;12:44.

An Ai-Based Solution for Cancer Detection: First Deployment in Clinical Routine in a US Pathology Lab

Juan C Santa Rosario 1, Roei Harduf 2, Yuval Raz 2, Gev Decktor 2, Geraldine Sebag 2, Joseph Mossel 2

Background: Prostate cancer is a major cause of cancer-related deaths in men, with a complex diagnosis and insufficient diagnostic reproducibility, at a period when there is a growing shortage in pathologists. Thus, deployment of Al-based solutions that can accurately detect and grade cancer, as well as multiple additional features, can help support pathologists in their diagnostic tasks. CorePlus Servicios Clínicos y Patológicos in Puerto Rico, a leading pathology and clinical laboratory, handles 53,000 accessions annually, of which ~6.4% are prostate core needle biopsies (PCNBs), with ~46% diagnosed with cancer.

Methods: An AI-based solution (Galen Prostate) has been developed to identify tissue structures and morphological features within whole-slide-images of prostate core needle biopsies, having been trained on >1M image samples from multiple institutes manually annotated by senior pathologists. The solution detects and grades prostate cancer in addition to other features, such as perineural invasion, high-grade PIN and inflammation. At CorePlus, this AI-based solution was deployed and integrated into the digital pathology workflow as a quality control system. The system raises alerts for discrepancies between the algorithmic analysis and the pathologist, prompting a second human opinion.

Results: 101 retrospective prostate cases from CorePlus comprising a total of 1,279 H&E slides were analyzed by the algorithm, demonstrating high specificity and sensitivity for cancer detection (96.9% and 96.5%, respectively), and an AUC of 0.901 for differentiating low-grade (Gleason 6) from high-grade (Gleason 7+) cancer. Following 10 months clinical use, >2,200 PCNB cases (>29,000 H&E slides) were processed and the reports of 51 (2.25%) of these cases were revised as a result of alerts raised by the AI solution, that were reviewed and accepted by pathologists.

Conclusions: An AI-based QC system is extremely useful for diagnostic accuracy and safety. To the best of our knowledge, this is the first AI-based digital pathology diagnostic system deployed in a US lab and used in routine clinical practice.

J Pathol Inform. 2021 Nov 1;12:44.

Forensic Teleneuropathology: Application of Telepathology for Neuropathologic Consultation in the Forensic Setting

Michelle Stram 1,2, Julian Samuel 1,2, Ben Criss 1,2, Sarah Thomas 1,2, Naeem Ullah 2, Chris Rainwater 2, Jason Graham 1,2, Rebecca Folkerth 1,2

Background: Access to neuropathologists (NPs) in the forensic pathology (FP) setting varies widely and can impact diagnostic adequacy as well as turn-around time. However, telepathology, as used for intraoperative diagnosis in centers without NP faculty, can be adapted for remote viewing of brain gross examinations (“brain cuttings”), using free software and equipment common to many FP offices, allowing real-time evaluation by an off-site NP.

Technology: A motorized photography copy stand with adjustable lights was fitted with a digital Canon EOS-80D camera with a fixed focal length lens. The camera was setup with an AC adapter for continuous usage and attached via USB to a networked computer running Windows 10. The free Canon web utility (EOS 3.13.10) was run in the remote shooting mode providing a real-time display of the camera feed with options for 1x, 5x, and 10x digital zoom, manual and automatic focusing, adjustable exposure, and remote still image capture. The “virtual brain cuttings” were broadcast via the share screen function using Microsoft Teams. Methods: The FP/NP fellows placed case specimens on the stand [Figure 1] and livestreamed images to the NP’s remote computer via Microsoft Teams. After discussion of case history and differential diagnosis, specimen(s) were dissected according to standard protocols, with slabs again livestreamed to the NP. When desired, off-site FPs could also remotely participate in the consultation.

Figure 1.

Figure 1

Forensic teleneuropathology setup for remote viewing of brain gross examination

Results: Between 9/1/20 and 1/31/21, 165 cases were evaluated via teleneuropathology. We found the image resolution to be of high quality [Figure 2], allowing the NP to easily detect and point out subtle findings and make critical decisions on a case-by-case basis, including selection of histologic samples. When the FP participated, real-time discussions of specific diagnostic concerns enhanced the overall consultation.

Figure 2.

Figure 2

Sample image from gross examination utilizing 10× magnification

Conclusions: The application of telepathology for NP consultation can be adapted to FP offices, using technological infrastructure that may already be available, providing a high-quality mechanism for integrating consistent access to an offsite NP.

J Pathol Inform. 2021 Nov 1;12:44.

Inpatient Laboratory Test Utilization of Procalcitonin and C-reactive Protein during and Pre-COVID- 19 Pandemic

Jayalakshmi Venkateswaran 1, Oluwaseyi Olayinka 1, Siaw Li Chan 1, Samuel P Barasch 1

Background: Procalcitonin (PCT) and C-Reactive Protein (CRP) laboratory testing were used as components of the recommended practice by clinical leadership to evaluate severity and prognosis of patients admitted to the hospital for SARS-CoV-2 infection. Elevated serum PCT may be helpful in guiding antibiotic therapy for bacterial superinfection. Plasma CRP level can also positively correlate to the severity of COVID-19, and higher level of CRP showed extended inpatient treatment. Our objective for this study was to understand the inpatient utilization of CRP and PCT tests during and pre-COVID-19 pandemic at Danbury hospital, CT.

Methods: Cerner Discern Analytics 2.0 (Version 3.28.6) was used to query test order volume and results from the laboratory information system for serum CRP (between Jan 2019- Dec 2020) and PCT (between January- Dec 2020. From the queried results, only inpatients (ICU and ED) were included in the data analysis as this patient population was expected to have a significant impact on test utilization due to SARS-CoV-2 surges. Statistical graphs were generated using Microsoft Excel 2013.

Results: A total of 1228 CRP tests were ordered in 2019, and 8649 tests was ordered in 2020, a more than 6-fold increase from 2019. Figure 1 shows utilization was conservative in 2019; however, as expected, there is a distinct exponential increase in ordering pattern around March and April and another increase in November- December 2020, which is correlative to the surge of SARS-CoV-2 admission during those time intervals. CRP distribution based on concentration showed increased number of patients with elevated CRP during the pandemic as compared to pre-pandemic. Interestingly, ordering pattern of PCT revealed a larger increase at the end of 2020.

Figure 1.

Figure 1

CRP order volume year 2019 vs. 2020

Conclusions: Discern Analytics is a powerful tool to analyze data for laboratory utilization of multiple assays and is essential for monitoring clinical laboratory performance. Increases in CRP and PCT utilization correlated with changes in recommendations from clinical leadership for management of SARS-CoV-2. Increase in CRP levels and utilization in 2020 correlate with increased severity of inpatient disease but is not specific. PCT utilization changes could be related to SARS-CoV-2 or allowing unrestricted order by the physicians concurrently.

J Pathol Inform. 2021 Nov 1;12:44.

Corista Virtual Slide Stage: Potential for Use in Daily Practice

Maxwell D Wang 1, Mustafa Yousif 1, Jacob T Abel 1, Jerome Cheng 1, David S McClintock 1

Background: With the increasing push for digitization in pathology, the way in which pathologists interact with whole slide images (WSI) must also undergo further development. While multiple interfacing devices have been proposed and some tested, there is still no single gold-standard, with the vast majority of WSI viewers relying on the traditional mouse and keyboard combination. The aim of this study is to describe a novel WSI input device aimed at easing the transition for pathologists from analog (glass) to digital.

Technology: The Corista Virtual Slide Stage (Corista, Concord, MA) is an early prototype input device for WSI consisting of a thin rectangular “slide” sitting atop a rectangular stage with optical sensors for directional movement. The device also includes an attached keypad console used to change magnification levels, navigating between WSI in the virtual “tray,” and returning to the case queue.

Methods: Five pathologists of varying levels of training participated in this study. Using the Corista Virtual Slide Stage, 20 possible cases were reviewed on Corista’s DP3 digital pathology platform (Corista, Concord, MA), and evaluation of the input method was performed on at least 5 cases. Free-text feedback was then solicited in regards to the use of the novel prototype input device.

Results: Overall, the five pathologists were encouraged by the use of the prototype device, each commenting favorably on its ease of use and ergonomics. The use of dedicated hotkeys for magnification and slide switching were noted as positive aspects of the device. The pathologists noted that while the similarity of navigation was similar to a traditional microscope, there were some aspects of the device that could be fine tuned in future iterations, e.g. the size of the stage relative to the slide and the ability to further customize keymappings and sensitivity of the directional movement.

Conclusions: We present here an initial evaluation of an early prototype input device for WSI that mimics traditional slide navigation for digital pathology. Overall we agree that the Corista Virtual Slide Stage has great potential over the current standard mouse/keyboard default combination and look forward to additional developmental iterations.

J Pathol Inform. 2021 Nov 1;12:44.

Inter-and Intra-variability Assessment of Common Technical specifications of Consumer-off-the-shelf (COTS) Displays

Mustafa Yousif 1, Jacob T Abel 1, David S McClintock 1

Background: The display (monitor) is currently defined and included as an integral part of the digital pathology pixel pathway. Medical grade (MG) displays were chosen over consumer-off-the-shelf (COTS), or professional-grade (PG) displays to be used within FDA validation studies. We present data comparing the technical criteria (to date) for multiple displays, including two FDA-cleared MG display and three sets of COTS displays as follows: 1) Dell MR2416 x 1 (Leica, MG), 2) Philips PP27QHD x 1 (PIPS, MG); 3) HP 1FH49A8#ABA x 4 (COTS, 24”, 1080p); 4) LG 27BN88U-B x 4 (COTS, 27”, 4K), and 5) LG 32BN88U-B x 4 (COTS, 32”, 4K).

Methods: Absolute luminance, luminance uniformity, and color measurements were taken with an X- Rite PANTONE i1Basic Pro2 spectrophotometer using Display CAL software. Display types (3-5) were calibrated to 250 cd/m^2, and 10-12 sets of luminance measurements were taken. Delta E-2000 values were calculated to assess uniformity, which measures the luminance difference between the display’s outer edges and the center. A 490-color test panel was used to assess calibrated color accuracy measurements. For context, these data were compared to 12 sets of measurement data taken for MG displays (1 & 2) in a previous study. Statistical analysis was conducted with Tableau.

Results: The boxplots and the 95% confidence intervals for median luminance deviation from 250 (not shown), uniformity [Figure 1a], and color accuracy [Figure 1b] were calculated and visualized. An ideal measurement for all characteristics is close to 0. Luminance on the Dell MR2416 and LG 27BN88U-B tended to be more uniform, while the LG 32BN88U-B and Philips PP27QHD were less uniform (e.g. higher Delta E-2000). All calibrated monitors had a median Delta E-2000 value < 1 in terms of color accuracy, which is generally considered to be indistinguishable by a human observer.

Figure 7.

Figure 7

Figure 1: Uniformity and color accuracy of measured displays. Luminance uniformity (1A) and color accuracy (1B) was visualized using vertical boxplot data distributions for each of the displays tested (left panels), with corresponding 95% confidence intervals included for median DeltaE2000 values for each display (right panels). Interquartile range is designated by the boxes, the median (50th percentile) value indicated by the change in color within the box, and outliers represented by circles outside the whiskers of the boxplot.

Conclusions: The COTS displays of the same model series, but with different sized display panels, showed significant variation in uniformity and luminance, demonstrating that one cannot assume equivalence based on a vendor’s model series/branding and that there is value in taking empiric measurements prior to display acceptance. An alternative hypothesis is that larger panels tend to be less uniform than smaller panels. Calibration has a marked effect on color accuracy differences between displays. Further evaluation of how these factors may affect a pathologists’ performance in a clinical setting may be warranted.

J Pathol Inform. 2021 Nov 1;12:44.

Why FDA Recommendations for COVID-19 Tests Vary with Patient Population: A Graphical Explanation

Yonah C Ziemba 1, Steven Gamss 2, Nina Haghi 1, MS; Scott Duong 1

Background: FDA recommendations (see Reference) for molecular SARS-CoV-2 assays include sensitivity of 95%, and specificity that varies with patient population. Tests for asymptomatic individuals require specificity of 98%, while tests for symptomatic patients are acceptable with lower specificity of 95%. This may seem counterintuitive, especially since sicker patients with greater clinical risk are matched with less reliable tests. We present an intuitive visual explanation based on positive predictive values (PPV), and we share an interactive online interface that can be customized for any test.

Technology: Desmos is an online tool for interactive classroom activities, regression analyses and graphs. W built an interactive graph on Desmos to illustrate how post-test probability changes in different populations. PPV is calculated by from prevalence, sensitivity and specificity based on the following formulae: PPV=TP/(TP+FP), TP=prevalence*sensitivity, and FP=(1-prevalence) (1-specificity).

Methods: Go to http://bit.ly/PPV-Graph-API-Summit-2021 on any computer or mobile device, and press “Play” to watch the simulation.

Results: Two unexpected observations become apparent as shown in Figure 1. When sensitivity is fixed at 95% and specificity changes at a constant fixed rate, the change in post-test probability is fast at high ranges of specificity and slow in low ranges. In addition, the change in PPV is always greatest where prevalence is lowest. In fact, when specificity moves from 100% down to 75%, a population with 50% prevalence would only see the PPV move down to 80%, while a population with 10% prevalence would see the PPV move to 30%. This illustrates that populations with low prevalence require higher specificity for an acceptable PPV. This is because low-prevalence populations have few true positives, and PPV is related to the balance of true positives to false positives. Asymptomatic individuals have low pre-test probability and are equivalent to low prevalence.

Figure 1.

Figure 1

How does a change in specificity from 100% to 75% effect the ppv?

Conclusion: The simulation illustrates that different patient populations may need different diagnostics tests. Low-prevalence diseases, or common diseases in asymptomatic individuals, need very high specificity in order to have adequate PPV. This is why SARS-CoV-2 assays designed for asymptomatic individuals need specificity of 98%. These principles might not be apparent without mathematical reasoning, and interactive graphs are helpful.

Reference

J Pathol Inform. 2021 Nov 1;12:44.

Genetic Expression Profile Analysis Reveals Overexpression of CHEK1 and BRCA1 in High-Grade Follicular Lymphoma Relative to Low-Grade Follicular Lymphoma

Hassan Rizwan 1, Ali Umer 1

Background: Follicular lymphoma (FL) is the second most frequent non-Hodgkin B cell lymphoma. Progressive genomic aberrations drive the progression of FL between low grade; high grade or transformation to Diffuse Large B Cell Lymphoma (DLBCL). FL are difficult to treat, and relapses are common. Novel therapies such as monoclonal antibodies, immuno-modulating drugs, and target therapy have recently been developed and approved for relapsed FL, however, multifaceted approach to develop effective therapies is needed. DNA damage response (DDR) resulting from genetic abnormalities is well known in solid tumors but DDR activity in lymphoma is sketchy. Heightened DDR can compromise therapeutic efficacy of DNA damaging drugs. Inhibitors of DNA repair molecules are providing clinical benefits as adjuvants in solid tumours (synthetic lethality). This study investigated the distinct gene expression pattern (GEP) for DDR molecules in a series of low- and high-grade FL as well as in DLBCL. Our findings will pave the pathway for adoption of synthetic lethality in lymphoma therapies, and hence carries promise to improve clinical outcomes in FL patients.

Methods: Diagnostic FFPE RNA from 98 FL and 27 DLBCL patient samples were screened for genes implicated with carcinogenesis, tumor microenvironment and immune response (n = 760) via PanCancer IO360 platform (Nanostring Technologies). Patients were categorized into high-grade FL (n = 22) FL and low-grade FL (n= 76). Gene Set Enrichment Analysis (GSEA) was performed for further correlation utilizing public data sets (p < 0.05, q < 0.1) using the titular GSEA software (v4.1.0, Broad Institute inc.).

Results: BRCA1 and CHEK1 were differentially expressed DDR response genes in late-grade FL relative to early- grade FL [Figure 1a]. CHEK1, an enzyme which facilitates the DDR in the cell cycle checkpoint arrest, was differentially expressed in these tumors alongside the tumor suppressor gene BRCA1, which conducts double stranded break repair [Figure 1b]. More DDR effectors were up-regulated in DLBCL compared to higher-grade lymphomas, suggesting CHEK1 inhibition could sensitize high-grade FL to therapies [Figure 1c].

Figure 1A.

Figure 1A

Heatmap illustrating DNA Response (DDR) genes expression between High (n=22) & Low-grade (n=76) FL patirnts. 1B: Box-and-workers plot depicting relative expression of CHEK1, WEE1 and BRCA1 genes between the High-grade FL, Low-grade FL and DLBCL (n=27) patients. 1C: Table displaying the statistical values of DDR effectors and checkpont kinases exhibiting differental expression

Conclusion: Our data indicate that DDR molecules are over expressed in high grade FL as well as in DLBCL, when compared with low grade FL. This enhanced DDR expression is likely to cause resistance to conventional therapies among high grade FL. These observations indicate that specific inhibitors like CHECK1/ BRCA1 and WEE1 can be explored as adjuvant therapeutic agent in high grade FL and DLBCL.

J Pathol Inform. 2021 Nov 1;12:44.

Custom Web Applications for Continual Improvements to Laboratory Workflows

Michelle R Stoffel 1, Nathan Breit 1, Patrick C Mathias 1

Background: Busy clinical laboratories may still rely on spreadsheet-based data analysis workflows for highly specialized assays not yet compatible with standard commercial software solutions. Barriers to automating such workflows include complicated analysis requirements and processes spanning multiple software applications. While spreadsheet-based workflows are highly customizable and accessible, disadvantages include time-consuming formatting requirements, potential for copy- paste errors, and version control issues. We describe a case study of a workflow improvement using an initial cycle of focused custom software interventions as “building blocks” to attenuate several error-prone and time-consuming workflow steps, as part of a plan to eventually automate the entirety of the analysis process.

Technology: We targeted a complex immunoassay analysis workflow, working with laboratory leadership and clinical laboratory technologists to identify the highest-impact workflow steps for imminent improvement. Based on workflow analysis, two web applications were developed. One is a “lookup tool” data dashboard displaying patient data from the laboratory information system database and allowing transfer of search results to the spreadsheet analysis workbook to replace the prior multi-step patient identification workflows. A second web application calculates a dose response curve from assay data and generates custom-formatted output which can be copy-pasted directly from the application to the workbook, eliminating multiple back-and-forth copy-paste steps between separate spreadsheet files.

Results: Preliminary analysis of workflows with the two web applications decreased the assay analysis workflow from 46 to 36 steps, a nearly 22% improvement. Furthermore, the new workflow decreases the number of data modalities/separate open files/worksheets involved from 10 to 8. Finally, while the improved workflow still involves copy-paste steps, one-step copying of the customized output from the web applications is anticipated to be less error-prone than the prior workflow involving multiple point-and-click selection of data and pasting steps.

Conclusions: Custom software applications can be used to optimize portions of a laboratory data analysis workflow with automation in a modular fashion, while allowing clinical laboratory staff to adjust to each change before moving to the next. While the ultimate goal is to move workflows away from a spreadsheet-based approach, a stepwise implementation can be a practical mechanism to achieve improvements.

J Pathol Inform. 2021 Nov 1;12:44.

Age of Cloud-based Applications: Opportunities and Challenges

Ashish Mishra 1, J Mark Tuthill 1

Background: This presentation explains our initial experiences and challenges with deployment of new thin client (browser) based LIS and image management systems in the Department of Pathology and Laboratory Medicine at Henry Ford Hospital, Detroit, MI. A thin client is an application that runs from resources stored on a central server instead of a localized hard drive. Thin clients work by connecting remotely to a server-based computing environment where most applications, sensitive data, and memory, are stored. Thin clients offer several benefits, including reduced cost, increased security, more efficient manageability, and scalability. A “fat client” refers to a typical software installed on a personal computer/ CPU that does all its own data processing.

Fat clients are more difficult to secure and manage, costlier to deploy, and can consume a great deal of energy. They generally require more powerful and costlier hardware than thin clients, but also typically have more features. We at Henry Ford Hospital have adopted a browser- based approach to using thin clients, which means that an ordinary PC connected to the internet carries out its application functions within a web browser instead of on a remote server. Data processing is done on the thin client machine, but software and data are retrieved from the network. We have successfully transitioned to thin client versions of AP LIS (Sunquest Vue), AP Image Management system (ARCC), Biobank Automation Software (BTM) and Document Control system (MasterControl).

Advantages: Lack of software installation; Wide distribution; Ease of license distribution; Ease of scaling; Easy maintenance; Easy updates; Lower costs; More security; Ability to use personal laptops/ computers using VPN; Small Form Factor Clients are sufficient for wide-scale deployment; Support Workstation on Wheels (WOWs) while being compliant with HIPAA.

Challenges: Single point of failure – Server failure may result in every client getting affected

Powerful servers are needed to deploy these applications. Browser wars may result in incompatibility of the application with your favorite browser. Challenges in connectivity to cameras, printers etc.

Technology: Servers running the database application. Small Form Factor PCs running Windows 10 across the health system. Browsers (Microsoft Edge, Google Chrome etc.) Microscope and gross station digital cameras with TWAIN drivers.

Conclusions: The adoption of Thin-Client computing is going to increase exponentially in future as more and more software developers are moving towards it. We have identified several challenges in implementing thin-client computing in Pathology using our experience, as it is still in early stages of development. These are teething troubles and will go away once this becomes the new standard of interacting with applications.

J Pathol Inform. 2021 Nov 1;12:44.

Application of Reverse Federated Database System for Clinical Laboratory Service

Keluo Yao 1, Christopher L Williams 2, Ulysses G J Balis 3, David S McClintock 3

Content: Traditionally, healthcare organizations utilize electronic health record (EHR) system as a federated database system for end users to retrieve data from a multitude of separate dependent databases through a uniform interface. Increasingly, the dependent databases and their originating clinical services require more information from the EHR in order to function. To fill the gap, numerous workarounds with significant limitations have been devised to allow the originating clinical services to function. Here we have created a novel web service application (WSA) for protein electrophoresis that applies reverse federation database system (RFDS) by interfacing the federated EHR as a dependent database to automatically gather information to streamline the laboratory workflow.

Technology: Web Application Programming Interface (API), JavaScript Framework, Representational state transfer (REST), Virtual Private Network (VPN), OS-level virtualization (San Francisco). Design: Using JavaScript framework and OS-level virtualization container, we created a server that can query EHR for essential laboratory values (e.g., total protein) through an EHR Web API REST interface. The queries are initiated by the WSA based on the patient identity and information pulled from the protein electrophoresis instrument database. Additional clinical information including medications, documents, radiology, and other additional laboratory results are also queried and compiled. Security and access are achieved through VPN, Web API key, dedicated machine account, and end-user authentication. Additionally, the use of a server to handle EHR query reduces the exposure of EHR Web API to end user tampering.

Results: Figure 1 shows WSA querying data from both EHR (A) and protein electrophoresis (B) and synthesizes a dashboard (C) and presents other pertinent information including medications (D), other laboratory results (E), and clinical documents (F). Preliminary alpha testing and feedback from the laboratory staff and pathologists indicates the WSA will significantly improve the workflow.

Figure 1.

Figure 1

The EHR (a), as a federated database system, allows multiple dependent databases to be accessed uniformly. The protein electrophoresis workflow (b) requires some information from the EHR and can treat it as a dependent database to provide a dashboard (c) medication reference (d), additional laboratory result (e), and clinical documentation (f)

Conclusion: We have successfully designed a WSA that applies RFDS to streamline the workflow of a laboratory service. The secure design and the opportunities it offers can be used on any clinical service that relies on low throughput and error-prone manual information extraction from the federated EHR for routine operation.

J Pathol Inform. 2021 Nov 1;12:44.

Deep Learning Algorithm to Predict ERG Gene Fusion Status in Prostate Cancer

Daniel Gonzalez 2,3, Vipul Kumar Dadhania 1, Mustafa Yousif 1, Jerome Cheng 1, Saravana M Dhanasekaran 6, Arul M Chinnaiyan 1,4,5,6,7, Liron Pantanowitz 1,5,#, Rohit Mehra 1,5,6

Background: The TMPRSS2-ERG gene rearrangement contributes to the pathogenesis of prostate cancer and plays a role in tumor multifocality and metastatic potential. ERG rearrangement in prostate cancer currently cannot be reliably identified from hematoxylin and eosin (H&E) features. Current methods for detection include immunohistochemistry (IHC) and fluorescence in situ hybridization (FISH). We sought to develop a deep learning algorithm to identify ERG rearranged prostate adenocarcinoma based on digitized slides of H&E morphology alone.

Methods: Using the Python Keras API we developed a deep learning model for distinguishing between ERG rearranged and ERG nonrearranged prostate adenocarcinoma. The model is based on MobileNetV2 convolutional neural network architecture pre-trained on ImageNet. Weights were fine-tuned using datasets of in-house whole slide images (WSI) and The Cancer Genome Atlas (TCGA) database containing ERG Positive and Negative cases. In-house WSI were scanned at 40x using a Leica Aperio AT2 whole slide scanner. WSI’s were reviewed by two pathologists, annotated using QuPath v0.2.3, and exported as 224x224 pixel sized tiles in 10x, 20x, and 40x, for input into the deep learning model. A separate model was trained for each magnification. Training and test sets for the model consisted of 268 cases (763945 tiles) and 155 cases (246060 tiles), respectively. The output of the model consisted of a prediction of ERG Positive or ERG Negative for each input tile. The ERG status for each case was determined by majority vote from the tiles comprising each case.

Results: All three models showed similar ROC curves with area under curve (AUC) results ranging between 0.768 and 0.790 [Figure 1]. The sensitivity and specificity of these models were as high as 0.78 (20x model) and 1.00 (10x model), respectively. Overall case accuracy ranged between 0.75 and 0.78.

Figure 1.

Figure 1

Receiver operating characteristics for each model trained at 10×, 20×, and 40× magnification

Conclusions: We demonstrate that a deep learning based AI model can successfully predict ERG fusion status in the majority of cases from H&E stained digital slides. Such a model can eliminate the need for IHC or FISH testing and thereby improve turnaround time, provide economic savings, and conserve tumor tissue when assessing for prostate tumor mutational status.

J Pathol Inform. 2021 Nov 1;12:44.

Quantitative Hematoxylin and Eosin (H and E) Tissue Staining in Digitized Whole Slide Images (WSI) Using Measures of Hue, Saturation, and Brightness (HSB)

David Kellough 1, Trina Shanks 1, David G Nohle 2, Satoshi Hamasaki 3, Leona W Ayers 2, Mark Lloyd 1, Anil V Parwani 2

Background: A Baldano et al, “Consistency and Standardization of Color in Medical Imaging: a Consensus Report,” J Digit Imaging. 2015 Feb;28 (1):41–52, notes that “lack of a well-defined color framework is limiting the use of digital tools that could maximize the utilization of medical devices for improved diagnostics.” We seek to calibrate image analysis software based on varying color image values generated by histopathology laboratories. To be viable, an inexpensive, rapid, and easy technique must be available.

Technology: Images of H&E stained tissue slides scanned on a Philips UFS scanner at 20x were viewed at 5x magnification smoothing out small variations in sampled areas. In the Philips Image Management System, 3562 pixel sections of heavily eosin-stained areas (stroma/muscle) and heavily hematoxylin-stained, nuclear-rich areas (lymphocytes, ducts, lobules, tumor) were captured. Hue, saturation, and brightness were quantified using a commonly available color picker (Photoshop CC 19.1.6, Adobe, San Jose, CA) to measure staining. 1012 pixel areas (maximum allowed) were sampled. We measured slides prepared 3/30 - 10/14/2020 using H&E images (Leica, Buffalo Grove, IL) at OSU lab 1 (n=133) with Tissue-Tek Prisma Plus (Sakura, Torrance, CA) and at OSU lab 2 (n=140) with ST4040 Linear Stainer (Leica, Buffalo Grove, IL).

Methods: We measured staining in predominantly red and blue areas of 15 breast, 15 gastrointestinal, 15 skin, and 15 prostate biopsies and H&E slides from multiple outside laboratories (n=60). HSB values for sampled areas from images of slides from the two OSU labs were compared with each other and with those from outside institutions.

Results: The HSB color picker was effective in assessing variation in Color staining, we found great consistency among specimens prepared in the same laboratory, similar but not as great consistency among specimens from our institution, but considerable variation among images prepared in consulting laboratories see Figure 1.

Figure 1.

Figure 1

Hue, saturation and brightness values are separated by whether sampled from a primarily Hematoxcylin/blue area (graphs in left column) or a primarily Eosin/red area (right) as well as by cohort within each graph. There is great consistency among specimens prepared in the same laboratory, similar but not as great consistency among specimens from our institution (OSU 1 and OSU 2) and considerable variation among images prepared in consulting laboratories (Non-OSU)

Conclusions: An inexpensive, readily-available HSB color-picker allowed characterization of WSI H&E colors. OSU histology labs had similar measures of central tendency (means, medians), smaller ranges, but large standard deviations with outside slides. Image analysis algorithms should be trained/tested with examples from the range of consulting laboratories served.

J Pathol Inform. 2021 Nov 1;12:44.

Barcode Location Information for Pathologists (BLIP), an Innovative Glass Slide Tracking Program

Edward Kang 1, Bartlomiej Radzik 1, Tushar Patel 1

Background: Slide tracking and management plays a major role in laboratory workflow efficiency, timeliness of diagnosis, and quality assurance. Laboratory information systems often lack an efficient way to track glass slides after delivery to the pathologist, and require specialized equipment (e.g. handheld scanner). Recognizing a need for a more efficient slide management software, we prototyped a mobile glass slide management program designed to close the loop and enable tracking of barcoded glass slides in possession by staff and faculty. Technology: Barcode Location Information for Pathologists (BLIP), utilizes Data Matrix barcode reading technology based on Scandit SDK software (Zurich, Switzerland), running on iPhone 8, iOS 13.3 (Apple Inc., Cupertino, CA), and Google Sheets (Alphabet Inc., Mountain View, CA). Honeywell 1900G-HD, 2D Barcode Scanner was used as a reference.

Methods: We initially surveyed anatomic pathology laboratory staff, faculty, and residents on the time spent locating glass slides. We then developed BLIP as a website consisting of a DataMatrix barcode scanner from the Scandit Barcode Scanner SDK for the Web and an input field for the user to enter their name. BLIP’s HTML form transmits the user’s name, scanned barcode ID(s) and timestamps to a Google Form and ultimately into a Google Sheet. We then compared the relative scan speed of 1000 slides using BLIP versus traditional handheld barcode scanners.

Results: Surveyed end-users (N=19) endorsed inefficiency and difficulty in locating glass slides. Most users spent 1-3 hours/week locating slides. BLIP has a scan rate of 23.17 slides/minute, with an error rate of 5/1000 scans as compared to the traditional scan rate of 60.98 slides/minute with an error rate of 2/1000 scans using the Honeywell handheld scanner [Table 1].

Table 1.

Comparison of scanning methods

Scanning method Number of slides scanned Time (min) Rate (slides/min) Error rate
BLIP 1000 43.2 23.17 5/1000
Honeywell scanner 1000 16.4 60.98 2/1000

BLIP: Barcode location information for pathologists

Conclusion: BLIP demonstrates a proof-of-concept mobile platform for closed loop tracking of bar-coded assets in the laboratory, including glass slides. In addition, this competitive performance and capability can be broadly deployed, even in low resources settings given the universality of mobile phones and networked computers. This platform has the potential to improve workflow efficiency and patient safety by aiding in the timely and complete tracking of glass slides, with cost-effective and easy-to-use hardware and software.

J Pathol Inform. 2021 Nov 1;12:44.

Prediction of HER2 Score in Breast Cancer using Synthetic IHC Derived from H&E Images

Xingwei Wang 1,*, Auranuch Lorsakul 1,*, Margaret Zhao 1,*, Yao Nie 1,*, Hartmut Koeppen 2,*, Hauke Kolster 3,*, Kandavel Shanmugam 4,*

Background: The current practice to diagnose human epidermal growth factor receptor 2 (HER2) positive breast cancer commonly relies on hematoxylin and eosin stain (H&E). To confirm a breast cancer diagnosis, extra multiple tissue sections for immunohistochemistry (IHC) slides are required. In this preliminary study, we explored the feasibility of predicting HER2 scores from H&E stained tissue sections by generating synthetic HER2-IHC images based on the H&E images.

Methods: We used a deep-learning architecture, a conditional Generative Adversarial Network (cGAN), to generate the synthetic HER2-IHC images from H&E images. Using the IRISe platform, a pathologist first created the ground-truth HER2 scores (0+, 1+, 2+, and 3+) within annotated tumor regions. Then, image patches from corresponding tumor areas in the IHC and H&E images were extracted and aligned. Multiple experiments for training/testing were performed for four different HER2 scores, using 7,472 and 1,900 patches of size 128x128/256x256, respectively. We used 80/20 split for training/testing, applying Adam optimizer, learning rate 0.0002, epochs 100/200 for small/large patches.

Results: To test the feasibility of using synthetic IHC images to predict HER2 scores, we selected three synthetic and three real stained image patches for each of the four HER2 scores. The images were randomly presented to one pathologist, who evaluated the HER2 scores and determined which IHC image was a tissue-based stain (“Real”) or synthetic (“Fake”) image. In the preliminary result, the pathologist reached an accuracy of 11/24 (45.8%) in identifying the correct image type, which shows that synthetic images were indistinguishable to the pathologist. The accuracy in predicting HER2 scores for image types “Synthetic” and “Real” was 10/12 (83.33%) and 11/12 (91.67%).

Conclusions: The synthetic HER2-stained IHC images created by a trained cGAN network were indistinguishable from real HER2-IHC images in our experiment. The HER2 scores evaluated based on the synthetic IHC images have a high degree of consistency with the scores based on the real IHC images. This suggests that the synthetic HER2-IHC images might be used as an early read before the availability of traditional tests. A further application can be the generation of large-scale image data for algorithm verification and network training.

J Pathol Inform. 2021 Nov 1;12:44.

Cloud-hosted Neural Networks for Cell Segmentation and Classification of Multiplexed Ion Beam Imaging Data

Cole Pavelchek 1, Noah Greenwald 2, William Graf 3, Erick Moen 3, Dylan Bannon 3, David Van Valen 3

Background: Multiplexed ion beam imaging (MIBI) is a novel variant of immunohistochemistry utilizing mass- tagged antibodies to enable the simultaneous visualization of up to one hundred independent targets. This is a crucial development in the fields of oncopathology and immuno-oncology, allowing for more accurate mapping of cell types and visualization of complex cellular interactions. However, this data is esoteric, with the high numbers of channels overwhelming for even seasoned pathologists. With potentially thousands of cells per image, hand-annotation is prohibitively time-consuming. Furthermore, current cell classification methods require significant post-processing and are not easily accessible for those less familiar with computation. While MIBI is groundbreaking, its application is bottlenecked by the lack of effective annotation methods. Work by the Van Valen lab has found that a combination of convolutional neural networks (CNNs) and cloud computing offers a promising solution.

Methods: An 8-channel CNN was trained to generate predictions for cell edge, interior, and background, subsequently converted into cell masks via watershed transform. Semantic labels were generated by combining masks with class predictions from a 26-channel CNN. A custom Redis consumer was developed to enable predictions via web interface for cloud-hosted segmentation models.

Results: The pipeline for semantic segmentation is shown in Figure 1. Overall pixel accuracy was 93.3%. With an intersection-over-union threshold of 0.5, predicted masks achieved an instance accuracy of 91.6% with a dice score of 0.94, and a corresponding cell type classification accuracy of 87.6%. An improved segmentation model is currently cloud-hosted and available for predictions on the deepcell.org website.

Figure 1.

Figure 1

Semantic segmentation pipeline

Conclusions: In concert, these CNNs perform cell classification and segmentation of MIBI data with accuracy comparable with or superior to current computational methods. Segmentation predictions are accessible within minutes, with no start-up time required, through the deepcell.org website. The full deployment of this pipeline to the cloud will provide researchers and medical centers with easy access to rapid, accurate MIBI annotations without requiring training in computational methods. Ultimately, the combination of CNNs and cloud computing will enable the widespread use of multiplexed ion beam imaging in ways that are currently impossible.

J Pathol Inform. 2021 Nov 1;12:44.

Efficacy of PanopticNets for 3-Dimensional Segmentation of Confocal Microscopy Imaging Data

Cole Pavelchek 1, Geneva Miller 2, Will Graf 2, Erick Moen 2, Dylan Bannon 2, David Van Valen 2

Background: Live-cell imaging is a widely-used modality of data collection. However, in order to quantify and analyze live-cell imaging data, it must first be annotated. State-of-the-art algorithmic methods suffer from a lack of robustness, and manual annotation can be prohibitively time-consuming. These issues compound when considering 3-dimensional datasets, such as confocal microscopy image stacks. We developed and trained neural networks (PanopticNets) that can accurately and reliably predict volumetric segmentations of cellular nuclei.

Methods: A 2-dimensional model was trained on existing labels and used to make initial predictions on a larger unlabeled set of DAPI-stained mouse brain nuclear data. Predicted 2D segmentations were algorithmically matched along the z-axis based on intersection-over-union and manually corrected to produce a training dataset of approximately 1200 distinct nuclei. 3- dimensional PanopticNet models were trained using a patch-based method to predict inner- distance and outer-distance transforms. Overlapping tiled predictions on test images were composited using 3D spline interpolation. Segmentation masks were generated from composited predictions using a 3D watershed transform, using local maxima of inner-distance predictions as seeds and thresholded outer-distance predictions to define cell boundaries. A model was hosted in the cloud and a custom Redis consumer was developed to enable web- based predictions through the deepcell.org website.

Results: Models trained achieved a net pixel intersection-over-union of > 0.8 for segmentation, and > 95% precision and recall for nuclear identification, matching or outperforming state-of-the- art approaches on similar datasets. A sample test prediction is shown in Figure 1.

Figure 1.

Figure 1

Sample test predictions

Conclusion: These models demonstrate the efficacy of PanopticNets for 3D nuclear segmentation, serving as an effective proof of concept, and supporting the future development of a varied dataset used to train a more robust model meant for wider use. Combined with our cloud-hosting and web interface, this will ultimately provide easy, rapid access to accurate 3- dimensional segmentations to researchers worldwide.

J Pathol Inform. 2021 Nov 1;12:44.

Histopathology Image Search for Lymphoma Diagnosis Directly from Hematoxylin and Eosin Slides

Areej Alsaafin 1, Morteza Babaie 1, Zhe Wang 1, Mahjabin Sajadi 1, Soma Sikdar 2, Adrian Batten 2, H R Tizhoosh 1

Background: The microscopic diagnosis of lymphoma remains challenging and complex due to the heterogeneity of its classification. Pathologists usually diagnose lymphoma cases, initially, by observing the morphological features in hematoxylin and eosin (H&E) staining to make an initial decision on a suspected lymphoma case. However, the final decision is made after analyzing immunohistochemistry (IHC) staining, which is a more advanced technique that highlights a specific antigen in tissue. Although IHC is an expensive staining technique, the absence of definitive morphological characteristics on conventional light microscopy makes IHC evaluation essential to specify the lineage of lymphoma cells.

Methods: This work utilized previously diagnosed lymphoma cases to develop a diagnostic model that can retrieve histopathology images that had a diagnosis similar to the diagnosis found in the query image. The aim of this work is to investigate the ability of deep learning features to recognize different types of lymphoma from H&E staining without analyzing the IHC slides. The implementation of our model is divided into four main stages: 1) tissue segmentation, 2) mosaic representation (patching) based on color and location clustering, 3) deep feature extraction using KimiaNet, and 4) image matching based on a similar diagnosis.

Results: The performance of our model was evaluated using an unbalanced dataset provided by Grand River Hospital which consists of 272 H&E slides from different subtypes of lymphoma from Hodgkin and non-Hodgkin lymphoma. The experiments showed promising results in recognizing different types of lymphoma from H&E slides. Furthermore, the results are comparable with the diagnosis accuracy of pathologists using both H&E and IHC stains. For instance, we achieved an accuracy of 71.07% by using only H&E slides to predict B-cell lymphoma [see Figure 1], while the overall accuracy achieved by pathologists, in using both H&E and IHC stains was 70.17%. For Hodgkin lymphoma, we obtained 75.90% compared with 78.05% by pathologists.

Figure 1.

Figure 1

Retrieve similar images to a query image of H&E slide with B-cell lymphom

Conclusion: Our model helps pathologists to take advantage of similar lymphoma cases that have already been studied and analyzed to have a better understanding of lymphoma patterns using inexpensive H&E staining as part of the routine diagnostic process.

J Pathol Inform. 2021 Nov 1;12:44.

MuTILs: A Multiresolution Approach for Computational TILs Assessment using Clinical Guidelines

Mohamed Amgad 1, Lee A D Cooper 1

Background: Tumor-infiltrating lymphocytes (TILs) are an important diagnostic and predictive biomarker in solid tumors. Manual TILs assessment suffers from inter- and intra-rater variability, motivating the development of computational assessment tools. Most existing algorithms diverge from clinical recommendations, which include focused assessment within intra-tumoral stroma, and de- emphasizing TILs in distant stroma or hotspots. Previous work either relied on cellular classification alone, or naive combination of independent region and nuclear models without enforcing biological compatibility.

Technology: MuTILs relies on jointly-trained U-Net deep learning models for segmentation of tissue regions at 10x and cell nuclei at 20x magnification. We rely on average pooling of the 10x tumor and stromal predictions to prioritize high-power fields for assessment. The 10x model is used to improve the 20x predictions by concatenating intermediate feature map representations, consistent with the HookNet architecture. Additionally, upsampled region predictions are used to impose biological constraints to improve training efficiency and predictive accuracy.

Methods: We analyzed scans of FFPE H&E stained slides from 144 breast cancer patients from the TCGA dataset. Ground truth annotations were obtained from our public crowdsourcing datasets of tissue regions and nuclei (NuCLS). Segmented classes included tumor, stroma, TILs, normal acini, necrosis and others. 5-fold internal-external cross-validation was used to measure generalization performance.

Results: Figure 1 shows a sample hold-out set region prediction (C, D), as well as corresponding nuclear predictions before (E) and after imposition of the biological compatibility constraint (F). For reference, panel B shows the true region and nuclei classes. Tumor, stroma, and TILs are respectively colored violet, purple and blue. The biological constraint prevents misclassifications of large fibroblasts and activated lymphocytes as tumor cells. Pearson’s correlation between computational TILs scoring using the ground truth and model predictions is high, both when global scoring is done using the 10x predictions (R=0.761, p<0.001) as well as when focused scoring is done using 20x predictions (R=0.825, p<0.001).

Figure 1.

Figure 1

Sample hold-out set region prediction (C, D), as well as corresponding nuclear predictions before (E) and after imposition of the biological compatibility constraint (F)

Conclusions: MuTILs enables highly accurate computational assessment of TILs in breast cancer in a fashion that is consistent with clinical scoring recommendations, paving the way for improved prognostication and therapeutic targeting.

J Pathol Inform. 2021 Nov 1;12:44.

Implementation of a Referral-Test Management System Using REDCap

Regina Kwon 1, Kevin Martin 1, Lori Sokoll 1, Claire Knezevic 1

Background: Each year, our laboratory Customer Service department handles more than 1,600 requests for non-formulary tests. Though they represent only a small fraction of the tests we perform, referral tests consume significant resources due to a paper-based management process. Tracking efforts for quality assurance and billing have, over time, resulted in duplicate (or even triplicate) data entry, multiple scanning steps, and large paper repositories. We describe here a project to design, test, and launch a referral-test management system using REDCap, a platform better known for collecting research data.

Technology: REDCap (Research Electronic Data Capture, v10.6.5, Nashville, Tenn.), Report Scheduler (Luke Stevens, Victoria, Australia).

Design: We based the functional requirements on an analysis of the workflow, from initial call receipt and pathologist review to sample send-out, result tracking, and periodic reporting. We identified five direct user roles (ordering provider, Customer Service representative, clinical lab scientist, resident/fellow, department administrator) and three indirect (attending, QA/QI staff, and Billing staff), each with different access and reporting needs. An iterative development process followed, driven by end-user interviews and component testing. The new system ultimately comprised one survey and two data-entry forms. The original data and several standalone repositories were cleaned and either loaded into the system or redistributed. The REDCap Alerts & Notifications module was used to generate e-mails and pages on completion of each of the workflow stages, and an external module was used to produce a scheduled report for billing. Simple dashboards were created to track requests in real time.

Results: The project was proposed in December 2020, and design and development began in January. Testing and training were completed in February, and the new system launched in early March 2021. It presently supports more than fifty users and handles all referral tests for five laboratories at a rate of five to seven requests per day.

Conclusions: Our unique implementation of REDCap demonstrates the usefulness of this secure platform as an adjunct to the laboratory information system. This system will help us capture the full referral- test workload, allowing us to evaluate test utilization, plan information system updates, and improve billing tasks.

J Pathol Inform. 2021 Nov 1;12:44.

Towards Digital Pathology: Monitoring Color Calibration in Monitors

Jerry J Lou 1,2, Sherwin Kuo 2,3, William H Yong 1,2, Ryan O’Connell 1,2

Background: Digital pathology relies heavily on computer monitors. Monitors present imaged slides to the pathologist for review but shift colors in the process. Variation in monitor chromaticity can alter the original colors of scanned slides, potentially influencing interpretation. We examine the hypothesis that monitor chromaticity in a standard pathology practice varies sufficiently that can alter visible colors on digital slides, which has not been previously investigated.

Methods: Chromaticity coordinates on the 1931 CIE XY chromaticity space provide a simple method for measuring variation in the color output of monitors. The XY coordinate of blue is approximately (0.174, 0.010); red is (0.736, 0.261); green (0.114, 0.824); white (0.3127, 0.329). Colorimetry from a total of 7 monitors including 2 attending desktops, 2 resident desktops, 1 personal laptop, and 2 public desktops were measured. The XY chromaticity coordinates were obtained for each monitor. Variation in color was measured by the standard deviation of the x and y coordinates, respectively.

Results: Among 7 monitors in our pathology practice, the standard deviation of X chromaticity coordinates was 0.00662 and of Y chromaticity coordinates was 0.00844 [Figure 1]. The range of X coordinates was 0.0190 and of Y coordinates was 0.0278. According to the MacAdam chromaticity ellipses study, a deviation of approximately 0.003 near white light produces a grossly noticeable difference in color. Therefore, the standard deviation and range of XY chromaticity coordinates we measured produce a noticeable variation in the red-green-blue color space.

Figure 1.

Figure 1

Chromaticity coordinates of 7 monitors

Conclusions: Monitor colorimetry varies in a standard pathology practice. This variation is sufficient to alter visible colors on digital slides. A move towards digital pathology necessitates standardization of monitor colorimetry to ensure uniform appearance of digital slides across pathology. Similar to other laboratory tests in the clinical pathology setting, monitors should be considered diagnostic instruments to be validated and maintained.

J Pathol Inform. 2021 Nov 1;12:44.

Data Analysis in Pathology: A Training Approach Geared to Pathology Residents

Nalan Yurtsever 1, Yonah Ziemba 1

Background: Data analysis is a valuable skill in Pathology. The formulas, pivot tables and visualizations that are supported in Microsoft Excel are beneficial to all data-driven fields, and some uses are particularly useful in Pathology data, such as Lookups to match a biopsy to the same patient’s resection and text functions to separate items in a synoptic summary into structured columns. Excel training courses are usually designed for business, and training for pathology applications is limited. Herein, we present an approach that was successfully employed in our institution.

Methods: Three workshops focused on project-based-learning were held on Microsoft Teams. Two residents designed the course and functioned as instructors. Participants received datasets and goals prior to each session and were encouraged to perform each technique as they watched. Each workshop was structured around a capstone project that analyzed real data for a research or QA purpose, and relevant techniques were introduced individually prior to the capstone by an instructor and then replicated by a participant who volunteered to share-screen to the audience. The first capstone project involved workflow analysis and division of workload in a pathology department, and the WEEKDAY function was used to compare workload across days of the week. The second capstone involved separating item in free-text synoptic summaries into separate columns using MID and SEARCH functions. The last capstone used the LOOKUP function to compare biopsy diagnosis to resections. After capstone datasets were prepared, their underlying trends were studied using pivot tables and graphs.

Results: Three workshops were enough to cover the basics of formulas, pivot tables and graphs as applied to Pathology. Participants appreciated the sessions, volunteered to demonstrate exercises, and were able to handle them without much difficulty. Some trainees were involved in independent research projects in which they applied newly learned techniques and were encouraged to share their implementation with the group.

Conclusions: Data analysis in Pathology involves a unique set of techniques, and the basics can be learned in few sessions. Project-based-learning is an effective model, and a remote-meeting platform that allows each participant to access their computer during the meeting is ideal.

J Pathol Inform. 2021 Nov 1;12:44.

Application of Whole Tissue Imaging by Micro-Computed Tomography for the Evaluation of Endoscopic Submucosal Dissection Specimens

Hirotsugu Sakamoto 1,2, Makoto Nishimura 3, Alexei Teplov 1, Galen Leung 3, Peter Ntiamoah 1, Canan Firat 1, Takashi Ohnishi 1, Emine Cesmecioglu 1, Md Shakhawat Hossain 1, Ibrahim Kareem 1, Jinru Shia 1, Yukako Yagi 1

Background: The precise pathological diagnosis of endoscopic submucosal dissection (ESD) specimens is essential in determining subsequent therapy. However, the current pathological diagnosis involves the evaluation of two-dimensional images of cross-sections of resected specimens, which only evaluates a small part of the tumor. Micro-computed tomography (micro-CT) can non-destructively provide three-dimensional reconstructed whole specimen imaging. The aim of this study was to clarify whether micro-CT was able to provide sufficient pathological information in the evaluation of ESD specimens.

Methods: We scanned fresh or formalin-fixed ESD specimens for 10 to 15 minutes using a custom-made micro- CT scanner (Nikon Metrology NV, Leuven, Belgium) after staining them with 10% iodine for 60 to 180 seconds. All paraffin blocks after making slides stained with hematoxylin-and-eosin were also subjected to micro-CT scanning. Reconstructed imaging data were visualized and analyzed using Dragonfly (Object Research Systems Inc, Montreal, Quebec, Canada). We evaluated the extent of the lesion and the presence of the lesion at the resection margin by correlating the reconstructed images obtained from fresh or formalin-fixed specimens with those from Whole Block Imaging (WBI) and whole slide imaging (WSI).

Results: A total of 9 ESD specimens [1 gastric intramucosal cancer, 3 colorectal (1 intramucosal, 2 submucosal) cancer, and 5 colorectal adenomas] were scanned by micro-CT. The matching cross- section slices between the WSI, the fresh specimen, and the WBI are shown in Figure 1. All reconstructed micro-CT images allowed for clear visualization of tissue structure, differentiation between tumor and non-tumor tissue, and the presence of the lesion at the resection margin with the same findings observed on the WSI. However, the micro-CT image of the fresh specimen failed to detect the site of submucosal invasion in one case of submucosal-invasive cancer. The WBIs were able to detect the extent of submucosal invasion in more detail and three-dimensional than WSI.

Figure 1.

Figure 1

The matching cross-section slices between the WSI, the fresh specimen, and the WBI

Conclusions: Our results suggest that a combination of whole tissue imaging by micro-CT and conventional histology could provide a more accurate diagnosis of ESD specimens. For clinical application, it is desirable to improve the visibility of fresh and formalin-fixed specimens.

J Pathol Inform. 2021 Nov 1;12:44.

Deep Learning in Automated Breast Cancer Diagnosis by Learning the Breast Histology from Microscopy Images

Qiangqiang Gu 1, Naresh Prodduturi 1, Steven N Hart 1, Chady Meroueh 2, Thomas J Flotte 2

Background: Breast cancer is one of the most common cancers in women. With early diagnosis, some breast cancers are highly curable. However, the concordance rate of breast cancer diagnosis from histology slides by pathologists is unacceptably low. Classifying normal versus tumor breast tissues from breast histology microscopy images is an ideal case to use for deep learning and could help to more reproducibly diagnose breast cancer.

Methods: Using 42 combinations of deep learning models, image data preprocessing techniques, and hyperparameter configurations, we tested the accuracy of tumor versus normal classification using the BreAst Cancer Histology (BACH) dataset. This approach included two steps. We first tested the patch-level validation accuracy of tumor versus normal classification for 16 combinations of nonspecialized deep learning models, image data preprocessing techniques, and hyperparameter configurations, and chose the model with the highest patch-level validation accuracy. Then we computed the slide-level validation accuracy of the selected models and compared them with 26 hyperparameter sets of a pathology-specific attention based multiple-instance learning model.

Results: Two generic models (One-Shot Learning and the DenseNet201 with highly tuned parameters) achieved 94% slide-level validation accuracy compared to only 88% for the pathology-specific model.

Conclusions: The combination of image data preprocessing and hyperparameter configurations have a direct impact on the performance of deep neural networks for image classification. To identify a well-performing deep neural network to classify tumor versus normal breast histology, researchers should not only focus on developing new models specifically for digital pathology, since hyperparameter tuning for existing deep neural networks in the computer vision field could also achieve a high prediction accuracy.

J Pathol Inform. 2021 Nov 1;12:44.

Blood Culture Positivity Trend Monitoring in COVID-19 Patients

Parsa Hodjat 1, S Wesley Long 1, Randall J Olsen 1, Paul Christensen 1

Background: Clinicians reported a perceived increase in blood culture positivity among COVID-19 patients in our 8-hospital system. We questioned 1) is the blood culture positive rate truly increased in COVID-19 compared to non-COVID-19 patients, and 2) if so, what are the causative organisms, and 3) how does this information guide the institutional response? We used a well-established data query infrastructure to answer these key questions bearing on patient care.

Methods: Since the beginning of the COVID-19 pandemic, we have implemented an automated data query system using SQL queries, Python scripting and cron scheduling to create COVID-19 dashboards. We leveraged our direct access to the Laboratory Information System (LIS) relational database to query all blood culture results from March 1, 2020 to December 31, 2020. We stratified blood culture positivity rates by COVID-19 or non-COVID-19 patient categories. Excel Power Query and pivot tools were used for fast and efficient data analysis including table joins, data cleaning and data visualization.

Results: Compared to non-COVID-19 patients, the overall blood culture positivity rate was lower in COVID-19 patients (4.61% compared to 7.36%); however, the absolute number of positive cultures showed an increase from 770 in March 2020 to 962 and 933 in the peak COVID-19 hospitalization months of July 2020 and January 2021, respectively. We discovered that positive cultures in COVID-19 patients were disproportionately caused by contaminants including skin commensal organisms (41.69% compared to 26.62%). The clinical informatics process took approximately 3 hours to design and run the SQL query and 2 hours to process with Excel Power Query and pivot tools [see Figure 1]. a

Figure 1 .

Figure 1

Direct LIS database, and Excel Power Query and pivot tools (a) used for a fast and efficient data acquisition and analysis showing (b) an overall lower blood culture positivity rate in COVID-19 vs. non-COVID-19 patients, (c) increases in the numbers of positive blood cultures in the peak COVID-19 hospitalization months, and (d) a higher rate of contaminants in the positive cultures of the COVID-19 group

Conclusions: Despite an overall lower rate of blood culture positivity in COVID-19 patients, a disproportionate number of positive cultures were caused by commensal skin organisms. We hypothesized that these findings may be due to high rates of immune suppressive therapies such as corticosteroids and cytokine inhibitors, other immune abnormalities, or inadequate isolation procedures and central line maintenance. In response, our institution renewed efforts to emphasize infection control measures. The clinical informatics team’s direct access to the LIS database played a pivotal role in data acquisition, data analysis and patient care.

J Pathol Inform. 2021 Nov 1;12:44.

Fast and Efficient Data Utilization: The Benefits of Physician Access to Health and Laboratory System Databases

Parsa Hodjat 1, S Wesley Long 1, Paul Christensen 1

Background: Massive volumes of data are recorded in Health (EHR) and Laboratory Information Systems (LIS) that can be effectively used for improving patient management. Unnecessary roadblocks in data acquisition can delay providing actionable information to questions raised by clinicians and hospital leadership. During the COVID-19 pandemic, we leveraged the direct LIS database access of our MD informaticists to provide quick and accurate answers, allowing us to keep up with the fast-paced evolution of institutional strategic planning. (see Figure 1) goes after Three examples of such inquiries were:

Figure 1.

Figure 1

Fast and efficient report generation and data analysis with clinical informaticist's direct access to LIS database

  • Possible increased rate of blood culture positivity in COVID-19 patients

  • Change in incidence of rhinovirus and endemic coronaviruses during the COVID-19 pandemic

  • Perceived increase in HSV-1 positivity in respiratory specimens of hospitalized patients

Methods: We created ad-hoc SQL queries to retrieve the necessary data elements from our LIS relational database (Oracle® DBMS). When necessary, such as for COVID-19 PCR test results, these queries were embedded in a Python script and added to a Linux cron schedule for automated updates. The query results were saved as comma-separated values (.csv) or Excel (.xlsx) files. Excel Power Query was used to join tables and process the data. Excel Pivot Tables and Pivot Charts were used to analyze the data.

Results: In all instances, the time spent on designing and executing the SQL queries was less than 5 hours. The data processing (configuring table joins, cleaning the data) and analysis (creating charts and summary tables) took 2.5 hours on average. In each of the instances we had the complete analysis within 8 hours.

Conclusions: Our experience with direct database access to the LIS database, with appropriate data governance, shows how physician training and appropriate access shorten the amount of time necessary to acquire actionable information. When relying on others for a data pull, there is risk of receiving suboptimal data in the initial queries and the need for multiple course corrections to achieve the best results, which further delays receiving the necessary information. We have found Excel Power Query and Excel Pivot Tables to be optimal tools for rapid, intermediate-level data processing, data exploration and data visualization.

J Pathol Inform. 2021 Nov 1;12:44.

Automated Interpretation of Serum Protein Electrophoresis

Andrew E O Hughes 1, Christopher W Farnsworth 1, Ann M Gronowski 1

Background: Serum protein electrophoresis (SPE) is a common laboratory test that plays a critical role in diagnosing and monitoring patients with clonal plasma cell disorders. SPE data consist of one-dimensional traces with characteristic peaks corresponding to defined populations of proteins. Interpreting these profiles requires manual inspection to differentiate normal versus abnormal patterns, which is often labor-intensive and subjective. To address this, we developed a machine learning model to predict the diagnostic labels assigned to SPE traces from clinical samples.

Methods: 6,737 traces from SPE performed at Barnes Jewish Hospital on a Sebia Capillarys 3 were used for model development. The seven target labels were: no apparent monoclonal peak, abnormal alpha-2, abnormal beta-1, abnormal/possible abnormal beta-2, and abnormal/possible abnormal gamma. For each trace, candidate peaks were identified, and 107 morphological features were extracted. Samples were split 80/20 into training/testing sets, and extracted features were used to train the following models on binary (normal vs. abnormal) and multiclass (specific label) classification tasks: k-nearest neighbors, penalized logistic regression, random forest, and gradient boosting machine. Hyperparameters were tuned using repeated cross-validation or Bayesian optimization. Area under the receiver operating characteristic curve (AUC-ROC) and precision recall curve (AUC-PR) were calculated on the test set. Data processing and modeling were implemented in R (v4.0.3) using tidymodels (v0.1.2).

Results: The best binary classification was obtained with logistic regression, with an AUC-ROC of 0.985 and AUC-PR of 0.993 [Table 1]. At fixed sensitivities of 0.90, 0.95, or 0.99, the corresponding specificities of logistic regression were 0.97, 0.92, or 0.73, respectively. The best multiclass classification was obtained with the gradient boosting machine, with an AUC-ROC of 0.978 and AUC-PR of 0.895. Classification errors were predominantly traces with possible abnormal peaks that were predicted as normal (or vice versa).

Table 1.

Binary classification with area under the curve receiver operating characteristic of 0.985 and area under the curve-precision recall of 0.993

Classification Model AUC-ROC AUC-PR Log loss
Binary K-nearest neighbors 0.948 0.976 0.333
Penalized logistic regression 0.985 0.993 0.152
Random forest 0.981 0.991 0.170
Gradient boosting machine 0.985 0.992 0.154
Multiclass K-nearest neighbors 0.938 0.799 0.823
Penalized logistic regression 0.972 0.847 0.381
Random forest 0.974 0.867 0.384
Gradient boosting machine 0.978 0.895 0.314

AUC-ROC: Area under the curve receiver operating characteristic, AUC-PR: Area under the curve-precision recall

Conclusions: The features used by laboratorians to interpret SPE traces can be readily extracted and used by standard machine learning models to accurately label SPE traces. In practice, these tools have the potential to reduce the time required for manual review and standardize interpretations across reviewers and laboratories. In addition, this approach is applicable to one-dimensional data produced by different instruments or for other applications.

J Pathol Inform. 2021 Nov 1;12:44.

Three-Dimensional Blood Vessel Structure on Whole-Block Image: Benign vs. Malignant

Takashi Ohnishi 1, Alexei Teplov 1, Hirotsugu Sakamoto 1,2, Emine Cesmecioglu 1, Noboru Kawata 1,3, Kareem Ibrahim 1, Peter Ntiamoah 1, Canan Firat 1, Meera Hameed 1, Jinru Shia 1, Yukako Yagi 1

Background: We have developed a deep-learning based blood vessel segmentation method for whole- block images (WBIs) acquired by micro-computed tomography (micro-CT). It allows us to analyze the three-dimensional (3D) characteristics of blood vessel in a noninvasive manner. In this study, we applied the proposed method to WBIs of colorectal tissue and tried to visually compare the 3D structure of the blood vessels between benign and malignant cases as a feasibility study.

Methods: 70 Formalin-Fixed Paraffin-Embedded colorectal tissues were scanned with a custom-build micro-CT scanner (Nikon Metrology NV, Leuven, Belgium). We implemented the blood vessel segmentation method using deep learning with Pytorch API (Facebook, Menlo Park, CA). A VNet was used as a basic network structure and the convolution block was modified to a residual-inception module. The implementation environment was as follows: Intel Core i7-5960X with 8 cores, 128GB RAM and GeForce RTX 2080 Ti with 4352 cores and 11GB VRAM (Nvidia, Santa Clara, CA). To compare the 3D structure of blood vessels, we selected 5 WBIs from benign and malignant cases respectively.

Results: Figure 1 shows examples of the WBI with segmented blood vessels. The shape of segmented blood vessels was smooth, and we confirmed our method could clearly segment and visualize thin blood vessels as well as thick ones. We can see the blood vessels were more densely distributed in the malignant case than the benign case. Similar to Figure 1, we could observe many blood vessels in other malignant cases.

Figure 1.

Figure 1

Examples of the WBI with segmented blood vessels

Conclusions: We applied the blood vessel segmentation method to whole-block images (WBIs) acquired by micro-computed tomography. Our method could segment blood vessels from WBIs and the three-dimensional structure and distribution of blood vessels could be visually compared between benign and malignant. The results showed the density of blood vessel in malignant cases tends to be increased. In the future, we will develop additional automatic analysis methods to quantitatively measure and compare the density or number of the blood vessels.

J Pathol Inform. 2021 Nov 1;12:44.

Federated Learning for Digital Pathology: Training Algorithms without accessing patient Data to Protect Patient Privacy

Fahime Sheikhzadeh 1, Ishan Shah 1, Srividya Doss 1, Yao Nie 1

Introduction: Herein, we explore the development of a Federated Learning (FL) system for Digital Pathology (DP) applications. Federated learning as a decentralized training approach enables different hospitals to collaboratively train a machine learning model while keeping their patient data local. In this approach, multiple client devices collaborate and train a deep learning model (global model) without sharing their training data. A server maintains and updates the global model in an iterative process. At each iteration, each client receives and retrains the global model on their local data and sends it back to the server. The models from all clients get aggregated at the server site. The aggregated model is then sent back to the clients. The iterations continue until training converges.

Methods: In this study, we developed a U-Net based deep learning model for tumor, stroma, and other tissue type segmentation on H&E images. In total, 221 images with the approximate size of 1024 by 1024 pixels, at 10X magnification were extracted from 77 whole slide images and used in our experiments. A fixed dataset at the server was used for validation. Models that were generated at the client site are not controlled by the server and were shared with the server only based on the client privacy policy. Each client had an independent validation dataset and used it to examine the performance of the model based on their quality standards. Based on this validation, the client decided to deploy the global model or not. We performed three experiments including training the model with three clients, with Independent and Identically Distributed (IID) data, three clients with non-IID data, and six clients with partial participation in the FL process.

Results: We successfully trained the segmentation model in a federated manner, without accessing the client data. In our experiments, the global model outperformed the model trained on the centralized data or had a comparable performance.

Conclusion: Establishing a FL system enables us to capitalize on real world data that should stay at hospitals or laboratories and train a global model that is suitable for all the users to make better diagnostic decisions.

J Pathol Inform. 2021 Nov 1;12:44.

Customizable Online Platform for Remote Asynchronous Medical Education

Matthew Anderson 1, Pouya Jamshidi 1, Thanh T Ha Lan 1, Thomas A Victor 1, Thomas J Gniadek 1

Background: The COVID-19 pandemic created an immediate need for remote learning in medical education. Online question banks are available; however, their content is static and not customized to each learner or instructor. Technology: The website (http://www.pathqbank.org) is deployed on an InMotion Hosting (Virginia Beach, VA) shared server and utilizes a LAMP stack. Yii Framework 2.0 was used as the backbone which contains a model-view-controller style architecture. Development was done using cPanel applications, such as phpMyAdmin (The phpMyAdmin Project), included with the Inmotion Hosting account.

Methods: A custom website was created using freely-available software tools. User access control, image overlays, question and answer feedback, and performance analytics were added using custom programming. The user access portion of the website was developed to allow users to control who has access to their content while at the same time promoting an open access vision. Instructors can add overlays to images, which can be visible all the time or only when a specific answer is selected. The overlays are created using the Imagine php library and are in the form of either an arrow or a circle. In addition to the displayed answer choice text, each answer can be linked with specific text data, which is displayed when that answer is selected as a choice. The intent is for this text to represent an explanation. Instructors can access statistics involving individual student performance and overall performance for each question.

Results: The platform successfully enabled the creation of asynchronous, customized image and problem-based learning activities as well as tracking of trainee performance over time and across keyword-defined topic areas. In addition to student and question specific performance analysis, the keyword search function allows instructors to search for question stems, correct answers, and incorrect answers.

Conclusions: Using customizable problem-based learning such as the platform described here, medical educators can deploy asynchronous remote learning while preserving the individuality of the educational curriculum and tailoring the educational experience to their trainees.

J Pathol Inform. 2021 Nov 1;12:44.

High Throughput Truthing (HTT): Pathologist Agreement from a Pilot Study

Brandon D Gallas 1, Katherine Elfer 1, Mohamed Amgad 2, Weijie Chen 1, Sarah Dudgeon 3, Rajarsi Gupta 4, Matthew Hanna 5, Steven Hart 6, Richard Huang 7, Evangelos Hytopoulos 8, Denis Larsimont 9, Xiaoxian Li 10, Anant Madabhushi 11, Hetal Marble 6, Roberto Salgado 12,13, Joel Saltz 4, Manasi Sheth 14, Rajendra Singh 15, Evan Szu 16, Darick Tong 16, Si Wen 1, Bruce Werness 16

Background: Artificial intelligence algorithms in digital pathology have enormous potential to increase diagnostic speed and accuracy. However, the performance of these algorithms must be validated against a reference standard before deployment in clinical practice. In this work, pathologists are considered as the reference standard. We studied interobserver variability in pathologists who evaluate stromal tumor-infiltrating lymphocytes (sTILs) in hematoxylin and eosin stained breast cancer biopsy specimens. Our ultimate goal is to create a validation dataset that is fit for a regulatory purpose.

Methods: Following an IRB exempt determination protocol, we obtained informed consent of volunteer pathologist annotators prior to completing data collection tasks via two modalities: an optical light microscope and two digital platforms (slides were scanned at 40X). Pathologists were trained on the clinical task of sTIL density estimation before annotating pre-specified regions of interest (ROIs) across multiple platforms. The ROI selection protocol sampled ROIs in the tumor, tumor boundary, and elsewhere. Inter-pathologist agreement was characterized with the root mean-squared difference, which is analogous to the root mean-squared error but doesn’t require ground truth.

Results: The pilot study accumulated 6,257 sTIL density estimates from 34 pathologists evaluating 64 cases, with 10 ROIs per case. The variability of sTIL density estimates in an ROI increases with the mean; the reader-averaged root mean-squared differences were 8.3%, 17.7%, and 40.4% as the sTIL density reference score increased from 0-10%, 10-40%, and 40-100%, respectively. We also found that the root mean-squared differences for some pathologists were considerably larger than others (as much as 120% larger than the next largest root mean-squared difference).

Conclusions: Slides, images, and annotations were successfully provided by volunteer collaborators and participants, which created an innovative and thorough method for data collection and truthing. This pilot study will inform the development of statistical methods, simulation models, and sizing analyses for pivotal studies. The development and results of this validation dataset and analysis tools will be made publicly available to serve as an instructive tool for algorithm developers and researchers. Furthermore, the methods used to analyze pathologist agreement between density estimates are applicable to other quantitative biomarkers.

J Pathol Inform. 2021 Nov 1;12:44.

Delays among Laboratory Results Delivered Via EHR Notification Messages

James A Mays 1, Jason M Baron 1, Anand Dighe 1

Background: Although clinical laboratories alert clinicians to “critical” test results for their patients via telephone or other rapid means, such critical callback procedures typically apply only to the most immediately life-threatening results. Results that are clinically time-sensitive but not critical are typically not called and only reported using electronic health record (EHR) functionality, risking delayed result review. In addition to posting the result in the patient’s electronic chart, our EHR sends clinicians e-mail-like “In basket” messages with new test results on their patients. To enable the development of an improved result communication process, we analytically assessed current reporting protocols using creatinine testing as an example.

Methods: We evaluated meta-data from 6,025 outpatient In basket messages reporting high creatinine results sent through our hospital’s EHR (Epic Systems, Verona WI). Meta-data included recipient, result value, creation timestamp, message interaction timestamp, and message status at the time of query. We excluded patients with a history of renal failure. We calculated the time from result release until results were seen by providers.

Results: Among the 6,025 abnormal creatinine results, there was wide variation in notification time. The median time between from result reporting to review was 46.1 hours (Nephrology: 48.3 hours; Transplant: 28.7 hours; Other: 47.8 hours). By contrast, the 90th percentile was 20.0 days (Nephrology: 29.3 days; Transplant: 8.1 days; Other: 22.25 days). These findings were not limited to a subset of providers: when evaluated as individuals, the median provider 90th percentile “time to notification” was 6.1 days. Notifications of abnormal results remaining unread > 3 days represented 2,429 out of 6,025 (40.3%) of messages.

Conclusions: There is wide variation in the time to complete delivery of non-critical creatinine results to ordering providers. Some of these results likely represent sudden changes in patient status not seen promptly by providers and may have clinical consequences. One important caveat is that in some cases, providers may have seen the results by another method than the In basket message. We are developing an EHR-based intervention to follow up on high-risk results not reviewed in a timely fashion and will present this strategy.

J Pathol Inform. 2021 Nov 1;12:44.

AI in Prostate Pathology – Is this Becoming Reality?

Liron Pantanowitz 1

Background: There is high demand to develop clinically useful computer-assisted diagnostic tools to evaluate prostate pathology specimens. Multiple studies have looked at the performance of AI-based tools in the detection and grading of adenocarcinoma within prostate core needle biopsies. These studies have employed different training and testing approaches, divergent study designs, and yielded varying degrees of accuracy. Early adopters have begun providing feedback about such AI systems deployed in clinical practice. In order to become part of routine clinical practice, these deep learning algorithms need to be assessed in terms of their scope, accuracy, and clinical relevance. The aim of this presentation was to therefore assess and compare the characteristics, performance, and validation of multiple AI-based diagnostic tools in prostate pathology.

Methods: Different peer-reviewed studies were evaluated in terms of algorithm training sets, test sets, training methodology, validation approach, scope of tasks, study design, performance characteristics and clinical utility.

Results: There is large variation between AI-based approaches applied to prostate pathology algorithm development, particularly with respect to validation and experience in routine clinical practice. Prostate AI-based tools deployed to date have revealed discrepancies in current prostate biopsy diagnosis, that accordingly demonstrates their utility in augmenting pathologists in clinical practice settings.

Conclusions: Although AI-based tools for evaluating prostate biopsies need to be closely appraise

d on an individual level for their design and performance, this analysis demonstrated clinical utility and strong evidence to support that implementation of these AI systems in routine prostate diagnosis is becoming a reality and has the potential to improve patient care.

J Pathol Inform. 2021 Nov 1;12:44.

Accuracy of AI Whole Slide Image Analysis is Adversely Affected by Pre-Analytical Factors such as Stained Tissue Slide and Paraffin Block Age

Ruhani Sardana 1, Satoshi Hamasaki 2, David G Nohle 1,3, Leona W Ayers 1,3, Anil V Parwani 1,3

Background: Personalized medicine and accurate quantification of tumor and biomarker expression have become the cornerstone of cancer diagnostics. This requires Quality Control of research tissue samples to confirm that adequate target tumor tissue is present in the tissue sample.

Digitalization of Pathology stained tissue samples makes it easier to archive, preserve and retrieve slides and paraffin blocks for review or study in time of need. Can pre-analytic and analytic factors such as digital image reproducibility, different machine algorithms, tissue age and condition of hematoxylin and eosin (H&E) stained tumor tissue be mediated so that morphometric algorithms can quantify the percentage of tumor?

Methods: H&E slides with whole-slide images from CHTN-MWD Image Quality Control Repository were utilized. Rapidly processed, consented research tissues had been fixed, stained and scanned contemporaneously (within one month). Two cohorts of malignant, colorectal cancer, 20X WSI (ScanscopeXT, Leica Biosystems, Illinois) and slides were assembled. The recent cohort had 76 images created in 2018 or later. Aged cohort had 73 from specimens procured 5 -8 years ago. 20 recent adenocarcinoma Whole-slide images were used to construct image analysis algorithms (VIS, Visiopharm A/S, Denmark) using machine learning to produce morphometric maps and calculate tissue and tumor areas. Tumor areas in the images from the aged cohort were grouped by year (2012 –2014).

Results: Algorithmic analysis results of 69 images from rescanning aged slides vs. that of contemporaneous images found 18 (28%) had similar tumor areas (within 10%), 56 (82%) had similar tissue areas and 54 (79%) had a similar percentage of tumor. Figure 1 shows example (left to right) H&E images, classification maps; (top to bottom) original contemporaneous, rescan, recut; scale 1mm.

Figure 1.

Figure 1

Example (left to right) H&E images, classification maps; (top to bottom) original contemporaneous, rescan, recut; scale 1mm

Conclusions: Images of aged H&E slides and stained paraffin block re-cuts produce different tumor quantification compared to the original scans likely due to pre-analytical factors. The difference in the tumor area detected between original and later rescanned images trended upward from 2012 to 2014. Less tumor area is detected as slides age. Recut and H&E stained tissues from stored paraffin blocks may detect more tumor due to excess eosinophilia.

J Pathol Inform. 2021 Nov 1;12:44.

Toward Placental Region Identification and Blood Vessel Classification using Machine Learning

Brian Vadasz 1

Background: The placenta is a key driver of diseases of pregnancy and abnormalities in the placenta reflect life-long risk of disease in the mother and infant. Correct identification of placental regions (region ID) is foundational for diagnosis – observations have differing significance based on their location. For example, thick-walled arterioles (TWA) in the decidua are evidence of pathologic failure to remodel maternal vessels and are characteristic in diseases such as preeclampsia. Conversely, TWA arterioles in fetal stem villi are the norm. We sought to develop models that use region ID to correctly differentiate normal from abnormal blood vessels.

Methods: We trained a region identification (ID) network using 200 whole-slide images of placental disc and membrane rolls on TensorFlow2 and Keras. Placental regions, including terminal villi, stem villi, decidua, fibrin, amnion, and background were rectangularly annotated, extracted, and 128x128 pixel regions were classified using a ResNet50-derived classifier. We compared a series of blood vessel segmentation and/or classification networks, including a MobileNetV2-derived U-Net, active learning based on 32x32 pixel micropatches, simple patch classification with a VGG16-derived classifier and combined region ID/blood vessel classification network. For whole-slide region identification, 160x160 pixel fields with 32 pixel overlaps were generated across whole slides. Fields were classified by averaging the logits of 4 128x128. IRB approval was obtained, STU00211333.

Results: Region ID showed a categorical accuracy of 0.98 in a balanced test set, with accuracy ranging from 1 for terminal villi and glass to 0.96 for fibrin. Whole-slide image classification shows plausible classification from non-bespoke fields (Figure 1, left panels). Among vessel identification networks, a U-Net showed reasonable segmentation of blood vessels from background. For combined detection/classification, it showed high nominal accuracy >0.95, driven by the large number of background pixels, while actual detection was poor. Micropatch active learning was unsuccessful. Simple classification showed an accuracy of 0.86. Integrating vessel classification with region ID gave an overall accuracy of 0.9 with accuracies of 0.68 and 0.66 for normal vessels and TWA, respectively (Figure 1, right panels). Normal vessels were liable to misclassification as decidua (12%) – reasonable given this is their normal location or TWA (10%). TWA were misclassified as stem villi (10%) – exactly the problem we are trying to avoid or as normal vessels (16%).

Figure 1.

Figure 1

Whole-slide image classification shows plausible classification from non-bespoke fields (left panels). Integrating vessel classification with region ID gave an overall accuracy of 0.9 with accuracies of 0.68 and 0.66 for normal vessels and TWA, respectively (right panels)

Conclusion: These results demonstrate that placental microanatomy is readily machine-recognizable. We report findings from several strategies for vessel recognition and classification, using region ID as a basis.

J Pathol Inform. 2021 Nov 1;12:44.

EMR Retrieved-transfusion Records Analysis and Blood Utilization Management Optimization: The AUBMC Experience

Omar Z Baba 1, Rami Mahfouz 1

Background: The American University of Beirut Medical Center (AUBMC) is the largest tertiary care center in Lebanon with a blood transfusion service that supplies around 14,000 units of blood products per year. A Blood Utilization Committee surveils the blood transfusion practice including transfusion indications, release, hand-off, documentation and reporting of transfusion reactions through monthly audit reports. With the implementation of Epic EHR system (Epic Systems, Verona, WI) in 2018, new prospects arose for robust data retrieval insuring a higher level of surveillance and abiding by the minimum sampling size set by Joint Commission International standards (5-10% of population size >1000).

Methods: Using Cogito Data Warehouse, Epic’s enterprise intelligent suite, a crystal report was engineered to compute the percentage of compliance with documentation standards for blood components transfused in different clinical services. The reported indicators include records of (1) vital signs pre-, during and post-transfusion, (2) transfusion reactions occurrence and (3) blood transfusion informed consents. A separately generated excel sheet extensively displays de-identified stratified raw data listing comprehensive details, ensures accountability by listing involved users and allows auditors to verify clinical validity using a computed process of transfusion indications cross-checking with pre-transfusion labs.

Results: Data retrieved from November 2018- March 2019 revealed non-compliance percentages of 75% for vital signs documentation, 75% for transfusion reactions documentation and 92% for informed consents. Subsequent circulation of education tip-sheets through administrative and nursing channels, implementation of mechanical hard-stops in the transfusion administration interface, and improved data extraction algorithms lead to improvements in non- compliance percentages in the subsequent 5 month-period of April-August 2019: 65 % for vital signs documentation, 19 % for transfusion reactions documentation and 36 % for informed consents.

Conclusion: Comprehensive blood transfusions’ electronic records retrieval provides Blood Utilization Committees with improved insight on patterns of incompliance with acknowledged transfusion guidelines and documentation quality standards. It hence allows targeted and effective interventions to enhance a practice conforming to the highest quality standards governing transfusion medicine.

J Pathol Inform. 2021 Nov 1;12:44.

Leveraging Tracking System Database to Extract Gross Dissection Activity Data

Dibson Gondim 1

Background: Gross dissection is a critical step of pathology examination. Optimal staffing of gross dissection operation prevents creation of workflow bottleneck. On academic medical centers with pathology residency programs, tracking gross dissection metrics is not only important to monitor the operation as a whole, but also to keep track of residents’ progress. Dissection time, the amount of time taken to gross a case, is a quantitative variable that can be used isolated or as a component of a metric. Although case dissection time was not available in our laboratory information system, a script was created to extract this information from the tracking system database.

Methods: The tracking system database was accessed and rows containing pathology case number, username, type of activity, and timestamps were downloaded. Jupyter notebook, python 3.7 programming language, and Pandas library were used for data exploration and creation of a script. Case dissection time was defined by the difference between the timestamp associated with the last cassette of a case that was scanned at the gross dissection and the timestamp associated with specimen container scanning at gross dissection. A Tableau© dashboard was created to show dissection time per individual, per type of specimen and per case.

Results: The dissection time documented for most specimens was close to the real dissection time. This methodology requires no additional step in the workflow. Therefore, it doesn’t create a documentation burden. Another advantage is possibility of historical data extraction, in addition to quasi-real-time monitoring. One shortcoming is that delays related to breaks are not taken into account. Additional data cleaning strategies may be used to filter imprecise intervals due to delays. The Tableau© dashboard is used by management, residency program leadership, and individuals performing gross dissection for monitoring, documentation and feedback.

Conclusion: Dissection time can be extracted from the database of tracking systems using timestamps of specific events. This piece of information can be potentially helpful for multiple stakeholders. There are opportunities for creators of pathology software solutions to design an easy workflow to capture dissection time with better accuracy.

J Pathol Inform. 2021 Nov 1;12:44.

WoC-Bots: A Multi-agent Approach to Predicting Lymph Node Metastasis from Primary Breast Tumors

Sean Grimes 1, Mark D Zarella 2, Fernando U Garcia 3, David E Breen 1

Background: Multiple methods for predicting lymph node metastasis in breast cancer patients have been presented previously, including the Mayo and Memorial Sloan-Kettering Cancer Center nomograms and Stanford Online Calculator. We present a unique method, using “Wisdom-of-Crowd Bots” (WoC-Bots), a multi-agent, social approach to binary classification of node-positive or node- negative disease, to predict lymph node status in order to avoid surgical dissection.

Technology: WoC-Bots are simple, modular agents that include a small multi-layer perceptron classifier and are trained with different, small subsets of the overall feature space. They interact with each other socially, sharing information about patients features, e.g., primary tumor stage, age, primary tumor size, and histologic grade, to form knowledge-diverse crowds. A swarm aggregation mechanism, based on honeybee foraging optimization, is used to elicit an overall prediction from the WoC-Bots. The distributed and social nature of WoC-Bots allows for the inclusion of additional features without the re- training requirement found in traditional deep neural networks. The swarm mechanism is able to place each prediction into a confidence interval, giving each prediction a categorical confidence value.

Methods: We used an existing dataset of 457 de-identified patients. 4-to-8 randomly selected features were distributed to each of ~250 WoC-Bots; all WoC-Bots received 2 additional features, “primary tumor size” and “histologic grade”. WoC-Bots’ multi-layer perceptron classifiers were initially trained on their feature subset, followed by interaction periods to share feature and prediction information and build a performance history used to determine individual bot performance and trust. The prediction history of each WoC-Bot was used to assign correctness probabilities by the swarm aggregation mechanism. The testing dataset was split 80% training, 20% testing. Results: WoC-Bots achieved an overall accuracy of 77%, with a best-case simulation at 81.2% accuracy. Additionally, predictions assigned to the “HIGH” and above confidence intervals, 44 of 81 (55%) of predictions, were able to achieve a combined 83.77% accuracy.

Conclusion: We present a distributed approach to predicting lymph node metastasis that achieves similar results to existing methods, while allowing for the incorporation of addition features as they become available without the expensive re-training required in traditional deep neural networks.

J Pathol Inform. 2021 Nov 1;12:44.

Create Public Health Surveillance Dashboard using Laboratory Analytics: Leverage Visiun to Support COVID-19 Pandemic Response

Mehrvash Haghighi 1, Maria Mcguire 1, Daya Adhimoolam 1, Ricky Kwan 1, Adolfo Firpo 1, Rao Mendu 1, Melissa Gitman 1, Russel McBride 1, Catherine Craven 1

Background: Pandemics are unpredictable and can rapidly spread worldwide. Proper planning and preparation for managing the impact of outbreak is only achievable through continuous and systematic collection and analysis of health-related data. We describe our experience on how to comply with the reporting requirement and develop a robust and consistent platform for surveillance data during an outbreaks.

Methods: We applied Visiun, a lab analytic dashboard, to support main response activities. EPIC SliderDicer module was used to develop clinical and research reports. We followed World Health Organization (WHO), federal and state guidelines, departmental policies, and expert consultation to create the framework.

Results: The developed dashboard integrates the data from scattered sources to be seamlessly distributed to major key stakeholders. The main report categories include federal, state, laboratory, clinical, employee’s health, and research. The federal and state reports are required to meet the government and state reporting requirements. The laboratory group is the most comprehensive category including operational reports such as performance metrics, technician performance assessment, and analyzer metrics. The close monitoring of testing volumes and lab operational efficiency is essential to manage the increasing demands and provide timely and accurate results. The clinical data and employee health reports are valuable to properly manage medical surge requirements such as the health care workforce and medical supplies. The reports included in the research category are highly variable and depend on the health care setting, research priorities and available funding. Table 1 shows the list of developed reports, the information system that data are extracted from, the frequency of distribution, and the list of external and internal recipients of reports. This study also includes multiple designed reports for quality assurance and additional reports for clinical, employee health, and research reports to be added to the current dashboard to optimize the reporting system.

Table 1.

Name Origin of report Frequency of distribution Recipients
Federal reports External Internal

Number of tested samples and percent positive for COVID-19 (grouped by age) CP-LIS Daily Federal agencies such as CDC CMIOs, CNOs, Infection control. Bed management. Lab Leadership, Hospital administrators

Number of tested samples and percent positive CP-LIS Daily

The overall percentage of patient visits for ILI (influenza-like illness) EPIC Daily

Associated factors with hospitalizations:
-By age
-By sex
-Age-adjusted COVID-19-associated hospitalization rates by race and ethnicity
EPIC Daily

State reports

Viral testing (positive, negative, total) by patient zip code/county CP-LIS Daily New York State government CMIOs, CNOs, Infection control. Bed management. Hospital administrators


Heat map Daily

Laboratory operational reports

Performance metrics

Cumulative COVID-19 tests ordered (number, percent) result breakdown CP-LIS Daily Lab Leadership


Cumulative antibody test (number, percent) results breakdown Daily


Percent results reported within target TAT, average and range of TAT Daily


Average number of the test resulted by the hour Daily

Technician performance assessment

Number of the test resulted by each technician CP-LIS Daily Lab Leadership


Average of the test resulted per technicians per hour compared to received tests per hour Daily

Analyzer metrics

Batch size and frequency of viral testing vs. TAT (measured for Roche) CP-LIS Daily Lab directorship


Batch size and frequency of viral testing vs. TAT (measured for Cepheid) Daily


Batch size and frequency of antibody testing vs. TAT Daily

Clinical reports
Test result breakdown by age trending CP-LIS Daily CMIOs, CNOs, Infection control. Bed management. Lab Leadership, Hospital administrators

Test result breakdown by sex trending

Antibody testing numbers, breakdown of results and titer of antibodies CP-LIS Daily CNO, Transfusion team

List of eligible antibody donors CP-LISa Daily CNO, Transfusion team. Infection control

Employee health reports

Employee test results breakdown per day CP-LIS Daily Infection control


Cumulative positive results in Employee (trending) Daily

Research reports

List of pregnant patients with positive viral testing EPIC On-demand Corresponding research team

Number of patients with prior positive viral testing with first negative viral testing grouped by duration CP-LIS On-demand

Number of patients with positive viral testing with first antibody- positive test vs. days CP-LIS On-demand

Conclusion: In this paper, we reviewed the key components of a surveillance framework required for a robust response to COVID-19 pandemics. We demonstrated leveraging a lab analytics dashboard, Visiun, combined with EPIC reporting tools to function as a public health surveillance system. The designed framework could also be used as a generic template for possible future outbreak events.

J Pathol Inform. 2021 Nov 1;12:44.

Robotic Process Automation: A Novel Method in Streamlining Digital Pathology Validation

Mehrvash Haghighi 1, Lam 2, Ricky Kwan 2

Background: There has been significant momentum in the adoption of digital pathology, since Philips’s announcement of FDA clearance for use of its technologies in primary diagnosis in 2017. Clinical validation is a key component of the preparation for this transition, and CAP guidelines recommend a minimum of 60 cases be evaluated using the digital platform. RPA (Robotic Process Automation) is an application of technology, governed by structured inputs, aimed at automating processes such as data entry. In this study, we attempt to introduce a novel method for streamlining the validation process, using RPA technology. We selected UiPath Studio (New York, NY) to design automated case accessioning processes visually, through diagrams and flowcharts, and optimized our digital pathology validation workflow.

Methods: The validation process can be a daunting and labor-intensive task. Most institutions use paper sheets or excel files to distribute the required case information to pathologists, and to document glass slides and corresponding diagnoses. All cases were accessioned in the LIS test environment, which was interfaced with the Philips scanner. We automated the accessioning process by recording a sample case accession in UiPath and then modifying the recorded diagrams and logic to further customize the process for automation across a variety of use cases. We selected 69 archived cases and extracted the required data for accessioning in excel format using a customized SQL report as structured input to feed the RPA logic.

Results: Using the UiPath RPA, we decreased the accessioning time of a validation set from 2 hours and 22 minutes to only 20 minutes. We also noticed that automation increased staff satisfaction and eliminated human errors in data entry. Accessioning the cases in the LIS using UiPath Studio allowed pathologists to experience improvements in workflow, made possible by implementing digital pathology, early on during the validation process. This accelerated the learning curve for participants and provided opportunities to further evaluate and improve the future workflow process.

Conclusion: RPA is an emerging technology practice applied by companies to streamline enterprise operations and reduce costs. Integrating RPA into digital pathology validation will enable pathology labs to include a more substantial number of cases without increasing human resource requirements and risk of errors.

J Pathol Inform. 2021 Nov 1;12:44.

Machine Learning Improves the Accuracy of Adequacy Evaluation of Cytology Specimens for Molecular Profiling

Mehrvash Haghighi 1, Max Signaevsky 2, Marcel Prastawa 1, Adnan Hasanovic 2, Brandon Veremis 3, Roshanak Alialy 4, XiangFu Zhang 2, Baidarbhi Chakraborty 2, Gerardo Fernandez 2

Background: Molecular testing has become standard of care in the diagnosis and therapeutic management of lung carcinomas. Adequate sampling of tumor tissue is crucial for accurate molecular profiling. The percentage of tumor cells in relationship with the benign cellular component of the specimen is the main factor in the determination of sample suitability for accurate genetic testing. Performing the gene sequencing on inadequate samples delay the result and impose a cost on the molecular laboratory. The overestimation of cellularity could lead to false-negative results and withholding therapies from patients. The underestimation of cellularity with negative results could lead to unnecessary repeat biopsies. The inter-rater reliability study has shown very little concordance between raters in assessing the percentage of malignant cells in the specimen, being a motivational factor for the development of an AI approach.

Design: The percentage of tumor cells on H&E slides of 17 aspirate of lung lesions and 27 pleural fluids, all containing lung adenocarcinomas, were scored by a group of four pathologists. We trained a fully convolutional neural network that detects malignant and benign components on the specimen. We used the SegNet architecture, with modifications that add squeeze-excitation layers. The network is trained using 12000 images of size 512 x 512 pixels extracted from 44 slides. We randomly assign 4 slides as a test data set for measuring the performance of our neural network, obtaining a Dice overlap score 0f 0.73 on annotated 2800 images of size 512 x 512 pixels. The neural network enables the generation of automatic scoring, for example using the ratio of malignant nuclei over total epithelial cells nuclei.

Results: We use Krippendorff’s alpha to measure inter-rater agreement and obtained alpha of 0.275 signifying high variability between raters that motivate our development of an automatic algorithm. The correlations (Spearman’s, Rho) between the score based on the network detection and estimations by four different pathologists were Rho1 =.228, p=.158; Rho2 =.683, p<.001; Rho3 =682, p<.001; and Rho4 =.350, p=.027, showing that estimations by two pathologists out of four were in significant correlation with the AI-derived score.

Conclusion: The artificial intelligence approach could reliably improve the accuracy of evaluation of malignant component on the cytological specimen without effects of inter- and intra-observer variability on molecular testing, which in turn improve the validity of the molecular testing result and patients’ outcome.


Articles from Journal of Pathology Informatics are provided here courtesy of Elsevier

RESOURCES