Abstract
Predictive analytics can provide valuable support to the effective management of pathology facilities. The introduction of new tests and technologies in anatomical pathology will increase the volume of specimens to be processed, as well as the complexity of pathology processes. In order for predictive analytics to address managerial challenges associated with the volume and complexity increases, it is important to pinpoint the areas where pathology managers would most benefit from predictive capabilities. We illustrate common issues in managing pathology facilities with an analysis of the surgical specimen process at the Department of Pathology and Laboratory Medicine (DPLM) at The Ottawa Hospital, which processes all surgical specimens for the Eastern Ontario Regional Laboratory Association. We then show how predictive analytics could be used to support management. Our proposed approach can be generalized beyond the DPLM, contributing to a more effective management of pathology facilities and in turn to quicker clinical diagnoses.
Introduction
The introduction of new tests and technologies in anatomical pathology is bringing at once new opportunities and challenges for diagnostic pathologists. For example, the onset of personalized medicine requires that new methods for the detection of oncogenic pathways in tumors be integrated in routine diagnostic pathology [1]. This shows an increasingly important role for pathology in management of surgical patients; however, it also implies a much higher volume of surgical specimen to be diagnosed, along with pathology processes more complex than what is currently the norm in pathology facilities. Given that timely pathology diagnostics are required to provide optimal patient management, changes in terms of the clinical complexity of pathology processes create challenges at the managerial level. Indeed, clinical and operations managers of these facilities need to ensure that pathology processes continue to run safely, efficiently and effectively despite their growing size and complexity.
The use of advanced information technologies and analytics in pathology – otherwise known as pathology informatics – has grown in response to these challenges [2]. In healthcare, predictive analytics have now matured sufficiently to be used in addressing major issues, such as preventing early post-discharge readmissions, facilitating patient engagement, and detecting insurance and billing fraud [3]. These techniques now extend visual analytics systems such as dashboards that allow users to access relevant data through graphical and tabular displays, drill down in the data and form hypothesis about root causes. However, dashboards often lack the capability of predicting events [4]. In this paper, we discuss how advanced predictive analytics can be integrated at key points in pathology processes in order to facilitate workload, throughput planning, and intervention by managers of pathology facilities. For the purpose of this paper, we define predictive analytics rather broadly as a set of techniques such as data mining, statistics, modeling, and machine learning to analyze current data and make predictions about future events. In that sense, we accept that predictive analytics is based on a fundamental assumption that patterns observed in the past are sufficiently stable for predicting future behavior. While this assumption is sometimes contested, its limitations can be mitigated by creating predictive models that can adjust to changes in past patterns through a feedback loop. However, it is important to stress that despite these limitations, the use of predictive analytics in healthcare is being recognized as a reliable approach to improving outcomes, enhancing patients’ experience, and reducing the costs of delivery of health services [5].
We illustrate the potential contributions of predictive analytics using the problem and solution domains at the Department of Pathology and Laboratory Medicine (DPLM) at The Ottawa Hospital (TOH), Ottawa, Ontario,
Canada. TOH is a teaching hospital affiliated with the University of Ottawa. The DPML houses grossing, histology and cytology laboratories, and processes all pathology specimens for the Eastern Ontario Regional Laboratory Association (EORLA), a newly established association of all the laboratory and pathology departments of Eastern Ontario that currently includes facilities from eight hospitals in the region. The rapid growth of volume of surgical and cytology specimens resulting from this centralized structure has created many challenges in insuring smooth workflow and processing of cases and diagnostic reports in a timely fashion. A number of research projects are underway to address these challenges and provide solutions that support decision-making for managers of the DPLM, including the development of optimized pathologists’ scheduling models [6], a real-time visual dashboard tracking the inventory to be processed by the DPLM [7], and predictive analytics algorithms for workload and throughput planning. We focus here on the latter.
The remainder of the paper is organized as follows. First, we describe the pathology processes currently implemented at the DPLM, and identify three key points within the surgical specimen process workflow where predictive analytics would generate the greatest benefits. We then present the systems architecture currently in place at the DPLM, and situate the use of predictive analytics within this architecture. We elaborate on the use of predictive analytics that is the most appropriate for each point within the specimen process flow. We conclude with future work and contributions.
Pathology processes at the DPLM
We briefly present the pathology process for surgical specimens implemented at the DPLM at TOH. The process revolves around pathology-specific entities being managed at the DPLM: cases, specimens, blocks, and slides. A case groups all specimens received from one patient-physician encounter. A specimen is one or more tissue fragments removed from an organ or specific site. A block is a container of one or multiple tissue fragments from the same specimen. A slide is a thin slice of tissue cut from the block and stained with a specific set of reagents. These elements thus form a hierarchy of the inventory to be processed by the DPLM.
The first step in the DPLM process is the accessioning of specimens, where specimens and clinical orders from physicians arrive and are given a uniquely identifying case code. Each specimen is also associated with a pathology subcode that determines how many blocks and slides will need to be created for this specimen. The second step is the grossing of specimens, where blocks are created from the tissue and automatically processed in a manner suitable for creating slides later on. In the third step, blocks are sent to the Histology laboratory, where each one is manually embedded in paraffin and then sliced and stained as required by the pathology subcode attached to it. In the fourth step, called dispatch, finished slides are checked to ensure that they are matched with the correct specimen and case. They are then organized by case and assigned to pathologists. The last step is the diagnosis, where pathologists receive complete cases to examine. During this diagnosis process, a pathologist may order additional slides or blocks. The process ends when the pathologist makes a definite diagnosis for a case and writes a report.
Figure 1 shows a simplified business process model developed for surgical specimens at the DPLM. As can be seen in the model, the process requires interactions among four key entities (Grossing lab, Histology lab, Immuno lab, and Pathologist) in order to be completed. This is not a fully linear process, since requests for additional slides or tests can originate from pathologists when they start to interpret slides, preventing the diagnostic process to be completed until these requests have been fulfilled and quality controlled. Key challenges in ensuring the timely completion of diagnostic reports are (i) the variability in the amount and type of cases being received at any point in time; (ii) the optimal allocation of cases to individual pathologists given the variability in cases as well as the variability in their availability and sub-specialties; and (iii) the variability in levels of complexity across similar cases, which can only be ascertained once the diagnostic process has been initiated. These challenges highlight the potential benefits of predictive analytics at three specific point in the surgical specimen flow shown in Figure 1: predicting the number of cases and case types that will be received by the pathology facility, predicting the workload that will be generated from this volume, and predicting the actual diagnostic throughput of pathologists based on case volume and complexity.
Figure 1.
Pathology surgical specimen process at the DPLM
Current and upcoming information systems at the DPLM
The DPLM currently uses the commercial laboratory information system PowerPath® by Sunquest [8] to record and track individual cases. PowerJ, a dashboard application developed in-house, pulls data from this system in order to monitor key phases in pathology processes at the DPLM, namely: i) number of pending and grossed cases; ii) number of pending and cut blocks; and, iii) number of pending and routed slides. The application also records and presents individual pathologists’ workload, and overflow slides by sub-specialty. A business intelligence (BI) tool providing a fine-grained overview of the performance of the DPLM in terms of cases that are within or that exceed predetermined processing targets has been developed and is planned to be deployed in the near future. The latter combines data from PowerPath® with business rules that specify target processing times per specimen types and process steps. The BI tool supports monitoring, analysis, and reporting. It does not, however, offer the ability to predict future events. Therefore, a predictive analytics capability needs to provide this added level of support. Figure 2 relates these current and upcoming capabilities as a proposed system architecture for the DPLM.
Figure 2.
Proposed systems architecture at the DPLM
Facilitating effective management in pathology through predictive analytics
The transition of pathology diagnosis into a context of digital pathology where slide analysis is conducted using computer-based tools represents a significant step forward in the pathological diagnosis process. This transition also creates new opportunities for the production, capture and use of large volumes of accurate data. In the DPLM, data capture is currently confined to selected stages of the process illustrated in Figure 1. Namely, data are collected at the start and completion of some, but not all, of the steps. With the advent of digital pathology, additional events can be captured, allowing for example to determine the amount of time a pathologist actually spends on each slide during the diagnosis step. Once laboratory information systems have been adapted to capture these new data, they can be leveraged to provide pathology facilities managers with a more accurate picture of their facilities’ operations. Our current project with the DPLM is designed to take advantage of such emerging opportunity.
The real-time use of operational data has the potential to increase the quality of diagnosis and to reduce the costs of the diagnostic process, contributing both to an increase in facility performance and effective management [9]. However, the contribution of pathology informatics to effective management has received less attention in the academic community; given the growing complexity of pathology processes and the resulting challenges to the management of pathology facilities, applying predictive analytics to support managerial decision-making is timely and relevant. This is the focus of this paper and we explain here the benefits of using predictive analytics at three key points within the DPLM surgical specimens process flow: predicting the number of incoming cases by type (and associated diagnostic complexity), predicting the workload of the pathologists, and finally predicting the throughput of the facility. Next, we describe our proposed solutions for each of these areas.
Predicting the number of incoming cases per type
The DPLM operates in constantly changing and seemingly unpredictable environment. The arrival of surgical cases is driven by a surgical schedule, the allocation of operating room blocks to surgical specialties (thus determining types of the surgeries), and the complexity of patients’ cases. In most hospitals, there is a disconnect between the information infrastructure designed to support patient management and surgical services (patient registration, EHR, surgical information system) and the one supporting pathology services. Hence, pathologists and managers of pathology facilities are typically not aware of the surgeons’ schedules and changes to block scheduling of the surgeries. This situation makes it difficult to correctly allocate resources within a pathology facility, for example in terms of scheduling the optimal number of laboratory technicians on a given day.
In our research, a long-term objective is to address this disconnect by developing an application that will capture relevant data about surgeries and store this information in a relational database (bottom of Figure 2) to be accessible by the pathology BI tools. While this will give access to reliable data on the number of surgical cases to be processed on any given day, it will not give a complete understanding of the level of complexity of incoming cases. However, historical data about past surgeries and types of incoming cases will allow us to develop predictive models using statistical techniques. Namely, we will use case complexity as the dependent variable and develop logistic regression models to predict the behavior of this variable in the future. Simultaneously, we will be mining pathology data in order to develop associations between case complexity and pathology specific attributes such as number of slides, stains used, etc. In this stage of our work, we expect to rely on unsupervised data mining methods such as association rule learning [10].
Predicting pathologists’ workload
The need to measure workload in a pathology facility is critical for management in order for appropriate service levels to be maintained. The Canadian Association of Pathologists advocated an approach to workload measurement that is based on the L4E system [11]. The L4E system was developed from survey data collected from 27 pathology centres; the results suggest that a full time pathologist working in a facility such as the DPLM should generate about 7560 L4E equivalents annually. While this measure can be used to evaluate individual pathologists’ performance, it is not very useful for predicting the workload of pathologists working in a teaching hospital; indeed, such predictions need to take into account their academic, teaching, research, and administrative responsibilities – which can account for up to 75% of their time -, as well as dimensions such as each pathologist’s sub-specialties and the impact of each case’s complexity on the time needed for diagnosis.
Our research thus takes a different approach; specifically, the association rules for case complexity used to process historical surgical will allow us to translate purely surgical information into data annotated with the complexity of pathological diagnosis. In turn, we will be able to use the annotated data as an input to a stochastic optimization model for predicting pathologists’ workload taking into account their availability, sub-specialties, prioritization of the cases, and sessional variations. Having such predictive power should allow managers of pathology facilities to better plan human resource allocation by sub-specialties, to develop flexible work schedules (including “floating” resources when available), and to prepare contingencies for unexpected events.
Predicting throughput
As Figure 1 shows, there are a number of potential bottlenecks in a pathology process, from the time cases are received to the time the pathology report is submitted. In order to help manage the process, and take full advantage of digital pathology, it is important to capture fine-grained data at each input and output point in each process step. This may however require a change in practices and equipment. For example, at the DPLM, data about each case, specimen, block, or slide is usually captured when the element arrives at a workstation, and at the time that the work is completed, but not necessarily at a fine-grained level. For example, when a pathologist receives slides to be interpreted for a case, data is captured at beginning of the step (by scanning one of the slide), and again at the end of the process, once the report is finalized; however, events such as interruptions do not generate data. Hence, data about the duration of some steps does not differentiate between active work and waiting time.
Capturing data at the times when an element leaves one station and moves to another would give more precise information from which to predict future throughput. Moreover, while processing times in some instances are fixed and cannot be really manipulated (thus act as fixed parameters), there are some that can be influenced by adding additional resources to processing. The resulting real time throughput prediction model could monitor the actual volume of cases that are in a process, that are waiting for processing, that are being diagnosed, or that are waiting for a diagnosis, hence representing a general performance measure for a pathology facility. Going beyond performance measurement, simulation models could simulate different strategies of dealing with changes in incoming cases and workload parameters, and be associated with the throughput prediction model. Once a model would detect deviation from predicted throughput (for a given context described as day of a month, etc.), it would automatically consult one of the scenarios and apply it (or present it to a manager) in order for the detected anomaly to be addressed.
Implementing the solution
Deployment of the proposed predictive analytics solution will be part of a multi-layer architecture that clearly separates data from a host of analytics tools, and user interface. The logical, hence application-neutral architecture is illustrated on Figure 2. The advantage of such a multi-layer architecture includes the ability to combine varied components, for example a hospital-wide data infrastructure in the data tier with a business intelligence application specifically developed for a pathology facility. While the proposed architecture can be implemented in a number of different (and customized) manners depending on the data infrastructure and suite of analytics solutions ultimately chosen, the DPLM implementation will involve using one of the industry predictive engines (such as those developed by IBM, SAS, or SAP). We are planning for the data tier to include stream data coming from, for example, laboratory information systems such as PowerPath® and historical pre-processed data, for example surgical block schedules updated occasionally. This tier will also include the association rules for determining case complexity and the business rules describing the processing of the specimens in case of the unexpected events (developed as a result of the simulation described earlier). Pragmatic issues regarding the integration of these data streams, for example variations in data structures, will have to be addressed.
Application and presentation tiers will act as an interface between data and end-users. The application tier will include a host of modeling services together with a set of BI applications. The presentation tier is where the reports, alerts, etc. are to be configured. We will use ontology-driven design to create a repository of presentation widgets for easier re-use and customization. Finally, a client tier will implement a web portal that will represent a “one point entry” to all the functions our system offers to the DPLM managers. Access to the portal will be governed by role- based principles, with different levels of granularity of information being accessible to different types of the end- users. We are currently working on finalizing the general architecture of the system, on identifying the relevant software tools, and on eliciting a final set of system requirements from different groups of potential end-users and other stakeholders.
Preliminary results of a user study conducted to evaluate a prototype of the business intelligence layer illustrated in Figure 2 shows that pathology managers find the proposed solution relevant to their work and plan to use the final system to support daily decision-making when it will be implemented in DPLM. Moreover, the preliminary version of the proposed solution had a positive impact on DPLM operations, allowing for example for more effective scheduling of pathology assistants and adjustments to pathologists’ schedules [7]. Given these positive results, it is expected that a fully implemented solution will increase managerial efficiency and effectiveness at the DPLM, allowing the facility to face new challenges in terms of the volume and complexity of cases to be processed.
Conclusion
At a time of growing complexity in pathology processes, it is of great importance that the management of pathology facilities be adequately supported in order to ensure that processing times and quality of provided services remain within acceptable standards. Indeed, unresolved managerial issues could trump the benefits of the new tests and technologies that are currently being introduced to support new patient management paradigms such as personalized medicine. In this paper, we have shown that predictive analytics could usefully be applied in a pathology facility to adress issues of variability in case volume and complexity that make it difficult to adequately plan resource allocation over short and long time horizons. Specifically, we proposed the use of techniques such as logistic regression models, data mining, and simulation models to predict the amount of incoming cases per type, complexity, workload requirements, and facility throughput. These techniques could provide novel and timely solutions to the managers of pathology facilities beyond the DPLM, contributing to a quicker processing of pathology specimens and in turn of clinical diagnoses.
References
- 1.Dietel M, Sers C. Personalized medicine and development of targeted therapies: the upcoming challenge for diagnostic molecular pathology. A review. Virchows Arch. 2006;448:744–55. doi: 10.1007/s00428-006-0189-2. [DOI] [PubMed] [Google Scholar]
- 2.Gabril MY, Yousef GM. Informatics for practicing anatomical pathologists: marking a new era in pathology practice. Mod Pathol. 2010;23:349–58. doi: 10.1038/modpathol.2009.190. [DOI] [PubMed] [Google Scholar]
- 3.Edelstein P. Emerging directions in analytics. Predictive analytics will play an indispensable role in healthcare transformation reform. Health Manag Technol. 2013 Jan.34(1):16–7. [PubMed] [Google Scholar]
- 4.Maciejewsk R, Hafen R, Rudolph S, Larew SG, Mitchell MA, Cleveland WS, et al. Forecasting hotspots - a predictive analytics approach. IEEE Trans Vis Comput Graph. 2011 Apr.17(4):440–53. doi: 10.1109/TVCG.2010.82. [DOI] [PubMed] [Google Scholar]
- 5.Amarasingham R, Patzer RE, Huesch M, Nguyen NQ, Xie B. Implementing electronic health care predictive analytics: considerations and challenges. Health Aff. 2014 Jul.33(7):1148–54. doi: 10.1377/hlthaff.2014.0352. [DOI] [PubMed] [Google Scholar]
- 6.Montazeri A, Patrick J, Michalowski W, Banerjee D. Developing the pathologists’ monthly assignment schedule: a case study at the division of anatomical pathology of The Ottawa Hospital. AMIA Annu Symp Proc. 2015:933–942. [PMC free article] [PubMed] [Google Scholar]
- 7.Halwani F, Li W, Banerjee D, Lessard L, Amyot D, Michalowski W, Giffen R. A real-time dashboard for managing pathology processes. J Pathol Inform. 2016 May;7(24) doi: 10.4103/2153-3539.181768. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Sunquest Information Systems. Anatomic Pathology [Internet]. [cited 2015 Sept 18]. Available from: http://www.sunquestinfo.com/products-solutions/anatomic-pathology.
- 9.Pantanowitz L, Tuthill JM, Balis UGJ. Pathology informatics: theory & practice. Chicago: American Society for Clinical Pathology Press. 2012 [Google Scholar]
- 10.Herrera F, Carmona CJ, González P, del Jesus MJ. An overview on subgroup discovery: foundations and applications. Knowledge and Information Systems. 2011 Dec.29(3):495–525. [Google Scholar]
- 11.Workload and Workforce Committee. Workload Measurement Guidelines. Canadian Association of Pathologists. 2014. Jun,


