Abstract
Despite the promise of a proactive approach to safety, a lack of resources and tangible measures have limited its implementation in organizations. We are exploring Joint Activity Monitoring (JAM) as one key component of a proactive safety program within the domain of infection prevention. However, despite a conceptual alignment to the requirements of a proactive monitoring capability, our experiences instrumenting daily work tools with the capabilities to support continuous, unobtrusive, real-time monitoring have revealed additional organizational and technological requirements. In this paper, we describe our strategies and challenges in developing this capability and discuss implications for supporting successful proactive safety implementations.
INTRODUCTION
When given a choice between being proactive and reactive, proactive is almost always preferred. This is as true for safety professionals as it is in any other setting (Provan et al., 2020; Woods et al., 2015). However, there are two daunting obstacles to a proactive safety practice: the lack of resources and the lack of meaningful proactive safety measures. We are developing Joint Activity Monitoring to address both problems and provide a pragmatic solution for organizations to meaningfully engage in proactive safety practices.
Monitoring is an essential ingredient for proactiveness. The literature on resilience engineering is closely associated with proactive approaches to safety and the necessity of monitoring features prominently in both (Provan et al., 2020). A resilient system must be poised to adapt, meaning that it must have the potential for adjusting patterns of activities for future adaptive action as conditions continually change (Woods, 2019). In order to provide this readiness to respond prior to disturbances, a resilient system needs to continuously monitor its own performance in relation to the environment for potential threats in the near future (Hollnagel, 2009). When combined with the ability to anticipate farther into the future, these capabilities form two of the fundamental building blocks for systems to proactively avoid threats, rather than simply react to them (Branlat & Woods, 2010).
Since proactive safety does not focus on adverse events or even near misses, the question of what and where to focus monitoring efforts is non-trivial. In ultrasafe industries, tracking work that results in non-events increases the scope of potential investigations by up to 10,000 times (Hollnagel, 2014; Hollnagel et al., 2015). Performing proactive safety in the same way as current incident investigations would be impossible without substantial upscaling of safety processes and resources. Many have proposed that this is a perfect application for artificial intelligence (AI) to assist with monitoring, but results from actual implementations have revealed that (1) phenomena of interest are difficult to track automatically or semi-automatically, and (2) this automation is prone to false alarms, where atypical but benign behaviors are misinterpreted as hazardous (Woods et al., 2015).
Even if a sustainable monitoring capability is implemented, there remains a lack of meaningful, forward-facing proactive safety measures. The vast majority of currently utilized safety measures are backward-facing, focusing on categories and frequencies of unwanted events and the inexorable desire to reduce them (Hollnagel et al., 2015; Woods et al., 2015). Current efforts in proactive safety data collection are largely qualitative, which are often discounted when they conflict with other more traditional performance or safety measures (Woods et al., 2015) and inherently more ambiguous because they are detecting potential problems earlier (Klein et al., 2005).
From these challenges, we propose that a successful monitoring capability to support proactive safety must, at minimum:
Continuously surveil the system in near real-time
Collect, process, and analyze data in a consistent, sustainable, and low-cost manner
Generate forward-facing insight on the system’s capabilities to match future demands
Motivate actionable and feasible interventions
Joint Activity Monitoring (JAM) addresses each of these requirements to begin or sustain a proactive safety program. It extends our recent work on Joint Activity Testing (JAT), where periodic human-on-the-loop simulations are used to test system response (Morey et al., 2020), to be a continuous and low-cost process (i.e., requirements 1–2). JAM leverages continuous unobtrusive monitoring of daily operations to identify troubling trends or targeted vulnerabilities. Using the trends themselves as data, and orchestrating these data so that multiple trends across multiple data relationships can be seen, JAM mitigates both the risks of a paucity of meaningful measures and the brittleness that the majority of AI-enabled decision-support experiences (Rayo et al., 2020).
We are exploring JAM within the domain of infection prevention. In current practice, the constraints of available tools limit infection preventionists (IPs) to predominantly reactive approaches. Clusters of hospital acquired infections (HAIs) are determined through retrospective investigations which trigger interventions. Furthermore, the evaluation criteria of these IPs is the number of recent infection clusters and outbreaks: both reactive measures. Learning from prior infection clusters and outbreaks may generally inform how IPs can change or improve how diseases are prevented in the hospital, but there is little support for IPs to do this proactively in the context of an emerging situation.
We are currently working with IPs in a tertiary care hospital to instrument a JAM-enabled decision-support tool designed to facilitate the detection, confirmation, and anticipation of HAI clusters. With this tool, we are developing strategies to measure and store data about day-to-day work to proactively monitor the health of the system and generate foresight about changing risks. However, the process of instrumenting a proactive monitoring capability has revealed additional organizational and technical constraints barring effective implementation. In this paper, we describe these challenges and discuss the implications for establishing and sustaining a proactive real-time monitoring capability in real work settings.
METHODS
Joint Activity Monitoring (JAM) aims to identify, collect, organize, and analyze measures of day-to-day work in a continuous, low-cost, and unobtrusive manner. It does this both through instrumenting the work tools of relevant operators (Murphy et al., 2017) and collecting environmental data. Through similar mechanisms to Joint Activity Testing (JAT), JAM facilitates analyzing the relationships between these data to understand how the system is sustaining through current demands, trending over time, and potentially vulnerable to future scenarios (Morey et al., 2020). JAM extends JAT by tracking these critically important performance measures over time, paying particular attention to trends that predict sustained system performance and decompensation events. Although the real-time measures collected via JAM are inherently more ambiguous and uncertain than the measures collected via an experimental paradigm (e.g., JAT), JAM compensates for this limitation by collecting data continuously, establishing empirical expectations of system performance, and tracking trends over time, which closely mirrors how complex systems detect anomalies (Woods & Hollnagel, 2006).
In transitioning from a periodic testing capability (JAT) to a continuous monitoring capability (JAM) within the domain of infection prevention, we followed several steps to ensure the resulting JAM measures would satisfy the requirements for proactive safety and smoothly integrate with real-time operations. First, we began with a cognitive task analysis (CTA) involving expert IPs to understand the work environment. From interviews and observations, we built an abstraction network including important goals, processes, components, and relationships in the work, which directly translated to key performance measures of system success. We then cross-referenced CTA findings with multiple macrocognitive functions to operationalize measures of meaningful challenges to the system (Patterson et al., 2010). With this subset of measures, we proposed a plan to instrument the daily work tools of IPs with methods to unobtrusively and automatically collect performance behaviors. These measures and plans were then reviewed and revised iteratively with subject matter experts and stakeholders from the department of infection prevention. Once agreement was reached, collection mechanisms for these measures were detailed and built into the code of IP’s daily tools.
IMPLEMENTATION & CONSTRAINTS
Original Measure Proposal
After several rounds of user and domain research, the human factors team, who specialized in JAT and JAM, proposed the set of measures in table 1. These measures aimed to characterize the performance of the system by both the accuracy and efficiency of decisions or actions taken to prevent the spread of HAIs, which was consistent with high-level goals of the organization (identified in the abstraction network). The human factors team chose these measures with the belief that they would be a low-cost way to gather continuous insight about how the system strains to meet various stresses (i.e., how well the system is able to match capabilities to environmental demands). By tracking these measures over time, establishing expectations, and extrapolating trends, these measures could help proactively generate insight about future threats and actionable interventions. Therefore, from the perspective of the human factors team, the proposed measures in table 1 satisfied the monitoring requirements to support proactive safety.
Table 1.
Performance Measure | Description |
---|---|
Time to cluster detection (i.e,. cluster confirming actions) | Elapsed time from diagnosis of a second infected patient (in a cluster) to confirmation of the cluster |
Completeness of clusters identified | Proportion of cases correctly identified of a total cluster |
Efficiency of UV cleans | Proportion of total rooms ordered to be cleaned that were contaminated by patients in the cluster |
Accuracy of UV cleans | Proportion of total rooms contaminated by patients in the cluster that were ordered to be cleaned |
Timeliness of UV cleans | Elapsed time from ordering a room cleaning to execution |
Rejection of Original Measures
The initial set of five performance measures in table 1 were proposed to the department of infection prevention for review; however, all five measures were subsequently ruled out and replaced with the three measures in table 2. None of the original measures proposed by the human factors team were deemed both organizationally and technically feasible by the subject matter experts and stakeholders in the department of infection prevention. This was a surprising finding because of the considerable research conducted by the human factors team to understand the domain and organization of IPs. In reviewing the driving forces behind these rejected measures, we identified three key constraints by comparing bottom-up (e.g., memos and personal notes taken on the numerous communications between the human factors team and the department of infection prevention) and top-down (e.g., factors of organizational acceptance) sources.
Table 2.
Performance Measure | Description |
---|---|
Number of clusters discussed | Total number of clusters discussed by individual IPs at weekly department meetings |
Aggregate average time spent in the EHR by IPs | Mean screen time in EHR (in minutes per day) |
Total number of UV cleans | Number of UV cleans initiated per day |
The most prominent factor hindering a measure’s organizational acceptance was insufficient perceived yield. The perceived benefit of implementing a new measure needs to outweigh the perceived cost in order to “fit through the door” of an organization (Fitzgerald, 2019). Although the scientific benefit of the original performance measures appeared theoretically sound, the perceived yield was substantially eroded by uncertainties over the quality of the input data. Unlike in experimental settings where actions can be compared to some (verified or constructed) ground truth, determining the correctness of operational actions in near real-time is inherently uncertain. As uncertainty about the real-world ground truth increases, the perceived benefit substantially decreases. As a result, several measures were determined to be non-value-added even with a low-cost implementation. We observed this pattern contributing to the rationale of ruling out “time to cluster detection”, “completeness of clusters identified”, “efficiency of UV cleans”, and “accuracy of UV cleans” due to uncertainty over which infection cases and rooms were truly in a cluster and contaminated. Combined with the additional workload required to generate more confident ground truth, these measures were ruled out by the department.
Another factor that reduced a measure’s acceptance was perceived conflicts with organizational goals. The department of infection prevention valued preserving a culture where IPs can work independently, both in deciding their own tempo of work and using their expertise to make the most reasonable decisions with the data they have. Both “time to cluster detection” and “completeness of clusters identified” required retrospective reviews and evaluations of the IPs’ completed work. The department was concerned that these evaluations, given inevitable uncertainties in this domain, might be perceived as overly constraining oversight and therefore jeopardize the organizational culture of independence.
The last constraint we observed was a lack of directability between interdependent parties. This makes a measure technically infeasible due to the department’s inability to affect changes in performance. “Timeliness of UV cleans” was constrained by a lack of ability to direct when a room is UV-cleaned after a cleaning order was initiated. UV cleans could not be conducted in-house by the department, creating an interdependency between the department and the contractor through whom the work needed to be carried out. The timeliness of UV cleans was highly influenced by the contractor’s schedule and workload, which was not directable by others in the current setting. Therefore, even if the measure was useful for generating insight, the lack of control hindered any actionable interventions.
Revised Measure Proposal
The revised measures proposed by the department of infection prevention in table 2 were both organizationally acceptable and technically feasible; however, these measures were not alone sufficient to instrument a proactive monitoring capability. The revised measures had limited utility to generate forward-facing insight on the system’s capabilities to match future demands because the measures confound performance of the system with the difficulty of the situation. “Number of clusters discussed” does not differentiate the system’s simultaneous pursuit of both efficiency and accuracy in preventing ongoing infections. Increasing challenges could either increase or suppress the number of clusters discussed as IPs may experience the need to either collaborate more with others or allocate more time towards individual investigation activities. “Aggregate average time spent in the EHR by IPs” and “total number of UV cleans” share similar disadvantages. Consequently, while these revised measures might be sufficient summary metrics for retrospectively evaluating system performance, the inability to extrapolate future system behaviors limits their usefulness for proactive safety.
In addition to the measures proposed in table 2, the department of infection prevention also suggested pre- and post-implementation surveys for understanding system performance. Although such survey questions (like the examples in table 3) were accepted by the organization and provided feasible means to collect performance-related data, they too are insufficient for proactive monitoring. First, data collected from pre- and post-implementation surveys provide only two points of measurement instead of continuously tracking critical performance measures over time. While this may offer a good measure of whether the new system is generally better or worse than before, it offers little support to proactively guide adaptations, especially as novel and unexpected situations arise. Second, the reliance of surveys on self-reporting raises potential issues of reliability (Nisbett & Wilson, 1977). While self-reporting survey methods can be good indicators of probable acceptance, the unreliability and high-cost of these methods are not well-suited for a proactive monitoring capability.
Table 3.
Category | Question |
---|---|
Experiences | The decision-support tool has made time to cluster identification longer/shorter |
Experiences | The decision-support tool has helped me identify fewer/more clusters |
Skill confidence | I am confident that I can identify potential clusters quickly |
Skill confidence | I am confident that I can accurately identify all cases within a cluster |
DISCUSSION
Ultimately, the first round of iterations to instrument JAM performance measures yielded no measures that meet all the requirements of a successful proactive monitoring program. The initial set of five performance measures had the potential to increase the system’s proactiveness, but they were either not organizationally acceptable or not technically feasible. The three revised measures and survey questions proposed by the department of infection prevention were organizationally acceptable and technically feasible, but were insufficient to support the monitoring capabilities needed for proactive safety. However, these challenges illuminate some of the constraints that shape JAM measures throughout the life-cycle of the implementation process. For a proactive monitoring capability to have the greatest chance of success, it is equally important for measures to meet conceptual, organizational, and technological requirements.
The initial set of measures proposed by the human factors team, while satisfying the conceptual requirements for proactive safety, were all rejected as organizationally unacceptable or technologically infeasible. From this experience we explore expanding the list of conceptual requirements to also include organizational and technological requirements. In addition to the conceptual requirements, future measures must additionally:
have sufficiently high perceived yield to offset the uncertainty about perceived benefits and anticipated costs (workload, financial, relationship, etc.)
align with organizational goals and culture
reflect the structure of interdependent work and mechanisms of control
Furthermore, the revised sets of measures generated by the team remained insufficient for supporting proactive safety on conceptual grounds. At minimum, this suggests gaps in team members’ mental models, despite adopting a teaming structure with explicit processes for continually realigning mental models (Li et al., 2021). The challenges detailed in this paper lend further support that the soundness of the scientific idea is just one small part of a successful implementation. The degree of mental model alignment, pragmatics of the implementation, and capabilities of the actors are just as, if not more, important for making an organizational impact (Fitzgerald, 2019). Implementations might have a greater chance of success if these various factors are considered in tandem rather than sequentially.
Proactive safety has been, and will likely continue to be, difficult to instill within real-world organizations navigating complex domains and trade-offs. In addition to a foundational shift in mindset (Provan et al., 2020), proactive safety mechanisms must additionally satisfy fundamental requirements of implementation. Our experiences instrumenting daily work tools with the capabilities to support unobtrusive, real-time, proactive monitoring have revealed non-trivial implementation challenges. Successful proactive safety mechanisms will similarly need to navigate these same trade-offs and, in addition to the considerable conceptual challenges, satisfy organizational and technological requirements.
ACKNOWLEDGEMENTS
This project was supported by an award from the Agency for Healthcare Research and Quality (AHRQ: R01HS027200). The content is solely the responsibility of the authors and does not necessarily represent the official views of AHRQ.
REFERENCES
- Branlat M, & Woods DD (2010). How do systems manage their adaptive capacity to successfully handle disruptions? A resilience engineering perspective. In 2010 AAAI fall symposium series. [Google Scholar]
- Fitzgerald MC (2019). The IMPActS Framework: the necessary requirements for making science-based organizational impact. (Master’s thesis, The Ohio State University). [Google Scholar]
- Hollnagel E (2009). The four cornerstones of resilience engineering. In Resilience Engineering Perspectives, Volume 2 (pp. 139–156). CRC Press. [Google Scholar]
- Hollnagel E (2014). Safety-I and Safety-II. Ashgate Publishing, Ltd. [Google Scholar]
- Hollnagel E, Wears RL, & Braithwaite J (2015). From Safety-I to Safety-II: A White Paper. The Resilient Health Care Net: Published simultaneously by the University of Southern Denmark, University of Florida, USA, and Macquarie University, Australia. [Google Scholar]
- Klein G, Pliske R, Crandall B, & Woods DD (2005). Problem detection. Cognition, Technology & Work, 7(1), 14–28. doi: 10.1007/s10111-004-0166-y [DOI] [Google Scholar]
- Li M, Morey DA, & Rayo MF (2021). Symbiotic Design Application in Healthcare: Preventing Hospital Acquired Infections. In Proceedings of the International Symposium on Human Factors and Ergonomics in Health Care (Vol. 10, No. 1, pp. 211–216). Sage CA: Los Angeles, CA: SAGE Publications. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Morey DA, Marquisee JM, Gifford RC, Fitzgerald MC, & Rayo MF (2020). Predicting Graceful Extensibility of Human-Machine Systems: A New Analysis Method for Evaluating Extensibility Plots to Anticipate Distributed System Performance. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 64(1), 313–318. doi: 10.1177/1071181320641072 [DOI] [Google Scholar]
- Murphy T, Balkin A, Rayo M, Woods DD, & Zelik D (2017). Integrated Multi-Method Probes as a Research Method in Cognitive Systems Engineering. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 61(1), 207–211. doi: 10.1177/1541931213601536 [DOI] [Google Scholar]
- Nisbett RE, & Wilson TD (1977). Telling more than we can know: Verbal reports on mental processes. Psychological review, 84(3), 231. [Google Scholar]
- Patterson ES, Roth EM, & Woods DD (2010). Facets of complexity in situated work. In Miller JE & Patterson ES(Eds.). Macrocognition metrics and scenarios: Design and evaluation for real-world teams. Ashgate. [Google Scholar]
- Provan DJ, Woods DD, Dekker SWA, & Rae AJ (2020). Safety II professionals: How resilience engineering can transform safety practice. Reliability Engineering & System Safety, 195, 106740. doi: 10.1016/j.ress.2019.106740 [DOI] [Google Scholar]
- Rayo MF, Fitzgerald MC, Gifford RC, Morey DA, Reynolds ME, D’Annolfo K, & Jefferies CM (2020). The Need for Machine Fitness Assessment: Enabling Joint Human-Machine Performance in Consumer Health Technologies. Proceedings of the International Symposium of Human Factors and Ergonomics in Healthcare, 9(1), 40–42. doi: 10.1177/2327857920091041 [DOI] [Google Scholar]
- Woods DD, & Hollnagel E (2006). Joint cognitive systems: Patterns in cognitive systems engineering. CRC Press. [Google Scholar]
- Woods DD, Branlat M, Herrera I, & Woltjer R (2015). Where Is the Organization Looking in Order to Be Proactive about Safety? A Framework for Revealing whether It Is Mostly Looking Back, Also Looking Forward or Simply Looking Away. Journal of Contingencies and Crisis Management, 23(2). doi: 10.1111/1468-5973.12079 [DOI] [Google Scholar]
- Woods DD (2019). Essentials of resilience, revisited. Handbook on resilience of socio-technical systems. [Google Scholar]