Skip to main content
EPA Author Manuscripts logoLink to EPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Aug 18.
Published in final edited form as: Opflow. 2017 Apr 1;43(4):30–32. doi: 10.5991/opf.2017.43.0025

CANARY Eases Water Quality Event Detection

John Hall 1, Sri Panguluri 2, Regan Murray 1, Jonathan Burkhardt 1
PMCID: PMC7433799  NIHMSID: NIHMS1540924  PMID: 32831432

Abstract

A free software tool from the US Environmental Protection Agency was put through its paces at a midsize water utility and bolstered the utility’s water quality event-detection capabilities.


Use of event-detection software can enhance detection of contamination incidents in distribution systems using typical water quality sensors such as free chlorine, total organic carbon, and electrical conductivity. Data analytics software increases the likelihood and speed of detection by interpreting data from strategically placed sensors in near real-time and identifying anomalies and alerting operators to a potential contamination event.

CANARY, a free event-detection software tool the US Environmental Protection Agency (USEPA) and Sandia National Laboratories developed as a research prototype, has been used by large drinking water utilities and been in operation at some locations for more than five years. Event-detection software benefits water utilities’ daily operations by helping to ensure water quality throughout the distribution system is maintained within desired ranges and to rapidly detect operational problems (e.g., chemical overdose, pipe breaks, and cross connections) when they occur.

Initial feedback from users of early versions of CANARY event-detection software indicated that installation and set-up costs were high, the configuration process was complex, and there were no standard operating procedures for managing alarms. This has led to the perception that only larger utilities with large budgets for control-room software support should consider deploying event-detection software. Recent versions of CANARY have improved the technology to reduce false alarms, and a report is available describing a simple configuration process. During USEPA Water Security Initiative pilots, standard approaches were developed to help guide utilities through the steps to investigate an alarm. A new version of CANARY has been developed, using the Java programming language to improve integration with commercial software packages, so CANARY can be deployed alongside other helpful software tools and supported by commercial companies. Multiple vendors have integrated the new CANARY Java runtime into their software. A vendor’s implementation of the software for a water utility in Akron, Ohio, is discussed here.

Although the CANARY software is free to download from the Web, implementation costs have ranged from $30,000 to $90,000, according to reported data from three water utilities. The cost was high because each implementation had to be customized using CANARY’s non-Java version. The study reported here demonstrates how the CANARY event-detection software was implemented along with real-time hydraulic modeling software for a much lower cost (about $5,000) at Akron Water Works (~85,000 customers).

STUDY DETAILS

The city of Akron was looking for a new event-detection software to monitor distribution-system events for pumps, tanks, pressure, and water quality on a full-time basis. Akron had previously installed five sensor stations (11 sensors in total) throughout its distribution system. The sensor data were stored in the utility’s supervisory control and data acquisition (SCADA) system.

Akron was using an existing commercial hydraulic-modeling software but was considering upgrading to a “real-time” hydraulic model. A vendor entered into a memorandum of understanding with Akron to pilot its real-time hydraulic and water quality modeling software, which included the CANARY event-detection functionality in addition to the vendor’s anomaly detection functionality. USEPA collaborated with the vendor on the Akron case study to estimate the implementation costs of the new architecture and periodically analyze the CANARY outputs during the study.

The software platform used for realtime modeling of hydraulics and water quality in Akron’s distribution system comprised several components, including a real-time simulation engine (RMX) based on EPANET, a data transformation engine (XFX), and a CANARY-based event-detection engine (EDX). RMX uses SCADA data to set the model’s initial conditions (e.g., pump status, valve settings, and initial tank levels) and to change boundary conditions (e.g., flows and pressures at the entrance to the distribution system), with the exception of tank levels, at each simulation time-step. Incorporating actual field data greatly improves the model’s output accuracy.

XFX is mainly used to convert tanklevel changes into tank demands using tank geometries and to reduce noise from the tank-level sensors in real time. XFX extracts the required data from a SCADA historian database, performs the computations, and stores the output into a local lightweight database. EDX uses the same database and the Java version of CANARY to detect events and generate alarms. Noise removal is an option for every signal, but the CANARY inputs weren’t smoothed in this study. Figure 1 shows the data flow between the modular software components.

Figure 1.

Figure 1.

Data Flow. The Akron project’s data flowed from SCADA historian to EDX to RMX.

SOFTWARE DEPLOYMENT

The following four steps detail the Akron software deployment:

Installing the Software and Mapping to SCADA Data.

The software suite was deployed initially onto a city of Akron virtual machine on Feb. 5, 2015. Live connections with the city of Akron’s SCADA historian database were established with assistance from the city of Akron. The SCADA data and assets then were associated and mapped with the modeling and eventdetection components. SCADA tags and descriptions were obtained for tank levels, pump station flows, pump status, and water quality. After mapping, the system was configured to display model-predicted data versus the system-reported SCADA data.

Data Handling and Data Quality.

To maximize functionality within EDX, the XFX module was configured to provide the CANARY algorithm with “training data” from the past to ensure it could immediately begin producing alarms (if applicable). CANARY uses a history window to train the algorithm, which during normal operations has no impact, but requires time to populate during initial startup. This new feature removed the lag in alarm generation and ensured EDX could provide meaningful alarms immediately if it needed to be restarted.

During data assessment, it was observed that some obviously bad-quality data still registered as good within the SCADA database. This included high/low water quality results, negative pump station flows, and unreliable turbidity data. To account for this behavior, the XFX module was set up to filter out badquality data when processing the data into the local database. For the city of Akron, only raw data that have quality at or above a quality value of 75 are used—a scaled data quality value of 100 represents good quality and 0 represents poor quality. The curated data were supplied to the EDX module. Tuning CANARY Algorithm Input Parameters. Initially, CANARY was set up using default configuration parameters. During the initial four-month period, with the initial configuration values, CANARY alarmed 98 times (or about 0.87 alarms/day). After a simple process to improve the configuration, the number of alarms was reduced to 0.42 alarms/day. Although some of these alarms may have been false alarms, many indicated true changes in water quality.

Offline Testing, Evaluation, and Real-time Operation.

EDX was deployed offline in Akron and monitored for four months at four-week intervals. During this period, Akron didn’t respond directly to alarms, but simply gathered data to analyze why alarms were occurring. As a result of this evaluation, input parameters were further modified to reduce false alarms. Numerous CANARY alarms were found to relate to normal system operation and were accepted by the control room operators.

A few events were investigated to determine the physical causes. For example, one alarm resulted from a spike in the chlorine levels at one of the tanks, and the cause was determined to be a spike in the chlorine concentration at the treatment plant as a result of an operational change. The lag between the two events also provided a good estimate of the travel time for that particular day. Such valuable information gathered during normal operations can reduce the need for expensive tracer studies. Figure 2 shows another event in which a chlorine-concentration drop was detected in one of the tanks. The blue data points turn red when the event probability goes over the set threshold of 0.5. At the end of the case study, the software suite was fully configured and ready for realtime operation.

Figure 2.

Figure 2.

Free Chlorine Drop Event. CANARY detected a chlorine concentration drop in one of Akron’s water storage tanks.

IMPLEMENTATION COSTS AND FINAL RESULTS

The cost to implement CANARY functionality at a utility may vary depending on the complexity of the sensor network and other factors. The Akron case study demonstrated that for a medium-sized utility, CANARY functionality could be accomplished for approximately $5,000. The vendor leveraged its previous experience with CANARY and the new Javabased application programming interface to integrate event-detection functionality into its real-time hydraulic-modeling software.

Implementing CANARY with real-time modeling software provided the city of Akron with a robust event-detection capability coupled with advanced modeling for a lower cost than had been previously feasible. The ability to run the software on a “plain-vanilla’’ virtual machine eliminated the need for expensive proprietary hardware. Also, reviewing the alarms and realtime modeling results provided insight into the city’s system operations.

The case study addressed the concerns that had been expressed by water utilities interested in using event-detection software. The overall implementation process was simplified by using the default configuration parameters now available for CANARY, and the project team’s prior experience with CANARY helped to reduce total effort. The configuration process was simplified by using the default configuration parameters that have been developed and are now available for CANARY. The overall costs are low for installing and running event-detection systems, in this case less than $5,000. By working with commercial partners, utilities can also expect quality customer support.

The case study also demonstrated that the XFX module could be used to transform SCADA data in a reliable manner for use in the RMX and EDX modules. The new Java version of the CANARY software runtime performance was excellent, and it worked well even on a modest virtual machine. EPANET has been widely used in commercial software as a hydraulic and water quality modeling engine, and CANARY is now positioned for adoption as a stable eventdetection engine.

RESOURCES