Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Jan 1.
Published in final edited form as: Stud Health Technol Inform. 2019;257:473–478.

Analysis of Anesthesia Screens for Rule-based Data Quality Assessment Opportunities

Zhan WANG a, Melody PENNING b,1, Meredith ZOZUS b
PMCID: PMC6692112  NIHMSID: NIHMS1044273  PMID: 30741242

Abstract

A rules-based data quality assessment system in electronic health record was explored through compilation of over six thousand data quality rules and twenty-two rule templates. To overcome the lack of knowledge sources and identify additional rules or rule templates, thirty-three anesthesia (perioperative period) EHR screens were reviewed. We analyzed the data elements appearing on anesthesia screens and relationships between them to identify new data quality rules and rule templates relevant to anesthesia care. We present the review process as well as new rules and rule templates identified. We found decomposition and analysis of EHR screens a viable mechanism for acquisition of new data quality rules and proved the number of rules likely tractable and their management scalable.

Keywords: Electronic health record, data quality

1. Introduction

A rule-based data quality assessment system has been developed to identify and monitor data discrepancies in healthcare facility Electronic Health Record (EHR) systems [1][2]. In the system, the rules are categorized and managed according to rule templates. The rules categorized into the same rule template share same topic and logic structure. An example of such a rule template is Flag the record if GENDER is equal to some invalid gender and DIAGNOSIS is equal to a corresponding invalid diagnosis. The clinical information in the rules (in the example the list of gender – diagnosis incompatibilities) was extracted existing rule or knowledge resources and compiled into a knowledge table against which the rule template runs. [1][2]. The existing twenty-two rule templates identify likely errant data through incompatibility, physically impossible values, invalid temporal sequence of events, absence of expected co-occurrence and impossible duplication.

As previously reported, identification of knowledge sources for data quality rules in healthcare was challenging. Through consolidation of existing rule sets we distilled a set of 6,357 rules across the eleven rule templates[2]. Knowledge sources did not exist and could not easily be created for half of the twenty-two rule templates initially identified. This remains an impediment to rule-based data quality assessment and monitoring in healthcare. A second challenge lies in identification of the rule templates themselves. The work presented here address both challenges by assessing extent to which new rules can be identified, through methods from clinical trials data cleaning.

2. Background

EHR adoption rates were at 96% in US hospitals and 87% in clinics [3] yet the quality of healthcare data remains questionable. Reports of errors in healthcare data are numerous and impact secondary data use and potentially patient care and safety [414]. When error rates grow the value of data for physicians as consumers of data may be diminished. Attempts to manage this problem with rule-based quality assessments have succeeded but only on a small scale and healthcare datasets are rapidly increasing in size [16]. In 2003, Brown et al. presented data quality probes to find data quality problems and improve data quality in EHRs [17]. Data errors can happen at every step in a clinical encounter including assessment, data entry, data retrieval, information interpretation and action. Data quality probes consisted of a rule implemented as a query in a clinical information system to find the inconsistency between two or more associated data items. EHR information quality was then tracked using the number of flagged exceptions to the rules and results were reported to clinicians, to encourage improvement.

Recent studies have pushed forward beyond the reporting of data quality to the correction of the errors found and thus the improvement of the healthcare data within the system[18]. Haart and Kuo (2017) used rule-based discrepancies to measure, report and resolve data quality problems [19]. Data quality rule results were used as feedback to healthcare providers who made corrections. The work reported a greater than 50% decrease in data errors over six months.

Our approach goes a step further and provides a scalable framework with which data quality rules can be shared as templates and knowledge tables to be used for improvements across health systems. In previous work we identified rules from existing literature, grouped them into sets of logical impossibility and inconsistency and devised a rules management approach which allows for sharing and reuse at multiple institutions [1]. In this research we address the need for extending and testing the set of rules to cover areas of the EHR not addressed in other rule sets. This work specifically focuses on anesthesia-related data elements in perioperative care.

3. Methods

New mechanisms are needed for the acquisition of data quality rules in healthcare. Toward this aim, we have taken advantage of EHR screens as a resource for identification of important data elements. We apply rule identification methods commonly applied in clinical trial data cleaning [20]. Different from clinical trials that use a high sensitivity screening approach and broadly identify discrepant data, we are concerned only with definite errors [21]. Data error identification can be approached by using rules that focus on logical inconsistencies, physically impossible values and nonsensical relationships between data values. Logical inconsistencies are discoverable by examining the electronic forms where data are captured; these issues are often easily apparent from a data collection form. To identify potential rules, each data element of a form is evaluated at several levels. The most basic level is the individual data element where physically impossible values are identified. At the next level, relationships between data elements on a form are evaluated to identify any where illogical sequence or relative magnitude can be exploited to identify data errors. Finally, interactions between data elements on one form and those on other forms are examined for exploitable relationships. Each screen is carefully reviewed, taking into consideration all the possible combinations of data elements to determine which of these are impossible, unlikely, and necessarily exist (or not). In this manner, thirty-three anesthesia screens from our institutional EHR were reviewed by two independent reviewers to identify new rules and rule templates.

4. Results

4.1. Identification of Rules

In our institutional anesthesia and perioperative care EHR screens, there are five modules recording anesthesia procedures including orders, preprocedure, intraprocedure, postprocedure and out of operating room procedure. We looked through all five modules. All the new potential rules were identified on the lab data elements and temporal sequence of events on intraprocedure screens. We didn’t identify any new rules from the other four modules.

4.1.1. Intraprocedure Lab Results

Lab and other measured quantity results are recorded in intraprocedure screens and are important parameters that effect clinicians’ decision making. Data error in measured quantities could be caused by several reasons, including problems with the sample, problems with the instrumentation, recording mistakes, or incorrect units.

These problems can be resolved for future data and the existing errant values flagged. Thus, data quality rules to identify data errors in measured physical quantities may be beneficial. All identified measured physical quantity rules were accommodated into one of the twenty-two rule templates: Numerical quantity out of range. After reviewing intraprocedure screens, 20 new rules were identified (Table 1) and added into a knowledge table.

Table 1.

Additional Rule Knowledge Identified in the Intraprocedure EHR Screens

Lab Test Valid Low Valid High Units
infusion 0 77 ml/kg
blood loss 0 77 ml/kg
FiO2 (fraction of inspired oxygen) 10 100 %
ETCO2 (End tidal CO2) 20 60 mmHg
PaO2(Partial Pressure Oxygen) 30 200 %
PRBC NR (270–350 ml/unit) 0 2
platlet (200 ml/unit) 0 2
plasma (200 ml/unit) 0 2
cell saver 0 77 ml/kg
systolic Noninvasive Blood Pressure(NIBP) 10 500 mmHg
diastolic Noninvasive Blood Pressure(NIBP) 10 500 mmHg
mean Noninvasive Blood Pressure(NIBP) 10 500 mmHg
systolic invasive arterial blood pressure 10 500 mmHg
diastolic invasive arterial blood pressure 10 500 mmHg
mean invasive arterial blood pressure 10 500 mmHg
Central venous pressure (CVP) 0 20 mmHg
pulmonary artery systolic pressure 5 100 mmHg
pulmonary artery diastolic pressure 5 100 mmHg
pulmonary artery mean pressure 5 100 mmHg
Intracranial pressure (ICP) 0 50 mmHg

4.1.2. Intraprocedure Temporal Sequence

Seventy-eight events can be added on anesthesia procedure screens. Temporal relationships between them were analyzed to identify any two event dates occurring in an invalid order. For example, it is impossible that anesthesia start happens after anesthesia stop. Also, most of events cannot happen before date of birth or after date of death. This type of error was accommodated by our existing temporal sequence rule template. This analysis produced 58 new temporal sequence error rules (truncated list shown in Table 2) for addition into the knowledge table.

Table 2.

Temporal Sequence Rules

Date 1 Invalid Order Date 2
anesthesia start after anesthesia stop
start data collection after stop data collection
intubation after Extubation
pause billing time after resume billing
handoff after transport to ICU
procedure start after procedure finish
intubation after one-lung ventilation
intubation after two-lung ventilation
LMA applied after airway removed
transport to ICU after procedure start
start in OR recovery after stop in OR recovery
Date of birth after anesthesia start
Date of birth after induction
anesthesia start after Date of death
induction after Date of death
deep sedation after Date of death
intubation after Date of death
anesthesia stop after Date of death
emergence after Date of death
LMA applied after Date of death
Extubation after Date of death

5. Discussion

This extracted set of rules demonstrates the large number of opportunities for errors to occur in the EHR. Thirty-three anesthesia screens were reviewed to identify new rules and rule templates of consequence to patient care. The rules established here apply to individual data elements and relationships between elements on and among electronic anesthesia forms which are used during patient care. The measured physical quantity constraints apply only to the individual numerical results but the temporal relationships evaluate temporal relationships between data elements.

Because studies have shown that improvements in healthcare data quality can result from rules-based analysis and reporting processes [18], we undertook this work to evaluate rule identification methods from clinical trials for use in healthcare. When we analyzed the EHR screens, we discovered that less than 50% of the information on the screens was in structured form and conducive to checking. We identified no data elements outside of measured physical quantities, dates and times that were conducive to such checking. Based on the limited clinical specialties and services, application of the clinical trial rule identification methods is tractable within EHRs.

As noted in our earlier work, structured knowledge sources for physically impossible measured physical quantities do not exist. The approach used was to talk with clinicians and identify ranges outside which a data value would most certainly be in error. However, procedural errors such as taking a blood draw down stream from a saline infusion can and do in reality cause values outside these ranges. Thus, we anticipate that ranges will require refinement.

Data discrepancies in EHR data are potentially of concern for patient care as well as problematic for secondary data use. Reusable avenues are clearly needed for quality monitoring rules and tools. Currently there are not generic rules pertinent to multiple institutions which means that the data quality inspection must always be built anew. This research moves use towards that interoperability. The results of this work provide a part of a continually growing ruleset resource that can be used and shared in a community manner. We plan to adopt the OMOP Common Data Model so that the rules and knowledge tables can be used at any institution willing and able to implement the data model. The cost and benefit to healthcare facilities for doing so remains to be demonstrated.

6. Conclusion

This work demonstrates the value of using existing EHR screens for acquisition of new rules for use in data error identification. Based on the number of rules identified here, the approach appears to be feasible. The anesthesia component of these rules developed in this research is an important first step into assessing the viability of mechanisms for acquiring new rules. Future work is needed since it is crucial that these rulesets to cover data important to healthcare facilities. The anesthesia rule extension is the first step towards capturing logical impossibility and inconsistency in EHR data for general application.

References

  • [1].Wang Zhan, et al. “Rule Templates and Linked Knowledge Sources for Rule-based Information Quality Assessment in Healthcare.” MIT ICIQ (2017)
  • [2].Wang Zhan, et al. “ Rule-Based Data Quality Assessment and Monitoring System in Healthcare Facilities.” ITCH (in review) (2019) [PMC free article] [PubMed]
  • [3].(ONC) OotNCfHIT. Health IT Quick Stats 2018; https://dashboard.healthit.gov/quickstats/quickstats.php. Accessed February 27, 2018.
  • [4].Stone MA, et al. “Incorrect and incomplete coding and classification of diabetes: a systematic review.” Diabetic Medicine 275 (2010): 491–497. [DOI] [PubMed] [Google Scholar]
  • [5].De Lusignan S, et al. “A method of identifying and correcting miscoding, misclassification and misdiagnosis in diabetes: a pilot and validation study of routinely collected data.” Diabetic Medicine 272 (2010): 203–209. [DOI] [PubMed] [Google Scholar]
  • [6].Sollie Annet, et al. “Reusability of coded data in the primary care electronic medical record: A dynamic cohort study concerning cancer diagnoses.” International journal of medical informatics 99 (2017): 45–52. [DOI] [PubMed] [Google Scholar]
  • [7].Sollie Annet, et al. “Do GPs know their patients with cancer? Assessing the quality of cancer registration in Dutch primary care: a cross-sectional validation study.” BMJ open 69 (2016): e012669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Münch Carola, et al. “Quality of documented diagnosis in primary care–An analysis using the example of thyroid disorders.” Zeitschrift für Evidenz, Fortbildung und Qualität im Gesundheitswesen 115 (2016): 56–62. [DOI] [PubMed] [Google Scholar]
  • [9].Bryant Gloryanne. “Aspects of data quality in the new millennium.” Topics in health information management 184 (1998): 81–88. [PubMed] [Google Scholar]
  • [10].Iqbal Kashif, Monina Klevens R, and Jiles Ruth. “Comparison of acute viral hepatitis data quality using two methodologies, 2005–2007.” Public Health Reports 1276 (2012): 591–597. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Hasan Sharique, and Padman Rema. “Analyzing the effect of data quality on the accuracy of clinical decision support systems: a computer simulation approach.” AMIA annual symposium proceedings. Vol. 2006 American Medical Informatics Association, 2006. [PMC free article] [PubMed] [Google Scholar]
  • [12].Skyttberg Niclas, et al. “Exploring Vital Sign Data Quality in Electronic Health Records with Focus on Emergency Care Warning Scores.” Applied clinical informatics 803 (2017): 880–892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Weller Grant B., et al. “Leveraging electronic health records for predictive modeling of post-surgical complications.” Statistical methods in medical research (2017): 0962280217696115. [DOI] [PubMed]
  • [14].Hsiao Ju-Ling, Wu Wen-Chu, and Chen Rai-Fu. “Factors of accepting pain management decision support systems by nurse anesthetists.” BMC medical informatics and decision making 131 (2013): 16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Ancker Jessica S., et al. “The invisible work of personal health information management among people with multiple chronic conditions: qualitative interview study among patients and providers.” Journal of medical Internet research 176 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Zozus Meredith, et al. “Factors Impacting Physician Use of Information Charted by Others. “ JAMIA Open (in review) (2018) [DOI] [PMC free article] [PubMed]
  • [17].Brown Philip, and Warmington Victoria. “Info-tsunami: surviving the storm with data quality probes.” Journal of Innovation in Health Informatics 114 (2003): 229–237. [DOI] [PubMed] [Google Scholar]
  • [18].Kahn Michael G., et al. “A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data.” Egems 41 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Hart Robert, and Kuo Mu-Hsing. “Better Data Quality for Better Healthcare Research Results-A Case Study.” Studies in health technology and informatics 234 (2017): 161–166. [PubMed] [Google Scholar]
  • [20].Society for Clinical Data Management, Good Clinical Data Management Practices, (2013). www.scdm.org
  • [21].Zozus Meredith, “The Data Book: Collection and Management of Research Data“ CRC Press, (2017). [Google Scholar]

RESOURCES