Skip to main content
Applied Clinical Informatics logoLink to Applied Clinical Informatics
. 2014 Jan 22;5(1):58–72. doi: 10.4338/ACI-2013-07-RA-0045

Towards Prevention of Acute Syndromes

Electronic Identification of At-Risk Patients During Hospital Admission

A Ahmed 1,2,4,, C Thongprayoon 1,4, BW Pickering 1,3, A Akhoundi 1, G Wilson 1,3, D Pieczkiewicz 2, V Herasevich 1,3
PMCID: PMC3974248  PMID: 24734124

Summary

Background

Identifying patients at risk for acute respiratory distress syndrome (ARDS) before their admission to intensive care is crucial to prevention and treatment. The objective of this study is to determine the performance of an automated algorithm for identifying selected ARDS predisposing conditions at the time of hospital admission.

Methods

This secondary analysis of a prospective cohort study included 3,005 patients admitted to hospital between January 1 and December 31, 2010. The automated algorithm for five ARDS predisposing conditions (sepsis, pneumonia, aspiration, acute pancreatitis, and shock) was developed through a series of queries applied to institutional electronic medical record databases. The automated algorithm was derived and refined in a derivation cohort of 1,562 patients and subsequently validated in an independent cohort of 1,443 patients. The sensitivity, specificity, and positive and negative predictive values of an automated algorithm to identify ARDS risk factors were compared with another two independent data extraction strategies, including manual data extraction and ICD-9 code search. The reference standard was defined as the agreement between the ICD-9 code, automated and manual data extraction.

Results

Compared to the reference standard, the automated algorithm had higher sensitivity than manual data extraction for identifying a case of sepsis (95% vs. 56%), aspiration (63% vs. 42%), acute pancreatitis (100% vs. 70%), pneumonia (93% vs. 62%) and shock (77% vs. 41%) with similar specificity except for sepsis and pneumonia (90% vs. 98% for sepsis and 95% vs. 99% for pneumonia). The PPV for identifying these five acute conditions using the automated algorithm ranged from 65% for pneumonia to 91 % for acute pancreatitis, whereas the NPV for the automated algorithm ranged from 99% to 100%.

Conclusion

A rule-based electronic data extraction can reliably and accurately identify patients at risk of ARDS at the time of hospital admission.

Key words: Electronic search, EMR, risk factor, ARDS

Background

Since the landmark report from the Institute of Medicine more than a decade ago [1], health information technology has been identified as a potential solution to health care safety and its potential to improve patient care has been emphasized [2]. The rapidly increasing adoption of electronic medical records (EMR) provides an unprecedented opportunity to utilize the technology as a tool for syndrome surveillance and to enhance the safety of critically ill patients through development of “smart alarms” [3–5].

Acute respiratory distress syndrome (ARDS) is a common ICU syndrome with high mortality [6]. Accurate, early identification of patients at risk of ARDS at the time of initial Emergency Department (ED) assessment provides the opportunity to initiate effective prevention strategies [7, 8]. It is also critical for successful enrollment of patients in prevention trials. The recently developed and validated Lung Injury Prediction Score model (LIPS) is a score that identifies patients at high risk of ARDS early in the course of their illness and before ICU admission [8]. This score assigns points both for conditions that predispose patients to ARDS (e.g., shock, aspiration, sepsis, pancreatitis, pneumonia, high-risk surgery) and ARDS-modifying factors (e.g., sex, alcohol abuse, obesity, chemotherapy, diabetes mellitus, smoking) at the time of hospital admission. It has been shown that the cumulative score is a reliable predictor of the risk of developing ARDS during hospitalization [9]. Most of the variables used for calculating the score and defining the risk predictor are readily available during the first 24 hours of critical care.

However, the need for timely identification of these risk factors may limit the use of this prediction score. Traditionally, researchers calculate these kinds of scores by manually extracting data from a patient’s medical records [8]. This process is usually performed every day by study coordinators and the data is then reinstated into the research databases. These processes are time-consuming, inefficient, and also carry the risk of inaccuracies due to errors in manual data extraction and manual data entry [10, 11]. With the continued adoption of EMRs, the risk of using manually collected data is substantially reduced [2], and more timely identifications of at-risk patients is occurring [12, 13]. However, EMR data manipulation and secondary use have their own limitations [14, 15]. The quality of data can be suboptimal and the need to check data for accuracy and validation is essential [16].

In this study, we aimed to develop and validate automated data-extraction strategies (automated algorithms) to identify selected ARDS risk factors that are required for LIPS calculation at the time of hospital admission. Our hypothesis is that compared to the manual data extraction, automated data extraction strategies can reliably identify selected ARDS risk factors with sensitivity similar to or exceeding that achieved by manual data collection.

Methods and Study Population

The study population is a secondary analysis utilizing a subset of an ongoing prospective study for ALI/ARDS prevention [9]. Briefly, the study cohort included Olmsted County, Minnesota residents exhibiting risk factors (see below) for ARDS at the time of hospital admission. Exclusion criteria for this cohort were: age <18 years old, prisoners, pregnancy, those who refused consent to use their medical records in research and a second/consecutive admission during the same year of the study period.

Trained investigators extracted data from the electronic medical records of patients and confirmed the presence of specific ARDS risk factors according to standardized definitions. From the above-mentioned cohort, we used subset patients who were admitted to the hospital during 2010 for our analysis. The patients from the first half of the year 2010 (N = 1562) were used for the derivation cohort. The patients from the second half of the year 2010 (N = 1443) were used for the validation cohort.

Only acute conditions occurring during the first 24 hours of hospital admission were considered. We included five acute conditions (sepsis, aspiration of gastric content into the lungs, pneumonia, acute pancreatitis and shock). ▶ Table 1 summarizes the acute conditions studied along with the actual medical definition used by manual data extraction and the definition used for EMR electronic search (pragmatic definition), in addition to the EMR tables or note sections used.

Table 1.

Definitions of included acute conditions

Condition Medical definition (m) EHR definition (pragmatic definition) EHR section used (Source table )
Pneumonia: (new infiltrate + clinical suspicion) 1. New or progressive radiographic infiltrate + High clinical suspicion of pneumonia (New cough, sputum, fever or WBC>12)
Or
1. NEW Abnormal chest radiograph of uncertain cause + Microbiological or serological evidence of definite or probable pneumonia + Low or moderate clinical suspicion of pneumonia
Text search for the word pneumonia Diagnosis, impression and plan note sections.
Sepsis: (SIRS + infection) Suspected or documented infection + More than one of the following clinical manifestations (any 2):
  1. Body temperature greater than 38°C or less than 36°C

  2. Heart rate greater than 90 beats per minute

  3. Respiratory rate greater than 20 breaths per minute, Or hyperventilation, as indicated by a PaCO2 of < 32 mm Hg

  4. White Blood Count greater than 12,000/cu mm, a count less than 4,000/cu mm, or the presence of more than 10 percent immature neutrophils (“bands”)

Lab and vital search for SIRS criteria. Heart rate and respiratory rate must be present at the same hour twice to be considered. Any two conditions must be present on the same day to be considered. For suspicion of infection empiric antimicrobial order during the first 24 hours was applied. Lab table, vital signs table, medication table and diagnosis section of the note.
Shock Suggested by any use of vasopressor OR history & examination and markers of inadequate perfusion as:
  1. Central venous oxygen saturation (ScvO2) or, mixed venous oxygen saturation (SvO2) less than 70%,

  2. Blood lactate levels greater than 4 mmol/L in the absence of known acute or chronic liver disease

  3. Increased base deficit < –4

  4. Blood pH less than 7.32

Presence of the shock index at least twice per hour or use of pressers medications outside operation room. Vital table, medication administration table.
Aspiration Witnessed or suggestive history of inhalation of food or regurgitated gastric contents Text search algorithm Diagnosis, impression and plan note sections.
Acute Pancreatitis Two of the following three features:
  1. Abdominal pain characteristic of acute pancreatitis

  2. Serum amylase and/or lipase >/ 3 times the upper limit of normal

  3. Characteristic findings of acute pancreatitis on CT scan

Lab search for elevated level of serum amylase and/ or lipase (Lipase > 140 U/L, Amylase 150 U/L) along with text search algorithm Lab table, Diagnosis, impression and plan note sections. Diagnosis, impression and plan note sections.

Data Extraction Methods

Manual data extraction

Risk factors were manually ascertained within 24 hours of hospital admission by the research coordinator using the recently developed and validated Lung Injury Prediction Score model (LIPS) [8]. All variables were collected from the electronic medical records of patients with risk factors admitted through the emergency department. Every morning the research coordinator would screen the patients admitted during the previous 24 hours.

Automated electronic note search strategies

In this study, we utilized data from the Mayo Clinic Life Sciences System (MCLSS). MCLSS is an exhaustive clinical data warehouse that stores patient demographics, diagnoses, and hospital/laboratory/clinical notes and pathology data gathered from various clinical and hospital source systems within the institution. MCLSS encompasses a near real-time(NRT) model of some of the institution’s EMR tables. Data Discovery and Query Builder (DDQB) was the tool used to access the data contained within the MCLSS database [17]. The DDQB is based on Boolean logic to create free text searches using a natural language processing (NLP) strategy.

In addition to DDQB, our group also utilized data from a custom integrative relational research database that contains a near real-time copy of clinical and administrative data from EMRs. The Multidisciplinary Epidemiology and Translational Research in Intensive Care (METRIC) datamart accumulates pertinent multiple source data within an average of 15 minutes from its entry into the EMR and serves as the main data repository for rules development. More detailed structures and contents have been previously published [18].

The algorithm for each acute condition was developed and continuously refined to improve the sensitivity and specificity, as outlined in ▶ Figure 1 illustrating the general structural flow of development and validation of each condition (▶ Appendix-Table 1).

Fig. 1.

Fig. 1

Study procedure

For automated extraction of acute medical conditions, MCLSS/DDQB and Metric datamart were used to interrogate the EMR of each study patient within a 24-hour period of hospital admission. Clinical note search queries were restricted to the following sections of the clinical notes: Diagnosis, Impression/Report/Planning and Assessment/Planning. Data regarding vital signs, laboratory values and medication administration (SIRS criteria vital signs (respiratory rate, heart rate, temperature), laboratory values (Leukocytes), medications (antibiotics – used as a method for indicating suspicion of infection) were extracted from the Metric datamart. To optimize sensitivity, each query was designed to identify the condition of interest and the common synonyms, acronyms and abbreviations, or vital signs, laboratory values and medication administration were used to represent the condition. To improve specificity, we excluded negative terms of the condition of interest (no, not, negative, unlikely) as well as terms referring to a history of the condition (history of, recent) mentioned in the clinical notes. Following the initial building of the queries for the acute condition, query building was an iterative process in a derivation cohort. Searches were performed and the results were analyzed by reviewing all false positive and false negative cases when compared to the manually ascertained risk factors. On the basis of this analysis between the manually ascertained risk factors done by the research coordinator and the automated search strategies of these risk factors, more key terms, synonyms, acronyms, abbreviations were added and more negative terms were excluded. The algorithm was then run again and false positive and false negative cases compared again, algorithm modified again, if necessary, then tested, etc. until a satisfying sensitivity and specificity was achieved. Once finalized in the derivation cohort, no further changes to the query were made and the queries were then validated in another independent cohort where the risk factors had also been previously ascertained manually by the research coordinators.

ICD-9 code search

The MCLSS was used to identify selected acute conditions according to the ICD-9 diagnosis code as described in ▶ Appendix-Table 2.

Reference standard

The reference standard was defined as the agreement between the ICD-9 code, automated and manual data extraction. Because the diagnosis of these acute conditions was often not obvious at the time of hospital admission, two study investigators, who were masked to the data extraction result, independently reviewed medical records charted within the first 24 hours of hospital admission and adjudicated all discordant results between the three search strategies. In case there was a disagreement between two reviewers, a third independent investigator also blinded to the results would make the final adjudication.

Statistical Analysis

Baseline characteristics of derivation and validation cohorts were summarized as mean and standard deviation or median and interquartile range for continuous variables, and number and percentage for categorical variables.

As our primary analysis, sensitivity and specificity for each search strategy was calculated based on the comparisons of the search results and the reference standard in both the derivation and validation cohorts. Positive (PPV) and negative predictive values (NPV) were calculated as well. Percentage agreement and Cohen’s kappa statistics comparing manual and electronic data extraction were used as our secondary analysis. JMP statistical software (version 9.0, SAS, Cary, NC) was used for data analysis.

Results

A total of 3,005 patients admitted to hospital during the year 2010 were included in the study. The derivation cohort included 1,562 patients admitted during the first six months of 2010. The validation cohort consisted of 1,443 patients admitted during the second half of 2010.The demographic characteristics and baseline comorbidities of the derivation and validation cohorts are summarized in ▶ Table 2.

Table 2.

Baseline characteristics between derivation and validation cohorts

Characteristic* Derivation cohort
(N = 1562)
Validation cohort
(N= 1443)
P value
Age M (SD) 63 (20) 63 (20) 0.49
Female, n (%) 735 (48) 666 (47) 0.65
White, n (%) 1412 (90) 1287 (89) 0.27
APACHE III score, median (IQR) 58 (42–74) 59 (40–78) 0.79
Diabetes mellitus, n (%) 438 (28) 375 (26) 0.21
Hypertension, n (%) 947 (61) 857 (59) 0.49
Chronic heart disease, n (%) 183 (12) 144 (9) 0.13
COPD, n (%) 80 (5) 49 (3) 0.02
ICU Admission, n (%) 452 (29) 404 (27) 0.55
LIPS score , median (IQR) 2 (1–3) 2 (1–2.5) 0.33
ICU mortality, n (%)+ 14 (3) 15 (4) 0.61

*Data are reported as number (%) or median (25%-75% interquartile range). + For those admitted to ICU only. APACHE = Acute Physiology and Chronic Health Evaluation; COPD = chronic obstructive pulmonary disease

Table 3 summarizes the sensitivity and specificity of the automated algorithm, manual data extraction and ICD-9 code search for five acute conditions in the validation cohort. Compared to the manual data extraction, the automated algorithm had higher sensitivity for identifying sepsis (95% vs. 56 %), aspiration (63% vs. 42%), acute pancreatitis (100% vs. 70%), pneumonia (93% vs. 62%) and shock (77% vs. 41%) with similar specificity except for sepsis and pneumonia (90% vs. 98% for sepsis and 95% vs. 99% for pneumonia). Compared to ICD-9 code search, the automated algorithm had higher sensitivity for detecting cases of sepsis (95% vs. 51%), pneumonia (93% vs. 77%), acute pancreatitis (100% vs. 90%) and shock (77% vs. 55%) but had lower sensitivity for detecting cases of aspiration (63% vs. 84%).

Table 3.

Sensitivity & specificity for automated digital algorithm, manual data extraction and ICD-9 code search in the validation cohort.

Condition Automated
digital algorithm
Manual data extraction ICD-9 code search
Sensitivity
(%) (95% CI)
Specificity
(%) (95%CI)
Sensitivity
(%) (95% CI)
Specificity
(%) (95%CI)
Sensitivity
(%) (95% CI)
Specificity
(%) (95%CI)
Sepsis 95 (85–99) 90 (85–93) 56 (43–68) 98 (95–99) 51 (38–64) 99 (96–99)
Pneumonia 93 (86–96) 95 (94–96) 62 (53–71) 99 (98–99) 77 (68–84) 98 (97–99)
Aspiration 63 (39–83) 99 (99–99) 42 (21–66) 99 (99–99) 84 (60–96) 99 (98–99)
Acute pancreatitis 100 (80–100) 99 (99–99) 70 (46–87) 100 (99–100) 90 (69–98) 99 (99–99)
Shock 77 (62–88) 99 (99–99) 41 (27–57) 99 (99–99) 55 (39–69) 99 (99–99)

CI: confident interval

The PPV for identifying these five acute conditions using an automated algorithm ranged from a minimum of 65% for pneumonia to a maximum of 91 % for acute pancreatitis, whereas the NPV for the automated algorithm ranged from 99% to 100%. ▶ Table 4 summarizes the prevalence, PPV and NPV of the automated algorithm, manual data extraction and ICD-9 code search for five acute conditions in the validation cohort.

Table 4.

Prevalence, PPV and NPV for automated digital algorithm, manual data extraction and ICD-9 code search in the validation cohort.

Condition Cases no. Prevalence (%) Automated digital algorithm Manual data extraction ICD-9 code search
PPV
(95%CI)
NPV
(95%CI)
PPV
(95%CI)
NPV
(95%CI)
PPV
(95%CI)
NPV
(95%CI)
Sepsis 61 19.1 69 (58–78) 99 (96–99) 87 (72–95) 90 (86–93) 91 (75–98) 89 (85–93)
Pneumonia 122 8.5 65 (58–72) 99 (99–99) 80 (70–87) 97 (95–97) 78 (69–85) 98 (97–99)
Aspiration 19 1.3 86 (56–97) 99 (99–99) 80 (44–96) 99 (99–99) 48 (31–66) 99 (99–99)
Acute pancreatitis 20 1.4 91 (69–98) 100 (99–100) 100 (73–100) 99 (99–99) 95 (72–99) 99 (99–99)
Shock 44 3.0 77 (62–88) 99 (99–99) 64 (44–81) 98 (97–99) 73 (54–86) 99 (98–99)

NPV: Negative predictive value, PPV: Positive predictive value, CI: confident interval

There was considerable agreement between electronic and manual data extraction (>80%), with low to high Cohen’s kappa statistics (range 0.40–0.78). ▶ Table 5 summarizes the agreement percentage and Cohen’s kappa statistics between manual and electronic data extraction in the validation cohort.

Table 5.

Agreement percentage and Cohen K Statistics between Manual vs. Electronic data extraction in the validation cohort

Condition Agreement (%) Cohen K statistic
Sepsis 83 0.46
Pneumonia 91 0.48
Aspiration 99 0.58
Pancreatitis 99 0.78
Shock 97 0.40

Discussion

The study result from a single center using retrospective data suggests the level of feasibility and validity of rule-based electronic data extraction of a number of acute conditions from the EMR. Using readily available data, electronic identification of patients at risk of ARDS during their hospital stay may offer an opportunity to implement timely interventions to prevent the syndrome. Moreover, it could assist in the enrollment of participants in prevention, early treatment and intervention trials.

There is strong evidence of a benefit of early intervention for patients admitted with critical syndromes including acute coronary syndrome [19], severe sepsis and septic shock [20], and terminal cancer [21]. However, identification of these critically ill patients represents an abundant challenge. In a research setting, scientists have long sought to solve the problem of time-consuming chart review using automated methods, and this was demonstrated in the time span of studies included in a systematic review by Stanfill et al [22]. The electronic method used in this study can also be beneficial in research settings and in enrolling patients in clinical trials, particularly when enrolling minorities presents greater challenges [23]. Furthermore, many applications can benefit from early identification of acute conditions, including many platforms of decision support systems, faster screening and enrollment in clinical trials, clinical research and syndrome surveillance and for enhancing compliance with evidence-based practices.

Our results are similar to other studies which used the same concept of keyword electronic text search. For example, Hanauer et al. was able to reach a sensitivity of 1.0 and specificity of 0.93 for myocardial infarction [24]. To ascertain risk factors for post-operative pulmonary complication, Kor et al. used a similar approach to identify preoperative risk factors and used them to develop a lung injury prediction score for surgical patients [25]. Singh et al. used a similar tool to identify chronic comorbidities required to calculate Charlson scores, and his results showed the electronic algorithm was superior to the International Classification of Diseases, Ninth Revision diagnostic (ICD-9) code search [26].

In multiple studies, the use of administrative data like ICD-9/10 codes in a research setting has proven its lack of accuracy [26–28] – except for the aspiration – our results were not different. In a recent study by Bensley et al, ICD codes were unreliable and inadequately identified at-risk patients [29]. Another option is to use Systematized Nomenclature of Medicine – Clinical Terms (SNOMED-CT) codes. Studies have demonstrated the superior performance of SNOMED-CT over ICD-9/10 codes [30], but it has its own limitations such as some care elements terminology representations, particularly when scientific scales are used [31]. Moreover, some studies suggested low reliability and the need for trained providers to execute the SNOMED-CT concepts [32]. Finally, applying coding concepts will require time, and usually will not be available for screening within a few hours of patient admission.

Another method would be to add more advanced Natural Language Processing (NLP) techniques to discriminate high-value textual information. In a multicenter study, Fitzhenry et al. was able to identify – with some variability – post-operative complications using NLP [33]. However, these NLP techniques require specific software and rigorous training as well as large training data-sets [34]. Additionally, data extraction using NLP may not be robustly accurate [35]. As an alternative, our approach, which incorporates calculated rules similar to actual disease definition, promises more accuracy and portability and can be used by providers and researchers without extensive expertise in NLP. For example, sepsis was identified with greater sensitivity (95% vs. 88%) in the present study vs. Fitzhenry’s NLP study. In the case of sepsis condition, the use of labs and vital table’s rules to calculate systemic inflammatory response syndrome (SIRS) might have significantly contributed to the superior performance of our algorithm. Nevertheless, our automated algorithm had lower PPV in sepsis and aspiration and this can be due to the lower prevalence of the two conditions in the studied sample.

Our study has a number of limitations. First, the number of events in some conditions is small which reflects either low prevalence or underdetection, but the electronic rule result had higher sensitivity compared to the manual data extraction. The low prevalence of conditions of interest also leads to a low Kappa value despite a high agreement percentage. Second, the terms and phrases used in the text search may be specific to our own institution, and can change over time due to the turnover of in-training physicians. Nonetheless, encouraging the use of homogeneous language and more structured notes may facilitate easier data extraction from clinical notes. Third, all data utilized in this study depended solely on EMR content, which in some instances may not represent the true world or events and may contain errors [16]. Furthermore, some of the discrepancies between the manual and the electronic algorithms might be due to the fact that some of the EMR data came late in the process after manual data abstractors already looked up the notes. However, the manual data collection was performed by trained providers and rechecked again while adjudicating discrepancies to generate the reference standard. Also, the database used to run these algorithms may not have the desired consistency, which can limit the applicability of this approach in institutions using similar or other databases. With more advances in EMRs, indexing of clinical notes will generate more stable databases and facilitate better outcomes for this approach. Finally, the single-center, academic nature of our institution could raise the concern of referral bias as well as overall generalizability.

Conclusion

Utilizing an existing EMR, an electronic rule-based search strategy was able to identify patients with risk factors for ARDS with more accuracy and higher sensitivity than manually collected data. The feasibility and ease of use of these electronic algorithms can facilitate the incorporation of such strategies into clinical decision support systems and screening processes for medical research and clinical trials.

Abbreviations and Acronyms

  • CI = Confidence interval

  • DDQB = Data Discovery and Query Builder

  • EMR = Electronic medical record

  • ICD-9 = International Classification of Disease, Ninth Revision

  • ICU = Intensive care unit

  • IQR = Interquartile range

  • MCLSS = Mayo Clinic Life Sciences System

  • NPV = Negative predictive value

  • PPV = Positive predictive value

  • SNOMED-CT = Systematized Nomenclature of Medicine – Clinical Terms

Acknowledgments

AA contributed to the study design and conduct, analysis, and manuscript writing. AA, CT and GAW contributed to the data collection and the conduct of the study. BP contributed to the study design and critical revision of the manuscript. DP helped with the preparation and revision of the manuscript. VH supervised and was involved as a senior author in all critical parts of the study.

Appendix-Table 1. DDQB Search Text used for each condition

Condition Sub condition Algorithm
Sepsis Empiric antimicrobial Any 2 of [(Heart rate > 90/min at least 2 times in 1 hour), (Respiratory rate >20 at least 2 time in 1 hour), (body temperature <36°C OR >38°C), (WBC count <4000 cell/mm3 OR >12000 cell/mm3)] occurring within 24 hour time window of each other AND antimicrobial administration* (excluding Cefazolin)
Pneumonia Pneumonia Clinical Note contain (“pneumonia”, “aspiration pneumonia”, “community acquired pneumonia”, “hospital acquired pneumonia”, “hospital-acquired pneumonia”, “healthcare-associated pneumonia”, “healthcare acquired pneumonia”, “lung infiltrate”, “lung infiltrates”, “lung infiltr%”, “new lung infiltr%”) AND NOT contain (“not” Same sentence as “pneumonia”, “no” Same sentence as “ pneumonia”, “no evidence” Same sentence as “pneumonia”, rule out” Same sentence as “pneumonia”, “unlikely” Same sentence as “pneumonia”, “differential” Same sentence as “pneumonia”, “vaccine” Same sentence as “pneumonia”, “unremarkable” Same sentence as “pneumonia”, “history of hospitalization” Same sentence as “pneumonia”, “less likely” Same sentence as “pneumonia”, “no” Same sentence as “lung infiltrate”, “no” Same sentence as “lung infiltrates”, “hx of aspiration pneumonia”, history of pneumonia”, “pneumonia risk”, “recently diagnosed pneumonia”, “recent pneumonia”, “history of recurrent pneumonia”) in section Diagnosis, Problem Oriented Hospital Course
Aspiration Aspiration Clinical Note contain (“aspiration”, “aspirated”, “possible aspiration”, “aspiration pneumonia”) AND NOT contain (“wound” Same sentence as “aspiration”, “wound” Same sentence as “aspirated”, “needle” Same sentence as “aspiration”, “ultrasound” Same sentence as “aspirated”, “ultrasound” Same sentence as “ aspiration”, “recent” Same sentence as “ aspiration”, “diagnostic” Same sentence as “ aspiration”, “therapeutic” Same sentence as “aspiration”, “negative” Same sentence as “aspirated”, “negative” Same sentence as “aspiration”, “bone marrow” Same sentence as “aspiration”, “bone marrow” Same sentence as “aspirated”, “knee” Same sentence as “aspiration”, “knee” Same sentence as “aspirated”, “shoulder” Same sentence as “aspiration”, “ankle” Same sentence as aspiration”, “ankle” Same sentence as “aspirated”, “joint” Same sentence as “aspiration”, “joint” Same sentence as “aspirated”, “arthritis” Same sentence as “aspiration”, “renal” Same sentence as “aspiration”, “cyst” Same sentence as “aspiration”, “no” Same sentence as “aspiration”, “not consistent” Same sentence as “aspiration”, “precaution” Same sentence as “aspiration”, “no evidence” Same sentence as “aspiration”, “no sign” Same sentence as “aspiration”, “increased risk of” Same sentence as “aspiration”, “abscess” Same sentence as “aspiration”, “rule out” Same sentence as “ aspiration”, “concern” Same sentence as “aspiration”, “aspiration precaution”, “aspiration precautions”, “hx of aspiration”, “history of aspiration”, history of aspiration pneumonia”) in section Diagnosis, Problem Oriented Hospital Course
Acute pancreatitis Pancreatitis (Clinical Note contain (“pancreatitis”, “acute pancreatitis”, “pancreatitis” Same paragraph as “abdominal pain”, “pancreas” Same sentence as “infection”) AND NOT contain (“history of” Same sentence as “acute pancreatitis”, “hx of” Same sentence as “acute pancreatitis”, “no laboratory evidence” Same sentence as “panc%”, “do not support” Same sentence as “panc%”, “transplant” Same sentence as “panc%”, “suspect” Same sentence as “panc%”, “prior” Same sentence as “panc%”, “not concerning” Same sentence as “panc%”, “less likely” Same sentence as “panc%”, “no obvious” Same sentence as “panc%”, “risk for” Same sentence as “panc%”, “mass” Same sentence as “panc%”, “no evidence” Same sentence as “pancreatitis”) in section Diagnosis, Problem Oriented Hospital Course) AND (serum amylase >150 mg/ dL OR serum lipase >140 mg/dL)
Shock shock ((Systolic blood pressure ≤90 AND Shock Index(HR/systolic blood pressure) >1 at least 2 times in 1 hour) AND this condition occur at least 2 hours in 1 day) OR Need for continuous vasopressor infusion (any dose of norepinephrine, epinephrine or vasopressin and dopamine >5 mcg/kg/min)

Appendix-Table 2. ICD-9 code for acute condition

Condition ICD-9 Code
Sepsis 20.2, 22.3, 3.1, 38, 38, 38.1, 38.1, 38.11, 38.19, 38.2, 38.3, 38.4, 38.4, 38.41, 38.42, 38.43, 38.44, 38.49, 38.8, 38.9, 415.12, 422.92, 449, 54.5, 785.52, 995.91, 995.92
Pneumonia 480.0, 480.1, 480.2, 480.3, 480.8, 480.9, 481, 482.0, 482.1, 482.2, 482.30, 482.31, 482.32, 482.39, 482.40, 482.41, 482.49, 482.81, 482.82, 482.83, 482.89, 482.9, 483.0, 483.1, 483.8, 484.1, 484.3, 484.5, 484.6, 484.7, 484.8, 485, 486
Aspiration 507, 507.0, 507.1, 507.8
Acute pancreatitis 577.0
Shock 785.5, 785.5, 785.51, 785.52, 785.59, 958.4, 995, 995.4, 995.6, 995.6, 995.61, 995.62, 995.63, 995.64, 995.65, 995.66, 995.67, 995.68, 995.69, 998

Footnotes

Conflict Of Interest

The authors declare that they have no conflicts of interest in this project.

Protection Of Human And Animal Subjects

The Institutional Review Board approved the study protocol and waived the need for informed consent since no direct involvement of human subject.

References

  • 1.Kohn L.To err is human: an interview with the Institute of Medicine’s Linda Kohn Jt Comm J Qual Improv 2000; 26(4): 227–234 [PubMed] [Google Scholar]
  • 2.Bates DW, Gawande AA.Improving safety with information technology. N Engl J Med 2003; 348(25): 2526–2534 [DOI] [PubMed] [Google Scholar]
  • 3.Escobar GJ, LaGuardia JC, Turk BJ, Ragins A, Kipnis P, Draper D.Early detection of impending physiologic deterioration among patients who are not in intensive care: development of predictive models using data from an automated electronic medical record. J Hosp Med 2012; 7(5): 388–395 [DOI] [PubMed] [Google Scholar]
  • 4.Haug PJ, Gardner RM, Tate KE, Evans RS, East TD, Kuperman G, Pryor TA, Huff SM, Warner HR.Decision support in medicine: examples from the HELP system. Comput Biomed Res 1994; 27(5): 396–418 [DOI] [PubMed] [Google Scholar]
  • 5.Herasevich V, Yilmaz M, Khan H, Hubmayr RD, Gajic O.Validation of an electronic surveillance system for acute lung injury. Intensive Care Med 2009; 35(6): 1018–1023 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Ahmed A, Kojicic M, Herasevich V, Gajic O.Early identification of patients with or at risk of acute lung injury. Neth J Med 2009; 67(9): 268–271 [PubMed] [Google Scholar]
  • 7.Kor DJ, Erlich J, Gong MN, Malinchoc M, Carter RE, Gajic O, Talmor DS.Association of prehospitalization aspirin therapy and acute lung injury: results of a multicenter international observational study of at-risk patients. Crit Care Med 2011; 39(11): 2393–2400 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Gajic O, Dabbagh O, Park PK, Adesanya A, Chang SY, Hou P, Anderson H, 3rd, Hoth JJ, Mikkelsen ME, Gentile NT, Gong MN, Talmor D, Bajwa E, Watkins TR, Festic E, Yilmaz M, Iscimen R, Kaufman DA, Esper AM, Sadikot R, Douglas I, Sevransky J, Malinchoc M.Early identification of patients at risk of acute lung injury: evaluation of lung injury prediction score in a multicenter cohort study. Am J Respir Crit Care Med 2011; 183(4): 462–470 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Trillo-Alvarez C, Cartin-Ceba R, Kor DJ, Kojicic M, Kashyap R, Thakur S, Thakur L, Herasevich V, Malinchoc M, Gajic O.Acute lung injury prediction score: derivation and validation in a population-based sample. Eur Respir J 2011; 37(3): 604–609 [DOI] [PubMed] [Google Scholar]
  • 10.Overhage JM, Suico J, McDonald CJ.Electronic laboratory reporting: barriers, solutions and findings. J Public Health Manag Pract 2001; 7(6): 60–66 [DOI] [PubMed] [Google Scholar]
  • 11.Wurtz R, Cameron BJ.Electronic laboratory reporting for the infectious diseases physician and clinical microbiologist. Clin Infect Dis 2005; 40(11): 1638–1643 [DOI] [PubMed] [Google Scholar]
  • 12.Erickstad L, Reed G, Bhat D, Roehrborn CG, Lotan Y.Use of electronic medical records to identify patients at risk for prostate cancer in an academic institution. Prostate Cancer Prostatic Dis 2011; 14(1): 85–89 [DOI] [PubMed] [Google Scholar]
  • 13.Brownstein JS, Murphy SN, Goldfine AB, Grant RW, Sordo M, Gainer V, Colecchi JA, Dubey A, Nathan DM, Glaser JP, Kohane IS.Rapid identification of myocardial infarction risk associated with diabetes medications using electronic medical records. Diabetes Care 2010; 33(3): 526–531 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Keyhani S, Hebert PL, Ross JS, Federman A, Zhu CW, Siu AL.Electronic health record components and the quality of care. Med Care 2008; 46(12): 1267–1272 [DOI] [PubMed] [Google Scholar]
  • 15.Baron RJ.Quality improvement with an electronic health record: achievable, but not automatic. Ann Intern Med 2007; 147(8): 549–552 [DOI] [PubMed] [Google Scholar]
  • 16.Tse J, You W.How accurate is the electronic health record? – a pilot study evaluating information accuracy in a primary care setting. Stud Health Technol Inform 2011; 168: 158–164 [PubMed] [Google Scholar]
  • 17.Alsara A, Warner DO, Li G, Herasevich V, Gajic O, Kor DJ.Derivation and validation of automated electronic search strategies to identify pertinent risk factors for postoperative acute lung injury. Mayo Clin Proc. 2011; 86(5): 382–388 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Herasevich V, Kor DJ, Li M, Pickering BW.ICU data mart: a non-iT approach. A team of clinicians, researchers and informatics personnel at the Mayo Clinic have taken a homegrown approach to building an ICU data mart. Healthc Inform 2011; 28(11): 42, 4–5 [PubMed] [Google Scholar]
  • 19.Julian DG.Treatment of cardiac arrest in acute myocardial ischaemia and infarction. Lancet 1961; 2(7207): 840–844 [DOI] [PubMed] [Google Scholar]
  • 20.Rivers E, Nguyen B, Havstad S, Ressler J, Muzzin A, Knoblich B, Peterson E, Tomlanovich M.Early goal-directed therapy in the treatment of severe sepsis and septic shock. N Engl J Med 2001; 345(19): 1368–1377 [DOI] [PubMed] [Google Scholar]
  • 21.Song JU, Suh GY, Park HY, Lim SY, Han SG, Kang YR, Kwon OJ, Woo S, Jeon K.Early intervention on the outcomes in critically ill cancer patients admitted to intensive care units. Intensive Care Med 2012; 38(9): 1505–1513 [DOI] [PubMed] [Google Scholar]
  • 22.Stanfill MH, Williams M, Fenton SH, Jenders RA, Hersh WR.A systematic literature review of automated clinical coding and classification systems. J Am Med Inform Assoc 2010; 17(6): 646–651 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Glickman SW, Anstrom KJ, Lin L, Chandra A, Laskowitz DT, Woods CW, Freeman DH, Kraft M, Beskow LM, Weinfurt KP, Schulman KA, Cairns CB.Challenges in enrollment of minority, pediatric, and geriatric patients in emergency and acute care clinical research. Ann Emerg Med 2008; 51(6): 775–780 e3 [DOI] [PubMed] [Google Scholar]
  • 24.Hanauer DA, Englesbe MJ, Cowan JA, Jr., Campbell DA.Informatics and the American College of Surgeons National Surgical Quality Improvement Program: automated processes could replace manual record review. J Am Coll Surg 2009; 208(1): 37–41 [DOI] [PubMed] [Google Scholar]
  • 25.Kor DJ, Warner DO, Alsara A, Fernandez-Perez ER, Malinchoc M, Kashyap R, Li G, Gajic O.Derivation and diagnostic accuracy of the surgical lung injury prediction model. Anesthesiology 2011; 115(1): 117–128 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Singh B, Singh A, Ahmed A, Wilson GA, Pickering BW, Herasevich V, Gajic O, Li G.Derivation and validation of automated electronic search strategies to extract Charlson comorbidities from electronic medical records. Mayo Clin Proc 2012; 87(9): 817–824 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Patrick SW, Davis MM, Sedman AB, Meddings JA, Hieber S, Lee GM, Stillwell TL, Chenoweth CE, Espinosa C, Schumacher RE.Accuracy of hospital administrative data in reporting central line-associated bloodstream infections in newborns. Pediatrics 2013; 131 (Suppl 1): S75–S80 [DOI] [PubMed] [Google Scholar]
  • 28.Stickler DE, Royer JA, Hardin JW.Accuracy and usefulness of ICD-10 death certificate coding for the identification of patients with ALS: results from the South Carolina ALS Surveillance Pilot Project. Amyotroph Lateral Scler 2012; 13(1): 69–73 [DOI] [PubMed] [Google Scholar]
  • 29.Bensley RP, Yoshida S, Lo RC, Fokkema M, Hamdan AD, Wyers MC, Chaikof EL, Schermerhorn ML.Accuracy of administrative data versus clinical data to evaluate carotid endarterectomy and carotid stenting. J Vasc Surg 2013; Mar 9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Kottke TE, Baechler CJ.An algorithm that identifies coronary and heart failure events in the electronic health record. Prev Chronic Dis. [Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov’t Research Support, U.S. Gov’t, P.H.S.]. 2013Feb; 10: E29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.van der Kooij J, Goossen WT, Goossen-Baremans AT, de Jong-Fintelman M, van Beek L.Using SNOMED CT codes for coding information in electronic health records for stroke patients. Stud Health Technol Inform 2006; 124: 815–823 [PubMed] [Google Scholar]
  • 32.Chiang MF, Hwang JC, Yu AC, Casper DS, Cimino JJ, Starren JB.Reliability of SNOMED-CT coding by three physicians using two terminology browsers. AMIA Annu Symp Proc 2006: 131–135 [PMC free article] [PubMed] [Google Scholar]
  • 33.Fitzhenry F, Murff HJ, Matheny ME, Gentry N, Fielstein EM, Brown SH, Reeves RM, Aronsky D, Elkin PL, Messina VP, Speroff T.Exploring the frontier of electronic health record surveillance: the case of postoperative complications. Med Care 2013; 51(6): 509–516 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Yang H, Spasic I, Keane JA, Nenadic G.A text mining approach to the prediction of disease status from clinical discharge summaries. J Am Med Inform Assoc 2009; 16(4): 596–600 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Meystre S, Haug PJ.Natural language processing to extract medical problems from electronic clinical documents: performance evaluation. J Biomed Inform 2006; 39(6): 589–99 [DOI] [PubMed] [Google Scholar]

Articles from Applied Clinical Informatics are provided here courtesy of Thieme Medical Publishers

RESOURCES