Abstract
Clinical decision support system (CDSS) performance may vary with the quality of the input data. We assessed the impact of medical record completeness and accuracy on a CDSS that provides risk assessment for gastrointestinal bleeding and recommends therapy when prescribing NSAIDs. We examined the documentation of six data elements in the medical record and the impact on the performance of the CDSS. We reviewed 178 transcribed clinical encounters from standardized patients with predefined clinical histories. Results showed that the mean completeness score across all encounters was .34. The mean correctness score for those elements present was .94. When the available data was input into the CDSS, the missing data elements resulted in inappropriate and unsafe recommendations in almost 77% of the encounters. The results show that important gaps in the medical record can affect the accuracy of a CDSS designed to improve safe prescribing.
Introduction
Clinical decision support systems (CDSS) when used in inpatient settings and linked to an electronic medical record have been shown to reduce errors and improve the quality of care. 1–3
Because CDSS depend on the quality of information in the record to “trigger” the alerts, limitations in documentation can affect how these systems function.4–6 These studies examining the impact of information quality have come from institutions well known for their sophisticated electronic medical records and CDSS. As less experienced institutions begin to implement these kinds of systems, the issue of data accuracy is likely to be even more important. Furthermore, most of the research on electronic medical records and CDSS are based on inpatient studies. Less is known about how CDSS function in ambulatory settings, but it is likely that the problem of missing or inaccurate information in the medical record is at least as significant a problem as in inpatient settings. Even on something as important as medication lists, Wagner and Hogan found a significant amount of error and estimated that these errors would affect the accuracy of CDSS that needed the information to provide accurate advice.7 Hogan and Wagner suggest that the ideal gold standard for judging accuracy should be the true state of the patient, but recognize that this ideal is almost impossible to achieve.4 This may be especially difficult in regard to patient history, where obtaining accurate data depends on both the clinician and the patient.
As part of a larger study examining the impact of CDSS in ambulatory settings, this study examined how a CDSS would perform if it utilized the data in an ambulatory medical record. In this study, we were able to approach the “gold standard” in that we used Standardized Patients (SPs) who were trained to portray a set of cases with specific past medical histories.8 Thus, the actual data that should have been captured is known and we can determine the impact of the missing data on CDSS performance.
Methods
Data Source
The source of the data was transcribed dictations from clinical encounters based on assessments by internal medicine residents (PGY-1-3) who examined the Standardized Patients (SPs). A total of 189 SP visits to 60 residents occurred, but dictations were available for only 178 visits. These 178 visits are the source of the data.
CDSS
The CDSS used was the GI Risk Score, a clinical prediction rule licensed by Stanford University and used with permission of the developer, G. Singh.9 It is used to assess the risk of gastrointestinal bleeding with use of NSAIDS and to make recommendations on safe prescribing. This rule takes into account the patient’s age, the patient’s own assessment of his/her health, patient’s arthritis status, history of steroid use in the past year, history of hospitalization for ulcer or GI bleed, and the patient’s symptoms with NSAIDs. Patients whose GI scores are below 16 are considered at low risk for GI bleeding with non-selective NSAIDS. The rule has been validated in several studies.10–11
Standardized Patients
Each SP was trained to portray one of four musculoskeletal complaints: Pain in the hip, knee, shoulder, or foot. All of these are conditions for which non-selective NSAIDS might be appropriate if the patients were not at risk for gastrointestinal bleed. However, each of these patients had one or more risk factors for GI bleeding, with GI risk scores equal to or greater than 16. Although different SPs portrayed each case, the basic clinical data were the same for each person portraying a given case. Table 1 shows the case data for each of these cases as well as the total Gastrointestinal (GI) risk score. The GI risk score should have been 16 or higher if all the information relevant to the case presented had been properly collected and recorded. If the decision support tool had been used and the score was below 16, the advice from the tool would have been that regular non-selective NSAIDS without gastroprotection could have been safely prescribed, which would have been potentially unsafe prescribing.
Table 1.
Case Descriptions and Correct Medical History as Developed for Each Case
| Chief Complaints of Cases | ||||
|---|---|---|---|---|
| Hip Pain | Knee Pain | Shoulder Pain | Foot Pain | |
| Age | 66 | 44 | 46 | 66 |
| Overall Health | Good | Fair | Good | Fair |
| Arthritis History | Non-RA | Non-RA | Non-Arthritic Pain | Non-Arthritic Pain |
| Steroid Use | No use | No Use | No Use | 1–3 months (steroid taper) |
| GI Bleed or Ulcer Hx | Ulcer | Ulcer | GI bleed | No GI History |
| NSAIDS Symptoms | Stomach upset | Stomach Pain | No Symptoms | No Symptoms |
| GI Risk Score | 22 | 16 | 17 | 16 |
| # of Encounters | 30 | 41 | 46 | 61 |
Procedure
The residents gave consent to unannounced clinic visits from the SPs. This meant that the SPs were woven into the residents’ normal ambulatory continuity clinic as new patients. As they would do with any patient, residents interviewed and examined the SPs, discussed the case with their attending physician, and dictated their encounter note. Once the resident’s dictation of the encounter with the SPs was transcribed and returned to the clinic, it was collected by the study team for analysis.
The dictations were reviewed for data relevant to the cases to see if the residents documented the six data elements needed for the CDSS: age, overall health rating, history of arthritis, steroid use in the previous year, any documentation of ulcer or GI bleed and symptoms with previous NSAID use. Presence or absence and accuracy of the data element were recorded. To assess data abstraction accuracy, a second rater reviewed a random, approximately 10%, sample of dictations, stratified by chief complaint. A total of 273 judgments were made on the 21 dictations reviewed [6 data elements per dictation, two judgments per data element (Presence/absence and accuracy), and calculation of the GI risk score].
Data Analysis
There were 178 dictations available for analysis. Within each dictation, each data element was scored 1 if present and 0 if absent. If the element was present, it was scored as either correct or incorrect. For each of the key elements, data were compiled to describe the percentage of dictations containing that element. For each of the 178 dictations, correctness and completeness scores were computed using the six elements and the formulas proposed by Logan et al.12 The completeness score is the proportion of the six elements present in the dictation and the correctness score is the proportion of elements present that is correct. Also a mean completeness and correctness score was computed for each case and for the total sample. Finally, a GI risk score for each encounter was computed based on the data in the medical record. Scores of 16 and above were considered “accurate” in that the CDSS, had it been used with that data, would have recognized that the patient was high risk and would have recommended safe prescribing. Scores below 16 were considered “inaccurate” in that the rule would have considered the patient low risk, and would allow unsafe prescribing, although all cases were actually high risk.
Results
There were a total of 15 disagreements between raters, which represented 5 % of the total judgments on the sample of dictations. Further review by a third reviewer showed that 20 % of the discrepancies were correct as initially recorded and none of the discrepancies resulted in a change in the GI risk score from safe to unsafe or vice versa.
Age was a very important element in the analysis, as an older patient was assumed to be more at risk than a younger patient and, for the same set of symptoms, older patients should have a higher GI risk score. SPs aged 66 or above made 91 visits and made up approximately half of the patient visits. Even so, using the birth date given on the registration form as the standard, age was incorrectly documented in the encounter note 10% of the time. Table 2 shows the presence of the remaining elements.
Table 2.
Completeness of Key Data Elements
| Data Elements | # of Dictations Containing Data Elements (n = 178) | % of Dictations |
|---|---|---|
| Patient rating of own health | 1 | 0.6 |
| Arthritis history | 48 | 27.0 |
| Steroid use | 41 | 23.0 |
| GI bleed or ulcer | 65 | 36.5 |
| NSAIDs symptoms | 27 | 15.2 |
The residents extracted information about the patient’s self-appraised health information in only one of the 178 dictations (0.6%) reviewed for data analysis. Arthritis history and use of steroids were reported for only about one quarter of the encounters.
However, because in only one of the four cases the patient actually did use steroids, we examined the documentation for that particular case separately and found that use of steroids was recorded in two-thirds of the medical records for that case. The residents did a slightly better job of obtaining information about gastrointestinal symptoms from the patients. As shown in Table 2, the patient’s history of ulcer or GI bleed was obtained in 36.5% of the dictations across all the cases, but for patients with a significant past GI history, it was obtained 50.4% of the time. Finally, history of symptoms with use of NSAIDs was present approximately 15% of the time. The mean Completeness and Correctness scores for each case and the total are shown in Table 3.
Table 3.
Completeness and Correctness Scores
| Mean Completeness Score | Mean Correctness Score | |
|---|---|---|
| Hip Pain | .39 | .91 |
| Shoulder Pain | .32 | .94 |
| Knee Pain | .37 | .96 |
| Foot Pain | .30 | .95 |
| All cases combined | .34 | .94 |
These data indicate that on average only 30–40% of the six data elements were recorded for each case and the proportions were similar across cases. However, most of what was recorded was accurate. As is obvious from Tables 2 and 3, completeness is the major problem, as the mean correctness scores are over 91%. However, in terms of the impact on the CDSS, as shown below, both types of errors can have an impact, especially when there is a threshold score by which an alert is triggered.
The most crucial issue to evaluate is whether these gaps in documentation would change the recommendation of the CDSS. A GI Risk score of 15 or below was considered inaccurate and would lead to potentially unsafe prescribing for the specified encounter and a score of 16 and above was considered to be accurate, in that it would recommend safe prescribing for these high risk patients. As can be seen from Table 4, using the data that were actually documented in the medical record, the CDSS would have failed to warn about the patient’s high risk status over 77% of the time.
Table 4.
GI Risk Score/Recommendation
| # of Dictations | % of Dictations | |
|---|---|---|
| Inaccurate recommendation (GI Risk score < 16) | 138 | 77.5 |
| Accurate recommendation (GI Risk Score >=16) | 40 | 22.5 |
| Total | 178 | 100.0 |
Discussion
Our data show that important gaps in the medical record can affect the accuracy of the recommendations of a clinical decision support tool designed to improve safe prescribing. For many types of clinical data, one might easily assume that the absent data mean that no problems were discovered. However, checking that assumption in most clinical situations would be nearly impossible. This study, using Standardized Patients, provided a “gold standard” with which to compare the recorded information. The results showed that critical pieces of information were either not obtained and/or were not recorded in the record. However, most of the data, when recorded, were accurate.
Residents in this study had the opportunity to use the CDSS on a handheld PDA. This study shows that a handheld computer using the NSAIDs GI risk rule could potentially reduce NSAIDs medication prescribing errors and improve patient safety if the clinicians obtained accurate and complete information from the patients. Although information can be retrieved on the handheld quickly, the residents did not regularly access the handheld decision tool in the clinical setting. Had they done so, they might have recognized the need to gather additional information from the patient. Automating the rule and having it draw data from an electronic record, rather than relying on the clinician choosing to invoke it, could improve decision support, but only if the record is complete and accurate enough to trigger the rule. The data from the present study show that had there been an electronic file of the clinical encounter and had the rule been in place to access that data at the point of prescribing NSAIDS, it would have alerted the clinician less than 25% of the times it should have done so, since all of the SPs were high risk. The type of data may impact both the completeness of the record and the impact on the CDSS. Aronsky and Haug5 found that the clinician’s text notes were the least complete part of the medical record, but that laboratory and medication data were more complete and accurate. Also, a CDSS that could be triggered by any one of several pieces of information might be less affected by a single piece of missing data. In the present study, there was little redundancy of data and in addition, the type of data (risk factors, patient’s self-assessment of their overall health, and past medical history) may be less likely to be assessed or recorded.13
Our study adds to a growing literature examining the accuracy of data in both paper and electronic medical records. Hogan and Wagner reviewed a number of studies on accuracy of data in electronic records and concluded that further study was needed to better understand the extent and impact of errors.4 Aronsky and Haug examined whether the data in the inpatient electronic record at LDS hospital was accurate enough to support a CDSS for determining pneumonia severity. 5 They concluded that the data quality was adequate, but that 64% of the errors were from the free text portions of the medical record. More recently, Hsieh et al. found that 80% of the drug allergy alerts within an electronic medical record were overridden by clinicians for clinically justifiable reasons.6 One of the main reasons for overriding the alert was that it was based on inaccurate information in the electronic medical record, such as allergy lists that were not updated or were inaccurate. Although our study examined the impact of documentation on a particular CDSS in a single setting, the results of significant gaps in documentation are similar to other studies.
One might be able to program the rule in an automated system to prompt the appropriate gathering of information, rather than simply identifying the risk. However, this solution has problems also. It would work best if the data were already in an electronic record and if it only prompted for more information when something was missing. Reminding a clinician to collect data that were already in the record would be likely to be perceived as annoying rather than helpful. Such tailored alerts may be more complex to implement than simpler reminders. If the encounter note were not yet entered into the system, (as is likely in outpatient encounters) entering the data into the CDSS, and then again in the normal encounter note, would involve double data entry. Furthermore, if both the prescription and the encounter notes were entered after the patient left the office, it might be too late for any warning.
One solution may be to have data related to risk factors, and other similar data, collected from the patient, and the CDSS used, prior to the physician encounter. This might be done by non-physician personnel or could be collected electronically directly from the patient. Currently many physicians utilize a paper checklist for past medical history and risk factors. If these data were collected electronically, decision support tools could be incorporated at the time of data collection. This process would provide the appropriate alerts in a timely manner when they could be best utilized to either prompt additional data gathering or changes in treatment recommendations. If the physician were alerted prior to constructing the encounter note, it might also lead to more complete and accurate documentation in the electronic record.
Limitations
While efforts were made to integrate the SPs into the residents’ practice seamlessly, residents reported that they detected the SP in approximately a quarter of the encounters, which could have affected the residents’ data gathering. Knowing that the patient is not “real” could make the resident less attentive. On the other hand SP detection could have the reverse effect, causing the residents to be more meticulous knowing that study personnel would evaluate their patient care. SP reports of residents’ overall behavior were that they felt the resident treated them like a real patient and did not do a cursory job. Residents were required to discuss all of their patients with their attending physician, so there was also some pressure to do a competent job. However, the performance of residents might be different from physicians in practice in either a positive way (because they had more supervision) or in a negative direction (because they were less experienced).
It is possible that some of the missing data is a result of the SP forgetting to convey the information accurately to the physician. However, based on the nature of the six key elements and anecdotal reports from the SPs, it is more likely that the physician neglected to inquire about the information, since in most dictations, the element was not inaccurate; there was no evidence that an inquiry had been made. Completeness of data collection might be less of an issue for a different CDSS that required routinely collected data elements, or if residents had more incentives to respond to the CDSS prompts.
Conclusion
Clinical decision support systems (CDSS) have been shown to improve patient safety by providing alerts to physicians of potentially dangerous drug interactions and suggesting appropriate treatment strategies. Such tools work most efficiently when they are responding to data contained in electronic medical records, but for CDSS to provide accurate information there must be high quality data in the medical record. This study showed that without efforts to improve the accuracy and completeness of the clinical data in the medical record these tools will fail to live up to their potential.
Acknowledgement
This research was funded in part by grants # R18 HS11820 and # 1-T32-HS013852 from the Agency for Healthcare Research and Quality.
References
- 1.Pestotnik SL, Classen DC, Evans RS, Burke JP. Implementing Antibiotic Practice Guidelines through Computer-Assisted Decision Support: Clinical and Financial Outcomes. Ann Intern Med. 1996;124:884–90. doi: 10.7326/0003-4819-124-10-199605150-00004. [DOI] [PubMed] [Google Scholar]
- 2.Bates DW, Gawande AA. Improving safety with information technology. N Engl J Med. 2003;348:2526–34. doi: 10.1056/NEJMsa020847. [DOI] [PubMed] [Google Scholar]
- 3.Garg AX, Adhikari NKJ, McDonald H, et al. Effects of Computerized clinical decision support systems on physician performance and patient outcomes. JAMA. 2005;293:1339–46. doi: 10.1001/jama.293.10.1223. [DOI] [PubMed] [Google Scholar]
- 4.Hogan WR, Wagner MM. Accuracy of data in computer-based patient records. J Am Med Inform Assoc. 1997;4:342–55. doi: 10.1136/jamia.1997.0040342. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Aronsky D, Haug PJ. Assessing the quality of clinical data in a computer-based record for calculating the pneumonia severity index. J Am Med Inform Assoc. 2000;7:55–65. doi: 10.1136/jamia.2000.0070055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Hsieh TC, Kuperman GJ, Jaggi T, et al. Characteristics and consequences of drug-allergy alert overrides in a computerized physician order entry system. J Am Med Inform Assoc. 2004;11:482–91. doi: 10.1197/jamia.M1556. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Wagner MM, Hogan WR. The accuracy of medication data in an outpatient electronic medical record. J Am Med Inform Assoc. 1996;3:234–43. doi: 10.1136/jamia.1996.96310637. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Luck J, Peabody JW. Using standardised patients to measure physicians' practice: validation study using audio recordings. BMJ. 2002;325(7366):679. doi: 10.1136/bmj.325.7366.679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Singh G, Ramey DR, Triadafilopoulus G, Brown BW, Balise RR. GI Score: A simple self-assessment instrument to quantify the risk of serious NSAID-related GI complications in RA and OA. [abstract] Arthritis Rheum. 1998;41 (suppl):S75. [Google Scholar]
- 10.Fries JF, Bruce B. Rates of serious gastrointestinal events from low dose use of acetylsalicylic acid, acetaminophen, and ibuprofen in patients with osteoarthritis and rheumatoid arthritis. J Rheumatol. 2003;30:2226–33. [PubMed] [Google Scholar]
- 11.Cheatham TC, Levy G, Spence M. Predicting the risk of gastrointestinal bleeding due to nonsteroidal anti-inflammatory drugs: NSAID electronic assessment of risk. J Rheumatol. 2003;30:2241–4. [PubMed] [Google Scholar]
- 12.Logan JR, Gorman PN, Middleton B. Measuring the quality of medical records: a method for comparing completeness and correctness of clinical encounter data. Proc 2001 AMIA Fall Symposium, 408–12. [PMC free article] [PubMed]
- 13.Atkins D, Clancy C. Multiple risk factor interventions. Are we up to the challenge? Am J Prev Med. 2004;(2 Suppl):102–3. doi: 10.1016/j.amepre.2004.04.016. [DOI] [PubMed] [Google Scholar]
