Summary
Despite the importance of adverse event (AE) reporting, AEs are under-reported on clinical trials. We hypothesized that electronic medical record (EMR) data can ascertain laboratory-based AEs more accurately than those ascertained manually. EMR data on 12 AEs for patients enrolled on two Children’s Oncology Group (COG) trials at one institution were extracted, processed and graded. When compared to gold standard chart data, COG AE report sensitivity and positive predictive values (PPV) were 0–21.1% and 20–100%, respectively. EMR sensitivity and PPV were >98.2% for all AEs. These results demonstrate that EMR-based AE ascertainment and grading substantially improves laboratory AE reporting accuracy.
Introduction
Side effects on oncology clinical trials are captured in adverse event (AE) reports using the National Cancer Institute (NCI) Common Terminology Criteria for Adverse Events (CTCAE).(https://ctep.cancer.gov/protocolDevelopment/electronic_applications/ctc.htm) Clinical research associates (CRAs) manually review the medical record to report AEs. Although comprehensive and accurate AE ascertainment is crucial for oncology trials, our prior work has demonstrated dramatic under-reporting of AEs.(Miller, et al 2016)
We and others have sought to improve AE reporting with automated data extraction from the electronic medical record (EMR).(Lencioni, et al 2015, Mandelblatt, et al 2014) Mandelblatt et al. (2014) ascertained AEs in breast cancer patients with an overall sensitivity of 89% (range 0–100%), but did not report positive predictive value (PPV). Lencioni et al (2015) incorporated EMR AE capture in one hospital system to streamline AE collection, but did not provide a methodology for improving AE reporting on multi-centre trials.
This study sought to automate EMR data extraction and grading for laboratory-based AEs at a single institution and to compare this approach to manual AE assessment. We hypothesized that EMR-based AE ascertainment and grading would have a higher sensitivity and PPV than manual review.
Methods
Cohort and Data Source
Retrospective EMR data from all patients enrolled on two phase III Children’s Oncology Group (COG) trials for acute myeloid leukaemia (AML: AAML0531, AAML1031) between December 2006 and March 2015 at the Children’s Hospital of Philadelphia (CHOP) were included.(Gamis, et al 2014) COG permitted use of trial data. Prospectively and manually collected COG trial AE data were received from COG in a de-identified spread-sheet listing each reported AE by course. Data were extracted from the vendor-supplied EMR (Epic Systems, Inc., Verona, WI). Laboratory data are stored in the Epic Clarity clinical data warehouse. For each patient, EMR data during the on-protocol period through August 2015 were extracted; this end date was determined from COG AE reports. Nine chemistry and haematology laboratory results were evaluated: potassium, glucose, sodium, alanine aminotransferase (ALT), aspartate aminotransferase (AST), bilirubin, haemoglobin, platelet count and absolute neutrophil count (ANC). Manual chart abstraction was performed by a paediatrician to determine the AE gold standard.
EMR Data Extraction
Structured Query Language queries extracted data on nine laboratory tests that correspond to twelve CTCAEs from Clarity into comma-separated values (CSV) files for each patient. Data included: medical record number; the ordered, collected and resulted times and dates; the test component (e.g. “potassium”) and name (e.g. “comprehensive metabolic panel”); the results, including units, narrative and comment fields; and the reference range. An iterative process of data extraction and review to verify completeness, and re-extraction ensured all results were extracted.
Processing and Grading
Data extracted from different Clarity data tables were concatenated. Reference units were standardized across all types of test for the same laboratory component. Appropriate numerical data in comment fields were extracted programmatically. Each laboratory result was graded using automated code based on cut-off values in CTCAE v4 definitions.(https://ctep.cancer.gov/protocolDevelopment/electronic_applications/ctc.htm) CTCAE definitions were interpreted as follows: non-fasting glucose values were used because fasting glucose is not routinely obtained in paediatric inpatients and, given that Grade 1 “neutrophil count decreased” is defined as a lower limit of normal (LLN) – 1.5 × 109/l,”(https://ctep.cancer.gov/protocolDevelopment/electronic_applications/ctc.htm), when ANC LLN was absent, an ANC of 2.0 × 109/l was used. Grade 4 anaemia is not numerically-based and could not be graded.(https://ctep.cancer.gov/protocolDevelopment/electronic_applications/ctc.htm) Finally, total bilirubin was graded.
Electronic data quality filtering was performed after grading to ensure that abnormal results were not due to laboratory error. Filtering removed results with the following error types: 1) haemolysed specimens based on text fields; 2) tests with multiple laboratory abnormalities, indicating improper specimen collection (e.g. inadequate discard volume prior to collection); 3) results that normalized on repeat testing within one hour.
Statistical Analyses
The highest AE grade in each course was determined. The distribution (frequency, percentage) of results and of courses with each grade was tabulated for each AE at each processing step.
Sensitivity, specificity, PPV and negative predictive value (NPV) were determined for COG AE reports and for EMR-based AE ascertainment using chart data as the gold standard. As the trials did not require haematological AE reporting (Gamis, et al 2014), analyses for COG AE reports were restricted to non-haematological AEs Chart review evaluated COG AE reporting error types. Institutional Review Board approval was obtained.
Results
EMR data on 49 patients (173 chemotherapy courses) were extracted. Three courses were excluded because the patient received chemotherapy at another hospital prior to transfer to CHOP. The number of total tests for each AE ranged from 946 to 6,326 (944 to 6,322 after false positive results were removed). As expected, >94% of courses had grade 3 or 4 haematological AEs. The percentage of courses with each chemistry AE ranged from 0 to 20.6% for grade 3 and 0 to 16.5% for grade 4.
The sensitivity of COG AE reports ranged from 0% to 21.1% and PPV ranged from 20% to 100% while the sensitivity and PPV of EMR-based AE ascertainment were >98.2% for all AEs (Table I). The median number of AEs missed per patient per course was one to two. COG AE reporting errors were due to one of three categories: incorrect reporting, reporting of false positive laboratory results, or incorrect grading (Table II).
Table I.
Chart Abstraction |
COG | EMR | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
N | N | Sensitivity | PPV | Specificity | NPV | N | Sensitivity | PPV | Specificity | NPV | ||
Sodium | Hypo | 36 | 3 | 8.3 (2.2, 23.6) |
100 (31.0, 100) |
100 (96.7, 100) |
81.1 (74.4, 86.5) |
36 | 100 (88.0, 100) |
100 (88.0, 100) |
100 (96.6, 100) |
100 (96.6, 100) |
Hyper | 2 | 1 | 50.0 (2.7. 97.3) |
100 (5.5, 100) |
99.4 (96.4, 100) |
99.4 (96.4, 100) |
2 | 100 (19.8, 100) |
100 (19.8, 100) |
100 (97.2, 100) |
100 (97.2, 100) |
|
Potassium | Hypo | 57 | 9 | 15.8 (7.9, 28.4) |
45.0 (23.8, 68.0) |
100 (96.2, 100) |
75.6 (68.1, 81.9) |
56 | 98.2 (89.4, 99.9) |
98.2 (89.4, 99.9) |
100 (96.0, 100) |
100 (96.0, 100) |
Hyper | 57 | 1 | 1.8 (0.09, 10.6) |
20.0 (1.1, 70.1) |
97.2 (93.1, 98.9) |
98.8 (95.4, 99.8) |
2 | 100 (19.8, 100) |
100 (19.8, 100) |
100 (97.2, 100) |
100 (97.2, 100) |
|
Glucose | Hypo | 2 | 0 | 0 (0, 80.2) |
-- | 100 (97.4, 100) |
99.4 (96.4, 100) |
1 | 100 (5.5, 100) |
100 (5.5, 100) |
100 (97.3, 100) |
100 (97.3, 100) |
Hyper | 62 | 13 | 21.0 (12.1, 33.5) |
81.3 (53.7, 95.0) |
99.1 (94.6, 100) |
71.2 (63.5, 77.8) |
62 | 100 (92.7, 100) |
98.4 (90.3, 99.9) |
99.1 (94.3, 100) |
100 (95.8, 100) |
|
Total Bilirubin |
Increased | 6 | 0 | 0 (0, 48.3) |
-- | 100 (97.2, 100) |
96.6 (92.5, 98.6) |
6 | 100 (51.7, 100) |
100 (51.7, 100) |
100 (96.9, 100) |
100 (96.9, 100) |
ALT | Increased | 19 | 4 | 21.1 (7.0, 46.1) |
66.7 (24.1, 94.0) |
100 (97.1, 100) |
91.9 (86.5, 95.3) |
19 | 100 (79.1, 100) |
100 (79.1, 100) |
100 (96.6, 100) |
100 (96.6, 100) |
AST | Increased | 13 | 2 | 15.4 (2.7, 46.3) |
50.0 (9.2, 90.8) |
99.4 (96.2, 100) |
94.3 (89.4, 97.1) |
13 | 100 (71.7, 100) |
100 (71.7, 100) |
100 (96.7, 100) |
100 (96.7, 100) |
ANC | Decreased | 170 | NR | 169 | 99.4 (96.3, 100) |
98.3 (94.6, 99.5) |
-- | -- | ||||
Haemoglobin | Decreased | 160 | NR | 160 | 100 (97.1, 100) |
100 (97.1, 100) |
100 (69.9, 100) |
100 (69.9, 100) |
||||
Platelets | Decreased | 170 | NR | 169 | 99.4 (96.3, 100) |
98.8 (95.4, 99.8) |
87.5 (46.7, 99.3) |
100 (56.1, 100) |
No hypoglycaemia or blood bilirubin increased were reported in COG data
NR: COG does not require reporting of Grade 3 and 4 haematology adverse events; ALT: alanine aminotransferase; ANC: absolute neutrophil count; AST: aspartate aminotransferase; COG: Children’s Oncology Group; EMR: electronic medical record; NPV: negative predictive value; PPV: positive predictive value;
Table II.
Area of Error | Types of Error |
---|---|
Incorrect reporting | Reported the wrong AE (e.g. hypokalaemia instead of hyperkalaemia) |
Laboratory data was not identified to match the AE report | |
Reporting of false positive laboratory results | Reported AE was due to haemolysed potassium specimen |
Reported hyperkalaemia or hyperglycaemia AE had concurrent hyperglyceamia or hyperkalaemia, indicating improper specimen collection | |
Reported AE normalized when re-checked within 1 h | |
Incorrect grading | Reported the first date of any grade 3 or 4 AE in the course rather than highest grade for the course |
Incorrect grading according to numerical cut-offs | |
Grading definition includes calculations compared to reference range and was not correctly graded |
Discussion
This study demonstrates that EMR-based AE ascertainment at a single institution using an automated data extraction, cleaning and grading process dramatically improves the accuracy of AE laboratory ascertainment. Notably, the automated processing steps to remove potential laboratory errors, which were not previously reported by Lencioni et al.,(2015) are crucial to ensuring accurate laboratory AE identification.
These results have several direct implications. First, automated laboratory AE ascertainment and grading may be able to replace the current manual ascertainment process. Given the extensive time required for manual ascertainment, the limited CRA time available for AE reporting, and the limited resources available to reimburse manual reporting, such an automated system may be of substantial benefit.(Nass, et al 2010, Roche, et al 2002)
Secondly, such course-level data can provide a more accurate estimate of true laboratory AE rates. Given that COG AE rates were substantially lower than in EMR data, published laboratory AE rates probably substantially underestimate the true laboratory AE rates with AML therapy. Accurate rates could serve as baseline estimates in the evaluation of new agents combined with standard AML chemotherapy backbones.
Finally, automated EMR-based laboratory AE reporting may substantially decrease the variation in clinical trial AE reports.(Huynh-Le, et al 2014, Lencioni, et al 2015, Thomas, et al 2002) Many CTCAE definitions use “life-threatening consequences” and a numerical cut-off.(https://ctep.cancer.gov/protocolDevelopment/electronic_applications/ctc.htm) In our experience, some CRAs report laboratory AEs only when accompanied by life-threatening consequences, while others report solely based on numerical values. EMR-based AE ascertainment can standardize grading by enforcing a uniform interpretation of CTCAE numerical criteria. Furthermore, the processing code could transparently be adjusted to a desired sensitivity and specificity for a particular clinical trial. Moreover, multiple thresholds could be applied to evaluate a potential laboratory data toxicity signal. Additional processing incorporating other EMR components, such as medication interventions (i.e. kayexalate), may also improve AE ascertainment.
The primary limitation of this study is that it was performed at one hospital. Future work needs to demonstrate that EMR-based laboratory AE ascertainment is generalizable across institutions using Epic and other EMRs. This should be feasible based on the work of other investigators.(Deakyne, et al 2015, Fiks, et al 2015) Furthermore, this study did not incorporate non-numerical CTCAE definitions. However, automated extraction of medical intervention data should be possible. Finally, this study did not include laboratory results reported as free text that would require natural language processing (NLP) for extraction. Querying free text using NLP has substantial challenges, however we believe that processing AE results with NLP should be feasible.(DeLisle, et al 2010, Friedlin et al 2008)
In summary, this single institution study shows that EMR-based AE ascertainment and grading substantially improves laboratory AE reporting. If incorporated into prospective clinical trials, EMR-based laboratory AE ascertainment may substantially improve laboratory AE reporting efficiency. This may free CRAs to ascertain complex AEs, which may improve the accuracy of complex AE reporting as well. Work is ongoing to test this EMR-based AE ascertainment at other institutions.
Acknowledgments
This work was supported by the National Institutes of Health R01 CA165277, National Institutes of Health K12 CA076931, Statistics and Data Center Grant U10CA098413, Chair's Grant U10CA098543, National Clinical Trials Network (NCTN) Operations Center Grant U10CA180886, NCTN Statistics & Data Center U10CA180899, and the St. Baldrick’s Foundation.
Footnotes
Author contributions: TPM, KDG, BTF, RB, AES, RG and RA designed the research study. TPM, JD, EB, AZ, YL and KDG performed the research. TPM, YL and KDG analyzed the data. JP and RG contributed essential research methods. TPM and RA wrote the paper.
Conflicts of Interest: The authors have no conflicts of interest to disclose.
References
- Deakyne SJ, Bajaj L, Hoffman J, Alessandrini E, Ballard DW, Norris R, Tzimenatos L, Swietlik M, Tham E, Grundmeier RW, Kuppermann N, Dayan PS Pediatric Emergency Care Applied Research, N. Development, Evaluation and Implementation of Chief Complaint Groupings to Activate Data Collection: A Multi-Center Study of Clinical Decision Support for Children with Head Trauma. Appl Clin Inform. 2015;6:521–535. doi: 10.4338/ACI-2015-02-RA-0019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DeLisle S, South B, Anthony JA, Kalp E, Gundlapallli A, Curriero FC, Glass GE, Samore M, Perl TM. Combining free text and structured electronic medical record entries to detect acute respiratory infections. PLoS One. 2010;5:e13377. doi: 10.1371/journal.pone.0013377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fiks AG, Grundmeier RW, Steffes J, Adams WG, Kaelber DC, Pace WD, Wasserman RC Comparative Effectiveness Research Through Collaborative Electronic Reporting, C. Comparative Effectiveness Research Through a Collaborative Electronic Reporting Consortium. Pediatrics. 2015;136:e215–e224. doi: 10.1542/peds.2015-0673. [DOI] [PubMed] [Google Scholar]
- Friedlin J, Grannis S, Overhage JM. Using natural language processing to improve accuracy of automated notifiable disease reporting. AMIA Annual Symposium Proceedings. 2008;2008:207–211. [PMC free article] [PubMed] [Google Scholar]
- Gamis AS, Alonzo TA, Meshinchi S, Sung L, Gerbing RB, Raimondi SC, Hirsch BA, Kahwash SB, Heerema-McKenney A, Winter L, Glick K, Davies SM, Byron P, Smith FO, Aplenc R. Gemtuzumab ozogamicin in children and adolescents with De Novo acute myeloid leukemia improves event-free survival by reducing relapse risk: results from the randomized phase III Children's Oncology Group trial AAML0531. J Clin Oncol. 2014;32:3021–3032. doi: 10.1200/JCO.2014.55.3628. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huynh-Le MP, Zhang Z, Tran PT, DeWeese TL, Song DY. Low interrater reliability in grading of rectal bleeding using National Cancer Institute Common Toxicity Criteria and Radiation Therapy Oncology Group Toxicity scales: a survey of radiation oncologists. Int J Radiat Oncol Biol Phys. 2014;90:1076–1082. doi: 10.1016/j.ijrobp.2014.08.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lencioni A, Hutchins L, Annis S, Chen W, Ermisoglu E, Feng Z, Mack K, Simpson K, Lane C, Topaloglu U. An adverse event capture and management system for cancer studies. BMC Bioinformatics. 2015;16(Suppl 13):S6. doi: 10.1186/1471-2105-16-S13-S6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mandelblatt JS, Huang K, Makgoeng SB, Luta G, Song JX, Tallarico M, Roh JM, Munneke JR, Houlston CA, McGuckin ME, Cai L, Clarke Hillyer G, Hershman DL, Neugut AI, Isaacs C, Kushi L. Preliminary Development and Evaluation of an Algorithm to Identify Breast Cancer Chemotherapy Toxicities Using Electronic Medical Records and Administrative Data. J Oncol Pract. 2014;11:e1–e8. doi: 10.1200/JOP.2013.001288. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller TP, Li Y, Kavcic M, Troxel AB, Huang YS, Sung L, Alonzo TA, Gerbing R, Hall M, Daves MH, Horton TM, Pulsipher MA, Pollard JA, Bagatell R, Seif AE, Fisher BT, Luger S, Gamis AS, Adamson PC, Aplenc R. Accuracy of Adverse Event Ascertainment in Clinical Trials for Pediatric Acute Myeloid Leukemia. J Clin Oncol. 2016;34:1537–1543. doi: 10.1200/JCO.2015.65.5860. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nass SJ, Moses HL, Mendelsohn J. A National Cancer Clinical Trials System for the 21st Century: Reinvigorating the NCI Cooperative Group Program. Washington, DC: The National Academies Press; 2010. [PubMed] [Google Scholar]
- Roche K, Paul N, Smuck B, Whitehead M, Zee B, Pater J, Hiatt M-A, Walker H. Factors affecting workload of cancer clinical trials: results of a multicenter study of the National Cancer Insititue of Canada clinical trials group. Journal of Clinical Oncology. 2002;20:545–556. doi: 10.1200/JCO.2002.20.2.545. [DOI] [PubMed] [Google Scholar]
- Thomas EJ, Lipsitz SR, Studdert DM, Brennan TA. The reliability of medical record review for estimating adverse event rates. Ann Intern Med. 2002;136:812–816. doi: 10.7326/0003-4819-136-11-200206040-00009. [DOI] [PubMed] [Google Scholar]