Abstract
Claims databases provide information on the effects of direct oral anticoagulants (DOACs) as used in routine care but may not contain important data on clinical characteristics which may be captured in electronic health records (EHR).
Within a US claims database, we identified patients initiating a DOAC or warfarin between 10/2010–12/2014. 1:1 propensity score (PS) matching was used to balance 78 claims-defined baseline characteristics. We evaluated whether balance was achieved in patient characteristics immeasurable in the claims data study by evaluating the balance in clinical information (using absolute standardized differences (aSD)) from linked EHR data.
From a claims data cohort study of 140,187 patients, 5,935 (4.2%) patients were linked to EHR data. After PS-matching, almost all EHR-defined patient characteristics were well balanced (aSD<0.1). A new user active comparator design with 1:1 PS matching on many patient characteristics improved balance on clinical risk factors observed in EHR but not in claims data.
Keywords: Direct oral anticoagulants, warfarin, administrative data, claims data, linkage, electronic health records, confounding, sensitivity analysis
Introduction
A number of direct oral anticoagulants (DOACs) are being marketed for the prevention of stroke in patients with non-valvular atrial fibrillation (NVAF).1,2 Unlike vitamin K antagonists, DOACs do not require titration towards a narrow therapeutic range.
DOACs were tested for efficacy and safety in large randomized trials in controlled research settings.3. 4. 5. With their widespread use, concerns arose about the representativeness of these trial findings for large patient populations. For example, the time in therapeutic range observed in the warfarin arm and the level of adherence observed in the DOAC arm of the trials may be overly optimistic for many patients in routine care. Large claims data studies were needed in order to fully understand the safety and effectiveness profile of DOACs given their growing use over time.
Pharmacoepidemiological studies based on longitudinal insurance claims data routinely generated in the provision of healthcare for millions of patients have increasingly been utilized to complement randomized controlled trial (RCT) findings6. 7. 8. 9. and provide information on the comparative effectiveness and safety of anticoagulants in routine care settings. This has resulted in a range of claims data studies of varying quality.10. Even high-quality studies that employ the preferred new user active comparator cohort designs with substantial covariate adjustment11. 12. 13. have been criticized for potential confounding by factors not measured in claims data, including underlying bleeding risks, renal function, over-the-counter (OTC) aspirin use, body mass index (BMI), or smoking.14. Such broad opinions which are not empirically substantiated could be refuted if the factors unmeasured in claims data studies were in fact balanced between treatment groups when measured in clinical data repositories, due to study design choices and high-dimensional proxy adjustment.7. 15.
With the wide-spread use of electronic medical records, subsets of patients identified in administrative claims data can be successfully linked to electronic health records (EHR), and the balance of clinical parameters not documented in claims can be assessed across exposure groups. We sought to evaluate the extent to which balance in clinical characteristics unobserved in claims data was achieved in a monitoring program of the safety and effectiveness of DOACs compared to warfarin.
Results
During the study period, we identified a total of 140,187 patients in the claims cohort (26,199 new dabigatran users, 32,595 new rivaroxaban users, 11,322 new apixaban users and 70,071 new warfarin users). From this claims-based cohort we successfully linked 1,130 dabigatran, 1,602 rivaroxaban, 637 apixaban and 2,566 warfarin users leaving a total EHR-linked subset of 5,935 anticoagulant initiators (4.2% of the total claims-based cohort). After 1:1 PS-matching within the EHR-linked subset, there were 846 dabigatran, 874 rivaroxaban, and 355 apixaban initiators (Figure 1). Patients were more often male (62%) and on average almost 70 years of age.
Claims-defined characteristics in the study population for whom EHR data were available and in patients without available EHR data were well balanced with almost all aSD <0.1, suggesting the EHR-linked subset was representative of the overall study population (Table 1). However, patients in the linked cohort were slightly younger, had a lower prevalence of hemorrhagic stroke, and slightly lower CHADS and CHA2DS2-VASc scores compared to the not-linked cohort. They had a slightly higher number of distinct medications prescribed and number of physician visits. Similarly, high representativeness was found in each of the three linked DOAC cohorts (Table e1).
Table 1:
Claims-defined patient characteristics | Linked | Not Linked | Standardized difference |
||
---|---|---|---|---|---|
N=5,935 | N=134,252 | ||||
N/Mean | %/SD | N/Mean | %/SD | linked vs. not linked | |
Age, years (mean, SD) | 67.4 | 11.4 | 69.5 | 12.3 | −0.17 |
Age group (N, %) | |||||
18–54 | 755 | 12.7% | 15,755 | 11.7% | 0.03 |
55–64 | 1,986 | 33.5% | 38,121 | 28.4% | 0.11 |
65–74 | 1,532 | 25.8% | 31,174 | 23.2% | 0.06 |
75+ | 1,662 | 28.0% | 49,202 | 36.6% | −0.19 |
Sex (N, %) | |||||
Male | 3,647 | 61.4% | 83,215 | 62.0% | −0.01 |
Comorbidities during baseline (N, %) | |||||
Acute renal disease | 364 | 6.1% | 9,084 | 6.8% | −0.03 |
Atherosclerosis | 1,805 | 30.4% | 40,904 | 30.5% | 0.00 |
Cancer | 1,190 | 20.1% | 24,416 | 18.2% | 0.05 |
Chronic renal insufficiency | 574 | 9.7% | 13,217 | 9.8% | −0.01 |
Miscellaneous renal insufficiency | 14 | 0.2% | 424 | 0.3% | −0.02 |
Coronary artery disease (CAD) | 1,999 | 33.7% | 45,511 | 33.9% | 0.00 |
Deep vein thrombosis (DVT) | 304 | 5.1% | 7,382 | 5.5% | −0.02 |
Diabetes | 1,585 | 26.7% | 33,724 | 25.1% | 0.04 |
Diabetic nephropathy | 154 | 2.6% | 2,919 | 2.2% | 0.03 |
Heart failure (CHF) | 977 | 16.5% | 25,043 | 18.7% | −0.06 |
Hemorrhagic stroke | 1,460 | 24.6% | 39,578 | 29.5% | −0.11 |
Hyperlipidemia | 3,120 | 52.6% | 68,747 | 51.2% | 0.03 |
Hypertension | 5,699 | 96.0% | 128,764 | 95.9% | 0.01 |
Hypertensive nephropathy | 346 | 5.8% | 8,189 | 6.1% | −0.01 |
Intracranial bleeding | 16 | 0.3% | 259 | 0.2% | 0.02 |
Ischemic stroke | 442 | 7.4% | 11,001 | 8.2% | −0.03 |
Lower/ unspecified GI bleed | 238 | 4.0% | 4,515 | 3.4% | 0.03 |
Upper GI bleed | 36 | 0.6% | 674 | 0.5% | 0.01 |
Urogenital bleed | 4 | 0.1% | 57 | 0.0% | 0.01 |
Other bleeds | 245 | 4.1% | 5,082 | 3.8% | 0.02 |
Peptic ulcer disease | 1,012 | 17.1% | 20,533 | 15.3% | 0.05 |
Peripheral vascular disease (PVD) or PVD surgery | 242 | 4.1% | 5,800 | 4.3% | −0.01 |
Previous TIA | 277 | 4.7% | 6,242 | 4.6% | 0.00 |
Prior liver disease | 277 | 4.7% | 5,182 | 3.9% | 0.04 |
Pulmonary embolism (PE) | 198 | 3.3% | 4,619 | 3.4% | −0.01 |
Recent MI | 281 | 4.7% | 6,589 | 4.9% | −0.01 |
Old MI | 243 | 4.1% | 5,765 | 4.3% | −0.01 |
Renal dysfunction | 869 | 14.6% | 20,132 | 15.0% | −0.01 |
Stroke | 512 | 8.6% | 12,835 | 9.6% | −0.03 |
Systemic embolism | 50 | 0.8% | 1,256 | 0.9% | −0.01 |
CHADS2 score (mean, SD) | 2.0 | 1.1 | 2.1 | 1.1 | −0.09 |
1 - Low risk (N, %) | 2,530 | 42.6% | 50,330 | 37.5% | 0.11 |
2 - Intermediate risk (N, %) | 1,912 | 32.2% | 45,676 | 34.0% | −0.04 |
3 - High risk (N, %) | 1,493 | 25.2% | 38,246 | 28.5% | −0.08 |
CHA2DS2-VASc score (mean, SD) | 3.1 | 1.6 | 3.3 | 1.7 | −0.11 |
1 - Low risk (N, %) | 0 | 0.0% | 0 | 0.0% | − |
2 - Intermediate risk (N, %) | 1,030 | 17.4% | 20,828 | 15.5% | 0.05 |
3 - High risk (N, %) | 4,905 | 82.6% | 113,424 | 84.5% | −0.05 |
HAS-BLED score (mean, SD) | 2.4 | 1.1 | 2.4 | 1.1 | −0.03 |
1 - Low risk (N, %) | 1,385 | 23.3% | 29,114 | 21.7% | 0.04 |
2 - Intermediate risk (N, %) | 2,200 | 37.1% | 49,876 | 37.2% | 0.00 |
3 - High risk (N, %) | 2,350 | 39.6% | 55,262 | 41.2% | −0.03 |
Medications during baseline (N, %) | |||||
Aspirin | 85 | 1.4% | 1,793 | 1.3% | 0.01 |
Aspirin/dipyridamole | 26 | 0.4% | 731 | 0.5% | −0.02 |
Clopidogrel | 649 | 10.9% | 15,727 | 11.7% | −0.02 |
Prasugrel | 49 | 0.8% | 807 | 0.6% | 0.03 |
Ticagrelor | 9 | 0.2% | 225 | 0.2% | 0.00 |
Other antiplatelet agents | 42 | 0.7% | 862 | 0.6% | 0.01 |
NSAIDs | 1,399 | 23.6% | 28,090 | 20.9% | 0.06 |
Heparin | 6 | 0.1% | 83 | 0.1% | 0.01 |
Low-molecular weight heparins | 377 | 6.4% | 9,246 | 6.9% | −0.02 |
PGP inhibitors | 3,411 | 57.5% | 74,399 | 55.4% | 0.04 |
ARB | 1,367 | 23.0% | 29,775 | 22.2% | 0.02 |
ACE inhibitor | 2,072 | 34.9% | 47,245 | 35.2% | −0.01 |
Beta blocker | 4,250 | 71.6% | 95,643 | 71.2% | 0.01 |
Calcium channel blocker | 2,415 | 40.7% | 54,623 | 40.7% | 0.00 |
Other hypertension drugs | 1,611 | 27.1% | 36,247 | 27.0% | 0.00 |
Antiarrhythmic drugs (other than amiodarone and dronedarone) | 954 | 16.1% | 16,970 | 12.6% | 0.10 |
Statin | 3,140 | 52.9% | 70,084 | 52.2% | 0.01 |
Other lipid-lowering drugs | 774 | 13.0% | 16,242 | 12.1% | 0.03 |
Diabetes medications | 1,446 | 24.4% | 31,074 | 23.1% | 0.03 |
Healthcare utilization | |||||
Hospitalization in 30 days prior to treatment initiation (N, %) | 2,126 | 35.8% | 54,237 | 40.4% | −0.09 |
reating prescriber (N, %) | |||||
Cardiologist | 1,302 | 21.9% | 25,339 | 18.9% | 0.08 |
Primary care physician | 1,470 | 24.8% | 33,604 | 25.0% | −0.01 |
Other or unknown | 3,163 | 53.3% | 75,309 | 56.1% | −0.06 |
Number of laboratory tests ordered (mean, SD) | 17.2 | 27.8 | 14.6 | 25.6 | 0.10 |
Number of INR (prothrombin) tests ordered (mean, SD) | 1.5 | 4.0 | 1.4 | 3.9 | 0.03 |
Number of lipid tests ordered (mean, SD) | 0.9 | 1.4 | 0.8 | 1.3 | 0.11 |
Number of creatinine tests ordered (mean, SD) | 0.2 | 0.8 | 0.2 | 0.8 | −0.02 |
Number of medications (mean, SD) | 12.8 | 6.8 | 11.6 | 6.4 | 0.17 |
Number of hospitalizations (mean, SD) | 0.7 | 0.8 | 0.7 | 0.8 | −0.04 |
Number of hospital days (mean, SD) | 3.7 | 7.8 | 3.9 | 7.7 | −0.03 |
Number of office visits (mean, SD) | 14.6 | 11.2 | 12.6 | 10.3 | 0.19 |
Number of cardiologist visits (mean, SD) | 3.7 | 4.7 | 3.5 | 4.8 | 0.05 |
Number of neurologist visits (mean, SD) | 0.3 | 1.3 | 0.3 | 1.4 | 0.03 |
Indicators for region, calendar time of cohort entry and non-CV medications all had SDs of <0.1 and were omitted. PS: propensity score; SD: standard deviation; ACE: angiotensin converting enzyme; ARBs: angiotensin receptor blockers.
Even before PS-matching, reasonable balance had been achieved due to the new user active comparator design16. 17. (Table e1). After PS-matching, all claims-based characteristics were very well balanced between DOAC initiators and warfarin initiators in the EHR-linked subset (Table 2).
Table 2.
Claims-defined patient characteristics | Dabigatran | Warfarin | Standardized difference |
Rivaroxaban | Warfarin | Standardized difference |
Apixaban | Warfarin | Standardized difference |
||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
N= | 846 | N= | 846 | N= | 874 | N= | 874 | N= | 355 | N= | 355 | ||||
N/Mean | %/SD | N/Mean | %/SD | N/Mean | %/SD | N/Mean | %/SD | N/Mean | %/SD | N/Mean | %/SD | ||||
Age, years (mean, SD) | 66.8 | 11.5 | 66.9 | 11.4 | −0.01 | 69.2 | 10.6 | 68.6 | 11.4 | 0.05 | 68.7 | 11.3 | 69.1 | 10.7 | −0.03 |
Age group (N, %) | |||||||||||||||
18–54 | 114 | 13.5% | 117 | 13.8% | −0.01 | 78 | 8.9% | 93 | 10.6% | −0.06 | 43 | 12.1% | 33 | 9.3% | 0.09 |
55–64 | 284 | 33.6% | 298 | 35.2% | −0.03 | 266 | 30.4% | 265 | 30.3% | 0.00 | 99 | 27.9% | 111 | 31.3% | −0.07 |
65–74 | 218 | 25.8% | 219 | 25.9% | 0.00 | 248 | 28.4% | 241 | 27.6% | 0.02 | 95 | 26.8% | 95 | 26.8% | 0.00 |
75+ | 230 | 27.2% | 212 | 25.1% | 0.05 | 282 | 32.3% | 275 | 31.5% | 0.02 | 118 | 33.2% | 116 | 32.7% | 0.01 |
Sex (N, %) | |||||||||||||||
Male | 519 | 61.3% | 507 | 59.9% | 0.03 | 527 | 60.3% | 524 | 60.0% | 0.01 | 198 | 55.8% | 205 | 57.7% | −0.04 |
Comorbidities during baseline1 (N, %) | |||||||||||||||
Acute renal disease | 35 | 4.1% | 38 | 4.5% | −0.02 | 59 | 6.8% | 51 | 5.8% | 0.04 | 19 | 5.4% | 15 | 4.2% | 0.05 |
Atherosclerosis | 235 | 27.8% | 213 | 25.2% | 0.06 | 309 | 35.4% | 273 | 31.2% | 0.09 | 120 | 33.8% | 128 | 36.1% | −0.05 |
Cancer | 120 | 14.2% | 126 | 14.9% | −0.02 | 221 | 25.3% | 202 | 23.1% | 0.05 | 99 | 27.9% | 97 | 27.3% | 0.01 |
Chronic renal insufficiency | 53 | 6.3% | 45 | 5.3% | 0.04 | 98 | 11.2% | 90 | 10.3% | 0.03 | 43 | 12.1% | 35 | 9.9% | 0.07 |
Miscellaneous renal insufficiency | 1 | 0.1% | 1 | 0.1% | 0.00 | 2 | 0.2% | 1 | 0.1% | 0.03 | 1 | 0.3% | 1 | 0.3% | 0.00 |
Coronary artery disease (CAD) | 261 | 30.9% | 249 | 29.4% | 0.03 | 338 | 38.7% | 294 | 33.6% | 0.10 | 128 | 36.1% | 137 | 38.6% | −0.05 |
Deep vein thrombosis (DVT) | 16 | 1.9% | 19 | 2.2% | −0.02 | 53 | 6.1% | 45 | 5.1% | 0.04 | 6 | 1.7% | 7 | 2.0% | −0.02 |
Diabetes | 197 | 23.3% | 197 | 23.3% | 0.00 | 265 | 30.3% | 238 | 27.2% | 0.07 | 116 | 32.7% | 107 | 30.1% | 0.05 |
Diabetic nephropathy | 8 | 0.9% | 8 | 0.9% | 0.00 | 26 | 3.0% | 29 | 3.3% | −0.02 | 11 | 3.1% | 12 | 3.4% | −0.02 |
Heart failure (CHF) | 134 | 15.8% | 122 | 14.4% | 0.04 | 162 | 18.5% | 146 | 16.7% | 0.05 | 58 | 16.3% | 52 | 14.6% | 0.05 |
Hemorrhagic stroke | 312 | 36.9% | 307 | 36.3% | 0.01 | 114 | 13.0% | 118 | 13.5% | −0.01 | 1 | 0.3% | 0 | 0.0% | 0.08 |
Hyperlipidemia | 390 | 46.1% | 378 | 44.7% | 0.03 | 492 | 56.3% | 455 | 52.1% | 0.09 | 213 | 60.0% | 210 | 59.2% | 0.02 |
Hypertension | 811 | 95.9% | 815 | 96.3% | −0.02 | 842 | 96.3% | 841 | 96.2% | 0.01 | 348 | 98.0% | 346 | 97.5% | 0.04 |
Hypertensive nephropathy | 29 | 3.4% | 22 | 2.6% | 0.05 | 47 | 5.4% | 41 | 4.7% | 0.03 | 27 | 7.6% | 19 | 5.4% | 0.09 |
Intracranial bleeding | 4 | 0.5% | 4 | 0.5% | 0.00 | 0 | 0.0% | 0 | 0.0% | − | 1 | 0.3% | 0 | 0.0% | 0.08 |
Ischemic stroke | 53 | 6.3% | 56 | 6.6% | −0.01 | 70 | 8.0% | 69 | 7.9% | 0.00 | 25 | 7.0% | 23 | 6.5% | 0.02 |
Lower/ unspecified GI bleed | 27 | 3.2% | 24 | 2.8% | 0.02 | 38 | 4.3% | 33 | 3.8% | 0.03 | 16 | 4.5% | 16 | 4.5% | 0.00 |
Upper GI bleed | 3 | 0.4% | 5 | 0.6% | −0.03 | 4 | 0.5% | 4 | 0.5% | 0.00 | 2 | 0.6% | 4 | 1.1% | −0.06 |
Urogenital bleed | 0 | 0.0% | 0 | 0.0% | − | 0 | 0.0% | 0 | 0.0% | − | 0 | 0.0% | 0 | 0.0% | − |
Other bleeds | 19 | 2.2% | 16 | 1.9% | 0.02 | 44 | 5.0% | 51 | 5.8% | −0.04 | 14 | 3.9% | 10 | 2.8% | 0.06 |
Peptic ulcer disease | 117 | 13.8% | 113 | 13.4% | 0.01 | 174 | 19.9% | 166 | 19.0% | 0.02 | 67 | 18.9% | 69 | 19.4% | −0.01 |
Peripheral vascular disease (PVD) or PVD surgery | 30 | 3.5% | 29 | 3.4% | 0.01 | 37 | 4.2% | 34 | 3.9% | 0.02 | 18 | 5.1% | 14 | 3.9% | 0.05 |
Previous TIA | 32 | 3.8% | 36 | 4.3% | −0.02 | 41 | 4.7% | 35 | 4.0% | 0.03 | 12 | 3.4% | 11 | 3.1% | 0.02 |
Prior liver disease | 24 | 2.8% | 25 | 3.0% | −0.01 | 49 | 5.6% | 49 | 5.6% | 0.00 | 18 | 5.1% | 19 | 5.4% | −0.01 |
Pulmonary embolism (PE) | 11 | 1.3% | 9 | 1.1% | 0.02 | 30 | 3.4% | 24 | 2.7% | 0.04 | 2 | 0.6% | 2 | 0.6% | 0.00 |
Recent MI | 33 | 3.9% | 32 | 3.8% | 0.01 | 46 | 5.3% | 41 | 4.7% | 0.03 | 14 | 3.9% | 14 | 3.9% | 0.00 |
Old MI | 31 | 3.7% | 29 | 3.4% | 0.01 | 42 | 4.8% | 36 | 4.1% | 0.03 | 12 | 3.4% | 10 | 2.8% | 0.03 |
Renal dysfunction | 87 | 10.3% | 75 | 8.9% | 0.05 | 149 | 17.0% | 134 | 15.3% | 0.05 | 59 | 16.6% | 51 | 14.4% | 0.06 |
Stroke | 60 | 7.1% | 63 | 7.4% | −0.01 | 81 | 9.3% | 79 | 9.0% | 0.01 | 28 | 7.9% | 29 | 8.2% | −0.01 |
Systemic embolism | 4 | 0.5% | 4 | 0.5% | 0.00 | 10 | 1.1% | 9 | 1.0% | 0.01 | 3 | 0.8% | 3 | 0.8% | 0.00 |
CHADS2 score (mean, SD) | 1.8 | 1.0 | 1.8 | 1.0 | 0.01 | 2.1 | 1.2 | 2.0 | 1.1 | 0.07 | 2.1 | 1.2 | 2.0 | 1.1 | 0.07 |
1 - Low risk (N, %) | 389 | 46.0% | 388 | 45.9% | 0.00 | 329 | 37.6% | 340 | 38.9% | −0.03 | 138 | 38.9% | 128 | 36.1% | 0.06 |
2 - Intermediate risk (N, %) | 288 | 34.0% | 292 | 34.5% | −0.01 | 276 | 31.6% | 295 | 33.8% | −0.05 | 113 | 31.8% | 129 | 36.3% | −0.10 |
3 - High risk (N, %) | 169 | 20.0% | 166 | 19.6% | 0.01 | 269 | 30.8% | 239 | 27.3% | 0.08 | 104 | 29.3% | 98 | 27.6% | 0.04 |
CHADS2−VASc score (mean, SD) | 2.9 | 1.5 | 2.9 | 1.5 | 0.01 | 3.4 | 1.7 | 3.3 | 1.6 | 0.09 | 3.3 | 1.7 | 3.2 | 1.5 | 0.08 |
1 - Low risk (N, %) | 0 | 0.0% | 0 | 0.0% | − | 0 | 0.0% | 0 | 0.0% | − | 0 | 0.0% | 0 | 0.0% | − |
2 - Intermediate risk (N, %) | 160 | 18.9% | 162 | 19.1% | −0.01 | 120 | 13.7% | 125 | 14.3% | −0.02 | 42 | 11.8% | 43 | 12.1% | −0.01 |
3 - High risk (N, %) | 686 | 81.1% | 684 | 80.9% | 0.01 | 754 | 86.3% | 749 | 85.7% | 0.02 | 313 | 88.2% | 312 | 87.9% | 0.01 |
HAS-BLED score (mean, SD) | 2.2 | 1.0 | 2.2 | 1.0 | 0.02 | 2.5 | 1.1 | 2.5 | 1.1 | 0.04 | 2.5 | 1.1 | 2.4 | 1.0 | 0.06 |
1 - Low risk (N, %) | 237 | 28.0% | 230 | 27.2% | 0.02 | 166 | 19.0% | 159 | 18.2% | 0.02 | 69 | 19.4% | 60 | 16.9% | 0.07 |
2 - Intermediate risk (N, %) | 330 | 39.0% | 346 | 40.9% | −0.04 | 298 | 34.1% | 334 | 38.2% | −0.09 | 126 | 35.5% | 157 | 44.2% | −0.18 |
3 - High risk (N, %) | 279 | 33.0% | 270 | 31.9% | 0.02 | 410 | 46.9% | 381 | 43.6% | 0.07 | 160 | 45.1% | 138 | 38.9% | 0.13 |
Medications during baseline (N, %) | |||||||||||||||
Aspirin | 14 | 1.7% | 12 | 1.4% | 0.02 | 13 | 1.5% | 14 | 1.6% | −0.01 | 9 | 2.5% | 8 | 2.3% | 0.02 |
Aspirin/dipyridamole | 3 | 0.4% | 4 | 0.5% | −0.02 | 3 | 0.3% | 3 | 0.3% | 0.00 | 3 | 0.8% | 3 | 0.8% | 0.00 |
Clopidogrel | 93 | 11.0% | 89 | 10.5% | 0.02 | 115 | 13.2% | 99 | 11.3% | 0.06 | 39 | 11.0% | 44 | 12.4% | −0.04 |
Prasugrel | 6 | 0.7% | 7 | 0.8% | −0.01 | 7 | 0.8% | 8 | 0.9% | −0.01 | 6 | 1.7% | 4 | 1.1% | 0.05 |
Ticagrelor | 1 | 0.1% | 1 | 0.1% | 0.00 | 0 | 0.0% | 0 | 0.0% | − | 0 | 0.0% | 0 | 0.0% | − |
Other antiplatelet agents | 4 | 0.5% | 7 | 0.8% | −0.04 | 7 | 0.8% | 6 | 0.7% | 0.01 | 3 | 0.8% | 4 | 1.1% | −0.03 |
NSAIDs | 182 | 21.5% | 179 | 21.2% | 0.01 | 212 | 24.3% | 215 | 24.6% | −0.01 | 74 | 20.8% | 80 | 22.5% | −0.04 |
Heparin | 0 | 0.0% | 0 | 0.0% | − | 0 | 0.0% | 0 | 0.0% | − | 0 | 0.0% | 0 | 0.0% | − |
Low-molecular weight heparins | 17 | 2.0% | 25 | 3.0% | −0.06 | 24 | 2.7% | 21 | 2.4% | 0.02 | 4 | 1.1% | 6 | 1.7% | −0.05 |
PGP inhibitors | 492 | 58.2% | 475 | 56.1% | 0.04 | 496 | 56.8% | 499 | 57.1% | −0.01 | 215 | 60.6% | 204 | 57.5% | 0.06 |
ARB | 188 | 22.2% | 195 | 23.0% | −0.02 | 221 | 25.3% | 207 | 23.7% | 0.04 | 90 | 25.4% | 79 | 22.3% | 0.07 |
ACE inhibitor | 307 | 36.3% | 291 | 34.4% | 0.04 | 309 | 35.4% | 311 | 35.6% | 0.00 | 126 | 35.5% | 123 | 34.6% | 0.02 |
Beta blocker | 623 | 73.6% | 620 | 73.3% | 0.01 | 627 | 71.7% | 628 | 71.9% | 0.00 | 256 | 72.1% | 251 | 70.7% | 0.03 |
Calcium channel blocker | 345 | 40.8% | 347 | 41.0% | 0.00 | 386 | 44.2% | 389 | 44.5% | −0.01 | 162 | 45.6% | 157 | 44.2% | 0.03 |
Other hypertension drugs | 227 | 26.8% | 219 | 25.9% | 0.02 | 253 | 28.9% | 243 | 27.8% | 0.03 | 121 | 34.1% | 108 | 30.4% | 0.08 |
Antiarrhythmic drugs (other than amiodarone and dronedarone) | 136 | 16.1% | 137 | 16.2% | 0.00 | 114 | 13.0% | 132 | 15.1% | −0.06 | 54 | 15.2% | 45 | 12.7% | 0.07 |
Statin | 429 | 50.7% | 430 | 50.8% | 0.00 | 486 | 55.6% | 463 | 53.0% | 0.05 | 203 | 57.2% | 194 | 54.6% | 0.05 |
Other lipid-lowering drugs | 104 | 12.3% | 108 | 12.8% | −0.01 | 116 | 13.3% | 105 | 12.0% | 0.04 | 43 | 12.1% | 54 | 15.2% | −0.09 |
Diabetes medications | 192 | 22.7% | 191 | 22.6% | 0.00 | 231 | 26.4% | 210 | 24.0% | 0.06 | 98 | 27.6% | 83 | 23.4% | 0.10 |
Healthcare utilization | |||||||||||||||
Hospitalization in 30 days prior to treatment initiation (N, %) | 298 | 35.2% | 299 | 35.3% | 0.00 | 328 | 37.5% | 315 | 36.0% | 0.03 | 90 | 25.4% | 81 | 22.8% | 0.06 |
Treating prescriber (N, %) | |||||||||||||||
Cardiologist | 150 | 17.7% | 149 | 17.6% | 0.00 | 196 | 22.4% | 193 | 22.1% | 0.01 | 109 | 30.7% | 119 | 33.5% | −0.06 |
Primary care physician | 206 | 24.3% | 206 | 24.3% | 0.00 | 281 | 32.2% | 259 | 29.6% | 0.05 | 98 | 27.6% | 90 | 25.4% | 0.05 |
Other | 490 | 57.9% | 491 | 58.0% | 0.00 | 397 | 45.4% | 422 | 48.3% | −0.06 | 148 | 41.7% | 146 | 41.1% | 0.01 |
Number of laboratory tests ordered (mean, SD) | 15.0 | 22.8 | 15.8 | 27.7 | −0.03 | 14.2 | 19.7 | 15.0 | 26.4 | −0.03 | 15.4 | 32.8 | 14.2 | 24.2 | 0.04 |
Number of INR (prothrombin) tests ordered (mean, SD) | 0.9 | 2.7 | 1.1 | 2.5 | −0.06 | 0.6 | 2.2 | 0.8 | 2.2 | −0.07 | 0.7 | 2.7 | 0.8 | 1.9 | −0.03 |
Number of lipid tests ordered (mean, SD) | 0.9 | 1.3 | 0.9 | 1.3 | −0.05 | 0.8 | 1.2 | 0.8 | 1.3 | −0.01 | 0.9 | 1.5 | 0.8 | 1.4 | 0.07 |
Number of creatinine tests ordered (mean, SD) | 0.2 | 0.8 | 0.2 | 0.8 | 0.01 | 0.1 | 0.6 | 0.1 | 0.7 | 0.00 | 0.1 | 0.4 | 0.1 | 0.4 | 0.13 |
Number of medications (mean, SD) | 12.3 | 6.7 | 12.1 | 6.4 | 0.03 | 13.0 | 6.8 | 12.8 | 6.7 | 0.02 | 13.2 | 6.3 | 13.2 | 6.9 | 0.01 |
Number of hospitalizations (mean, SD) | 0.6 | 0.7 | 0.6 | 0.8 | −0.02 | 0.6 | 0.7 | 0.6 | 0.8 | 0.01 | 0.5 | 0.7 | 0.5 | 0.8 | 0.02 |
Number of hospital days (mean, SD) | 2.7 | 4.4 | 2.7 | 3.8 | −0.01 | 3.4 | 6.5 | 3.4 | 6.0 | 0.00 | 2.6 | 5.8 | 2.6 | 5.8 | 0.01 |
Number of office visits (mean, SD) | 13.4 | 9.9 | 13.6 | 10.6 | −0.01 | 14.2 | 9.9 | 14.3 | 10.7 | −0.01 | 14.0 | 10.2 | 13.5 | 9.2 | 0.05 |
Indicators for region, calendar time of cohort entry and non-CV medications all had SDs of <0.1 and were omitted. PS: propensity score; SD: standard deviation; ACE: angiotensin converting enzyme; ARBs: angiotensin receptor blockers.
Most EHR-based clinical patient characteristics that were unobserved in the claims data study were well balanced within each 1:1 PS-matched exposure group (Table 3). Although we observed a relatively large proportion of patients with missing EHR information (ranging from approximately 25% to 98% depending on the variable) (Table 3 and Table e2), the mechanism underlying the missingness appeared to be non-differential between exposure groups for all variables except INR. The large amount of missing information which was differential for INR, a HAS-BLED score component, limits the interpretation of the HAS-BLED score with EHR data.
Table 3.
EHR-observed measurements | Dabigatran | Warfarin | Standardized
difference |
Rivaroxaban | Warfarin | Standardized difference |
Apixaban | Warfarin | Standardized difference |
||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
N=846 | N=846 | N=874 | N=874 | N=355 | N=355 | ||||||||||
N/Mean | %/SD | N/Mean | %/SD | N/Mean | %/SD | N/Mean | %/SD | N/Mean | %/SD | N/Mean | %/SD | ||||
Body mass index (BMI; kg/m2) | |||||||||||||||
Missing BMI data (N, %) | 205 | 24.2% | 187 | 22.1% | 0.05 | 121 | 13.8% | 136 | 15.6% | −0.05 | 36 | 10.1% | 43 | 12.1% | −0.06 |
Patients with BMI data (N, %) | 641 | 75.8% | 659 | 77.9% | −0.05 | 753 | 86.2% | 738 | 84.4% | 0.05 | 319 | 89.9% | 312 | 87.9% | 0.06 |
BMI (mean, SD) | 32.4 | 8.3 | 32.5 | 8.0 | −0.01 | 31.8 | 7.6 | 31.9 | 7.9 | −0.02 | 31.9 | 7.4 | 32.4 | 8.1 | −0.06 |
BMI category (N, % of pts w/ BMI data) | |||||||||||||||
Underweight: BMI <18.5 | 5 | 0.8% | 3 | 0.5% | 0.04 | 9 | 1.2% | 6 | 0.8% | 0.04 | 4 | 1.3% | 3 | 1.0% | 0.03 |
Healthy weight: BMI 18.5 to <25 | 89 | 13.9% | 102 | 15.5% | −0.05 | 117 | 15.5% | 114 | 15.4% | 0.00 | 46 | 14.4% | 49 | 15.7% | −0.04 |
Overweight: BMI 25 to <30 | 179 | 27.9% | 197 | 29.9% | −0.04 | 220 | 29.2% | 227 | 30.8% | −0.03 | 97 | 30.4% | 80 | 25.6% | 0.11 |
Obese: BMI >30 | 368 | 57.4% | 357 | 54.2% | 0.07 | 407 | 54.1% | 391 | 53.0% | 0.02 | 172 | 53.9% | 180 | 57.7% | −0.08 |
Class 1 obese: BMI 30 to <35 | 188 | 29.3% | 153 | 23.2% | 0.14 | 198 | 26.3% | 187 | 25.3% | 0.02 | 76 | 23.8% | 87 | 27.9% | −0.09 |
Class 2 obese: BMI 35 to <40 | 86 | 13.4% | 99 | 15.0% | −0.05 | 122 | 16.2% | 108 | 14.6% | 0.04 | 51 | 16.0% | 48 | 15.4% | 0.02 |
Class 3 obese: BMI >40 | 94 | 14.7% | 105 | 15.9% | −0.04 | 87 | 11.6% | 96 | 13.0% | −0.04 | 45 | 14.1% | 45 | 14.4% | −0.01 |
Smoking | |||||||||||||||
Missing smoking data (N, %) | 758 | 89.6% | 749 | 88.5% | 0.03 | 767 | 87.8% | 771 | 88.2% | −0.01 | 318 | 89.6% | 315 | 88.7% | 0.03 |
Patients with smoking data (N, %) | 88 | 10.4% | 97 | 11.5% | −0.03 | 107 | 12.2% | 103 | 11.8% | 0.01 | 37 | 10.4% | 40 | 11.3% | −0.03 |
Never smoked (N, % of pts w/ smoking data) | 44 | 50.0% | 48 | 49.5% | 0.01 | 51 | 47.7% | 48 | 46.6% | 0.02 | 21 | 56.8% | 16 | 40.0% | 0.34 |
Current/past (N, % of pts w/ smoking data) | 44 | 50.0% | 49 | 50.5% | −0.01 | 56 | 52.3% | 55 | 53.4% | −0.02 | 16 | 43.2% | 24 | 60.0% | −0.34 |
Alcohol consumption | |||||||||||||||
Missing alcohol consumption data (N, %) | 832 | 98.3% | 828 | 97.9% | 0.03 | 852 | 97.5% | 853 | 97.6% | −0.01 | 350 | 98.6% | 345 | 97.2% | 0.10 |
Patients with alcohol consumption data (N, %) | 14 | 1.7% | 18 | 2.1% | −0.03 | 22 | 2.5% | 21 | 2.4% | 0.01 | 5 | 1.4% | 10 | 2.8% | −0.10 |
No consumption (N, % of pts w/ alcohol data) | 7 | 50.0% | 11 | 61.1% | −0.23 | 9 | 40.9% | 13 | 61.9% | −0.43 | 0 | 0.0% | 5 | 50.0% | −1.41 |
Light to moderate consumption2 (N, % of pts w/ alcohol data) | 1 | 7.1% | 0 | 0.0% | 0.39 | 2 | 9.1% | 1 | 4.8% | 0.17 | 0 | 0.0% | 0 | 0.0% | − |
Heavy consumption3 (N, % of patients w/ alcohol data) | 6 | 42.9% | 6 | 33.3% | 0.20 | 11 | 50.0% | 6 | 28.6% | 0.45 | 5 | 100.0% | 5 | 50.0% | 1.41 |
Consumption of unknown quantify (N, % of pts w/ alcohol data) | 0 | 0.0% | 1 | 5.6% | −0.34 | 0 | 0.0% | 1 | 4.8% | −0.32 | 0 | 0.0% | 0 | 0.0% | − |
Glomerular filtration rate (GFR; ml/min/1.73m2) | |||||||||||||||
Missing GFR data (N, %) | 405 | 47.9% | 405 | 47.9% | 0.00 | 432 | 49.4% | 431 | 49.3% | 0.00 | 161 | 45.4% | 179 | 50.4% | −0.10 |
Patients with GFR data (N, %) | 441 | 52.1% | 441 | 52.1% | 0.00 | 442 | 50.6% | 443 | 50.7% | 0.00 | 194 | 54.6% | 176 | 49.6% | 0.10 |
GFR (mean, SD) | 85.4 | 21.9 | 83.4 | 23.5 | 0.09 | 81.8 | 24.5 | 80.0 | 24.7 | 0.07 | 81.1 | 25.2 | 79.4 | 23.9 | 0.07 |
GFR category (N, % of pts w/ GFR data) | |||||||||||||||
G1: GFR >90 | 202 | 45.8% | 189 | 42.9% | 0.06 | 177 | 40.0% | 164 | 37.0% | 0.06 | 78 | 40.2% | 63 | 35.8% | 0.09 |
G2: GFR 60 to 89 | 175 | 39.7% | 178 | 40.4% | −0.01 | 175 | 39.6% | 184 | 41.5% | −0.04 | 76 | 39.2% | 80 | 45.5% | −0.13 |
G3a: GFR 45 to 59 | 43 | 9.8% | 42 | 9.5% | 0.01 | 47 | 10.6% | 52 | 11.7% | −0.04 | 21 | 10.8% | 15 | 8.5% | 0.08 |
G3b: GFR 30 to 44 | 19 | 4.3% | 17 | 3.9% | 0.02 | 35 | 7.9% | 28 | 6.3% | 0.06 | 13 | 6.7% | 14 | 8.0% | −0.05 |
G4: GFR 15 to 29 | 2 | 0.5% | 14 | 3.2% | −0.20 | 7 | 1.6% | 11 | 2.5% | −0.06 | 4 | 2.1% | 2 | 1.1% | 0.07 |
G5: GFR <15 | 0 | 0.0% | 1 | 0.2% | −0.07 | 1 | 0.2% | 4 | 0.9% | −0.09 | 2 | 1.0% | 2 | 1.1% | −0.01 |
Abnormal renal function (N, %) | 55 | 6.5% | 68 | 8.0% | −0.06 | 83 | 9.5% | 88 | 10.1% | −0.02 | 35 | 9.9% | 35 | 9.9% | 0.00 |
Duration of atrial fibrillation | |||||||||||||||
Months (mean, SD) | 22.7 | 35.2 | 25.4 | 33.0 | −0.08 | 22.7 | 34.0 | 24.0 | 33.6 | −0.04 | 18.2 | 28.6 | 23.1 | 35.0 | −0.15 |
<1 year (N, % of pts w/ atrial fibrillation) | 224 | 58.0% | 174 | 49.7% | 0.17 | 206 | 56.1% | 185 | 55.4% | 0.01 | 113 | 64.6% | 84 | 57.1% | 0.15 |
1 to <3 years | 78 | 20.2% | 85 | 24.3% | −0.10 | 74 | 20.2% | 65 | 19.5% | 0.02 | 26 | 14.9% | 31 | 21.1% | −0.16 |
3 to <5 years | 43 | 11.1% | 45 | 12.9% | −0.05 | 49 | 13.4% | 34 | 10.2% | 0.10 | 21 | 12.0% | 13 | 8.8% | 0.10 |
5+ years | 41 | 10.6% | 46 | 13.1% | −0.08 | 38 | 10.4% | 50 | 15.0% | −0.14 | 15 | 8.6% | 19 | 12.9% | −0.14 |
Duration of stroke | |||||||||||||||
Months (mean, SD) | 34.6 | 38.6 | 34.4 | 34.0 | 0.00 | 32.3 | 34.9 | 31.3 | 32.1 | 0.03 | 32.7 | 40.6 | 36.2 | 41.5 | −0.09 |
<1 year (N, % of pts w/ stroke) | 16 | 34.0% | 19 | 35.8% | −0.04 | 27 | 39.1% | 19 | 33.3% | 0.12 | 12 | 46.2% | 9 | 45.0% | 0.02 |
1 to <3 years | 17 | 36.2% | 16 | 30.2% | 0.13 | 19 | 27.5% | 22 | 38.6% | −0.24 | 7 | 26.9% | 4 | 20.0% | 0.16 |
3 to <5 years | 4 | 8.5% | 7 | 13.2% | −0.15 | 12 | 17.4% | 7 | 12.3% | 0.14 | 1 | 3.8% | 2 | 10.0% | −0.24 |
5+ years | 10 | 21.3% | 11 | 20.8% | 0.01 | 11 | 15.9% | 9 | 15.8% | 0.00 | 6 | 23.1% | 5 | 25.0% | −0.05 |
Use of antiplatelets or NSAIDs, incl. OTC use (N, %) | 145 | 17.1% | 158 | 18.7% | −0.04 | 183 | 20.9% | 184 | 21.1% | 0.00 | 74 | 20.8% | 75 | 21.1% | −0.01 |
Bleeding history or predisposition (N, %) | 31 | 3.7% | 39 | 4.6% | −0.05 | 28 | 3.2% | 34 | 3.9% | −0.04 | 8 | 2.3% | 14 | 3.9% | −0.10 |
International normalized ratio (INR) | |||||||||||||||
Missing INR data (N, %) | 747 | 88.3% | 671 | 79.3% | 0.25 | 809 | 92.6% | 691 | 79.1% | 0.39 | 327 | 92.1% | 282 | 79.4% | 0.37 |
Patients with INR data (N, %) | 99 | 11.7% | 175 | 20.7% | −0.25 | 65 | 7.4% | 183 | 20.9% | −0.39 | 28 | 7.9% | 73 | 20.6% | −0.37 |
INR (mean, SD) | 1.3 | 0.5 | 1.8 | 0.7 | −0.76 | 1.4 | 0.7 | 1.7 | 0.7 | −0.44 | 1.2 | 0.5 | 1.6 | 0.8 | −0.62 |
INR category (N, % of pts w/ INR data) | |||||||||||||||
<1 | 15 | 15.2% | 14 | 8.0% | 0.22 | 10 | 15.4% | 14 | 7.7% | 0.24 | 5 | 17.9% | 4 | 5.5% | 0.39 |
1 to <2 | 74 | 74.7% | 83 | 47.4% | 0.58 | 46 | 70.8% | 99 | 54.1% | 0.35 | 22 | 78.6% | 45 | 61.6% | 0.38 |
2 to <3 | 8 | 8.1% | 68 | 38.9% | −0.78 | 5 | 7.7% | 62 | 33.9% | −0.68 | 0 | 0.0% | 20 | 27.4% | −0.87 |
>3 | 2 | 2.0% | 10 | 5.7% | −0.19 | 4 | 6.2% | 8 | 4.4% | 0.08 | 1 | 3.6% | 4 | 5.5% | −0.09 |
HAS-BLED score | |||||||||||||||
HAS-BLED (mean, SD) | 1.5 | 1.0 | 1.5 | 1.0 | −0.06 | 1.6 | 1.0 | 1.6 | 1.0 | −0.05 | 1.6 | 1.0 | 1.7 | 1.0 | −0.04 |
<1 (N, % of all pts) | 127 | 15.0% | 115 | 13.6% | 0.04 | 118 | 13.5% | 103 | 11.8% | 0.05 | 41 | 11.5% | 36 | 10.1% | 0.05 |
1 to 2 | 607 | 71.7% | 599 | 70.8% | 0.02 | 599 | 68.5% | 610 | 69.8% | −0.03 | 246 | 69.3% | 248 | 69.9% | −0.01 |
>2 | 112 | 13.2% | 132 | 15.6% | −0.07 | 157 | 18.0% | 161 | 18.4% | −0.01 | 68 | 19.2% | 71 | 20.0% | −0.02 |
PS: propensity score; IQR: interquartile; BMI: body mass index; HbA1c: hemoglobin A1c; eGFR: estimated glomerular filtration rate; HDL: high-density lipoprotein; LDL: low-density lipoprotein; BP: blood pressure; HAS-BLED: Labile INR defined as most recent INR <2 or >3 prior to cohort entry.
Smoking status and BMI categories were well balanced with aSDs mostly below 0.05, although smoking status was largely missing (~90%) in the EHR. As some patient numbers get smaller than 20 per treatment group (e.g., smoking in apixaban cohort), chance variation is likely to have led to slightly higher imbalances but with aSD still smaller than 0.40. Alcohol use was recorded in less than 5% of patients and resulted in non-interpretable findings. Estimated GFR was available for about 50% of patients and mean eGFR was well balanced across all groups. eGFR categories also showed very good balance, with aSDs all <0.10 except for a few categories with small numbers of patients. The mean duration of existing AF, which is difficult to fully assess in claims data due to left censoring, was well balanced (aSD <0.10) for dabigatran and rivaroxaban, with some imbalance (aSD=0.15) observed for the apixaban group. NSAID use including recorded over-the-counter (OTC) use though seemingly less completely captured in EHR versus prescription databases, was very well balanced (aSDs < 0.05). INR values were recorded in less than 20% of patients. INR measurements are more likely to be performed amongst warfarin users than among DOAC users. For those patients for whom it was recorded, there was evidence of some imbalance. The HAS-BLED score for bleeding defined by EHR information was very well balanced (Table 3).
Any residual imbalances in EHR-defined clinical variables could potentially cause confounding bias. The potential confounding bias caused by any residual imbalances in EHR-defined clinical variables was negligible for plausible scenarios of unmeasured confounder-outcome (RRCD) associations (Figure 2). For example, if current or past smoking would truly triple the risk of stroke (RRCD=3.0) then the observed HR in claims data would still be unchanged from the observed 0.79; similarly, if eGFR<45 would truly triple the risk of stroke (RRCD=3.0) then the observed HR in claims data would be 0.83 instead of the observed 0.79 (Figure 2, Panel A). These minimal biases would be counteracted by residual imbalances in INR values although bias estimates are unreliable for INR values due to the high proportion of missing values. The impact on the analysis of major hemorrhage is similar. If HAS-BLED >2 would truly triple the risk of major hemorrhage (RRCD=3.0) then the observed HR in claims data would be 0.77 instead of the observed 0.74 (Figure 2, Panel B).
Discussion
From a claims data cohort study of 140,187 NVAF patients newly using DOACs or warfarin that were linked with clinical EHR data in 5,935 (4.2%) patients, we found little evidence of residual confounding by EHR-recorded variables that would meaningfully bias studies on the effectiveness of stroke prevention or incidence of major hemorrhage. Furthermore, the 4% patient sample that successfully linked to EHR data was broadly representative of the much larger claims data study population, although moderate differences in terms of age and some resource use were observed. Indeed, because the EHR information considered in this study is derived from the repositories of physician practices that opted to use the GE Centricity EMR system – a choice which is not expected to be associated with the clinical characteristics of the served patient population – the availability of EHR for patients can essentially be considered “at random”. Among the patient sample linked to EHR data, even before 1:1 PS matching, claims-defined characteristics were well balanced between the respective DOAC and the warfarin groups, which was further improved with PS matching. After PS-matching, almost all EHR-defined patient characteristics were well balanced, although it should be recognized that some EHR variables had a significant amount of missing data. Nevertheless, our findings indicate that it is highly unlikely that residual confounding by EHR-recorded variables would meaningfully bias the claims data analysis in the DOAC monitoring program. While this question has previously been examined in the context of studies for glucose-lowering medications17. – with similar conclusions – this is the first time it has been examined in the context of oral anticoagulants.
We explain this reassuring conclusion by the choices made in study design, analysis planning and the specific clinical setting. We restricted the study population to patients with NVAF. It is well described that restriction can be a powerful tool to remedy confounding.18. 19. Additional restriction to new users and active comparators further reduce confounding.19. 20. In a new user active comparator cohort study, all patients are at a point where their treatment is initiated or escalated by starting a new drug treatment after their prescriber has evaluated their disease state and concluded it is time to change the disease management strategy.7. This design, therefore, makes patients in the comparison groups more similar in measured and unmeasured characteristics. We then used PS matching, which allowed us to adjust for a number of claims-defined covariates. Such comprehensive covariate adjustment will balance a long list of observed confounders and proxy measures of unobserved confounders even if outcome events are infrequent.15. This approach has found wide-spread use in healthcare database analyses because proxy measures can be defined in those data and the threat of residual confounding is high.
Implications for future claims data analyses are two-fold. First, this study confirms that with the right study design and analytic strategy, confounding can be well controlled in database studies.21. 22. 23. We recognize that this is context specific and the preference for an active comparison group may not match the clinical question.19. We further want to stress that our conclusion relates to confounding bias and not other biases caused by misclassification of the outcome or exposure. Second, this study illustrates a scalable approach for checking whether sufficient balance was achieved in important but unmeasured clinical parameters. As EHR databases mature in data quality and completeness they also become increasingly linkable to large claims databases. Having a standing mechanism in place to do this type of linkage in representative patient subsets with all electronic data will expedite the process over medical records abstraction.21. We have previously recommended to check covariate balance in the main study and linked subsets before moving forward with the main outcome analysis.24. 25. Findings of such interim validity checks blinded towards the study results will increase decision-makers’ confidence in the eventual study findings.9. It may also lead to the conclusion that a study should not go forth because it cannot produce valid findings, a decision equally important to avoid polluting the literature with “evidence” that is not fit for purpose.25.
Conclusion:
In database studies of anticoagulation for stroke prevention, a new user active comparator design with 1:1 PS matching on many patient characteristics improved balance on important risk factors not available in claims data but measured in EHR, making confounding bias unlikely. Linking EHR data to a subset of patients in a larger claims database study is a worthwhile and scalable strategy for instilling confidence in findings from database studies.
Methods
Dabigatran monitoring program
This study was conducted in the context of a multi-year program to monitor the safety and effectiveness of dabigatran (NCT02081807, EUPAS5855). The primary outcomes of the monitoring program were stroke and major hemorrhage. The monitoring program involved repeated outcome evaluations over time, starting on October 1, 2010, coinciding with the marketing of dabigatran for stroke prevention in patients with NVAF in the US, and ended in September 2015.
Data source
From the Truven MarketScan healthcare database,26. a longitudinal claims database of commercial U.S. health plans including patients enrolled in Medicare Advantage plans, employer sponsored coverage of seniors, and Medicare supplemental insurance. The MarketScan database contains patient-level information on demographics, health plan enrollment status, records of reimbursed medical services, including inpatient and outpatient encounters with diagnosis and procedure information, and dispensed prescription drugs. For a subset of the population, claims data were linkable to EHRs from select clinics and other outpatient settings providing care to MarketScan beneficiaries. The Institutional Review Board of the Brigham and Women’s Hospital approved the study and signed licensing agreements for use of the Truven data were in place.
Formation of claims-based study population
The study population included three pairwise cohorts of patients aged 18 years or older who initiated dabigatran versus warfarin, rivaroxaban versus warfarin, or apixaban versus warfarin, between October 1, 2010 and December 31, 2014. Patients entered the cohort on the day of a first filled prescription of any of the drugs above defined for each pair-wise cohort as no prior use of any anticoagulant in the previous twelve months and were required to have at least 12 months of continuous enrollment before cohort entry.
We restricted the cohort to patients with a diagnosis of NVAF, defined as an inpatient or outpatient ICD-9 CM diagnosis code of 427.31 at any point prior to drug initiation. We excluded patients with a preexisting diagnosis of valvular comorbidity, CHA2DS2-VASc score <1, or a nursing home admission in the previous twelve months.
Claims-EHR linkage and EHR-based clinical characteristics
For a subset of patients enrolled in the claims data study, insurance claims were enriched with additional data obtained through linkage with EHRs. EHR information was contributed by select clinics and other outpatient settings providing care to MarketScan beneficiaries. Probabilistic linkage was performed by Truven Health Analytics® to preserve patient privacy, including features such as year of birth, sex, three-digit ZIP code, and dates of office visits.27. EHR-defined covariates were captured prior to cohort entry and included, amongst others, health behaviors (smoking status and BMI), duration of AF (the earliest record for an AF diagnosis in the EHR), laboratory test results (INR, estimated glomerular filtration rate [eGFR]),28. and the HAS-BLED score. The HAS-BLED was computed including labile INR defined as the most recent INR <2 or >3 prior to cohort entry. In a sensitivity analysis, we computed the HAS-BLED score in the subgroup of patients with complete information on all HAS-BLED components. If multiple recording of EHR-defined covariates were available, we considered the value closest to the day of cohort entry (see eTable2 for a comprehensive list of the EHR-defined variables). Binary EHR variables capturing the presence or absence of a condition were considered to be truly absent if not recorded in the EHR.
PS-matching within Claims-EHR linked subset
To control for imbalances in patient characteristics between treatment groups in the EHR-linked subset, in three separate multivariable logistic regression models we estimated exposure propensity scores (PS) as the predicted probability of receiving the treatment of interest, i.e. dabigatran, rivaroxaban, or apixaban vs. warfarin, conditional on 78 claims-defined baseline characteristics (Table 1),29. identified during the twelve months before and including the cohort entry date. Emphasis was placed on the identification of claims-defined proxies of stroke and bleed risks, including the HAS-BLED score, the CHA2DS2-VASc score, and prior history of stroke or bleeding. Other patient characteristics included demographics, presence of other comorbidities, use of medications, and indicators of health care utilization as proxy for overall disease state and care intensity. Comorbidities were defined using ICD-9 codes and CPT-4 codes. Exposure groups were 1:1 matched on their PS using nearest neighbor matching without replacement with a maximum caliper of 0.05.30. The PS was re-estimated every 6 months with matching performed within calendar quarters to account for quickly changing prescribing behaviors of newly marketed medications over time.31.
Statistical analysis
To assess whether the claims-EHR linked subset was representative of the overall study population, we compared claims-defined characteristics in the study population for whom EHR data were available and patients without available EHR data using absolute standardized differences (aSD). To assess the potential of residual confounding after PS matching on claims-based variables, caused by clinical characteristics unobserved in claims data, we cross-tabulated the EHR-defined covariates by exposure groups and evaluated imbalances by computing aSD. Meaningful imbalances were defined as aSD greater than 0.1.23.
We quantified the potential bias associated with any residual imbalances in EHR-defined variables based on realistic scenarios of varying exposure-outcome and confounder-outcome associations.32. 33. Findings from these bias models were applied to dabigatran-stroke (HR= 0.79) and dabigatran-major hemorrhage associations (HR=0.74) observed in the monitoring system.12. 34.
Supplementary Material
Study Highlights
What is the current knowledge on the topic?
Claims data studies of comparative effectiveness and safety of anticoagulants are often criticized because of the lack of information on critical clinical characteristics, such as underlying bleeding risks, renal function, over-the-counter (OTC) aspirin use, body mass index (BMI), or smoking. Such criticisms could be refuted if the factors unmeasured in claims data studies were in fact balanced between treatment groups when measured in clinical data repositories, due to study design choices and high-dimensional proxy adjustment
What question did this study address?
With the wide-spread use of electronic medical records, subsets of patients identified in administrative claims data can be successfully linked to electronic health records (EHR), and the balance of clinical parameters not documented in claims can be assessed across exposure groups. We sought to evaluate the extent to which balance in unmeasured patient characteristics was achieved in claims data studies, by comparing against detailed clinical information available in EHR data.
What does this study add to our knowledge?
In the context of database studies of anticoagulation for stroke prevention, a new user active comparator design with 1:1 propensity score matching on many patient characteristics improved balance on important clinical risk factors not available in claims data, making confounding bias unlikely
How might this change clinical pharmacology or translational science?
Our manuscript provides evidence that linking EHR data to a subset of patients in a larger claims database study is a worthwhile and scalable strategy for instilling confidence in findings from database studies.
Acknowledgements:
We thank Debra Irwin, Paul Juneau and Kristin Evans for their support and input at various stages of this research.
Funding: This research was supported by a research contract from Boehringer-Ingelheim to the Brigham and Women’s Hospital. The research contract granted Brigham and Women’s Hospital rights to publish all findings as well as final wording of the manuscript.
This research was supported by a research grant from Boehringer Ingelheim.
SS is consultant to WHISCON, LLC and to Aetion, Inc., a software manufacturer of which he also owns equity. He is principal investigator of investigator-initiated grants to the Brigham and Women’s Hospital from Bayer, Genentech and Boehringer Ingelheim unrelated to the topic of this study.
KH reports grant support from the National Institute of Mental Health, and is investigator on grants to the Brigham and Women’s Hospital from Eli Lilly, GlaxoSmithKline and Pfizer unrelated to the topic of this study
Footnotes
CG, JF and JL have no relevant disclosures
DB is an employee at BI X GmbH
KZ and LRF are employed at Boehringer Ingelheim International GmbH
Conflict of Interest:
References
- 1.Ageno W, Gallus AS, Wittkowsky A, Crowther M, Hylek EM, Palareti G. Oral anticoagulant therapy: Antithrombotic Therapy and Prevention of Thrombosis, 9th ed: American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest. 2012;141(2 Suppl):e44S–e88S. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.You JJ, Singer DE, Howard PA, et al. Antithrombotic therapy for atrial fibrillation: Antithrombotic Therapy and Prevention of Thrombosis, 9th ed: American College of Chest Physicians Evidence-Based Clinical Practice Guidelines. Chest. 2012;141(2 Suppl):e531S–e575S. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Connolly SJ, Ezekowitz MD, Yusuf S, et al. Dabigatran versus warfarin in patients with atrial fibrillation. N Engl J Med. 2009;361(12):1139–51. [DOI] [PubMed] [Google Scholar]
- 4.Patel MR, Mahaffey KW, Garg J, et al. Rivaroxaban versus warfarin in nonvalvular atrial fibrillation. N Engl J Med. 2011;365(10):883–91. [DOI] [PubMed] [Google Scholar]
- 5.Granger CB, Alexander JH, McMurray JJ, et al. Apixaban versus Warfarin in Patients with Atrial Fibrillation. N Engl J Med. 2011. [DOI] [PubMed] [Google Scholar]
- 6.Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol. 2005;58(4):323–337. [DOI] [PubMed] [Google Scholar]
- 7.Schneeweiss S A basic study design for expedited safety signal evaluation based on electronic healthcare data. Pharmacoepidemiol Drug Saf. 2010;19(8):858–868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Schneeweiss S, Seeger JD, Jackson JW, Smith SR. Methods for comparative effectiveness research/patient-centered outcomes research: from efficacy to effectiveness. J Clin Epidemiol. 2013;66(8 Suppl):S1–4. [DOI] [PubMed] [Google Scholar]
- 9.Schneeweiss S Improving therapeutic effectiveness and safety through big healthcare data. Clin Pharmacol Ther. 2016;99(3):262–265. [DOI] [PubMed] [Google Scholar]
- 10.Schneeweiss S, Huybrechts K, Gagne J. Interpreting the quality of health care database studies on the comparative effectiveness of oral anticoagulants in routine care. Comp Eff Res. September 2013:33. [Google Scholar]
- 11.Graham DJ, Reichman ME, Wernecke M, et al. Cardiovascular, bleeding, and mortality risks in elderly Medicare patients treated with dabigatran or warfarin for nonvalvular atrial fibrillation. Circulation. 2015;131(2):157–164. [DOI] [PubMed] [Google Scholar]
- 12.Seeger JD, Bykov K, Bartels DB, Huybrechts K, Zint K, Schneeweiss S. Safety and Effectiveness of Dabigatran and Warfarin in Routine Care of Patients with Atrial Fibrillation: Thromb Haemost. 2015;114(6):1277–1289. [DOI] [PubMed] [Google Scholar]
- 13.Graham DJ, Reichman ME, Wernecke M, et al. Stroke, Bleeding, and Mortality Risks in Elderly Medicare Beneficiaries Treated With Dabigatran or Rivaroxaban for Nonvalvular Atrial Fibrillation. JAMA Intern Med. 2016;176(11):1662–1671. [DOI] [PubMed] [Google Scholar]
- 14.Potpara TS. Dabigatran in “real-world” clinical practice for stroke prevention in patients with non-valvular atrial fibrillation. Thromb Haemost. 2015;114(6):1093–1098. [DOI] [PubMed] [Google Scholar]
- 15.Schneeweiss S, Rassen JA, Glynn RJ, Avorn J, Mogun H, Brookhart MA. High-dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology. 2009;20(4):512–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Ray WA. Evaluating medication effects outside of clinical trials: new-user designs. Am J Epidemiol. 2003;158(9):915–920. [DOI] [PubMed] [Google Scholar]
- 17.Patorno E, Gopalakrishnan C, Franklin JM, et al. Claims-based studies of oral glucose-lowering medications can achieve balance in critical clinical parameters only observed in electronic health records. Diabetes Obes Metab. December 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. Lippincott Williams & Wilkins; 2008. [Google Scholar]
- 19.Schneeweiss S, Patrick AR, Stürmer T, et al. Increasing levels of restriction in pharmacoepidemiologic database studies of elderly and comparison with randomized trial results. Med Care. 2007;45(10 Supl 2):S131–142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Johnson ES, Bartman BA, Briesacher BA, et al. The incident user design in comparative effectiveness research. Pharmacoepidemiol Drug Saf. 2013;22(1):1–6. [DOI] [PubMed] [Google Scholar]
- 21.Eng PM, Seeger JD, Loughlin J, Clifford CR, Mentor S, Walker AM. Supplementary data collection with case-cohort analysis to address potential confounding in a cohort study of thromboembolism in oral contraceptive initiators matched on claims-based propensity scores. Pharmacoepidemiol Drug Saf. 2008;17(3):297–305. [DOI] [PubMed] [Google Scholar]
- 22.Schneeweiss S, Glynn RJ, Tsai EH, Avorn J, Solomon DH. Adjusting for unmeasured confounders in pharmacoepidemiologic claims data using external information: the example of COX2 inhibitors and myocardial infarction. Epidemiol Camb Mass. 2005;16(1):17–24. [DOI] [PubMed] [Google Scholar]
- 23.Austin PC. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples. Stat Med. 2009;28(25):3083–3107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Franklin JM, Rassen JA, Ackermann D, Bartels DB, Schneeweiss S. Metrics for covariate balance in cohort studies of causal effects. Stat Med. 2014;33(10):1685–1699. [DOI] [PubMed] [Google Scholar]
- 25.Franklin JM, Schneeweiss S. When and How Can Real World Data Analyses Substitute for Randomized Controlled Trials? Clin Pharmacol Ther. 2017;102(6):924–933. [DOI] [PubMed] [Google Scholar]
- 26.Hansen L The Truven Health MarketScan Databases for life sciences researchers. 2017. [Google Scholar]
- 27.Huse DM. Linking insurance claims and medical records for outcome research. White paper available at: Dan.Huse@truvenhealth.com
- 28.Rule AD, Larson TS, Bergstralh EJ, Slezak JM, Jacobsen SJ, Cosio FG. Using serum creatinine to estimate glomerular filtration rate: accuracy in good health and in chronic kidney disease. Ann Intern Med. 2004;141(12):929–937. [DOI] [PubMed] [Google Scholar]
- 29.Rubin DB. Estimating causal effects from large data sets using propensity scores. Ann Intern Med. 1997;127(8 Pt 2):757–763. [DOI] [PubMed] [Google Scholar]
- 30.Rassen JA, Shelat AA, Myers J, Glynn RJ, Rothman KJ, Schneeweiss S. One-to-many propensity score matching in cohort studies. Pharmacoepidemiol Drug Saf. 2012;21 Suppl 2:69–80. [DOI] [PubMed] [Google Scholar]
- 31.Gopalakrishnan C, Huybrechts KF, Shash D, et al. Characteristics of Patients Initiating Oral Anticoagulants in Routine Care: Pharmacoepidemiol Drug Saf. 2015;24:405–406. [Google Scholar]
- 32.Schneeweiss S Sensitivity analysis and external adjustment for unmeasured confounders in epidemiologic database studies of therapeutics. Pharmacoepidemiol Drug Saf. 2006;15(5):291–303. [DOI] [PubMed] [Google Scholar]
- 33.Psaty BM, Kuller LH, Bild D, et al. Methods of assessing prevalent cardiovascular disease in the Cardiovascular Health Study. Ann Epidemiol. 1995;5(4):270–277. [DOI] [PubMed] [Google Scholar]
- 34.Gopalakrishnan C, Huybrechts KF, Ortiz AS, Zint K, Bartels DB, Schneeweiss S. Abstract 16498: Monitoring of Safety and Effectiveness of Dabigatran Relative to Warfarin in Routine Care. Circulation. 2017;136(Suppl 1):A16498. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.