Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Feb 1.
Published in final edited form as: Circ Cardiovasc Qual Outcomes. 2024 Jan 19;17(2):e009986. doi: 10.1161/CIRCOUTCOMES.123.009986

A Classification Algorithm to Distinguish Between Type 1 and Type 2 Myocardial Infarction in Administrative Claims Data

Jason H Wasfy 1, Mary Price 2, Sharon-Lise T Normand 3, James L Januzzi Jr 1, Cian P McCarthy 1, John Hsu 2
PMCID: PMC11087697  NIHMSID: NIHMS1989321  PMID: 38240159

Abstract

Background:

Type 2 myocardial infarction (T2MI) and type 1 myocardial infarction (T1MI) differ with respect to demographics, comorbidities, treatments, and clinical outcomes. Reliable quality and outcomes assessment depends on the ability to distinguish between T1MI and T2MI in administrative claims data. As such, we aimed to develop a classification algorithm to distinguish between T1MI and T2MI that could be applied to claims data.

Methods:

Using data for beneficiaries in a Medicare accountable care organization contract in a large healthcare system in New England, we examined the distribution of MI diagnosis codes between 2018–21 and the patterns of care and coding for beneficiaries with a hospital discharge diagnosis ICD-10 code for T2MI, compared with those for T1MI. We then assessed the probability that each hospitalization was for a T2MI vs. T1MI and examined care occurring in 2017 before the introduction of the T2MI code.

Results:

After application of inclusion and exclusion criteria, 7,759 myocardial infarctions remained (46.5% T1MI and 53.5% T2MI, mean age 79 +/− 10.3 years, 47% female). In the classification algorithm, female gender (OR = 1.26, 95% CI: 1.11–1.44), Black race relative to White race (OR = 2.48, 95% CI: 1.76–3.48), and diagnoses of Covid-19 (OR = 1.74, 95% CI=[1.11–2.71]) or hypertensive emergency (OR = 1.46, 95% CI=[1.00–2.14]) were associated with higher odds of the hospitalization being for T2MI vs. T1MI. When applied to the testing sample, the C-statistic of the full model was 0.83. Comparison of classified T2MI and observed T2MI suggest the possibility of substantial misclassification both before and after the T2MI code.

Conclusions:

A simple classification algorithm appears to be able to differentiate between hospitalizations for T1MI and T2MI before and after the T2MI code was introduced. This could facilitate more accurate longitudinal assessments of acute myocardial infarction quality and outcomes.

Keywords: Outcomes Assessment, Quality of Care, Myocardial Infarction

Introduction

Type 2 myocardial infarction (T2MI) is caused by a mismatch in myocardial oxygen supply and demand.1 Specific causes of T2MI include coronary artery spasm, dissection, or embolism, anemia, arrhythmias, hypertension, or hypotension.1 Over time, recognition has increased that T2MI is a phenotypically-different syndrome with distinctive risk factors, pathophysiology, patient characteristics, presentation and outcomes.24 Unlike traditional type 1 myocardial infarction (T1MI), the underlying cause of T2MI is not underlying atherothrombotic disruption of coronary plaque.1 Community-based cohorts with clinical adjudication suggest the incidence of T2MI is now roughly similar to the incidence of T1MI, since rates of T1MI have declined far more rapidly than rates of T2MI.5 Superimposing these clinically-adjudicated data onto national estimates of 805,000 new and recurrent acute myocardial infarction (AMI) each year6, we estimate that roughly 400,000 Americans suffer each of T1MI and T2MI each year. Unadjusted all-cause mortality is roughly double after T2MI (~60% at 5 years) compared with T1MI with similar cardiovascular mortality (~20% at 5 years).5

Prior to the introduction of an administrative code for T2MI in October 2017, T1MI and T2MI were essentially indistinguishable in administrative claims data. Studies with clinical adjudication suggest a 50%/50% split in actual incidence in the United States.5 Indeed, within a few months of the new code, there was a comparable 50%/50% split in our study system. After the new T2MI code, the measured T1MI population nationally decreased by 13.7% and the new population was more commonly male, younger, and were less likely to have comorbidities such as heart failure or peripheral arterial disease.7 which suggest that patient characteristics, treatments, and clinical outcomes differ for the 2 phenotypically distinct syndromes. In addition, even after the introduction of the T2MI code, there could continue to be misclassification of T1MI and T2MI and the extent of misclassification between those 2 syndromes is unclear.. Accurate reporting of longitudinal quality and outcomes for AMI before and after October 2017 depends on an ability to reliably distinguish T1MI and T2MI in national administrative claims data. Moreover, nearly all current AMI-related quality measures relate to T1MI, which have unknown relevance for T2MI.

The development of reliable longitudinal quality and outcomes metrics for AMI first requires a better ability to distinguish between T1MI and T2MI in administrative claims data, both before and after 2018. In the context of that need, this work develops and applies a classification algorithm to distinguish between T1MI and T2MI. The overall analytic goal is to develop a model to classify the probability that each hospitalization represents “likely T2MI” when using administrative or insurance claims data, including data from before October 2017, when the T2MI code was introduced. As such, the strategy is (1) to exploit the time period after the introduction of the T2MI code to identify clinical conditions associated with T2MI as opposed to T1MI using the new International Statistical Classification of Diseases and Related Health Problems, version 10 (ICD-10) codes as a reference standard, (2) to develop a classification algorithm using and the available hospital discharge information combined with demographic information to distinguish likely T2MI from T1MI, and (3) to retrospectively apply the algorithm to periods both before and after the T2MI code was introduced.

This work is our first effort at developing an algorithm that can distinguish between “likely T1MI” and “likely T2MI” in administrative claims data. Eventually, we plan to develop larger clinically adjudicated datasets linked to local claims data to validate these algorithms. Then, we plan to apply these algorithms to national claims data to create more accurate and actionable quality metrics for a more precise claims-based definition of T1MI.

Methods

The analytic methods, statistical code, and study materials will be made available to other researchers for purposes of reproducing the results or replicating the procedure. These requests can be made via email to the corresponding author. The underlying data cannot be shared because of patient privacy restrictions as well as contractual restrictions from Medicare.

Data Definitions

We defined codes now attributable to T1MI as I21.0, I21.1, I21.2, I21.3, I21.4, I21.9, the code attributable to T2MI as I21.A1, and the code attributable to type 3–5 MI as I21.A9. We also created indicators for each these diagnosis codes (primary or secondary) along with an indicator for primary versus secondary AMI diagnosis code. We then identified other variables including year of discharge, and length of stay, plus sociodemographic characteristics such as age, age squared, biological sex, race/ethnicity, and the original reason for Medicare entitlement (age, disability, end-stage renal disease, or both disability and end-stage renal disease). We also identified codes for other conditions using in the primary position when T2MI was used as a secondary code.

Study Population

Source data was the Mass General Brigham enterprise data warehouse (EDW), which contains claims data on patients attributable to patients in Mass General Brigham (MGB) Medicare accountable care organization (ACO) contracts. We have previously used these data for a range of research questions that require data on patients receiving care in different health systems.810 The claims data include all paid care including care received outside the MGB system. The time frame of the query was January 2017 – December 2021. The specific inclusion criteria were inpatient hospitalizations with codes now attributable to T1MI, T2MI, and type 3–5 MI. For these hospitalizations, the position of the AMI code (primary or secondary diagnosis), and all other discharge diagnoses were recorded. The sample was limited to patients enrolled in the MGB Medicare ACO and used enrollment tables to find the number of Medicare beneficiaries assigned to the ACO in each month between 2017 and 2021. Hospitalizations with Type 3–5 MI codes only were described, to have a sense for the full range of AMI codes, but then excluded from the main regression analyses. The investigators had access to the claims data, so no data cleaning was necessary, and there was no data linkage across multiple databases.

Analytic Assumptions

Developing the classification algorithms required several key assumptions. First, the analysis assumes that the new ICD-10 subtype codes accurately differentiate between type 1 and type 2 myocardial infarction within the study health system. To increase the plausibility of this assumption, the analysis includes several steps and builds upon prior work.7,11

First, the analysis assumes a potentially gradual uptake of the new ICD-10 subtype codes. For model derivation and validation, the analysis focuses on the time period after observed T1MI and T2MI event rates visually appear to reach at least a short-term equilibrium. Second, the analysis builds upon prior work11 that found a very low proportion of clinically-adjudicated T1MI among patients with T2MI codes, i.e., limited misclassification of T1MIs as T2MIs. However, the frequency of other types of potential misclassification is unknown including the true number of clinical T2MI among patients with T1MI codes. Future studies with adjudication of clinical data will need to assess this possibility. Third, the analysis assumes that the introduction of the new T2MI code did not influence the documentation of other types of information that would be used in the classification algorithms. Variables such as the presence of serious medical conditions during the hospitalization, (e.g., sepsis) also should not be influenced by the presence of the new T2MI code. Other information, however, such as the numbers of patients with any AMI codes or the positioning of the myocardial infarction ICD-10 general code, i.e., primary or secondary code, might be influenced by the introduction of the ICD-10 myocardial infarction subtype codes. In fact, prior work has shown that characteristics of patients now attributed to T1MI shifted after the development of a T2MI code.7 To address this issue, the event rates over time for both T1MI and T2MI over time after the introduction of the T2MI code are plotted. Based on prior results, smooth proportions of T1MI and T2MI codes over time were expected after an initial period of disequilibrium.

Statistical Analysis

This was an analysis at the level of the patient-hospitalization. First, to distinguish between T1MI and T2MI, we categorized hospitalizations with a T1MI code in any position as T1MI while hospitalizations with a T2MI code in any position and no T1MI codes present on the claim were categorized as T2MI.

After examining trends in T1MI and T2MI codes (Figure 1), which demonstrated rapid uptake of the new T2MI code in the final months of 2017, the dataset was split into 2 time periods by discharge date, 2017 and 2018–2021. Using the 2018–2021 discharges only, the diagnoses occurring in the primary position were then examined when an T2MI diagnosis was a secondary diagnosis.

Figure 1.

Figure 1.

Monthly percent of Type 1 versus Type 2 MIs after the introduction of the MI subtype codes in Oct 2017

Data are presented with 95% confidence intervals. Please note that this figure is to illustrate changes immediately after the introduction of the T2MI code and this time interval does not correspond to the entire span of the study.

With the introduction of new codes, we would anticipate that some time is required before there is a new equilibrium in use of the codes. From these data, we cannot differentiate between actual clinical changes and coding use changes to explain the separation in late 2018 through early 2019, and would not be able to differentiate between actual clinical changes and coding use changes during this early time period.

The next step was to develop and validate a classification algorithm for T2MI as opposed to T1MI. We split the 2018–2021 data randomly into a training sample (80%) and a testing sample (20%). As a sensitivity analysis, we also performed the analysis with non-random split sample by time, using 2018–2020 data to train the model and 2021 data to test the model. In the primary analysis, using the training sample and T2MI ICD-10 codes as the reference standard, we used logistic regression to predict the reference standard using the discharge diagnosis information. In selecting variables for the model we initially identified the most common codes used in the primary position when T2MI is used as a secondary code. We chose these codes because we expected them to be associated with T2MI in any position, and as such, they would be expected to be influential in distinguishing between T1MI and T2MI. We also included the indicators for each these AMI codes (primary or secondary) along with an indicator for primary versus secondary AMI diagnosis code, year of discharge, and length of stay, plus sociodemographic characteristics and the original reason for Medicare entitlement defined above. As a sensitivity analysis, we also performed the logistic regression with a backward selection procedure with a p value threshold of p ≥ 0.2 (stepwise command in Stata). We also performed a sensitivity analysis explicitly not including Covid-19 as a predictor. Given the analysis used claims data, there were essentially no missing data aside from “unknown” for race/ethnicity and no imputation was performed. We assessed the discrimination and calibration of the model in the testing sample by examining the C-statistic and calibration plots.

We identified a probability threshold by plotting sensitivity and specificity within the training sample visually and picking a probability threshold that maximized both. Using that probability threshold, we then classified each hospitalization as being for T1MI or T2MI. Within the testing sample, we calculated the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). Finally, we plotted the predicted rates of T1MI and T2MI for each month for the entire study time period (including both 2017 and 2018–2021), along with the observed rates for each AMI type after the introduction of the code in October 2017, using the number of beneficiaries enrolled in the Medicare ACO in each month as the denominator. We used Stata/MP 14.0. The MGB Institutional Review Board approved the study.

Results

Between 2017 and 2021, there were 10,479 hospitalizations with an AMI discharge diagnosis code for beneficiaries within the Medicare ACO. Of those, 946 were excluded because they did not appear in the enrollment file, and 9,533 hospitalizations remained for analysis. After initial inclusion and exclusion criteria were applied, there were 2,997 T1MI, 273 T2MI, and 2 type 3–5 MI when used as a primary diagnosis (Figure 2). There were also 6,261 cases of AMI being used as a secondary code with a non-myocardial infarction code used as the primary code. Of those, 2,137/6,261 (34.1%) have type 1 MI as a secondary code, 4,106/6,261 (65.5%) have type 2 MI as a secondary code, and 18/6,261 (0.3%) have type 3–5 MI (“other myocardial infarction”) as a secondary code.

Figure 2. Distribution of myocardial infarction subtypes by position of diagnosis code positioning, 2017–2021.

Figure 2.

Classification of T2MI

Using the claims-based definitions of T1MI and T2MI defined in methods and limiting to hospitalizations with discharges in 2018–2021, 3,609 T1MI hospitalizations and 4,150 T2MI hospitalizations were included in the development and validation of the classification algorithm. Demographic characteristics separated by type of AMI appear in Table 1. The most common non-MI primary diagnoses among patients with T2MI as a secondary diagnosis (i.e., candidate clinical variables for the model) as well as baseline characteristics stratified by training and testing samples appear in the Supplementary Appendix (Table S1, Table S2).

Table 1.

Patient demographics associated with T1MI and T2MI hospitalizations (2018–2021).

Total Type 1 Type 2 SMD

Total 7,759 3,609 4,150

Mean Age (SD) 79 (10.3) 77 (9.9) 80 (10.5) −0.21

Female n, % 3,639 47% 1,555 43% 2,084 50% 0.14

Race/Ethnicity: Unknown 110 1% 65 2% 45 1% 0.18
White 6,964 90% 3,299 91% 3,665 88%
Black 380 5% 112 3% 268 6%
Other 109 1% 57 2% 52 1%
Asian 72 1% 25 1% 47 1%
Hispanic 124 2% 51 1% 73 2%

Original Reason for Entitlement: Age 6,153 79% 2,877 80% 3,276 79% 0.02
Disabled 1,490 19% 678 19% 812 20%
ESRD/ESRD & Disabled 116 1% 54% 1% 62 1%

Mean Length of Stay (SD) 8.3 8.6 8.0 8.2 8.5 9.0 −0.06

Other diagnoses on claim
Sepsis, unspecified organism 798 10% 232 6% 566 14% 0.24
Hypertensive heart and chronic kidney disease with heart failure and stage 1 through stage 4 chronic kidney disease, or unspecified chronic kidney disease 1,703 22% 725 20% 978 24% 0.08
Hypertensive heart disease with heart failure 866 11% 408 11% 458 11% 0.01
Paroxysmal atrial fibrillation 549 7% 270 7% 279 7% 0.03
Hypertensive heart and chronic kidney disease with heart failure and with stage 5 chronic kidney disease, or end stage renal disease 320 4% 132 4% 188 5% 0.04
Sepsis due to Escherichia coli [E. coli] 134 2% 31 1% 103 2% 0.13
Pneumonitis due to inhalation of food and vomit 515 7% 154 4% 361 9% 0.18
COVID-19 190 2% 43 1% 147 4% 0.16
Pneumonia, unspecified organism 742 10% 272 8% 470 11% 0.13
Acute kidney failure, unspecified 2,218 29% 882 24% 1,336 32% 0.17
Hypertensive emergency 237 3% 83 2% 154 4% 0.08
Other specified sepsis 96 1% 25 1% 71 2% 0.09
Chronic obstructive pulmonary disease with (acute) exacerbation 318 4% 106 3% 212 5% 0.11
Unspecified atrial fibrillation 464 6% 196 5% 268 6% 0.04

ESRD = end-stage renal disease

SMD = standardized mean difference

In the training sample (see Table 2, Table S3, Table S4, Table S5, Table S6 for odds ratios of the full model including sensitivity analyses), female gender (OR = 1.26, 95% CI: 1.11–1.44) and Black race relative to White race (OR = 2.48, 95% CI: 1.76–3.48) were associated with higher odds of T2MI. In terms of clinical diagnoses, Covid-19 (OR = 1.74, 95% CI=1.11–2.71) and hypertensive emergency (OR = 1.46, 95% CI=1.00–2.14) were associated with higher odds of T2MI. The pseudo-R2 of the model was 0.32 and the deviance was 5847.85 with 6218 degrees of freedom (p = 0.9996).

Table 2.

Odds ratios in model to predict T2MI relative to T1MI

Model 1: With Primary Dx indicator
OR 95% CI

Indicator for primary MI dx 0.034 0.029 0.041
Age 1.01 1.00 1.02
Age^2 1.00 1.0002 1.0010
Female vs. male 1.26 1.11 1.44
Original reason for entitlement code = Aged vs. other 0.84 0.69 1.01
Length of Stay 0.99 0.98 1.00
- Sepsis, unspecified organism 0.99 0.80 1.21
- Hypertensive heart and chronic kidney disease with heart failure and stage 1 through stage 4 chronic kidney disease, or unspecified chronic kidney disease 0.87 0.74 1.02
- Hypertensive heart disease with heart failure 0.90 0.72 1.11
- Paroxysmal atrial fibrillation 1.05 0.81 1.36
- Hypertensive heart and chronic kidney disease with heart failure and with stage 5 chronic kidney disease, or end stage renal disease 1.14 0.82 1.59
- Sepsis due to Escherichia coli [E. coli] 1.24 0.77 1.99
Pneumonitis due to inhalation of food and vomit 1.13 0.88 1.46
- COVID-19 1.74 1.11 2.71
- Pneumonia, unspecified organism 1.02 0.82 1.27
- Acute kidney failure, unspecified 1.09 0.94 1.26
- Hypertensive emergency 1.46 1.00 2.14
- Other specified sepsis 0.85 0.50 1.47
- Chronic obstructive pulmonary disease with (acute) exacerbation 1.23 0.89 1.69
- Unspecified atrial fibrillation 1.32 0.99 1.76
Discharge year vs. 2018
2019 0.92 0.77 1.10
2020 0.75 0.63 0.90
2021 0.71 0.59 0.86
Race/ethnicity (ref. White)
Black 2.48 1.76 3.48
Hispanic 1.11 0.66 1.88
Asian 2.06 0.98 4.34
Other/Unknown Race/ethnicity 0.69 0.48 0.98
Constant 2.85 2.27 3.57

Discharges between 2018–2021

The probability threshold that maximized sensitivity and specificity within the training sample was 70% (Figure S1). When applied to the testing sample, the C-statistic of the full model was 0.83 (95% CI = 0.81–0.85), consistent with excellent discrimination (Figures S2 and S3). Calibration plots were consistent with excellent calibration (Figure 3). Model fit was similar for the sensitivity analyses using stepwise selection and explicitly excluding Covid-19 as a predictor (Figures S4 and S5).

Figure 3. Performance of classification algorithms for differentiating between type 1 and type 2 myocardial infarction.

Figure 3.

In the training model, chi-squared p-value was 0.80, C-statistic was 0.83, cut point was > 0.7, sensitivity was 74.4%, specificity was 74.9%, positive predictive value was 77.4%, and negative predictive value was 71.7%.

In the testing model, chi-squared p-value was 0.49, C-statistic was 0.83, sensitivity was 74.4%, specificity was 74.8%, positive predictive value was 76.9%, and negative predictive value was 72.2%.

Observed and classified counts of T2MI per 1000 hospitalizations

Observed counts of T1MI and T2MI per 1,000 hospitalizations is shown in Figure 4, superimposed on classified counts of T1MI and T2MI. The classified counts of T1MI and T2MI can be used to infer counts of T1MI and T2MI before those diagnoses were distinguishable in claims data (October 2017). After October 2017, any differences between observed and predicted T1MI and T2MI rates in Figure 4 could suggest potential misclassification of the T1MI and T2MI codes.

Figure 4. Classified versus observed monthly myocardial infarction subtype rates per 1,000 hospitalizations over time.

Figure 4.

This graph shows observed type 1 and type 2 MI counts over time, alongside type 1 and type 2 MI counts as classified by this model. The blue solid line represents the type 1 MI counts and the orange solid line represents the type 2 MI counts as defined in the methods section. The corresponding dotted lines represent the estimated type 1 and type 2 MI counts, when using the model to estimate which cases are type 1 and type 2 (ie, not relying on the primary billing code alone). The model suggests that the counts of the 2 syndromes may have been similar before the introduction of the type 2 myocardial infarction billing code in October 2017.

The sensitivity analysis using 2018–2020 for model development and 2021 for model testing produced similar results and are presented in the Supplementary Appendix (Figures S5 and S6).

Discussion

In this work, we have created a classification algorithm for T1MI vs T2MI with excellent discrimination and calibration. This approach provides a key step to potentially improving quality and outcomes assessment for AMI generally and both T1MI and T2MI individually. Superimposing model-derived estimates compared with observed data suggests that many “AMI” coded before a T2MI code was available may have in fact been T2MI. Furthermore, a similar comparison from 2018 onward suggests some residual misclassification of T1MI and T2MI even after the code to distinguish them was available.

This work confirms and extends prior work suggesting potential misclassification of subtypes of AMI in administrative data. Any valid longitudinal assessment of quality and outcomes needs to minimize differences in data definitions and/or risk-adjustment over time and identify clinically homogeneous groups consistent with populations from the source trials. We have previously demonstrated a sharp discontinuity in characteristics, demographics, treatments, and outcomes associated with observed T1MI when the new T2MI code was introduced.7 We now demonstrate that with a validated predictive model applied to administrative data before and after that transition, the predicted counts of both T1MI and T2MI is smoother, potentially consistent with more accurate identification of these 2 syndromes.

Improving the ability to detect subtypes of AMI in administrative data is critically important for several reasons. First, T2MI is common and deadly. T2MI and T1MI have similar mortality, although patients with T2MI specifically due to hypotension, anemia, or hypoxemia have worse mortality than T1MI.12 First, T2MI is common and T1MI has been declining much more quickly than T2MI as assessed in clinically-adjudicated local datasets.5 At this point, the incidence of T2MI appears similar to the incidence of T1MI.5 This is difficult to assess in national data which entirely mixes the 2 syndromes in data prior to October 2017. Given the increase in the proportion of AMI that are T2MI, this predictive model gives the opportunity to study quality and outcomes in large national data sets with more rigor.

Refining the detection of AMI subtypes in administrative data is also important because it starts a pathway to more accurate measurements of hospital quality. Measurement and reporting of quality and outcomes for AMI is central to national hospital rankings and value-based payment mechanisms. For example, AMI mortality is included in the hospital value-based purchasing program (HVBP)13 and AMI readmission is included in the Hospital Readmission Reduction Program (HRRP).14 In addition, the federal Care Compare public hospital star rankings incorporate both AMI mortality and readmission rates.15 Measuring hospital overall quality and adjusting payments based on claims-based measures for AMI quality, however, is potentially unreliable. AMI includes heterogenous syndromes with different characteristics, outcomes, and pathophysiology. Better distinctions between these 2 pathophysiological distinct syndromes creates the potential of better risk-adjustment methodologies for AMI outcomes.

Disentangling T1MI from T2MI in quality metrics is particularly important, since many therapies exist for T1MI but no known therapy exists for T2MI. Conceptually, hospitals cannot improve performance on syndromes like T2MI for which there are no known effective therapies. Newer quality metrics more precisely isolating T1MI might be easier to manage. Overall, the tremendous potential of quality measurement – and then public reporting and adjustment of hospital payments – to improve quality nationally has been underrealized. For example, reductions in risk-standardized readmissions following AMI under the HRRP16 appear mostly related to increased detection of co-morbidities used in risk-adjustment17 rather than an actual decrease in AMI readmissions. HVPB is not associated with reduction in 30-day AMI mortality.18,19 Better quality metrics that more rigorously disentangle T1MI outcomes (which hospitals can conceptually control) from T2MI outcomes (which they cannot control) make catalyze the unrealized potential of quality reporting leading to large-scale quality improvement. Ultimately, however, which hospitals would improve measured performance and which would not with a switch to more specific T1MI metrics is not yet clear.

In this work, different types of AMI in both primary and secondary positions are intentionally included, to maximally detect the incidence of AMI. It is important to be mindful, however, that inclusion in national quality metrics for readmission and mortality depends only on AMI codes as the primary diagnosis. Patients with T2MI as the primary diagnosis are not included in these metrics, although patients with T2MI misclassified as T1MI could be included.20 The model from this analysis here could be used to reclassify patients with T1MI in the primary diagnosis position. Then, new mortality and readmission models could be developed, assessed, and compared to traditional models that were derived on a potentially more heterogenous population including misclassified T2MI patients.

Since claims-based definitions of T1MI and T2MI were used as the reference standard to derive and validate the model, misclassified administrative codes for T2MI after the code was introduced could threaten the validity of this work. This includes classifying a T1MI as T2MI, incorrectly classifying a true T2MI as T1MI, or further classifying myocardial injury as an MI. It is worth noting that based on prior work that included clinical review and adjudication that there is a very low proportion of clinically-adjudicated T1MI among patients with T2MI codes.11 We do not know the frequency with which true clinical T2MI may be misclassified as T1MI even after the T2MI code was introduced. This would be an important topic for future work with linked clinical and claims data. A more substantial issue is clinical codes for T2MI being used when patients have myocardial injury, a non-AMI syndrome. This is a particularly significant issue with advent of high sensitivity cardiac troponin assays, where detection of non-coronary myocardial injury in those with critical medical illness might be conflated with T2MI. In actual clinical practice, rigorous distinctions between myocardial injury and T2MI may not be applied consistently, leading to this form of misclassification. The T2MI code itself has previously been shown to include both T2MI and myocardial injury11, so the prediction algorithm for T2MI may to some extent actually be a prediction algorithm for either myocardial injury or T2MI as opposed to T1MI. This is potentially consistent with the prominence of sepsis and Covid-19 among the “T2MI” hospitalizations. Additional work clinically adjudicating myocardial injury and T2MI among patients in claims data with different combinations of codes might clarify these issues. However, we are overall pessimistic that administrative data could distinguish between T2MI and myocardial injury, given the lack of clear distinctions in clinical practice. We are reassured however that this known type of misclassification does not affect the distinction between T1MI and T2MI, which is the main goal of this model.

The results of this study should be interpreted in the setting of important limitations. First, these methods explicitly use a claims-based reference to derive and validate a claims-based prediction model. This is an important step to establishing that claims data can differentiate phenotypes of AMI, but is not a clinical “gold standard.” Eventually, clinical adjudication of cases would be required next to assess this prediction model. Second, as an analysis of claims from beneficiaries within a single accountable care organization, the extent to which the findings may be generalized is unclear. However, since these data represent care delivered in all settings to an insured population, these results are more generalizable than results from care within a single health care system. However, patterns may be different in more rural areas or areas with fewer cardiologists. As such, externally validation of the model is planned in other settings to evaluate performance throughout the Medicare population. Furthermore, it is unclear how these results might extend outside the Medicare population (for example, in younger patients) or outside the United States. Third, claims-based definitions of AMI depend on accurate identification of clinical syndromes and coding. For example, a patient admitted with sepsis without troponin or electrocardiographic assessment may have been having T2MI without the syndrome being identified. If such patients were more likely to have conditions or demographic characteristics included in the model, this could cause a type of ascertainment bias. Finally, we have used conventional methods and we have not used more advanced methods such as natural language processing, machine learning, or artificial intelligence. We would make the distinction between available data, structure of those data, and methods to analyze those data. Conceptually, we could eventually improve the structure of the data using natural language processing and other methods. Ideally from an analytic perspective, we would have identical, detailed, repeated measurements for all patients but in a real-world setting this would be inefficient and potentially unethical (performing tests that patients do not need). For many patients, we are using administrative claims as proxies for physiological and clinical data. Ultimately, more advanced statistical techniques might improve predictive ability but these approaches still depend on data quality and quantity.

Conclusions

This work develops and validates a classification algorithm to distinguish between T1MI and T2MI in administrative claims, even before the T2MI code was available. This algorithm could potentially be used to reformulate widely used readmission and mortality metrics for AMI that are essential to national quality assessment and value-based payments. Whether such models perform better or are more accurate measures of hospital quality that existing measures remains yet to be determined.

Supplementary Material

Supplemental materials

FIGURE 3A – Sensitivity and specificity of algorithm based on probability threshold.

FIGURE 3A –

This represents the sensitivity and specificity of the algorithm for T2MI based on probability threshold. As probability threshold approaches 100%, the algorithm approaches perfect specificity for T2MI (but approaches zero sensitivity). Conversely, as the algorithm probability threshold declines, specificity for T2MI declines but sensitivity increases.

FIGURE 3B – Discrimination of the model derived in the derivation set in the testing set.

FIGURE 3B –

This is the discrimination of the model derived in the derivation set in the testing set.

FIGURE 3C -. Calibration in the test dataset.

FIGURE 3C -

This is the calibration of the model in the testing dataset.

What is Known:

  • Quality and outcomes assessment for type 1 myocardial infarction is common and consequential.

  • However, administrative claims data has limited ability to distinguish between type 1 myocardial infarction and type 2 myocardial infarction after October 2017, and no ability before October 2017.

What the Study Adds:

  • Here, we develop a claims-based classification algorithm that is able to distinguish between type 1 and type 2 myocardial infarction with excellent discrimination.

  • This algorithm could be used to develop more accurate quality and outcomes assessment for myocardial infarction over time.

Sources of funding:

This work has been supported by a grant from the American Heart Association (18 CDA 34110215) awarded to Dr. Wasfy. Dr McCarthy is supported by a grant from the National Institutes of Health (K23HL167659). Dr. Hsu is supported by U01AG076478.

Non-standard abbreviations and Acronyms:

T2MI

Type 2 myocardial infarction

T1MI

Type 1 myocardial infarction

AMI

Acute myocardial infarction

ICD-10

International Statistical Classification of Diseases and Related Health Problems, version 10

ACO

Accountable care organization

NPV

Negative Predictive Value

HVBP

Hospital Value Based Purchasing

HRRP

Hospital Readmission Reduction Program

Footnotes

Disclosures: Dr. McCarthy reports consulting for Abbott Laboratories. All remaining authors have no pertinent disclosures.

Supplemental Materials: Tables S1S6, Figures S1S6

References

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental materials

RESOURCES