Skip to main content
PLOS One logoLink to PLOS One
. 2021 Jun 4;16(6):e0252903. doi: 10.1371/journal.pone.0252903

External validation of a claims-based model to predict left ventricular ejection fraction class in patients with heart failure

Mufaddal Mahesri 1, Kristyn Chin 1, Abheenava Kumar 2, Aditya Barve 3, Rachel Studer 4, Raquel Lahoz 4, Rishi J Desai 1,*
Editor: Gianluigi Savarese5
PMCID: PMC8177622  PMID: 34086825

Abstract

Background

Ejection fraction (EF) is an important prognostic factor in heart failure (HF), but administrative claims databases lack information on EF. We previously developed a model to predict EF class from Medicare claims. Here, we evaluated the performance of this model in an external validation sample of commercial insurance enrollees.

Methods

Truven MarketScan claims linked to electronic medical records (EMR) data (IBM Explorys) containing EF measurements were used to identify a cohort of US patients with HF between 01-01-2012 and 10-31-2019. By applying the previously developed model, patients were classified into HF with reduced EF (HFrEF) or preserved EF (HFpEF). EF values recorded in EMR data were used to define gold-standard HFpEF (LVEF ≥45%) and HFrEF (LVEF<45%). Model performance was reported in terms of overall accuracy, positive predicted values (PPV), and sensitivity for HFrEF and HFpEF.

Results

A total of 7,001 HF patients with an average age of 71 years were identified, 1,700 (24.3%) of whom had HFrEF. An overall accuracy of 0.81 (95% CI: 0.80–0.82) was seen in this external validation sample. For HFpEF, the model had sensitivity of 0.96 (95%CI, 0.95–0.97) and PPV of 0.81 (95% CI, 0.81–0.82); while for HFrEF, the sensitivity was 0.32 (95%CI, 0.30–0.34) and PPV was 0.73 (95%CI, 0.69–0.76). These results were consistent with what was previously published in US Medicare claims data.

Conclusions

The successful validation of the Medicare claims-based model provides evidence that this model may be used to identify patient subgroups with specific EF class in commercial claims databases as well.

Introduction

Ejection fraction (EF) is an important prognostic factor in heart failure (HF). HF with reduced ejection fraction (HFrEF) is well characterized and there are a number of evidence-based therapies available [1]. In contrast, HF with preserved EF (HFpEF) is more heterogeneous, poorly characterized and there are no approved therapies that improve outcomes [1].

Insurance claims databases allow for longitudinal follow-up at the patient level and are very useful in evaluation of disease epidemiology and treatment outcomes in routine care [2]. However, a major limitation with claims databases in studying HF is the lack of available results from procedures such as echocardiograms, which are used to measure EF. Consequently, one cannot directly distinguish between HFrEF and HFpEF based on administrative claims. To address this limitation, we previously developed a model to predict EF class using Medicare claims and validated using electronic medical record (EMR) data from two large healthcare provider networks from the Boston metropolitan area [3]. The primary objective of the current study was to evaluate the performance of this prediction model in an external validation cohort.

Methods

Data source

Claims data derived from the Truven MarketScan database linked to EMRs from the IBM Explorys database were used. Truven MarketScan covers 235 million lives of US citizens consisting of two core claims databases; 1) MarketScan Commercial Claims and Encounters—which contains healthcare data commercially insured individuals, encompassing employees, their spouses, and their dependents from the United States, 2) Medicare Supplemental and Coordination of Benefits—which contains the healthcare experiences of Medicare-eligible retirees with employer-sponsored Medicare Supplemental plans. Both these data sources contain longitudinally traceable information for their enrollees’ medical diagnoses recorded with International Classification of Disease, 9th and 10th Clinical Modification (ICD-9/ICD-10 CM) codes, medical procedures recorded as Current Procedure Terminology (CPT) or ICD-9 procedure codes, and medication dispensing recorded using National Drug Codes (NDC). The IBM Explorys data platform is a data network that comprises integrated information from 360 hospitals and approximately 31,700 providers, covering approximately 50 million patient lives. The Explorys data has been used for multiple prior observational studies [47] and contain data derived from ambulatory electronic medical records (EMRs), inpatient EMRs, laboratory, pharmacy, health plans, billing and accounting, data warehouses, patient portals, satisfaction surveys, and care management systems. The Marketscan and Explorys linked population represent approximately 10% of the total MarketScan population.

Study design

Adult patients were included in the study if they had ≥1 diagnosis code for HF (ICD-9 or ICD-10) from the Truven MarketScan claims database after 6 months of continuous enrollment in their health plans and ≥1 recorded EF result, within 6 months prior or 1 month after the HF diagnosis date, from the IBM Explorys EMR database. The study period was between January 1st of 2012 and October 31 of 2019 and the HF diagnosis date successfully paired with a qualifying EF measurement was defined as the cohort entry date. The study protocol was approved by the Brigham and Women’s Hospital Institutional Review Board. The Institutional Review Board committee waived the requirement for informed consent. This is a retrospective cohort study using a HIPAA de-identified dataset and individuals cannot be identified from the data.

Model validation

A patient level analytic data file with information on the predictor variables (S1 Table) was created for the whole cohort of eligible HF patients from the Marketscan-Explorys linked dataset. All predictors were measured in the 6 months prior to and 1-month post cohort entry. Using the regression coefficients for each individual predictor variables and the y-intercept of the model, (as reported in Desai et al. [3]), we estimated the probabilities of patient belonging to HFrEF or HFpEF and classified patients into one of these two classes using the recommended cut off. We used EF data from IBM Explorys to define gold standard classification into HFpEF (LVEF ≥45%) and HFrEF (LVEF<45%). In case of ≥1 EF results, values recorded on days closest to the cohort entry dates were used to define the gold standard. The predicted classification was compared against the gold standard to complete this validation exercise.

Statistical analysis

Patient characteristics including demographics, HF-related variables (e.g. diagnosis code recorded for HF, HF-related hospitalizations), HF-related medications and various co-morbid conditions (e.g. hyperlipidemia, hypertension, cardiomyopathy) were described stratified by HFrEF or HFpEF for this validation cohort. We calculated overall accuracy (correct classification rate = number of accurate predictions/number of total predictions), positive predictive value (probability of being a true case, given algorithm prediction) and sensitivity (the probability of being identified as a case of specific HF class by the algorithm for a true case out of the overall population) along with 95% confidence intervals. Further, the performance of this model was also tested in the following pre-specified subgroups: males and females, age <65 and > = 65 years, index date prior to October 2015 (ICD9 period) and after October 2015 (ICD10 period). It should be noted that we allowed multiple entries in the cohort, therefore some patients may contribute to both the ICD9 and ICD10 period subgroups. We also described patient characteristics in categories of patients accurately and inaccurately classified by our model to characterize misclassified populations.

Results

Study cohort

We identified 157,203 patients with at least 1 HF diagnosis following 6 months eligibility of continuous medical and pharmacy benefits. Of these patients, we included 7,001 who were at least 18 years old at cohort entry date and who had at least one EF result available between 180 days before and 30 days after index date. Details of the cohort construction are provided in Fig 1.

Fig 1. Cohort consort diagram.

Fig 1

Table 1 contains data on baseline characteristics by EF class identified via the gold standard criteria using EMR-recorded EF values. We identified 5,301 patients as HFpEF and 1,700 patients as HFrEF. The average age was similar across both the groups (HFpEF = 71 years vs HFrEF = 69 years) while males comprised 68% of HFrEF compared to 51% of HFpEF. The mean (SD) EF was 59%(7) in the HFpEF group while it was 32%(9) in the HFrEF group.

Table 1. Baseline characteristics of HF patients stratified by ejection fraction class (HFrEF, < 0.45; or HFpEF, ≥ 0.45).

Variable Gold standard HFrEF (N = 1,700) Gold standard HFpEF (N = 5,301)
  N (%) N (%)
    Mean LVEF (in %), (SD) 32 (9) 59 (7)
Demographics    
        Male 1152 (67.8) 2687 (50.7)
        Age in years, mean (SD) 69.2 (14.0) 70.6 (13.7)
HF-related variables    
        HF-specific ICD-9 and ICD-10 codes    
            Systolic HF 657 (38.6) 476 (9.0)
            Diastolic HF 83 (4.9) 1360 (25.7)
            Left HF 94 (5.5) 239 (4.5)
            Unspecified HF 790 (46.5) 2930 (55.3)
        HF Hospitalizations, mean (SD) 0.2 (0.4) 0.08 (0.3)
        Implantable cardioverter-defibrillator 245 (14.4) 111 (2.1)
        HF diagnosis identified in outpatient claims 886 (52.1) 3146 (59.3)
HF-related medication use    
        ACE inhibitors 968 (56.9) 2108 (39.8)
        Mineralocorticoid receptor antagonists 389 (22.9) 467 (8.8)
        Beta blockers 998 (58.7) 2587 (48.8)
        Digoxin 101 (5.9) 118 (2.2)
        Loop diuretics 952 (56.0) 2489 (46.9)
        Nitrates 285 (16.8) 519 (9.8)
        Thiazide diuretics 629 (37.0) 1581 (29.8)
Comorbidities    
        Atrial fibrillation or flutter 723 (42.5) 1956 (36.9)
        Anemia 583 (34.3) 2121 (40.0)
        Coronary artery bypass graft 132 (7.8) 292 (5.5)
        Cardiomyopathy 789 (46.4) 572 (10.8)
        Chronic obstructive pulmonary disease 422 (24.8) 1539 (29.0)
        Depression 209 (12.3) 941 (17.7)
        Hypertensive nephropathy 241 (14.2) 772 (14.6)
        Hyperlipidemia 1063 (62.5) 3356 (63.3)
        Hypertension 1365 (80.3) 4375 (82.5)
        Hypotension 293 (17.2) 811 (15.3)
        Myocardial infarction 436 (25.6) 608 (11.5)
        Obesity 324 (19.1) 1277 (24.1)
        Other dysrhythmias 1002 (58.9) 2469 (46.6)
        Psychosis 539 (31.7) 1964 (37.0)
        Rheumatic heart disease 260 (15.3) 994 (18.7)
        Sleep apnea 235 (13.8) 950 (17.9)
        Stable angina 215 (12.6) 540 (10.2)
        Valve disorders 278 (16.3) 1148 (21.7)

Performance of the HF model

The model showed an overall accuracy of 0.81 (95% CI: 0.80–0.82). For HFpEF, the model had sensitivity of 0.96 (95% CI, 0.95–0.97) and PPV of 0.81 (95% CI, 0.81–0.82); while for HFrEF, the sensitivity was 0.32 (95% CI, 0.30–0.34) and PPV was 0.72 (95% CI, 0.69–0.76).

The overall accuracy was similar across the different subgroups; however, some variation was observed in sex subgroups. The overall accuracy was higher among female patients compared to male patients, due to a higher sensitivity and PPV in HFpEF. While, the male subgroup performed better for HFrEF. The model demonstrated very similar performance when using ICD-9 HF diagnoses compared to ICD-10 coded HF diagnoses. This was an important finding as the original model was developed using ICD-9 codes only and these finding support its use for data currently using both ICD-9 and ICD-10 diagnoses codes. Details of the performances of the primary model as well as the subgroup analyses are presented in Table 2. Patient characteristics in categories of patients accurately and inaccurately classified by our model are summarized for both HFrEF and HFpEF, in S2 Table.

Table 2. Primary analysis and subgroup- specific performance.

Analysis Overall Accuracy With 95% CIs Reduced Ejection Fraction Preserved Ejection Fraction
Positive Predicted Value With 95% CIs Sensitivity With 95% CIs Positive Predicted Value With 95% CIs Sensitivity With 95% CIs
Primary analysis 0.81 (0.80–0.82) 0.72 (0.69–0.76) 0.32 (0.30–0.34) 0.81 (0.81–0.82) 0.96 (0.95–0.97)
Subgroup 1: Age 65–75 y 0.80 (0.78–0.82) 0.73 (0.66–0.80) 0.32 (0.28–0.37) 0.81 (0.79–0.83) 0.96 (0.95–0.97)
Subgroup 2: Age 75 y and older 0.80 (0.79–0.82) 0.73 (0.66–0.79) 0.20 (0.17–0.23) 0.81 (0.79–0.82) 0.98 (0.97–0.98)
Subgroup 3: Males 0.77 (0.75–0.78) 0.73 (0.69–0.77) 0.35 (0.32–0.38) 0.77 (0.76–0.79) 0.95 (0.94–0.95)
Subgroup 4: Females 0.85 (0.84–0.86) 0.70 (0.63–0.76) 0.27 (0.23–0.30) 0.86 (0.85–0.88) 0.98 (0.97–0.98)
Subgroup 5: Entry HF diagnosis in inpatient claims 0.80 (0.78–0.81) 0.76 (0.72–0.80) 0.37 (0.34–0.40) 0.80 (0.78–0.82) 0.96 (0.95–0.96)
Subgroup 6: Entry HF diagnosis in outpatient claims 0.81 (0.80–0.82) 0.68 (0.63–0.73) 0.28 (0.25–0.31) 0.83 (0.81–0.84) 0.96 (0.96–0.97)
Subgroup 7: ICD-9 coded HF 0.80 (0.78–0.81) 0.72 (0.66–0.77) 0.28 (0.25–0.32) 0.80 (0.79–0.82) 0.96 (0.96–0.97)
Subgroup 8: ICD-10 coded HF 0.79 (0.78–0.80) 0.72 (0.68–0.75) 0.34 (0.31–0.36) 0.80 (0.79–0.81) 0.95 (0.95–0.96)

Discussion

As EF information is unavailable in administrative claims databases, it is important to develop claims-based models that can be used as a proxy to identify EF classes in patients with HF. In this external validation study, we assessed the accuracy of a claims-based model to predict EF class developed in Medicare data, by applying it to commercial claims data to establish generalizability of this model outside of Medicare claims.

The performance with commercial claims was noted to be equivalent to the performance previously reported for the internal validation sample using Medicare claims [3]. In this study, we observed sensitivity of 0.96 and PPV of 0.81 in identifying HFpEF patients. This is very similar to what was reported by Desai et al. in Medicare claims data (sensitivity of 0.97, PPV of 0.84). For HFrEF patients a substantially lower sensitivity (0.32) and a relatively lower PPV (0.72) was seen, which is also consistent with what was previously published (sensitivity of 0.29, PPV of 0.73).

We want to emphasize certain cautions that must be weighed carefully when using this model to identify EF classes in HF. First, the low sensitivity in identifying HFrEF would result in a considerable amount of sample being lost. Further, the group that is identified as HFrEF may systematically differ than the group that is misclassified by the model. On comparing the accurately classified HFrEF patients (547) with the misclassified HFrEF patients (1,153), we observed that EF was lower in accurately classified patients (average of 29% versus 33%, S2 Table). Compared to the misclassified HFrEF patients, the accurately classified HFrEF patients showed a higher prevalence of HF-related comorbidities, such as cardiomyopathy (85% versus 28%), myocardial infarction (34% versus 21%) and other dysrhythmias (67% versus 55%). Thus, patients identified as HFrEF by this model represents a sicker group. However, despite the low sensitivity, a recent study describing the epidemiology of HFrEF patients, identified in claims data based on this algorithm, showed characteristics and outcome trajectories for HFrEF patients that closely resembled other well-characterized population-based cohorts [8].

Determining EF class based on information routinely available in electronic data sources has received increasing recognition in recent years to address frequent unavailability of EF values. A retrospective cohort study among Minnesota residents evaluated a claims-based approach to identify HFpEF patients based on HF diagnosis codes in combination with laboratory orders for BNP/NT-proBNP and achieved PPV of 84% [9]. The higher PPV for HFpEF reported in our study is likely explained by addition of frequent comorbid conditions and demographics in addition to diagnoses codes. A second study by Uijl et al. based on data from the Swedish Heart Failure Registry used an approach similar to ours, where 22 predictors including laboratory results such as NT-proBNP, renal function; demographics such as age, sex; and comorbid conditions were used to classify patients into HFpEF and HFrEF [10]. The authors noted discrimination of 0.73 for this model in an external validation cohort. Results from our study along with the study by Uijl et al. suggest that model-based computable phenotyping of HF patients may provide a useful way to identify and study HF subtypes from electronic healthcare sources where EF values are not available. Our model may be more applicable in the context of US insurance claims databases, which often lack information such as laboratory test results.

Some limitations deserve mention. Although EMR data includes rich clinical information, high amount of missing data is to be expected. Consequently, generalizability may be limited if the patients with recorded EF values are not representative of the full HF population. Further, in the clinical setting, the diagnosis of HFpEF is typically a diagnosis of exclusion and may require confirmatory information about structural changes of the heart, beyond EF alone. Consequently, even though EF improves the accuracy of the diagnosis, there might be false positive HFpEF patients. Finally, we used a cutpoint of 45% to differentiate between 45% to differentiate between HFrEF and HFpEF, which meant we did not attempt to identify moderately reduced (mr) EF (40–49%). As noted by Desai et al. the EF cutpoint of 45% being implemented in our model is based on two major considerations. Firstly, a cutoff at either end of the 40% or 50% EF range could combine all patients with mrEF (40–49%) within a single group which could meaningfully change distribution of patient characteristics selectively in one of these two groups and make prediction difficult. Secondly, as EF changes over time are common in both pEF and rEF, a cutpoint at the mid-point of the mrEF range (45%) might capture more accurately those mrEF patients who either have recovered EF from initial rEF or reduced EF from initial pEF. Moreover, the 45% cutoff has also been used in multiple pivotal trials of HF patients (e.g TOPCAT trial for pEF patients and VICTORIA trial for rEF patients) [11, 12].

In conclusion, results from this study provide evidence regarding the generalizability of an approach using claims data to identify EF classes in HF patients outside of Medicare claims. This will aid future studies evaluating health outcomes, healthcare utilization as well as cost of care among HF patients in routine care when EF measurements are not available.

Supporting information

S1 Fig. Receiver operating characteristics (ROC) curve for the model.

(TIF)

S1 Table. Operational definitions for the variables included in the EF class prediction algorithm.

(PDF)

S2 Table. Baseline characteristics of HF patients correctly and incorrectly classified by algorithm compared to gold standard classification.

(PDF)

Data Availability

Data for these analyses were made available to the authors through third-party license from IBM Truven, a commercial data provider in the US. As such, the authors cannot make these data publicly available due to data use agreement. Other researchers can access these data by purchasing a license through Truven. Inclusion criteria specified in the Methods section would allow other researchers to identify the same cohort of patients we used for these analyses. Interested individuals may see https://marketscan.truvenhealth.com/marketscanportal/ for more information on accessing Truven data.

Funding Statement

This study was supported by a collaborative research grant from Novartis Inc. The study was conducted by the authors independent of the sponsor. The funder provided support in the form of salaries for authors AK, AB, RS and RL but did not have any additional role in the study design, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘Author contributions’ section.

References

  • 1.Ponikowski P, Voors AA, Anker SD, et al. 2016 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure: The Task Force for the diagnosis and treatment of acute and chronic heart failure of the European Society of Cardiology (ESC)Developed with the special contribution of the Heart Failure Association (HFA) of the ESC. Eur Heart J. 2016;37(27):2129–2200. doi: 10.1093/eurheartj/ehw128 [DOI] [PubMed] [Google Scholar]
  • 2.Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. Journal of clinical epidemiology. 2005;58(4):323–337. doi: 10.1016/j.jclinepi.2004.10.012 [DOI] [PubMed] [Google Scholar]
  • 3.Desai RJ, Lin KJ, Patorno E, et al. Development and Preliminary Validation of a Medicare Claims-Based Model to Predict Left Ventricular Ejection Fraction Class in Patients With Heart Failure. Circ Cardiovasc Qual Outcomes. 2018;11(12):e004700. doi: 10.1161/CIRCOUTCOMES.118.004700 [DOI] [PubMed] [Google Scholar]
  • 4.Kaelber DC, Foster W, Gilder J, Love TE, Jain AK. Patient characteristics associated with venous thromboembolic events: a cohort study using pooled electronic health record data. J Am Med Inform Assoc. 2012;19(6):965–972. doi: 10.1136/amiajnl-2011-000782 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Maradey-Romero C, Prakash R, Lewis S, Perzynski A, Fass R. The 2011–2014 prevalence of eosinophilic oesophagitis in the elderly amongst 10 million patients in the United States. Aliment Pharmacol Ther. 2015;41(10):1016–1022. doi: 10.1111/apt.13171 [DOI] [PubMed] [Google Scholar]
  • 6.Sheyn D, James RL, Taylor AK, Sammarco AG, Benchek P, Mahajan ST. Tobacco use as a risk factor for reoperation in patients with stress urinary incontinence: a multi-institutional electronic medical record database analysis. Int Urogynecol J. 2015;26(9):1379–1384. doi: 10.1007/s00192-015-2721-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Van Fossen VL, Wilhelm SM, Eaton JL, McHenry CR. Association of thyroid, breast and renal cell cancer: a population-based study of the prevalence of second malignancies. Ann Surg Oncol. 2013;20(4):1341–1347. doi: 10.1245/s10434-012-2718-3 [DOI] [PubMed] [Google Scholar]
  • 8.Desai RJ, Mahesri M, Chin K, Levin R, Lahoz R, Studer R, et al. Epidemiologic Characterization of Heart Failure with Reduced or Preserved Ejection Fraction Populations Identified Using Medicare Claims. Am J Med. 2020. Oct 27:S0002-9343(20)30924-4. doi: 10.1016/j.amjmed.2020.09.038 [DOI] [PubMed] [Google Scholar]
  • 9.Cohen JB, Schrauben SJ, Zhao L, Basso MD, Cvijic ME, Li Z, et al. Clinical Phenogroups in Heart Failure With Preserved Ejection Fraction: Detailed Phenotypes, Prognosis, and Response to Spironolactone. JACC Heart Fail. 2020. Mar;8(3):172–184. doi: 10.1016/j.jchf.2019.09.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Uijl A, Lund LH, Vaartjes I, Brugts JJ, Linssen GC, Asselbergs FW, et al. A registry-based algorithm to predict ejection fraction in patients with heart failure. ESC Heart Fail. 2020. Oct;7(5):2388–2397. doi: 10.1002/ehf2.12779 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Pitt B, Pfeffer MA, Assmann SF, Boineau R, Anand IS, Claggett B, et al. Spironolactone for heart failure with preserved ejection fraction. N Engl J Med. 2014. Apr 10;370(15):1383–92. doi: 10.1056/NEJMoa1313731 [DOI] [PubMed] [Google Scholar]
  • 12.Pieske B, Patel MJ, Westerhout CM, Anstrom KJ, Butler J, Ezekowitz J, et al. Baseline features of the VICTORIA (Vericiguat Global Study in Subjects with Heart Failure with Reduced Ejection Fraction) trial. Eur J Heart Fail. 2019. Dec;21(12):1596–1604. doi: 10.1002/ejhf.1664 [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Gianluigi Savarese

19 Feb 2021

PONE-D-20-39176

External validation of a claims-based model to predict left ventricular ejection fraction class in patients with heart failure

PLOS ONE

Dear Dr. Desai,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Apr 02 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

We look forward to receiving your revised manuscript.

Kind regards,

Gianluigi Savarese

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please provide additional details regarding participant consent. In the ethics statement in the Methods and online submission information, please ensure that you have specified (1) whether consent was informed and (2) what type you obtained (for instance, written or verbal, and if verbal, how it was documented and witnessed). If your study included minors, state whether you obtained consent from parents or guardians. If the need for consent was waived by the ethics committee, please include this information.

If you are reporting a retrospective study of medical records or archived samples, please ensure that you have discussed whether all data were fully anonymized before you accessed them and/or whether the IRB or ethics committee waived the requirement for informed consent. If patients provided informed written consent to have data from their medical records used in research, please include this information.

Once you have amended this/these statement(s) in the Methods section of the manuscript, please add the same text to the “Ethics Statement” field of the submission form (via “Edit Submission”).

For additional information about PLOS ONE ethical requirements for human subjects research, please refer to http://journals.plos.org/plosone/s/submission-guidelines#loc-human-subjects-research.

3. Thank you for stating the following in the Financial Disclosure section:

"This study was supported by a collaborative research grant from Novartis Inc. The study was conducted by the authors independent of the sponsor. The sponsor was given the opportunity to make nonbinding comments on a draft of the manuscript, but the authors retained the right of publication and determined the final wording."

We note that one or more of the authors have an affiliation to the commercial funders of this research study : [Novartis Healthcare Pvt. Ltd, Novartis Ireland Pvt. Ltd, Novartis Pharma AG].

3.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study. If the funding organization did not play a role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript and only provided financial support in the form of authors' salaries and/or research materials, please review your statements relating to the author contributions, and ensure you have specifically and accurately indicated the role(s) that these authors had in your study. You can update author roles in the Author Contributions section of the online submission form.

Please also include the following statement within your amended Funding Statement.

“The funder provided support in the form of salaries for authors [insert relevant initials], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

If your commercial affiliation did play a role in your study, please state and explain this role within your updated Funding Statement.

3.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.  

Within your Competing Interests Statement, please confirm that this commercial affiliation does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to  PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If this adherence statement is not accurate and  there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared.

Please include both an updated Funding Statement and Competing Interests Statement in your cover letter. We will change the online submission form on your behalf.

Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests

4. We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. Please see http://www.bmj.com/content/340/bmj.c181.long for guidelines on how to de-identify and prepare clinical data for publication. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: No

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Mahesri and coworkers describe the external validation of a previously developed prediction model for LV ejection fraction in a claims-based dataset. The paper is topical and of interest to improve the 'researchability' of claims datasets. I have some points of concern.

1. Heart failure and left ventricular ejection fraction are intertwined for historical reasons. The cut-off of either below or above 45% is questioned, at least in European ESC/HFA guideline and/or position papers in recent years. I understand it is not feasible to predict other LVEF ranges since this is a validation of a previously developed model. However, it deserves attention in the discussion since many peers view LVEF range 40-50% as mildy reduced, with more resemblance of HFrEF <40%, at least in response to drugs.

2. In addition to first point, could there be many patients within LVEF 40-50% which resulted in decreased sensitivity and PPV? The model could discriminate more easily between the two extremes, as stated by the authors. What I am missing in the discussion are explanations for the poor model performance for HFrEF. The question of interest for this paper is not as to whether it is feasible to externally validate the model (i.e. present conclusion of this ms), but given the sens+PPV, what is the usabilty of this model in daily research practise? I struggle to be convinced that a sensitivity of 30% for HFrEF, in a highly selected population (i.e. 17 924 out of 157 203 patients for which LVEF was available) will yield sensible results if this model was to be used on claims datasets.

3. Consider giving just one decimal in table 1 for the percentages.

4. Please shortly mention the variables used by the model to predict the two phenotypes of HFrEF and HFpEF.

5. minor; introduction section: cardiac cahteterization is rarely used to estimate LV ejection fraction nowadays, at least in Western European countries.

Reviewer #2: In this interesting analysis the authors confirm the discriminatory ability of the model they have constructed to classify patients captured in administrative claims databases into HFrEF (herein defined as left ventricular ejection fraction <45%) and HFpEF (herein defined as left ventricular ejection fraction ≥45%). I have several objections regarding the value of the study. I am skeptical on whether a HF classification with 45% used as an EF cut-off is clinically meaningful. Furthermore, the low sensitivity of the model to identify HFrEF (32%) and the considerable misclassification seen with the model for both HFpEF (19%) and HFrEF (27%) increases the degree of uncertainty and bias which are already and issue in case of administrative database. I, therefore, doubt as to the potential utility of this model to support reliable clinical research.

Nonetheless, the study presents the results of original research, which has not been published elsewhere. The methods are described in detail and the analyses are performed to a high technical standard and described in sufficient detail. The conclusions are supported by the data and presented appropriately. The article is presented in an intelligible fashion and is written in standard English.

As the latter are the sole criteria based on which a manuscript submitted to this journal should be evaluated, I find that the paper fulfils all of them and merits publication.

Reviewer #3: The aim of this study was to externally validate a model which predicts reduced and preserved ejection fraction (EF) in heart failure in a sample of commercial insurance enrollees. This represents a different domain of patients which includes those <65 years old and beyond patients who were under Medicare coverage (in which the original model was developed).

The research is relevant to the prediction of EF status in electronic health or claims databases which lacks information on EF values.

It is good to see that overall accuracy, sensitivity, specificity, and PPV for HFrEF and HFpEF here remained similar to the original population when tested in a heart failure population with lower disease severity and fewer comorbid conditions such as previous myocardial infarction, anaemia and atrial fibrillation. While these metrics are commonly used to indicate performance of prediction models, their values are dependent on the health setting and prevalence of the condition of interest in the study population. For instance, higher disease prevalence will raise the PPV and decrease the NPV.

What would be necessary to assess the performance of this model compared to the original one is the model discrimination; the trade-off between sensitivity and specificity. The c-statistics provided in the model development paper was 0.86 (35-predictor binomial model), what is the c-statistics and how does the discrimination plot/ ROC curve look like in this external validation cohort?

Also, it would be important to know the calibration of the model, which refers to the extent of agreement between the predicted probabilities of the model versus the observed frequencies. A calibration plot would be helpful for readers to determine how close the model predictions are to the observed gold standard frequencies.

The following references may come handy:

-DOI: 10.1136/heartjnl-2011-301247

-doi: 10.1093/ckj/sfaa188

-http://dx.doi.org/10.1136/bmj.i3140

Please describe the predictors used in the model in methods, I counted 34 predictors in the Appendix but the original model uses 35? If predictors were not available in the new dataset, it would be helpful to explain in the Methods.

Please also discuss the study findings in relation to existing/previous work by others

In conclusion, the findings from this study is of interest to future work on phenotyping HF patients in large administrative databases

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2021 Jun 4;16(6):e0252903. doi: 10.1371/journal.pone.0252903.r002

Author response to Decision Letter 0


5 May 2021

Response to Reviewers– Manuscript ID: PONE-D-20-39176 – PLOS ONE

External validation of a claims-based model to predict left ventricular ejection fraction class in patients with heart failure

We greatly appreciate the opportunity to revise this manuscript. Please see below for our point-by-point responses and the revised manuscript with tracked changes.

Journal Requirements:

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming.

Response: Thank you. The manuscript was revised to meet the PLOS ONE style requirements. The files have been renamed as appropriate.

2. Please provide additional details regarding participant consent.

Response: Thank you. We have included the following statement in the Methods section:

“The Institutional Review Board committee waived the requirement for informed consent. This is a retrospective cohort study using a HIPAA de-identified dataset and individuals cannot be identified from the data.”

3.1. Please provide an amended Funding Statement declaring this commercial affiliation, as well as a statement regarding the Role of Funders in your study.

Response: Thank you for your comment. Based on the instructions, we have updated our Funding Statement to declare affiliation to the commercial funders for some of the authors and updated the authors’ roles. We have added the following:

“This study was supported by a collaborative research grant from Novartis Inc. The study was conducted by the authors independent of the sponsor. The funder provided support in the form of salaries for authors AK, AB, RS and RL but did not have any additional role in the study design, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the ‘author contributions’ section.”

3.2. Please also provide an updated Competing Interests Statement declaring this commercial affiliation along with any other relevant declarations relating to employment, consultancy, patents, products in development, or marketed products, etc.

Response: We have updated our Competing Interest Statement to confirm that this commercial affiliation does not alter our adherence to PLOS ONE policies on sharing data and materials. We have added the following text in the Competing Interest Statement:

“Dr. Desai has received research grants from Merck and Bayer to the Brigham and Women’s Hospital for projects outside the submitted work. Dr. Studer and Ms. Lahoz are employees of Novartis Pharma AG. Mr. Kumar is an employee of Novartis Healthcare Pvt. Ltd., India and Dr. Barve is an employee of Novartis Ireland Pvt. Ltd., Ireland. There are no conflicts of interest to disclose for the other co-authors. This does not alter our adherence to PLOS ONE policies on sharing data and materials.”

4. We note that you have indicated that data from this study are available upon request.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially identifying or sensitive patient information) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

We will update your Data Availability statement on your behalf to reflect the information you provide.

Response: Data for these analyses were made available to the authors through third-party license from Truven, a commercial data provider in the US. As such, the authors cannot make these data publicly available due to data use agreement. Other researchers can access these data by purchasing a license through Truven. Inclusion criteria specified in the Methods section would allow other researchers to identify the same cohort of patients we used for these analyses. Please see https://marketscan.truvenhealth.com/marketscanportal/ for more information on accessing Truven data.

Response to comments from Reviewers:

Reviewer #1: Mahesri and coworkers describe the external validation of a previously developed prediction model for LV ejection fraction in a claims-based dataset. The paper is topical and of interest to improve the 'researchability' of claims datasets. I have some points of concern.

1. Heart failure and left ventricular ejection fraction are intertwined for historical reasons. The cut-off of either below or above 45% is questioned, at least in European ESC/HFA guideline and/or position papers in recent years. I understand it is not feasible to predict other LVEF ranges since this is a validation of a previously developed model. However, it deserves attention in the discussion since many peers view LVEF range 40-50% as mildy reduced, with more resemblance of HFrEF <40%, at least in response to drugs

Response: Thank you, and this is an important question. First, we would re-iterate that given somewhat distinct patterns of characteristics and outcomes in patients with moderately reduced (mr) EF (40-49%) observed in previous research [Nadruz et al. Circ Heart Fail. 2016 Apr;9(4):e002826], it would be ideal to have an algorithm that can identify these patients separately from rEF and pEF patients. However, as noted in the original publication that describes the development of the claims-based model [Desai et al. Circ Cardiovasc Qual Outcomes. 2018;11(12):e004700.], the algorithm did not succeed in identifying these patients with reliable accuracy. As a result, we had to determine an EF cutpoint for the binary model. We chose the cutpoint of 45% based on the following two reasons: 1) we reasoned that separating out patients at either end of the mrEF range (40 or 50%) could combine all mrEF patients with a single group (with pEF for 40% cutoff or with rEF for 50% cutoff) group, which could meaningfully change distribution of patient characteristics selectively in one of these two groups and make prediction difficult, and 2) as noted in previous research [Dunlay et al. Circ Heart Fail. 2012;5:720-726], EF changes over time are common in both pEF and rEF. We thought that placing the cutpoint at the mid-point of the mrEF range (45%) might capture those mrEF patients who either have recovered EF from initial rEF or reduced EF from initial pEF with some accuracy in their EF class. Additionally, as noted in the study by Desai et al., the 45% cutoff has also been used in multiple pivotal trials of HF patients (e.g TOPCAT trial for pEF patients and VICTORIA trial for rEF patients). Therefore, this choice is not completely arbitrary. We have added the following text to summarize this rationale in the discussion (page 11):

“As noted by Desai et al. the EF cutpoint of 45% being implemented in our model is based on two major considerations. Firstly, a cutoff at either end of the 40% or 50% EF range could combine all patients with moderately reduced (mr) EF (40-49%) within a single group which could meaningfully change distribution of patient characteristics selectively in one of these two groups and make prediction difficult. Secondly, as EF changes over time are common in both pEF and rEF, a cutpoint at the mid-point of the mrEF range (45%) might capture more accurately those mrEF patients who either have recovered EF from initial rEF or reduced EF from initial pEF. Moreover, the 45% cutoff has also been used in multiple pivotal trials of HF patients (e.g TOPCAT trial for pEF patients and VICTORIA trial for rEF patients).”

2. In addition to first point, could there be many patients within LVEF 40-50% which resulted in decreased sensitivity and PPV? The model could discriminate more easily between the two extremes, as stated by the authors. What I am missing in the discussion are explanations for the poor model performance for HFrEF. The question of interest for this paper is not as to whether it is feasible to externally validate the model (i.e. present conclusion of this ms), but given the sens+PPV, what is the usabilty of this model in daily research practise? I struggle to be convinced that a sensitivity of 30% for HFrEF, in a highly selected population (i.e. 17 924 out of 157 203 patients for which LVEF was available) will yield sensible results if this model was to be used on claims datasets.

Response: Thank you. We recognize the limitation and are transparent in describing the low sensitivity for rEF patients in the discussion section. Having said this, in claims data, a large proportion of HF patients are coded as having “unspecified HF”. This inherent limitation of the data contributes to the low model performance for HFrEF patients in claims data. In a recent study, we have characterized the epidemiology of claims-identified HFrEF patients based on this algorithm and have noted that despite the low sensitivity, characteristics and outcome trajectories of patients identified as HFrEF based on this model closely resemble well-characterized population-based cohorts. We have added the following to the discussion section:

“However, despite the low sensitivity, a recent study describing the epidemiology of HFrEF patients, identified in claims data based on this algorithm, showed characteristics and outcome trajectories for HFrEF patients that closely resembled other well-characterized population-based cohorts [8].

Reference:

[8] Desai RJ, Mahesri M, Chin K, Levin R, Lahoz R, Studer R, Vaduganathan M, Patorno E. Epidemiologic Characterization of Heart Failure with Reduced or Preserved Ejection Fraction Populations Identified Using Medicare Claims. Am J Med. 2020 Oct 27:S0002-9343(20)30924-4. doi: 10.1016/j.amjmed.2020.09.038. Epub ahead of print. PMID: 33127370].

3. Consider giving just one decimal in table 1 for the percentages

Response: Thank you for your comment. We have made this change in the manuscript.

4. Please shortly mention the variables used by the model to predict the two phenotypes of HFrEF and HFpEF

Response: Thank you. The definitions of the predictor variables are included in the Supporting information, S-1 Table.

5. minor; introduction section: cardiac cahteterization is rarely used to estimate LV ejection fraction nowadays, at least in Western European countries

Response: Thank you for your comment. We have removed “cardiac catheterization” from the second paragraph in the introduction.

Reviewer #2: In this interesting analysis the authors confirm the discriminatory ability of the model they have constructed to classify patients captured in administrative claims databases into HFrEF (herein defined as left ventricular ejection fraction <45%) and HFpEF (herein defined as left ventricular ejection fraction ≥45%). I have several objections regarding the value of the study. I am skeptical on whether a HF classification with 45% used as an EF cut-off is clinically meaningful. Furthermore, the low sensitivity of the model to identify HFrEF (32%) and the considerable misclassification seen with the model for both HFpEF (19%) and HFrEF (27%) increases the degree of uncertainty and bias which are already and issue in case of administrative database. I, therefore, doubt as to the potential utility of this model to support reliable clinical research.

Nonetheless, the study presents the results of original research, which has not been published elsewhere. The methods are described in detail and the analyses are performed to a high technical standard and described in sufficient detail. The conclusions are supported by the data and presented appropriately. The article is presented in an intelligible fashion and is written in standard English.

As the latter are the sole criteria based on which a manuscript submitted to this journal should be evaluated, I find that the paper fulfils all of them and merits publication.

Response: Thank you for your comment. Please refer to our response to Reviewer 1, comment 1 and comment 2 for a detailed discussion of the points raised.

Reviewer #3: The aim of this study was to externally validate a model which predicts reduced and preserved ejection fraction (EF) in heart failure in a sample of commercial insurance enrollees. This represents a different domain of patients which includes those <65 years old and beyond patients who were under Medicare coverage (in which the original model was developed).

The research is relevant to the prediction of EF status in electronic health or claims databases which lacks information on EF values.

It is good to see that overall accuracy, sensitivity, specificity, and PPV for HFrEF and HFpEF here remained similar to the original population when tested in a heart failure population with lower disease severity and fewer comorbid conditions such as previous myocardial infarction, anaemia and atrial fibrillation. While these metrics are commonly used to indicate performance of prediction models, their values are dependent on the health setting and prevalence of the condition of interest in the study population. For instance, higher disease prevalence will raise the PPV and decrease the NPV.

1. What would be necessary to assess the performance of this model compared to the original one is the model discrimination; the trade-off between sensitivity and specificity. The c-statistics provided in the model development paper was 0.86 (35-predictor binomial model), what is the c-statistics and how does the discrimination plot/ ROC curve look like in this external validation cohort?

Response: We appreciate the comment. The c-statistic for our model was 0.83 (95% CI: 0.81-0.84) which is comparable to the c-statistics reported in the model development paper by Desai et al. We have included the c-statistic along with the following ROC curve graph in the appendix of the manuscript:

Appendix S-3 Figure. Receiver operating characteristics (ROC) curve for the model.

2. Also, it would be important to know the calibration of the model, which refers to the extent of agreement between the predicted probabilities of the model versus the observed frequencies. A calibration plot would be helpful for readers to determine how close the model predictions are to the observed gold standard frequencies.

Response: Thank you for the comment. We have included the calibration plot below for the reviewer. However, as our model is describing a classification exercise and not a prediction exercise, we do not feel the calibration plot adds any additional value here since the predicted probability values are only used to inform the cutoff for classifying rEF and pEF. Therefore, we have opted to not include it in the manuscript.

Calibration plot for the model

3. Please describe the predictors used in the model in methods, I counted 34 predictors in the Appendix but the original model uses 35? If predictors were not available in the new dataset, it would be helpful to explain in the Methods.

Response: Thank you. All 35 predictor variables in the original model are included in the model presented here. The y-intercept value was included as the 35th predictor variable in both the original and this model. We have clarified this in the text as well.

4. Please also discuss the study findings in relation to existing/previous work by others

In conclusion, the findings from this study is of interest to future work on phenotyping HF patients in large administrative databases.

Response: Thank you for your comment. The following discussion was added:

“Determining EF class based on information routinely available in electronic data sources has received increasing recognition in recent years to address frequent unavailability of EF values. A retrospective cohort study among Minnesota residents evaluated a claims-based approach to identify HFpEF patients based on HF diagnosis codes in combination with laboratory orders for BNP/NT-proBNP and achieved PPV of 84% [9]. The higher PPV for HFpEF reported in our study is likely explained by addition of frequent comorbid conditions and demographics in addition to diagnoses codes. A second study by Uijil et al. based on data from the Swedish Heart Failure Registry used an approach similar to ours, where 22 predictors including laboratory results such as NT-proBNP, renal function; demographics such as age, sex; and comorbid conditions were used to classify patients into HFpEF and HFrEF [10]. The authors noted discrimination of 0.73 for this model in an external validation cohort. Results from our study along with the study by Uijil et al. suggest that model-based computable phenotyping of HF patients may provide a useful way to identify and study HF subtypes from electronic healthcare sources where EF values are not available. Our model may be more applicable in the context of US insurance claims databases, which often lack information such as laboratory test results.”

References:

[9] Cohen JB, Schrauben SJ, Zhao L, Basso MD, Cvijic ME, Li Z, Yarde M, Wang Z, Bhattacharya PT, Chirinos DA, Prenner S, Zamani P, Seiffert DA, Car BD, Gordon DA, Margulies K, Cappola T, Chirinos JA. Clinical Phenogroups in Heart Failure With Preserved Ejection Fraction: Detailed Phenotypes, Prognosis, and Response to Spironolactone. JACC Heart Fail. 2020 Mar;8(3):172-184. doi: 10.1016/j.jchf.2019.09.009. Epub 2020 Jan 8.

[10] Uijl A, Lund LH, Vaartjes I, Brugts JJ, Linssen GC, Asselbergs FW, Hoes AW, Dahlström U, Koudstaal S, Savarese G. A registry-based algorithm to predict ejection fraction in patients with heart failure. ESC Heart Fail. 2020 Oct;7(5):2388-2397. doi: 10.1002/ehf2.12779. Epub 2020 Jun 17.

Attachment

Submitted filename: Response to Reviewers_4_16_2021.docx

Decision Letter 1

Gianluigi Savarese

25 May 2021

External validation of a claims-based model to predict left ventricular ejection fraction class in patients with heart failure

PONE-D-20-39176R1

Dear Dr. Desai,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Gianluigi Savarese

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #2: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The revised manuscript has been considerably improved and is well balanced. I have no further comments.

Reviewer #2: All previous comments have been addressed in detail by the authors. I have no further changes to request.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: No

Acceptance letter

Gianluigi Savarese

28 May 2021

PONE-D-20-39176R1

External validation of a claims-based model to predict left ventricular ejection fraction class in patients with heart failure

Dear Dr. Desai:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Gianluigi Savarese

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Receiver operating characteristics (ROC) curve for the model.

    (TIF)

    S1 Table. Operational definitions for the variables included in the EF class prediction algorithm.

    (PDF)

    S2 Table. Baseline characteristics of HF patients correctly and incorrectly classified by algorithm compared to gold standard classification.

    (PDF)

    Attachment

    Submitted filename: Response to Reviewers_4_16_2021.docx

    Data Availability Statement

    Data for these analyses were made available to the authors through third-party license from IBM Truven, a commercial data provider in the US. As such, the authors cannot make these data publicly available due to data use agreement. Other researchers can access these data by purchasing a license through Truven. Inclusion criteria specified in the Methods section would allow other researchers to identify the same cohort of patients we used for these analyses. Interested individuals may see https://marketscan.truvenhealth.com/marketscanportal/ for more information on accessing Truven data.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES