Skip to main content
PLOS ONE logoLink to PLOS ONE
. 2021 Nov 18;16(11):e0260036. doi: 10.1371/journal.pone.0260036

Development and validation of a prognostic tool: Pulmonary embolism short-term clinical outcomes risk estimation (PE-SCORE)

Anthony J Weekes 1,*, Jaron D Raper 1,¤a, Kathryn Lupez 1,¤b, Alyssa M Thomas 1,¤c, Carly A Cox 1,¤d, Dasia Esener 2, Jeremy S Boyd 3, Jason T Nomura 4, Jillian Davison 5, Patrick M Ockerse 6, Stephen Leech 5, Jakea Johnson 3, Eric Abrams 2, Kathleen Murphy 4, Christopher Kelly 6, H James Norton 7
Editor: Christophe Leroyer8
PMCID: PMC8601564  PMID: 34793539

Abstract

Objective

Develop and validate a prognostic model for clinical deterioration or death within days of pulmonary embolism (PE) diagnosis using point-of-care criteria.

Methods

We used prospective registry data from six emergency departments. The primary composite outcome was death or deterioration (respiratory failure, cardiac arrest, new dysrhythmia, sustained hypotension, and rescue reperfusion intervention) within 5 days. Candidate predictors included laboratory and imaging right ventricle (RV) assessments. The prognostic model was developed from 935 PE patients. Univariable analysis of 138 candidate variables was followed by penalized and standard logistic regression on 26 retained variables, and then tested with a validation database (N = 801).

Results

Logistic regression yielded a nine-variable model, then simplified to a nine-point tool (PE-SCORE): one point each for abnormal RV by echocardiography, abnormal RV by computed tomography, systolic blood pressure < 100 mmHg, dysrhythmia, suspected/confirmed systemic infection, syncope, medico-social admission reason, abnormal heart rate, and two points for creatinine greater than 2.0 mg/dL. In the development database, 22.4% had the primary outcome. Prognostic accuracy of logistic regression model versus PE-SCORE model: 0.83 (0.80, 0.86) vs. 0.78 (0.75, 0.82) using area under the curve (AUC) and 0.61 (0.57, 0.64) vs. 0.50 (0.39, 0.60) using precision-recall curve (AUCpr). In the validation database, 26.6% had the primary outcome. PE-SCORE had AUC 0.77 (0.73, 0.81) and AUCpr 0.63 (0.43, 0.81). As points increased, outcome proportions increased: a score of zero had 2% outcome, whereas scores of six and above had ≥ 69.6% outcomes. In the validation dataset, PE-SCORE zero had 8% outcome [no deaths], whereas all patients with PE-SCORE of six and above had the primary outcome.

Conclusions

PE-SCORE model identifies PE patients at low- and high-risk for deterioration and may help guide decisions about early outpatient management versus need for hospital-based monitoring.

Introduction

An important indicator of acute pulmonary embolism (PE) of moderate to high severity is an acute increase in right ventricular pressure or size or decreased systolic function. PE-provoked right ventricle (RV) abnormality is commonly assessed in two ways: 1) laboratory surrogates of myocardial stretch and injury, and 2) imaging assessments for RV dilatation, pressure increases, and decreased systolic function. The most common diagnostic tests are natriuretic peptide and troponin, and imaging by computed tomography (CT) and echocardiography. Assessments for abnormal RV (abnlRV) are absent in validated clinical prognostic models, such as the original and simplified Pulmonary Embolism Severity Index (PESI and sPESI) and Hestia [13]. These prognostic prediction models utilized a limited set of candidate variables without pertinent imaging and laboratory measurements [4]. Risk of early clinical deterioration from worsening RV function is not captured in current prediction models [57].

The newer anticoagulants offer efficacy and safety in PE treatment, yet there is hesitancy to discharge those with acute PE. Hospitalization for PE is as high as 90%–95% in the U.S. and Europe, yet 41%–51% of PE patients are classified as low-risk by existing clinical prediction models [812]. Clinical algorithms, checklists, and prognostic models are being developed and updated to optimize the safety of outpatient management, improve prognostic accuracy for outcome(s), and provide guidance to reduce practice variation. Incorporation of imaging and laboratory assessments for PE-provoked abnlRV have now been incorporated into hybrid clinical algorithms [1, 7, 1316], and some meta-analyses now support use of one or multiple RV assessment methods [4, 17, 18]. A consistent definition of PE-provoked abnlRV, however, is lacking [1922].

Acute care providers are thus challenged to identify PE patients who are considered low-risk (and safe for early discharge) and those at greater risk of clinical deterioration without a clear guideline on RV assessment in acute PE. Providers must make disposition decisions driven by concerns for acute deterioration (respiratory failure, cardiac arrest, new dysrhythmia, sustained hypotension, and rescue reperfusion intervention) within the first days of PE diagnosis rather than events at 30 days or later. Thus, we aimed to develop and validate a prediction model for the probability of deterioration or death within days of acute PE diagnosis in acute care settings.

Materials and methods

Study design and setting

This was a prospective, observational, multicenter study using two registry databases. The first database was the Pulmonary Embolism Short-term Clinical Outcomes Registry (PESCOR; clinicaltrials.gov NCT02883491), a prospective registry of patients who presented to six urban, academic, emergency departments (EDs) in the following locations during the pilot: San Diego, California; Newark, Delaware; Orlando, Florida; Charlotte, North Carolina; Nashville, Tennessee; and Salt Lake City, Utah. The cohort was chosen to allow for broad generalizability. By enrolling patients from a diverse set of EDs with geographic spread, we expected to capture the full spectrum of demographics and acute PE severity at presentation. The second registry was created after federal funding was secured for development of the prediction model (Short-term Clinical Deterioration After Acute Pulmonary Embolism; clinicaltrials.gov NCT03915925). The unfunded initial registry (PESCOR) was used for the validation. Both registries were populated by the same six EDs and had similar variables, data recording instruments, and outcome variables.

The development database was prospectively accrued between September 18, 2018 and December 14, 2020. The validation database was built between August 2016 to March 2019. The central site (located in Charlotte, North Carolina) prospectively enrolled consecutive patients; the other five sites prospectively enrolled on a convenience basis. During the early stages of the unfunded registry, the central site enrolled patients with written informed consent until its institutional review board (Atrium Health IRB) approved waiver of informed consent. The other five sites enrolled with written informed consent with approval from each of their institutional review boards. Once federal funding was secured, all sites used the central IRB (Advarra IRB), which approved the study protocol and waiver of written informed consent for enrollments at all sites (Advarra approval code PRO-00029256). The reporting of results adheres to the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) reporting criteria [23, 24].

Participants

Inclusion and exclusion criteria were the same for both development and validation databases. Men and women 18 years or older with image-confirmed acute PE diagnosed within 12 hours of ED presentation were eligible for enrollment. Patients were excluded for any of the following reasons: age 17 years old and younger at the time of screening; refusal to participate in study; radiologist’s determination that filling defects were chronic, resolving, or unchanged after comparison to previous CT, if available; empiric anticoagulation or escalated intervention initiated more than 12 hours before PE diagnosis; incidental identification of either segmental or subsegmental intraluminal filling defects on CT or unrelated to primary diagnostic workup or ED presentation.

Data collection

The electronic case report included over 400 variable entry fields for prognostic model testing and other aims of the registry. For the prognostic tool, we collected 138 data elements on each patient, including vital signs at presentation, risk factors for PE, comorbidities, contemporaneous measurements of cardiac biomarkers [troponin and brain natriuretic peptide (BNP)], and CT and goal-directed echocardiography evaluations performed early in ED management of the index PE event.

Outcome measures

The primary composite outcome had morbidity and mortality outcomes of interest to emergency providers, which require hospital-based monitoring or time-sensitive interventions. We used a composite of death (all cause and PE-related) and clinical deterioration within five days of index PE confirmation. We incorporated and adapted a composite primary outcome previously used by researchers and considered to be important to providers and pulmonary embolism response teams in the USA and other countries [5, 7, 13, 2528]. The individual components of the composite outcome have previously been reported on [5, 27]. Deaths were classified as PE-related when the site investigator reviewed the case and determined death was not likely to be due to another cause, such as septic shock or acute myocardial infarction. Elements of clinical deterioration included respiratory failure, cardiac arrest, new dysrhythmia, sustained hypotension requiring intravenous volume expansion or adrenergic medication, and rescue reperfusion intervention.

Respiratory failure was defined as respiratory distress associated with emergent interventions with mechanical ventilation (intubation, non-invasive positive pressure ventilation, or surgical cricothyrotomy). Cardiac arrest was defined as any unstable cardiac rhythm or absent electrical activity requiring cardiopulmonary resuscitation or advanced cardiac life support for asystole, pulseless electrical activity, ventricular fibrillation, or unstable ventricular tachycardia. New dysrhythmia was defined as the identification of atrial fibrillation with rapid ventricular response, atrial flutter, supraventricular tachycardia, stable ventricular tachycardia, or bradycardia that was not evident at ED presentation. Hypotension was defined as systolic blood pressure less than 90 mmHg (or a 40 mmHg decrease from baseline) or shock index >1 associated with administration of greater than 500 mL of intravenous fluids within 15 minutes for volume expansion or administration of norepinephrine, dopamine, or epinephrine infusion.

Major bleeding was attributed to treatment with anticoagulation or thrombolysis and not as a primary outcome of clinical deterioration due to PE severity. The presence of death or any clinical deterioration element within five days of hospitalization was considered to be positive for the primary outcome. The absence of death or clinical deterioration within five days post-PE confirmation was considered negative for primary outcome. Each patient could have more than one element of clinical deterioration.

Although not the focus of this report, our secondary outcome included the same events in 5 days with the addition of major bleeding, recurrence of venous thromboembolism (VTE), or subsequent hospitalization within 30 days.

Predictor variables

We considered 138 candidate variables available at the point-of-care, including laboratory and imaging tests relevant to assessment of abnlRV, and those previously vetted by PE registries, sPESI, Hestia, and European Society of Cardiology (ESC) [3, 4, 15, 29, 30]. Predictor variables were measured and assessed while blinded to outcomes. We included symptoms, signs, and findings likely to represent higher PE severity. As an example, we chose syncope instead of shortness of breath or chest pain based on clinical experience and evidence in the literature [3134]. We added a variable that factored in initial heart rate < 50 or > 100 bpm [35]. We included a component of Hestia that employed provider gestalt of medical and social support reasons for hospitalization as social determinants of health. Variables that addressed the safety risk of PE treatment (including predispositions to bleeding) were not included as candidate variables for the primary outcomes of clinical deterioration. We report on the missingness of variables in the final prognostic model and associated outcomes of those with missing variable responses.

Definitions of key predictor variables

Echocardiography

Goal-directed echocardiography (GDE) was performed by emergency physicians within four hours of PE confirmation with qualitative interpretation by site investigators. We reported the training level of providers performing the GDE (i.e., residency year, fellowship, or attending). We defined RV anatomy and physiology in PE using previously reported interpretation guidelines for abnormal [3638]. We used a criteria grading system with high inter-rater reliability (kappa = 0.84) for severe RV dilatation between emergency physicians and cardiologists [37]. Quality assurance reviews were performed by experienced clinical ultrasound leaders.

Severe RV dilatation was defined as RV:LV basal diameter ≥ 1.0 or basal RV diameter > 42 mm with blunting of the RV apex on two or more different windows. Severe RV systolic dysfunction was defined as a visual estimate of tricuspid annular planar systolic excursion (TAPSE) being 10 mm or less and RV free wall hypokinesis [38]. GDE was also assessed for flattening or deviation of interventricular septum (IVS) towards the left ventricle. The GDE score for PE-provoked RV dysfunction was assigned scores of zero to three. RV dilatation was considered to be a requirement for visual identification of PE-provoked RV dysfunction. The absence of RV dilatation was scored as zero, whereas one point each was assigned for RV dilatation, septal flattening or leftward deviation, and RV systolic dysfunction. When GDE was considered abnormal (scores of 1 to 3), a determination of whether the RV abnormality was acute, chronic, or indeterminate was included. We determined RV abnormality to be chronic based on the presence of RV free wall thickness ≥ 7 mm or accompanying signs of LV abnormalities or previous echocardiography records reporting pre-existing RV abnormalities. We also noted if GDE image quality was inadequate for interpretation.

Cardiac biomarkers

Serum measurements were obtained within six hours of PE diagnosis to test for myocardial stretch and injury. For myocardial stretch, we used BNP with i-STAT BNP test cartridge (Abbott Point of Care, Abbott Park, IL), with a cut-off value of > 90 pg/mL. For sites that used N terminal BNP, the threshold cut-off value was 500 pg/mL. For myocardial injury, we used troponin i-STAT cTnI test cartridges (Abbott Point of Care, Abbott Park, IL), with cut-off value ≥ 0.07 ng/mL. In December 2019, the central site had an institution-wide replacement of point-of-care troponin I with high-sensitivity troponin, for which we used cut-off values of 20 ng/L for males and 12 ng/L for females. We created binary categorical variables for natriuretic peptide and troponin elevation.

Computed tomography pulmonary angiography (CTPA)

CTPA images were reviewed by board-certified radiologists unaffiliated with the research and blinded to clinical condition of patients. Using transverse 1 mm CT cuts, RV:LV basal diameter ≥ 1.0 was considered indicative of RV dilatation. The proximal location of thrombus on CTPA was also reported (saddle, proximal portion of main right or left pulmonary arteries, lobar, segmental or subsegmental).

Abstractor training

Before the study started, the principal investigator (PI) led detailed in-person discussions with site investigators to clearly define variables with field notes in REDCap case report forms. Monthly communication updates, central site monitoring, and in-person training sessions at national conferences were all employed for data cleaning. Before enrollment ended, the central site performed univariable analyses to determine completeness and sensibility of entries. Verification queries were performed, with corrections made if necessary.

Sample size

We used Peduzzi’s rule for logistic regression to guide determination of sample size [39]. This rule declares the maximum number of independent (predictor) variables is no more than N/10, where N is the number of observations (subjects) in the smaller of the two groups (outcome dichotomous yes/no). We were prepared to accommodate up to 22 final variables. So, 220 subjects (220/10 = 22) were needed in the smaller (clinical deterioration yes) subgroup. Using an estimated 25% occurrence of clinical deterioration within several days (based on previously cited literature), sample size of 880 was required for the development database [6, 27, 40].

Missing values

We reported the percentage of missing observations for each variable. Missing categorical data were marked as absent [41].

Statistical analysis methods

Data cleaning

We performed three interim data cleans during the enrollment phase before importing to SAS for the final data clean after the final enrollment. During the enrollment phase, important variables were assessed for missingness and discrepancies during data cleaning. For example, we reported outliers in vital signs or laboratory measurement values to the site investigators. At the close of enrollments, descriptive statistics were used to examine predictor and outcome variables for sensibility and missingness. Instructions for corrective actions were assigned to the site investigator and clinical research team by referring to source documents within the electronic health records. After missingness was mitigated and sensibility of data optimized, the database was used for analysis. We then imported the development and external validation databases to SAS Enterprise Guide 7.1 (SAS Institute Inc., Cary, NC, USA).

We computed overall descriptive statistics on all variables in each dataset. We reported on the number of non-missing and missing values, the mean, median, standard deviation, minimum and maximum values for continuous variables. We used frequencies and percentages of each value (including missing values) for categorical variables. The PI and biostatistician inspected reports and made queries to verify and correct data as needed.

We created additional dichotomous variables using previously established cut-offs in validated prediction models or clinical guidelines (e.g., age > 80, systolic blood pressure < 100 mmHg, heart rate < 50 or > 100 bpm, initial oxygen saturation < 90%).

Model development

Fig 1 shows the steps taken to derive the prognostic model. We screened 138 candidate variables with bivariate analyses of the primary outcome in the development dataset. We used Student’s t-test for continuous variables, Cochran-Armitage test for trend for ordinal variables, and the chi-square test for categorical variables. We chose a significance level of 5% or clinical importance as preliminary screening criteria for the full model testing and filtering of candidate variables for subsequent regression model testing. The rule for retaining variables was not simply p < 0.05. Rather, the decision whether to retain a variable was based on a combination of factors, including strength of association, prior research findings, and clinical importance as determined by investigators. Below, we outline the subsequent steps taken to optimize its clinical utility. Full descriptions of each step follow the outline.

Fig 1. Deriving the nine-variable prognostic model.

Fig 1

  1. We used a least absolute shrinkage operator (LASSO) logistic regression model for variable selection [42].

  2. To further assess the predictor variables selected by the LASSO procedure, we included them in a standard logistic regression with the primary outcome as the response on the development data. We excluded predictor variables with p > 0.10 from further analysis.

  3. We ran a generalized linear mixed model (GLMM) on the development database, with the primary outcome as the response and the reduced set of predictor variables identified by the LASSO and standard logistic regression models. The GLMM included a random intercept term for the clinical site to adjust for intra-site clustering.

  4. To facilitate real-time clinical use by providers, we simplified the final 9-variable logistic regression model to a 9-variable points model that we named PE-SCORE.

Per the outline, we first used a LASSO logistic regression model for variable selection. LASSO is a type of penalized regression method that minimizes collinearity and avoids overfitting the model [43]. In addition, LASSO partitioned the development database such that two-thirds (67%) of data were used to train (or fit) the model, while 33% of the data were used for the first stage of internal validation of the model [44]. We selected the optimal level of penalization by using average squared error between responses and predictions in the internal validation data [44].

To further assess the predictor variables selected by the LASSO procedure, we included them in a standard logistic regression with the primary outcome as the response on the development data. We excluded predictor variables with p > 0.10 to create a more parsimonious model. Because of possible intra-site clustering, we considered the clinical research site to have a potentially important random effect in modeling the primary outcome. To assess site differences on the primary outcome and selected predictor variables, we used one-way analysis of variance for continuous variables, the Kruskal-Wallis test for ordinal variables, and the chi-square test for categorical variables. Informed by these findings, we ran a generalized linear mixed model (GLMM) on the development database, with the primary outcome as the response and the reduced set of predictor variables identified by the LASSO and standard logistic regression models. The GLMM included a random intercept term for the clinical research site to adjust for intra-site clustering. To determine the importance of the site effect in the model, we assessed its variance using a test based on the ratio of residual pseudo-likelihoods. We tested odds ratios of retained variables and used their confidence intervals (CIs) to determine significance as predictors of the primary composite outcome.

Presentation of prediction model

For the logistic regression, we reported coefficients for the variables in the final model, p-values, likelihood ratios, and odds ratios with confidence intervals. [The logistic regression equation is available for calculation of the probabilities.] Next, we assigned whole points and weights to the final variables of the tool, which were proportional to each variable’s odds ratio for the primary outcome. We developed the points tool, called Pulmonary Embolism Short-term Clinical Outcome Risk Estimator (PE-SCORE), for real-world usefulness to providers at the point of decision-making [2, 30, 4547].

External validation

We used the external validation database to test the PE-SCORE model for reproducibility of results and to measure performance of the model on an entirely different sample. Site investigators and data extractors were blinded to the selection of development and validation databases.

We reported descriptive statistics to determine similarities and differences between the databases and compared them with t-test and chi-square analyses for predictor variables. We ran the points model on the validation database.

Prognostic model performance

We reported on the prognostic performance of both the logistic model and the points model (PE-SCORE) on the development and validation databases. We measured and assessed sensitivity, specificity, and positive and negative predictive values for the primary outcome (yes/no) using two thresholds (low-risk and high-risk) for the PE-SCORE model. To report on discrimination, we reported sensitivity, 1 minus specificity, and receiver operating characteristic (ROC) to derive the area under the curve (AUC) and area under precision recall curve (AUCpr), with 95% confidence intervals and F1 scores and curves for visualization. For calibration, we 1) reported the proportion of observed actual events versus predicted probabilities, and 2) assessed goodness of fit between individuals with and without the outcome of interest with the Spiegelhalter z test and its p-value [48]. We reported measurements of calibration slope for overestimation and underestimation of risk prediction and the intercept for calibration-in-the- large [49, 50]. We provided figures of calibration curves for visualization [50, 51]. We used the following interpretation guideline: A slope < 1.0 suggests estimated risks are exaggerated, whereas slope > 1 suggests risks are underestimated. The calibration intercept was used for overall calibration-in-the-large. Using an optimal value of 0, negative values indicated overestimation, whereas positive values suggested underestimation.

To compare model performance, we compared the AUC of the full logistic model with the PE-SCORE in the development dataset. For this comparison, we used the method described by DeLong [52]. To compare AUC of PE-SCORE in the development and validation databases, we used the chi square test presented by Gonen [53].

Results

Participants

We enrolled 1008 patients into the development database, with 73 post-enrollment exclusions, leaving 935 records for analysis. We enrolled 815 patients in the validation database, with 14 post-enrollment exclusions, leaving 801 records for analysis. As shown in Table 1, patient characteristics in both databases were similar, as was the incidence of primary composite outcome and each of its components. Recurrence of VTE, major bleeding, and death within 30 days were higher in the development database.

Table 1. Descriptive statistics of development and validation databases.

Development Validation
(N = 935) (N = 801)
Age
    Mean (SD) 60.3 (16.5) 58.5 (16.7)
    Median [Min, Max] 62.0 [18.0, 104] 60.0 [19.0, 101]
Age > 80 92 (9.8%) 69 (8.6%)
Initial Systolic Blood Pressure
    Mean (SD) 132 (24.8) 132 (23.7)
    Median [Min, Max] 133 [55.0, 223] 131 [60.0, 210]
    Missing 0 (0%) 4 (0.5%)
Systolic Blood Pressure < 100 mmHg 82 (8.8%) 54 (6.8%)
Initial Heart Rate (beats/min)
    Mean (SD) 98.8 (21.5) 97.0 (21.3)
    Median [Min, Max] 98.0 [35.0, 184] 96.0 [45.0, 182]
    Missing 1 (0.1%) 4 (0.5%)
Abnormal Heart Rate (< 50 or > 100 beats/min) 435 (46.6%) 335 (42.0%)
Shock Index Calculation
    Mean (SD) 0.776 (0.248) 0.764 (0.249)
    Median [Min, Max] 0.700 [0.300, 2.00] 0.700 [0.300, 2.50]
    Missing 1 (0.1%) 4 (0.5%)
Initial Respiratory Rate
    Mean (SD) 20.0 (4.66) 19.9 (4.55)
    Median [Min, Max] 18.0 [10.0, 48.0] 18.0 [11.0, 47.0]
    Missing 4 (0.4%) 4 (0.5%)
Initial Pulse Oximetry (%)
    Mean (SD) 95.4 (4.88) 95.6 (4.26)
    Median [Min, Max] 96.0 [37.0, 100] 96.0 [67.0, 100]
    Missing 3 (0.3%) 4 (0.5%)
Initial Temperature (F)
    Mean (SD) 98.1 (0.973) 98.3 (0.886)
    Median [Min, Max] 98.1 [89.0, 103] 98.2 [93.8, 103]
    Missing 20 (2.1%) 7 (0.9%)
Body Mass Index (kg/m 2 )
    Mean (SD) 31.2 (8.68) 31.4 (9.22)
    Median [Min, Max] 29.8 [14.1, 79.9] 30.0 [14.1, 87.6]
    Missing 18 (1.9%) 59 (7.4%)
Point-of-care
BNP Level, pg/ml
    Mean (SD) 245 (487) 303 (704)
    Median [Min, Max] 71.0 [4.00, 4670] 76.0 [7.00, 8280]
    Missing 334 (35.7%) 256 (32.0%)
NT Pro BNP
    Mean (SD) 1710 (5960) 1030 (3000)
    Median [Min, Max] 159 [5.00, 70000] 109 [6.00, 32500]
    Missing 644 (68.9%) 577 (72.0%)
Point-of-care
Troponin ng/L
    Mean (SD) 0.860 (21.8) 0.0928 (0.289)
    Median [Min, Max] 0.0300 [0, 648] 0.0200 [0, 4.19]
    Missing 52 (5.6%) 10 (1.2%)
Length of Stay, days
    Mean (SD) 4.7 (6.2) 4.8 (5.0)
    Missing 40 (4.3%) 27 (3.4%)
Hospital Length of Stay less than 24 hours (discharged) 137 (14.6%) 91 (11.3%)
Clinical Research Site
    Carolinas Medical Center 312 (33.4%) 409 (51.1%)
    San Diego 189 (20.2%) 66 (8.2%)
    Vanderbilt University Medical Center 134 (14.3%) 78 (9.7%)
    University of Utah 78 (8.3%) 85 (10.6%)
    Orlando Regional Medical Center 105 (11.2%) 89 (11.1%)
    Christiana Care 117 (12.5%) 74 (9.2%)
Gender
    Female 455 (48.7%) 392 (48.9%)
    Male 480 (51.3%) 409 (51.1%)
Race
    Black 253 (27.1%) 255 (31.8%)
    White 638 (68.2%) 497 (62.0%)
    American Indian/Alaskan Native 7 (0.7%) 6 (0.7%)
    Asian 10 (1.1%) 7 (0.9%)
    Pacific Islander/Native Hawaiian 2 (0.2%) 1 (0.1%)
    Unknown/Other 25 (2.7%) 35 (4.4%)
Ethnicity
    Hispanic or Latino 73 (7.8%) 45 (5.6%)
    Not Hispanic or Latino 825 (88.2%) 744 (92.9%)
    Unknown 36 (3.9%) 12 (1.5%)
    Missing 1 (0.1%) 0 (0%)
Preceding Episode Syncope
    Yes 92 (9.8%) 68 (8.5%)
Transient Hypotension
    Yes 68 (7.3%) 72 (9.0%)
Preceding Bradycardia
    Yes 16 (1.7%) 12 (1.5%)
Preceding Pulselessness
    Yes 12 (1.3%) 13 (1.6%)
Prior diagnosis of PE or DVT
    Yes 230 (24.6%) 205 (25.6%)
Family History of VTE
    Yes 59 (6.3%) 57 (7.1%)
Hormone Replacement Therapy
    Yes 28 (3.0%) 29 (3.6%)
    Missing 1 (0.1%) 0 (0%)
Recent Pregnancy
    Yes 8 (0.9%) 12 (1.5%)
    Missing 2 (0.2%) 0 (0%)
Creatinine > 2.0 mg/dL
    Yes 81 (8.7%) 44 (5.5%)
    Missing 0 (0%) 1 (0.1%)
Moderate or severe liver disease
    Yes 20 (2.1%) 20 (2.5%)
Clotting Disorders
    Yes 27 (2.9%) 24 (3.0%)
    Missing 0 (0%) 5 (0.6%)
Recent Hospitalization
    Yes 286 (30.6%) 300 (37.5%)
Recent Trauma
    Yes 66 (7.1%) 60 (7.5%)
    Missing 0 (0%) 2 (0.2%)
Indwelling Vascular Catheter
    Yes 56 (6.0%) 60 (7.5%)
    Missing 0 (0%) 3 (0.4%)
Chronic Obstructive Pulmonary Disease
    Yes 136 (14.5%) 121 (15.1%)
Any cancer
    Yes 230 (24.6%) 200 (25.0%)
Heart Failure
    Yes 55 (5.9%) 71 (8.9%)
    Missing 0 (0%) 1 (0.1%)
Hemiplegia
    Yes 22 (2.4%) 26 (3.2%)
Diabetes without end organ damage
    Yes 109 (11.7%) 114 (14.2%)
    Missing 0 (0%) 2 (0.2%)
Diabetes with end organ damage
    Yes 58 (6.2%) 43 (5.4%)
Suspected Hypovolemia
    Yes 44 (4.7%) 39 (4.9%)
Total Charlson Index
    0 397 (42.5%) 326 (40.7%)
    1 159 (17.0%) 140 (17.5%)
    2 132 (14.1%) 119 (14.9%)
    3 71 (7.6%) 55 (6.9%)
    4 23 (2.5%) 26 (3.2%)
    5 26 (2.8%) 19 (2.4%)
    6 57 (6.1%) 63 (7.9%)
    7 31 (3.3%) 17 (2.1%)
    8 20 (2.1%) 13 (1.6%)
    9 10 (1.1%) 8 (1.0%)
    10 4 (0.4%) 12 (1.5%)
    11 2 (0.2%) 2 (0.2%)
    12 2 (0.2%) 1 (0.1%)
    13 1 (0.1%) 0 (0%)
Other Medical or Social reason for treatment in Hospital >24 hours
    Yes 526 (56.3%) 310 (38.7%)
    Missing 5 (0.5%) 14 (1.7%)
Natriuretic peptide elevation
    Yes 349 (37.3%) 308 (38.5%)
    Missing 42 (4.5%) 31 (3.9%)
Troponin Elevation
    Yes 269 (28.8%) 185 (23.1%)
    Missing 12 (1.3%) 9 (1.1%)
CT RV:LV Ratio
    Yes 309 (33.0%) 249 (31.1%)
    Missing 17 (1.8%) 20 (2.5%)
    Most Proximal Location thrombus on CTPA
    Saddle 106 (11.6%) 92 (11.6%)
    Proximal pulmonary artery 192 (21.0%) 102 (12.9%)
    Lobar 324 (35.5%) 281 (35.6%)
    Segmental 245 (26.8%) 256 (32.4%)
    Subsegmental 46 (5.0%) 59 (7.5%)
GDE Score
    0 604 (64.6%) 549 (68.5%)
    1 72 (7.7%) 62 (7.7%)
    2 122 (13.0%) 104 (13.0%)
    3 116 (12.4%) 59 (7.4%)
    Missing 21 (2.2%) 27 (3.4%)
Poor LV Function
    Yes 68 (7.3%) 63 (7.9%)
    Missing 19 (2.0%) 47 (5.9%)
GDE Showing abnlRV
    Yes 310 (33.2%) 225 (28.1%)
    Missing 21 (2.2%) 27 (3.4%)
If GDE >0, is it acute?
    Yes 278 (63.2%) 191 (65.0%)
    No 121 (27.5%) 69 (23.5%)
    Indeterminate 41 (9.3%) 34 (11.6%)
Low-risk sPESI
    Yes 314 (33.6%) 297 (37.1%)
Low-risk ESC
    Yes 77 (8.2%) 106 (13.2%)
Primary Composite Outcome
    Yes 209 (22.4%) 213 (26.7%)
Secondary Outcome
    Yes 331 (35.4%) 313 (39.1%)
    Missing 5 (0.5%) 0 (0%)
Death within 5 days
    Yes 24 (2.6%) 16 (2.0%)
Cardiac Arrest within 5 days
    Yes 15 (1.6%) 17 (2.1%)
Respiratory Failure within 5 days
    Yes 78 (8.3%) 71 (8.9%)
Dysrhythmia within 5 days
    Yes 60 (6.4%) 53 (6.6%)
Major Bleeding within 5 days
    Yes 23 (2.5%) 26 (3.2%)
Reperfusion intervention within 5 days
    Yes 60 (6.4%) 63 (7.9%)
Hypotension Pressors within 5 days
    Yes 46 (4.9%) 35 (4.4%)
Hypotension Fluid within 5 days
    Yes 72 (7.7%) 101 (12.6%)
Hypoxia within 5 days
    Yes 423 (45.2%) 364 (45.4%)
Recurrence of VTE within 30 days
    Yes 12 (1.3%) 10 (1.2%)
    Missing 13 (1.4%) 1 (0.1%)
Major Bleeding within 30 days
    Yes 30 (3.2%) 37 (4.6%)
    Missing 12 (1.3%) 1 (0.1%)
Death within 30 days
    Yes 68 (7.3%) 56 (7.0%)
    Missing 3 (0.3%) 0 (0%)
Active Bleeding
    Yes 0 (0%) 105 (13.1%)
    Missing 935 (100%) 14 (1.7%)
Anticoagulation Initiated
    Yes 868 (92.8%) 714 (89.1%)
    Missing 1 (100%) 12 (1.5%)

Abbreviations: BNP = brain natriuretic peptide; PE = pulmonary embolism; DVT = deep vein thrombosis; VTE = venous thromboembolism; ESC = European Society of Cardiology Pulmonary Embolism Management guidelines (2019)[15]; CT = computed tomography; LV = left ventricle; RV = right ventricle; GDE = goal-directed echocardiography; sPESI = simplified Pulmonary Embolism Severity Index

There was low missingness for candidate variables. The variable with the most frequent missing responses (marked as absent) was GDE score at 2.2% and 3.4% in the development and validation databases, respectively. GDE missingness, however, was expected. Our assessment showed the impact of missing GDE values on outcomes was minimal: the percentage of patients experiencing the primary outcome for those with GDE negative, positive, and missing responses for abnlRV were 14.4%, 38.4%, and 28.6%, respectively (S1 Table). Twenty-one (2.2%) patients in the developmental database were missing GDE. Most of these missing GDE scores were marked as inadequate for interpretation. Six of the 21 had a positive primary outcome. For the combined databases, 1706 GDE were performed by faculty (29.1%), fellows (9.7%), third-year emergency medicine (EM) residents (23.6%), second-year EM residents (29.1%), and first-year EM residents (15.1%). There were 48 patients (2.8%) without GDE scores: 18 (37.5%) were performed but technically difficult and not interpretable; whereas GDE was not performed on 30 patients (62.5%) before ED discharge.

Enrollments were not evenly distributed for the six sites in both databases. The central site enrolled 33.4% of the development database and 51.1% of the validation database. The other sites enrolled 8.3%–20.2% and 8.2%–11.1% in the two databases, respectively.

Model development

S1 Table shows main results of univariable analyses of candidate variables on the development database. Notably, any cancer (p = 0.987) and heart failure (p = 0.285) had non-significant p-values. Twenty-six of the 138 candidate variables vetted by univariable analyses had p-values below 0.05 and were retained for subsequent LASSO regression. We re-entered variables for chronic obstructive pulmonary disease (COPD), cancer, and oxygen saturation below 90% because these were variables in validated sPESI and Hestia models. LASSO retained 13 variables; cancer was not retained again. We next ran a standard logistic regression with the 13 retained variables, nine of which had p < 0.10 in the logistic model and were retained for further analysis.

In the univariable comparisons of clinical research sites, we found statistically significant differences between sites for the variables shown in S2 Table (primary composite outcome, race, age, ethnicity, abnormal heart rate, creatinine greater than 2.0 mg/dL, abnormal RV by imaging, and medical/social reasons for hospitalization). Moreover, the random intercept term for the clinical research site was statistically significant (p < 0.01) in the GLMM. Accordingly, we retained ‘clinical research site’ as a random effect in the model. [Although clotting disorder was statistically significant, it was uncommon; thus, it was not included in the final prognostic model.] Table 2 shows the nine variables used in the final logistic regression equation.

Table 2. Final variables of logistic regression model.

Predictor Adjusted Odds 95% CI Coefficient 95% CI Coefficient
Odds Ratio
Lower Upper Lower Upper
Creatinine > 2.0 mg/dL 5.37 2.49 11.58 1.68 .911 2.45
Dysrhythmia 4.00 2.07 7.73 1.39 .730 2.04
Suspected/confirmed systemic infection 3.47 1.64 7.37 1.24 .491 2.00
Systolic Blood Pressure < 100 2.87 1.63 5.07 1.05 .486 1.62
Abnormal Heart rate 2.26 1.52 3.35 .813 .418 1.21
Preceding episode of syncope 1.97 1.15 3.38 .680 .141 1.22
Medical social reason for hospitalization 1.91 1.21 3.03 .649 .190 1.11
Echocardiography RV abnormal 1.81 1.12 2.91 .592 .115 1.07
CT RV:LV ratio elevated 1.73 1.05 2.84 .548 .050 1.05
Intercept -2.91 -4.01 -1.80

Abbreviations: CI = confidence interval; RV = right ventricle; LV = left ventricle; CT = computed tomography

Model specification

The logistic regression equation to determine probability of the primary outcome is P = [1+exp(-(αRE + Σβixi))]-1, where αRE is the fixed intercept (-2.91) and Σβixi is the sumproduct of the nine fixed regression coefficients of the random effects model.

To convert the 9-variable logistic regression prognostic model into a simpler format for usefulness, we used the odds ratios shown in Table 3

Table 3. Primary outcome probability for final model variables.
Final Predictor Variable Adjusted Odds Ratio Development Database Validation Database Points Assigned
Relative Risk Relative Risk
Creatinine > 2.0 mg/dL 5.37 2.48 2.16 2
Dysrhythmia 4.00 2.39 3.67 1
Suspected/confirmed systemic infection 3.47 2.63 3.67 1
Systolic blood pressure < 100 mmHg 2.87 2.65 2.85 1
Abnormal heart rate (<50 or >100 beats/min) 2.26 2.17 1.67 1
Syncope 1.97 2.00 2.25 1
Medical or social reason for hospitalization 1.91 2.00 1.76 1
Echocardiography with abnormal RV 1.81 2.67 3.16 1
CT RV:LV ratio elevated 1.73 2.23 2.38 1
Total Points
PE-SCORE score (minimum = 0; maximum = 10 points)

Abbreviations: CT = computed tomography; LV = left ventricle; RV = right ventricle

The odds ratios of most of the nine predictor variables were similar and each was assigned 1 point, except for the creatinine > 2.0 mg/dL variable, which was assigned 2 points. The reason 2 points were awarded for creatinine > 2.0 mg/dL was based on the adjusted odds ratio of 5.37 for this variable. The adjusted odds ratio of 5.37 for renal impairment was more than double that of 5 variables in the model. Compared to dysrhythmia, which had the second highest adjusted odds of 4.00, the adjusted odds for creatinine elevation was 40% higher. We recognized that by awarding 2 points for creatinine elevation, the range for our point system would be 0–10, which is standard for many similar scales. The weights assigned to each variable in the final PE-SCORE model are listed in Table 3. The lowest PE-SCORE is 0 and the highest score is 10.

Presentation of points prognostic model

S1 Fig illustrates a useful presentation and application platform of the points model. With PE-SCORE, a provider can list the 9 variables and whether the findings for each are present or not (yes or no). If creatinine greater than 2.0 mg/dL is present, 2 points are awarded. For the other 8 variables, 1 point is awarded if the condition is present. If any provider wants to use a finer scale, we have supplied the coefficients derived from the logistic regression model. With the logistic regression equation, a computer program would be required to calculate the probability of a positive primary outcome.

External validation

Table 4 shows the actual versus predicted events of PE-SCORE on the validation database. Predicted events were derived from the logistic regression model estimations. At the low end of the risk estimation, actual events in the validation database were higher (8% compared to 2% in the development database). There were no deaths within 5 days for patients with PE-SCORE of zero. There was one death among patients with PE-SCORE of 4, but it was not considered PE-related (segmental PE with coexisting perforated intestinal ulcer and gastrointestinal bleeding). The patient did not have CT or GDE finding of RV abnormalities, although both troponin and BNP were elevated. In this case, the PE-SCORE was elevated (although the sPESI was zero) because of other medical conditions, a heart rate of 105 bpm, and creatinine greater than 2.0 mg/dL.

Table 4. Number of predicted and actual events in validation database.

Predicted from frequency in developmental database
PE-SCORE % Positive Primary Outcome % Positive Primary Outcome Predicted Events Actual Events
Development Validation
0 points 2.05 8.11 3.79 15
1 point 7.31 16.72 13.89 31
2 points 18.59 23.72 29.00 37
3 points 38.00 42.43 39.52 44
4 points 35.58 58.57 24.21 41
5 points 63.83 85.71 13.40 18
6 + points 69.60 100 7.66 11

Prognostic model performance

All 9 components of the prognostic model were available for full scoring of PE-SCORE for 888 of 935 patients (95%) in the development database and 737 of 801 patients (92%) in the validation database. In the development database, for the minimum score of zero, the proportion with primary composite outcome was 2%. Among those with scores of 6 or higher, the composite outcome was 69.6%. The exception was 38% for a score of 3 and 35.6% for a score of 4. In the validation dataset, for the minimum score of 0, the proportion with primary outcome was 8%. Among those with scores of 6 or higher, 100% had the primary outcome. The discrepancy in the middle ranges was absent. Based on the results, we set a low-risk threshold for PE-SCORE at 0 points and high-risk threshold at 5 points.

Table 5 shows performance of a) the logistic regression model on the development database, and b) the PE-SCORE model on the development and validation databases (AUC 0.78 and 0.77, respectively). Fig 2 shows AUC for the logistic regression model on the development database, followed by AUC for PE-SCORE on the development and validation databases. The AUC of the full logistic model (AUC 0.83) and PE-SCORE (AUC 0.78) were compared in the development dataset with DeLong’s test with p-value <0.01, indicating a significant difference in the two ROC curves. Although the AUC of PE-SCORE was less than that of the logistic regression model, the prognostic performance of both logistic regression and PE-SCORE are in the good range. Next, the chi square test comparison of ROC curves of PE-SCORE in the development dataset (AUC 0.78) versus validation dataset (AUC 0.77) resulted in a p-value of 0.49, indicating no statistically significant difference in the two ROC curves. The prognostic performance of PE-SCORE was similar in the development and validation databases.

Table 5. Discrimination and calibration metrics.

Discrimination Calibration
Model Sensitivity vs 1- specificity plot Precision Recall curve Spiegelhalter z test and p-value Slope (95% CI) Intercept (95% CI) Hosmer-Lemeshow
p value
AUC (95% CI) AUCpr (95% CI)
Logistic regression (development database) 0.83 (0.80, 0.86) 0.61 (0.57, 0.64) 0.2933, 0.7693 1.029 (0.920, 1.138) -0.006 (-0.040, 0.027) 0.08
PE-SCORE (development database) 0.78 (0.75, 0.82) 0.50 (0.39, 0.60) -0.071, 0.9431 0.966 (0.829, 1.102) 0.008 (-0.031, 0.047) 0.01
PE-SCORE (validation database) 0.77 (0.73, 0.81) 0.63 (0.43, 0.81) 0.3283, 0.7427 1.006 (0.867, 1.146) -0.002 (-0.049, 0.045) 0.76

Abbreviations: AUC = area under the curve, CI = confidence interval

Fig 2.

Fig 2

Area under receiver operating characteristic of: A) logistic regression model on the development database; B) PE-SCORE model on development database; and C) PE-SCORE model on validation database.

We report on the AUCpr due to the imbalance in outcomes on both development and validation databases [54]. Fig 3 shows AUCpr for logistic regression model on the development database, followed by the AUCpr for PE-SCORE on the development and validation databases.

Fig 3.

Fig 3

Precision recall curves of: A) logistic regression model on the development database; B) PE-SCORE model on the development database; and C) PE-SCORE model on the validation database.

In Table 5, we provide four metrics for calibration of PE-SCORE on the development and validation dataset and for logistic regression model on the development database. Fig 4 shows calibration slope and intercept values to be excellent: 1) the Spiegelhalter z test did not indicate lack of fit (p > 0.05); 2) calibration curve slope values were close to 1.0 and linear regression intercept values were close to zero. Calibration slopes and intercepts were close to 1.0 and zero on both databases. Although the Hosmer-Lemeshow test suggested lack of fit (p <0.1) for the full regression model and points model on the development database, those results were offset by three calibration test metrics that indicated excellent calibration [50]. Fig 5 shows the proportion experiencing the primary composite outcome (death or clinical deterioration event) at each total PE-SCORE and the number of patients experiencing death for the primary composite in both databases.

Fig 4. Calibration curves of logistic regression on development and PE-SCORE model on both databases.

Fig 4

Fig 5. Proportions with primary outcome by calculated PE-SCORE.

Fig 5

Legend: Panels A and B show 2D stacked column charts stratified by the proportions of patients with primary composite outcome positive (lower column) and the outcome negative groups (upper column) for each PE-SCORE calculation in the development and validation databases. Panels C and D show column charts for the number of patients with primary outcome positive next to the number with death for each PE-SCORE calculation in the development and validation databases.

Prognostic values [positive predictive value (PPV) and negative predictive value (NPV)] are affected by prevalence of the outcome of interest. Of 935 patients in the development database, 888 (95%) had complete responses for nine components of the PE-SCORE tool. Of 801 patients in the validation database, 737 (91%) had complete responses for all nine components of the PE-SCORE tool. GDE was the only variable missing. None of the other 8 variables used to calculate the PE-SCORE were missing. A modified PE-SCORE that did not include GDE was calculated with a reduced potential range of 0–9 points. Their modified scores had an actual range of 0–4. The percentages of patients experiencing the primary outcome among those with modified PE-SCORE scores of 0, 1, 2, 3, 4 were 16.7%, 16.7%, 50.0%, 50.0%, and 0%, respectively. In comparison, the percentages of patients without missing GDE who experienced the primary outcome were 2.1%, 7.3%, 18.6%, 38.0%, and 35.6% among those with a PE-SCORE of 0, 1, 2, 3 and 4, respectively. Except for a modified score of 4, these percentages were higher in each point category for the patients with GDE missing than the same score for the group not missing GDE.

Table 6 shows traditional prognostic accuracy performance metrics for PE-SCORE (at two different risk thresholds) on the development and validation databases. We used a threshold of zero points for PE-SCORE to address low-risk stratification. A threshold of 5 for PE-SCORE indicates high-risk for clinical deterioration. At the lower-risk threshold, providers are most interested in the negative predictive value (NPV) of a prognostic model. We report on the model’s performance in low- versus high-risk stratification because the decisions made are quite different. Low-risk stratification increases consideration for immediate outpatient clinical management, whereas high-risk stratification increases the intensity of monitoring.

Table 6. Performance of PE-SCORE model at two risk thresholds on both databases.

Low-risk threshold PE-SCORE cut-off = 0 points
Development database Accuracy = 37.8% (34.6%–41.1%)
Score+ (1–9 points) sensitivity 98.5% (95.2%–99.6%) PPV 26.0% (22.9%–29.3%)
Score- (0 points) specificity 20.7% (17.7%–23.9%) NPV 97.9% (93.6%–99.5%)
A = 193 Precision = A/(A+C) = 0.26
Recall = A/A+B) = 0.9847
B = 3 F1 = (precision*Recall)/ (Precision + Recall) = 0.2056
C = 549
D = 143
Validation database Accuracy = 47.8% (44.1%–51.4%)
Score+ (1–9 points) sensitivity 92.4% (87.5%–95.5%) PPV 33.0% (29.1%-37.1%)
Score—(0 points) specificity 31.5% (27.6%–35.6%) NPV 91.9% (86.7%–95.2%)
A = 182 Precision = A/(A+C) = 0.33
B = 15 Recall = A/A+B) = 0.92
C = 370 F1 = (precision*Recall)/ (Precision + Recall) = 0.24
D = 170
High-risk threshold PE-SCORE cut-off = 5+ points
Development database Accuracy 80.4% (77.6%–83.0%)
Score+ (5–9 points) sensitivity 23.5% (17.9%–30.1%) PPV 65.7% (53.3%–76.4%)
Score- (0–4 points) specificity 96.5% (94.8%–97.7%) NPV 81.7% (78.8%–84.2%)
A = 46 Precision = A/(A+C) = 0.65
B = 147 Recall = A/A+B) = 0.24
F1 = (precision*Recall)/ (Precision + Recall) = 0.17
C = 24
D = 671
Validation database Accuracy 76.8% (73.6%–79.8%)
Score+ (5–9 points) sensitivity 14.7% (10.2%–20.6%) PPV 90.6% (73.8%–97.5%)
Score- (0–4 points) specificity 99.4% (98.3%–99.9%) NPV 76.2% (72.8%–79.2%)
A = 29 Precision = A/(A+C) = 0.91
B = 168 Recall = A/A+B) = 0.15
F1 = (precision*Recall)/ (Precision + Recall) = 0.13
C = 3
D = 537

Abbreviations for number of predicted events versus actual events: A = True positive, B = False positive (Type II error), C = False positive (type I error), D = True negative.

Other abbreviations: PPV = positive predictive value; NPV = negative predictive value.

Low-risk PE-SCORE had sensitivity 98.5% (95.2%–99.6%) and 92.4% (87.5%–95.5%), specificity 20.7% (17.7%–23.9%) and 31.5% (27.6%–35.6%), PPV 26.0% (22.9%–29.3%) and 33.0% (29.1%–37.1%), and NPV 97.9% (93.6%–99.5%) and 91.9% (86.7%–95.2%), respectively. In addition, precision, recall, and F1 metrics are presented at both risk thresholds.

S3 Table shows the performance of sPESI and the 2019 version of the European Society of Cardiology (ESC) PE management guidelines at low-risk threshold on the two databases [15]. The sPESI represents a prognostic model without RV assessment variables and ESC represents an updated risk stratification model that combines a previously validated clinical prediction model (sPESI) with imaging RV assessment variables (using our definitions). We used it for the primary composite outcome of death and pre-defined clinical deterioration outcomes within 5 days. We identified low-risk ESC criteria, which incorporated low-risk sPESI threshold and absence of RV abnormalities (using our definitions). In the development and validation databases, low-risk sPESI had sensitivity 85.2% (79.6%–89.7%) and 80.4% (74.4%–85.5%), specificity 39.0% (35.4%–42.6%) and 43.4% (39.4%–47.6%), PPV 28.7% (27.0%–30.4%) and 34.1% (32.0%–36.3%), NPV 90.1% (86.7%–92.8%) and 85.9% (82.0%–89.0%) respectively. In the development and validation databases, low-risk ESC sensitivity 99.5% (97.4%–99.9%) and 97.7% (94.6%–99.2%), specificity 10.5% (8.3%–12.9%) and 17.2% (14.2%–20.5%), PPV 24.2% (23.8%–24.7%) and 30.1% (29.2%–31.0%), NPV 98.7% (91.4%–99.8%) and 95.3% (89.3%–98.0%), respectively. Overall, both ESC and PE-SCORE models, which involved RV assessments, outperformed low-risk SPESI when focused on the primary composite outcomes that matter to point-of-care decision makers.

Discussion

We used prospective registry databases and developed and validated an original prognostic model from a field of 138 candidate variables. The registry involved contemporaneous and early assessments for PE provoked RV abnormalities with predefined laboratory and imaging assessments, and focused on outcomes of interest to providers at the point of decision-making and to pulmonary embolism response teams [5, 7, 13, 28]. The final variables in the prognostic model are readily available during ED evaluation, including interview questions, witnessed events (syncope), vital signs at presentation or on a cardiac monitor (heart rate and systolic blood pressure), past medical history, routine laboratory findings, and imaging. The two imaging variables were CT RV:LV ratio, which is determined from CT images, and goal directed echocardiography, which is performed at the patient’s bedside and provides multiple dynamic images of the RV.

In a meta-analysis of 71 prognostic model reports, 17 were original prognostic models like our study [4]. The other 54 reports were validating, updating, or investigating the impact of prognostic models. For the 17 original prognostic studies, the number of candidate variables ranged from seven to greater than 30. In five studies, the number of candidate variables were either unclear or not reported [16, 5558]. Few studies included imaging findings as candidate variables: echocardiography finding of abnlRV (one study), RV:LV ratio (two studies), CT findings of RV abnormality (one study), and ultrasound for venous thrombosis (four studies) [1, 16, 5761].

Most reports on prognostic models for acute PE focus on outcomes of death, recurrent VTE, and bleeding at a time point of 30 days or longer [4, 18, 62]. In contrast, our study focused on death or clinical deterioration within five days of PE diagnosis, as outcomes that are important to providers and researchers [5, 7, 13, 25].

Prognostic performance of the logistic regression and PE-SCORE models was strong for discrimination and calibration. The logistic regression model had an AUC of 0.83 in the development database. The user-friendly PE-SCORE points tool had AUCs of 0.77 and 0.78 in the development and validation databases, respectively. When decision-making priority is focused on patient candidacy for outpatient treatment, PE-SCORE set to a low-risk threshold has high negative predictive value. When the decision-making priority is determining whether increased intensity of monitoring or increased considerations for escalated PE treatment may be indicated, PE-SCORE set to a high-risk threshold has moderate accuracy.

Regression analyses provide plausible ranking of importance of RV imaging variables in PE risk stratification: GDE had greater odds ratio than CT. We used GDE instead of comprehensive echocardiography to visually detect PE-provoked abnormalities of RV size, pressure, and systolic function. To assign a GDE score of 1 or more, providers were required to detect RV dilatation (not severe RV systolic dysfunction or septal shift alone). The ordinal nature of GDE scoring was itself calibrated, showing increased odds of clinical deterioration as GDE scores increased.

The absence of variables in our final prognostic model deserves discussion. Troponin and natriuretic peptides are considered influential PE prognostic predictors in meta-analyses [18, 22, 6365]. Although our study’s univariable analyses showed significant differences in both cardiac biomarkers in outcome groups, neither troponin nor natriuretic peptide elevation were retained after regression analyses. Our findings rank the predictive accuracy of laboratory RV assessments lower than imaging RV assessments in a restricted prognostic model. It is plausible natriuretic peptide and troponin do not directly identify the cardiac chamber experiencing acute myocardial stretch and myocardial injury. In contrast, GDE directly identifies RV dilatation and abnlRV systolic function. Age and cancer (predictors featured in models like PESI, sPESI, Hestia, and ESC) were not significant in univariable analyses or with penalized regression analysis. In the original PESI study, those aged > 65 years accounted for 52%–59% of the development and validation cohorts [2]. In our study, the proportion of patients aged 65 or older in both databases was lower at 39.3%. In the original PESI derivation and validation report, cancer was present in 19% and 16% of the databases, respectively. In our PE-SCORE study, cancer was present in 24.8% of PE patients, but did not reach statistical significance for prognosis of acute clinical deterioration.

The absence of advanced age or cancer as discrete variables in the PE-SCORE does not prevent these features from being considered by provider’s discretion for social or medical reasons for hospitalization or an increased level of monitoring—an important component of PE-SCORE. Original clinical scores or guidelines, which were developed for outcomes of death, recurrent VTE, or major bleeding 30 days or later, tend to be pragmatically modified or adapted to consider other social/medical conditions or laboratory or imaging findings instead of being used in isolation during clinical practice [7, 13, 15, 66]. With PE-SCORE, variables for provider discretion on social or medical reasons for hospitalization for increased monitoring and RV imaging assessment are built in.

Unlike our findings, some studies found troponin and echocardiography findings of abnlRV did not have prognostic value in determining in-hospital adverse events [67]. Zondag et al. reported that although 35% classified as low-risk by Hestia criteria had coexisting RV abnormality by CTPA, there was no difference in outcomes compared to patients without abnlRV [68].

Our study has several limitations. Although the validation was performed on a different database with data collected during a different time period, external validation should be conducted at sites outside the current registry. Our study focused on clinical deterioration and early mortality due to PE severity. We did not assess outcomes due to PE treatment (e.g., bleeding, bleeding risk, compliance with treatment), which would influence disposition decisions and need for safety outcomes. The study setting was focused on emergency department patients and ambulatory care settings where the cadence and feasibility of testing may not be generalizable to patients developing acute PE while already in the inpatient setting. Already hospitalized patients who develop acute PE may have different risk factors or susceptibilities to PE-associated deterioration from those diagnosed in an outpatient setting.

Our a priori study design included using troponin measurements as continuous data; however, institutional change in troponin assay at the central site interrupted plans to perform linear regression on the troponin variable. Similarly, two of the six sites used NT proBNP, while others used point-of-care BNP assay measurements. Therefore, we used institutional assay cut-offs to create categorical variables (troponin and natriuretic peptide elevation). Univariable analyses showed significant differences in mean troponin, point-of-care BNP, and NT proBNP measurements between outcome groups in both databases. Valuable information, however, may have been lost by converting a continuous variable into a categorical variable [69].

Univariable analysis identified the clinical site itself as a variable of importance. The logistic regression model therefore has a random effects intercept for clinical sites. The random intercept cannot be used in a risk calculation on patients at sites outside of the six sites of this study, as the random effect of the new site is unknown. Thus, only the fixed intercept of the random effects model is used in the risk calculation. Model performance at a clinical site outside the six sites in this study may differ. Other discrete variables that may be of interest (e.g., median income, insurance status, other social determinants of health) were not included in this study. Despite significant differences in patient characteristics between sites, the prognostic model performed well on patients.

Another possible limitation of our report is that machine-based learning derivation techniques may offer better management of multiple variables (including those with interactions); however, our preliminary steps with classification tree analysis were not helpful. The logistic regression model we developed had an AUC of 0.83 (95% CI 0.80–0.86), whereas the PE-SCORE yielded an AUC of 0.78 on the development database. Although PE-SCORE had lower prognostic performance than the logistic regression model, PE-SCORE performed similarly by AUC on both databases and offers real-world usefulness at the site-of-care.

It is plausible that definitions of candidate variables may be modified or optimized in future updates or revisions of prognostic tools. For example, other CT-derived variables, such as contrast reflux or the pulmonary arterial occlusion index, may be considered as candidate variables for a prognostic model. We only used CT RV:LV ratio from the CT. Initial oxygen saturation by pulse oximetry and initial respiratory rate at presentation were not retained. Both clinical variables were measured with patients at rest. Because exertional shortness of breath is a common symptom of PE, oxygen saturation and respiratory rate (measured after fixed and defined exertion) may yield different results when developing a prognostic tool. Even retained variables, like initial heart rate, can be optimized by measuring heart rate after fixed exertion or by using highest heart rate within a fixed time interval as a candidate variable.

Our prognostic model included creatinine elevation (greater than 2.0 mg/dL) as a parameter of renal function. Other reports have identified acute renal injury as a prognostic factor [70, 71]. Acute renal injury was not included as a candidate variable in PESI/sPESI. In our study, we did not attempt to differentiate renal injury from renal failure. We did not use glomerular filtration rate to assess renal function, and we used a modest cut-off for creatinine level evaluation for provider use. It is possible a different cut-off value of creatinine or a different renal function parameter of renal function assessment may offer an optimal prognostic performance.

Another possible limitation is that we used an ordinal GDE score of visually estimated severe RV dilatation (absolute or relative to left ventricle) and severe RV systolic function. Use of echocardiographic measurements on two-dimensional modality or other echocardiographic modalities may increase risk stratification stringency or provide recommendations for optimal cut-offs.

Although this study was performed at academic centers, competency in GDE has been expected of those emerging from EM residency training for the past decade. Our results may indicate an opportunity to study the impact of employing GDE into PE risk stratification. Upon external validation, any real-world application of PE-SCORE would include recommendation that technically difficult or uninterpretable GDE images limit full use of PE-SCORE. None of the other eight variables used to calculate PE-SCORE were missing during development. When faced with absent GDE scores, providers should use available clinical information, recognizing the worst case scenario (that GDE is abnormal) has not been ruled out. Providers may either add a point or consider the partial PE-SCORE a minimum score. The other real-world option is to consider comprehensive echocardiography (by cardiology service).

Most of the clinical outcomes were determined during hospitalization and may not have been recognized outside of the hospital setting. The study design did not directly assess the impact or safety of implementing the prognostic prediction or its PE-SCORE on provider decisions regarding disposition, level of monitoring needed, or escalation of treatment.

Potential benefits of PE-SCORE include early detection of deterioration and avoiding misclassification of patients who experience the outcome but would have been classified as low-risk by another prognostic tool. Potential harms may include unnecessary testing or interventions in those who did not experience any clinical deterioration outcomes despite higher risk classification, subjecting them to potential adverse events of the interventions, and increased lengths of stay and medical costs. After external validation, we anticipate use of the PE-SCORE tool in acute care settings with similar prevalence of early clinical deterioration to identify PE patients likely to benefit from early discharge and those who may need higher level monitoring and escalated PE interventions. However, incorporation of any new prognostic tool into clinical practice requires implementation and impact studies to better understand the clinical consequences [72].

Conclusions

We have summarized development and validation of a new prognostic tool that uses readily available imaging findings from CT, GDE, vital signs, and interview information. A PE-SCORE score of zero conferred a low probability and a score of ≥ 6 predicted high probability of clinical deterioration/death within days of PE diagnosis. External validation may support use of this prognostic tool to inform decisions about early outpatient management versus the need for hospital-based monitoring and considerations for escalated PE interventions.

Supporting information

S1 Table. Univariable analysis of 138 candidate variables for primary outcome on development database.

(DOCX)

S2 Table. Comparison of clinical research sites in development database.

(DOCX)

S3 Table. Prognostic performance of sPESI and ESC at low-risk threshold.

(DOCX)

S1 Fig. Assignment of points to each of the nine variables in the PE-SCORE model.

(DOCX)

S1 Data

(CSV)

S2 Data

(CSV)

S3 Data

(CSV)

Acknowledgments

Authors thank Pilar Tochiki for central database acquisition and management, Melanie Hogg for central site research project management, Kelly Goonan for scientific writing assistance, and Michael Runyon for mentoring.

Data Availability

All relevant data are within the manuscript and its Supporting Information files.

Funding Statement

This project was supported by grant number R01HS025979 to AJW from the Agency for Healthcare Research and Quality (ahrq.gov). The content is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality. The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Jiménez D, Kopecna D, Tapson V, Briese B, Schreiber D, Lobo JL, et al. Derivation and validation of multimarker prognostication for normotensive patients with acute symptomatic pulmonary embolism. Am J Respir Crit Care Med. 2014. Mar 15;189(6):718–26. doi: 10.1164/rccm.201311-2040OC [DOI] [PubMed] [Google Scholar]
  • 2.Aujesky D, Obrosky DS, Stone RA, Auble TE, Perrier A, Cornuz J, et al. Derivation and validation of a prognostic model for pulmonary embolism. Am J Respir Crit Care Med. 2005;172(8):1041–6. doi: 10.1164/rccm.200506-862OC [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Zondag W, Mos IC, Creemers-Schild D, Hoogerbrugge AD, Dekkers OM, Dolsma J, et al. Outpatient treatment in patients with acute pulmonary embolism: the Hestia Study. J Thromb Haemost. 2011. Aug;9(8):1500–7. doi: 10.1111/j.1538-7836.2011.04388.x [DOI] [PubMed] [Google Scholar]
  • 4.Elias A, Mallett S, Daoud-Elias M, Poggi J-N, Clarke M. Prognostic models in acute pulmonary embolism: a systematic review and meta-analysis. BMJ Open. 2016. Apr 29;6(4):e010324. doi: 10.1136/bmjopen-2015-010324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kabrhel C, Sacco W, Liu S, Hariharan P. Outcomes considered most important by emergency physicians when determining disposition of patients with pulmonary embolism. Int J Emerg Med. 2010;3(4):239–64. doi: 10.1007/s12245-010-0206-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kabrhel C, Okechukwu I, Hariharan P, Takayesu JK, MacMahon P, Haddad F, et al. Factors associated with clinical deterioration shortly after PE. Thorax. 2014. Sep;69(9):835–42. doi: 10.1136/thoraxjnl-2013-204762 [DOI] [PubMed] [Google Scholar]
  • 7.Vinson DR, Drenten CE, Huang J, Morley JE, Anderson ML, Reed ME, et al. Impact of relative contraindications to home management in emergency department patients with low-risk pulmonary embolism. Ann Am Thorac Soc. 2015. May;12(5):666–73. doi: 10.1513/AnnalsATS.201411-548OC [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Singer AJ, Thode HC Jr, Peacock WF th. Admission rates for emergency department patients with venous thromboembolism and estimation of the proportion of low risk pulmonary embolism patients: a US perspective. Clin Exp Emerg Med. 2016. Sep;3(3):126–31. doi: 10.15441/ceem.15.096 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Dentali F, Di Micco G, Giorgi Pierfranceschi M, Gussoni G, Barillari G, Amitrano M, et al. Rate and duration of hospitalization for deep vein thrombosis and pulmonary embolism in real-world clinical practice. Ann Med. 2015. Sep 30;47(7):546–54. doi: 10.3109/07853890.2015.1085127 [DOI] [PubMed] [Google Scholar]
  • 10.Jiménez D, de Miguel-Díez J, Guijarro R. Trends in the management and outcomes of acute pulmonary embolism: analysis from the RIETE registry. Journal of the American [Internet]. 2016; Available from: https://www.jacc.org/doi/abs/10.1016/j.jacc.2015.10.060 [DOI] [PubMed] [Google Scholar]
  • 11.Konstantinides SV. Trends in Pulmonary Embolism Outcomes: Are We Really Making Progress? J Am Coll Cardiol. 2016. Jan 19;67(2):171–3. doi: 10.1016/j.jacc.2015.10.062 [DOI] [PubMed] [Google Scholar]
  • 12.Mastroiacovo D, Dentali F, di Micco P, Maestre A, Jiménez D, Soler S, et al. Rate and duration of hospitalisation for acute pulmonary embolism in the real-world clinical practice of different countries: analysis from the RIETE registry [Internet]. Vol. 53, European Respiratory Journal. 2019. p. 1801677. Available from: doi: 10.1183/13993003.01677-2018 [DOI] [PubMed] [Google Scholar]
  • 13.Kabrhel C, Rosovsky R, Baugh C, Connors J, White B, Giordano N, et al. Multicenter Implementation of a Novel Management Protocol Increases the Outpatient Treatment of Pulmonary Embolism and Deep Vein Thrombosis [Internet]. Academic Emergency Medicine. 2018. Available from: doi: 10.1111/acem.13640 [DOI] [PubMed] [Google Scholar]
  • 14.Bledsoe JR, Woller SC, Stevens SM, Aston V, Patten R, Allen T, et al. Management of Low-Risk Pulmonary Embolism Patients Without Hospitalization: The Low-Risk Pulmonary Embolism Prospective Management Study. Chest. 2018. Aug;154(2):249–56. doi: 10.1016/j.chest.2018.01.035 [DOI] [PubMed] [Google Scholar]
  • 15.Konstantinides S, Meyer G, Becattini C. 2019. ESC Guidelines for the diagnosis and management of acute pulmonary embolism developed in collaboration with the European Respiratory Society (ERS …. Eur Heart J [Internet]. 2020; Available from: https://researchportal.helsinki.fi/en/publications/2019-esc-guidelines-for-the-dignosis-and-management-of-acute-pulm [DOI] [PubMed] [Google Scholar]
  • 16.Bova C, Sanchez O, Prandoni P, Lankeit M, Konstantinides S, Vanni S, et al. Identification of intermediate-risk patients with acute symptomatic pulmonary embolism. Eur Respir J. 2014. Sep;44(3):694–703. doi: 10.1183/09031936.00006114 [DOI] [PubMed] [Google Scholar]
  • 17.Andrade I, Mehdipoor G, Le Mao R, García-Sánchez A, Pintado B, Pérez A, et al. Prognostic significance of computed tomography-assessed right ventricular enlargement in low-risk patients with pulmonary embolism: Systematic review and meta-analysis. Thromb Res. 2021. Jan 1;197:48–55. doi: 10.1016/j.thromres.2020.10.034 [DOI] [PubMed] [Google Scholar]
  • 18.Barco S, Mahmoudpour SH, Planquette B, Sanchez O, Konstantinides SV, Meyer G. Prognostic value of right ventricular dysfunction or elevated cardiac biomarkers in patients with low-risk pulmonary embolism: a systematic review and meta-analysis. Eur Heart J. 2019;40(11):902–10. doi: 10.1093/eurheartj/ehy873 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lahm T, Douglas IS, Archer SL, Bogaard HJ, Chesler NC, Haddad F, et al. Assessment of Right Ventricular Function in the Research Setting: Knowledge Gaps and Pathways Forward. An Official American Thoracic Society Research Statement. Am J Respir Crit Care Med. 2018. Aug 15;198(4):e15–43. doi: 10.1164/rccm.201806-1160ST [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Huang SJ, Nalos M, Smith L, Rajamani A, McLean AS. The use of echocardiographic indices in defining and assessing right ventricular systolic function in critical care research. Intensive Care Med [Internet]. 2018; Available from: https://www.ncbi.nlm.nih.gov/pubmed/29789861 doi: 10.1007/s00134-018-5211-z [DOI] [PubMed] [Google Scholar]
  • 21.Cho JH, Kutti Sridharan G, Kim SH, Kaw R, Abburi T, Irfan A, et al. Right ventricular dysfunction as an echocardiographic prognostic factor in hemodynamically stable patients with acute pulmonary embolism: a meta-analysis. BMC Cardiovasc Disord. 2014;14:64. doi: 10.1186/1471-2261-14-64 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Sanchez O, Trinquart L, Colombet I, Durieux P, Huisman MV, Chatellier G, et al. Prognostic value of right ventricular dysfunction in patients with haemodynamically stable pulmonary embolism: a systematic review. Eur Heart J. 2008. Jun;29(12):1569–77. doi: 10.1093/eurheartj/ehn208 [DOI] [PubMed] [Google Scholar]
  • 23.Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD): the TRIPOD Statement. Br J Surg. 2015. Feb;102(3):148–58. doi: 10.1002/bjs.9736 [DOI] [PubMed] [Google Scholar]
  • 24.Green SM, Schriger DL, Yealy DM. Methodologic standards for interpreting clinical decision rules in emergency medicine: 2014 update. Ann Emerg Med. 2014. Sep;64(3):286–91. doi: 10.1016/j.annemergmed.2014.01.016 [DOI] [PubMed] [Google Scholar]
  • 25.Meyer G, Vicaut E, Danays T, Agnelli G, Becattini C, Beyer-Westendorf J, et al. Fibrinolysis for patients with intermediate-risk pulmonary embolism. N Engl J Med. 2014;370(15):1402–11. doi: 10.1056/NEJMoa1302097 [DOI] [PubMed] [Google Scholar]
  • 26.Hariharan P, Takayesu JK, Kabrhel C. Association between the Pulmonary Embolism Severity Index (PESI) and short-term clinical deterioration. Thromb Haemost. 2011. Apr;105(4):706–11. doi: 10.1160/TH10-09-0577 [DOI] [PubMed] [Google Scholar]
  • 27.Weekes AJ, Johnson AK, Troha D, Thacker G, Chanler-Berat J, Runyon M. Prognostic Value of Right Ventricular Dysfunction Markers for Serious Adverse Events in Acute Normotensive Pulmonary Embolism. J Emerg Med. 2017. Feb;52(2):137–50. doi: 10.1016/j.jemermed.2016.09.002 [DOI] [PubMed] [Google Scholar]
  • 28.Rosovsky R, Zhao K, Sista A, Rivera‐Lebron B, Kabrhel C. Pulmonary embolism response teams: Purpose, evidence for efficacy, and future research directions [Internet]. Vol. 3, Research and Practice in Thrombosis and Haemostasis. 2019. p. 315–30. Available from: doi: 10.1002/rth2.12216 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Pollack CV, Schreiber D, Goldhaber SZ, Slattery D, Fanikos J, O’Neil BJ, et al. Clinical characteristics, management, and outcomes of patients diagnosed with acute pulmonary embolism in the emergency department: initial report of EMPEROR (Multicenter Emergency Medicine Pulmonary Embolism in the Real World Registry). J Am Coll Cardiol. 2011;57(6):700–6. doi: 10.1016/j.jacc.2010.05.071 [DOI] [PubMed] [Google Scholar]
  • 30.Jimenez D, Aujesky D, Moores L, Gomez V, Lobo JL, Uresandi F, et al. Simplification of the pulmonary embolism severity index for prognostication in patients with acute symptomatic pulmonary embolism. Arch Intern Med. 2010;170(15):1383–9. doi: 10.1001/archinternmed.2010.199 [DOI] [PubMed] [Google Scholar]
  • 31.Barco S, Ende-Verhaar YM, Becattini C, Jimenez D, Lankeit M, Huisman MV, et al. Differential impact of syncope on the prognosis of patients with acute pulmonary embolism: a systematic review and meta-analysis. Eur Heart J. 2018;39(47):4186–95. doi: 10.1093/eurheartj/ehy631 [DOI] [PubMed] [Google Scholar]
  • 32.Vinson DR, Engelhart DC, Bahl D, Othieno AA, Abraham AS, Huang J, et al. Presyncope Is Associated with Intensive Care Unit Admission in Emergency Department Patients with Acute Pulmonary Embolism. West J Emerg Med. 2020;21(3):703–13. doi: 10.5811/westjem.2020.2.45028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Konstantinides S, Geibel A, Olschewski M, Heinrich F, Grosser K, Rauber K, et al. Association between thrombolytic treatment and the prognosis of hemodynamically stable patients with major pulmonary embolism: results of a multicenter registry. Circulation. 1997;96(3):882–8. doi: 10.1161/01.cir.96.3.882 [DOI] [PubMed] [Google Scholar]
  • 34.Duplyakov D, Kurakina E, Pavlova T, Khokhlunov S, Surkova E. Value of syncope in patients with high-to-intermediate risk pulmonary artery embolism. Eur Heart J Acute Cardiovasc Care. 2015. Aug;4(4):353–8. doi: 10.1177/2048872614527837 [DOI] [PubMed] [Google Scholar]
  • 35.Jaff MR, McMurtry MS, Archer SL, Cushman M, Goldenberg N, Goldhaber SZ, et al. Management of massive and submassive pulmonary embolism, iliofemoral deep vein thrombosis, and chronic thromboembolic pulmonary hypertension: a scientific statement from the American Heart Association. Circulation. 2011. Apr 26;123(16):1788–830. doi: 10.1161/CIR.0b013e318214914f [DOI] [PubMed] [Google Scholar]
  • 36.Weekes AJ, Thacker G, Troha D, Johnson AK, Chanler-Berat J, Norton HJ, et al. Diagnostic Accuracy of Right Ventricular Dysfunction Markers in Normotensive Emergency Department Patients With Acute Pulmonary Embolism. Ann Emerg Med [Internet]. 2016; Available from: doi: 10.1016/j.annemergmed.2016.01.027 [DOI] [PubMed] [Google Scholar]
  • 37.Weekes AJ, Oh L, Thacker G, Johnson AK, Runyon M, Rose G, et al. Interobserver and Intraobserver Agreement on Qualitative Assessments of Right Ventricular Dysfunction with Echocardiography in Patients with Pulmonary Embolism. J Ultrasound Med. 2016. Oct;35(10):2113–20. doi: 10.7863/ultra.15.11007 [DOI] [PubMed] [Google Scholar]
  • 38.Lang RM, Badano LP, Mor-Avi V, Afilalo J, Armstrong A, Ernande L, et al. Recommendations for cardiac chamber quantification by echocardiography in adults: an update from the American Society of Echocardiography and the European Association of Cardiovascular Imaging. Eur Heart J Cardiovasc Imaging. 2015. Mar;16(3):233–70. doi: 10.1093/ehjci/jev014 [DOI] [PubMed] [Google Scholar]
  • 39.Peduzzi P, Concato J, Kemper E, Holford TR, Feinstein AR. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996. Dec;49(12):1373–9. doi: 10.1016/s0895-4356(96)00236-3 [DOI] [PubMed] [Google Scholar]
  • 40.Vanni S, Nazerian P, Pepe G, Baioni M, Risso M, Grifoni G, et al. Comparison of two prognostic models for acute pulmonary embolism: clinical vs. right ventricular dysfunction-guided approach. J Thromb Haemost. 2011. Oct;9(10):1916–23. doi: 10.1111/j.1538-7836.2011.04459.x [DOI] [PubMed] [Google Scholar]
  • 41.Schafer JL, Graham JW. Missing data: our view of the state of the art. Psychol Methods. 2002. Jun;7(2):147–77. [PubMed] [Google Scholar]
  • 42.Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. J R Stat Soc Series B Stat Methodol. 2006. Feb;68(1):49–67. [Google Scholar]
  • 43.Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol [Internet]. 1996; Available from: https://rss.onlinelibrary.wiley.com/doi/abs/10.1111/j.2517-6161.1996.tb02080.x [Google Scholar]
  • 44.SAS Institute Inc. SAS/STAT® 14.2 User’s Guide: High-Performance Procedures [Internet]. SAS Institute Inc.; 2016. Available from: https://support.sas.com/documentation/onlinedoc/stat/142/stathpug.pdf
  • 45.Stiell IG, Wells GA, Vandemheen KL, Clement CM, Lesiuk H, De Maio VJ, et al. The Canadian C-spine rule for radiography in alert and stable trauma patients. JAMA. 2001. Oct 17;286(15):1841–8. doi: 10.1001/jama.286.15.1841 [DOI] [PubMed] [Google Scholar]
  • 46.Stiell IG, Greenberg GH, McKnight RD, Nair RC, McDowell I, Reardon M, et al. Decision rules for the use of radiography in acute ankle injuries. Refinement and prospective validation. JAMA. 1993. Mar 3;269(9):1127–32. doi: 10.1001/jama.269.9.1127 [DOI] [PubMed] [Google Scholar]
  • 47.Kline JA, Mitchell AM, Kabrhel C, Richman PB, Courtney DM. Clinical criteria to prevent unnecessary diagnostic testing in emergency department patients with suspected pulmonary embolism. J Thromb Haemost. 2004. Aug;2(8):1247–55. doi: 10.1111/j.1538-7836.2004.00790.x [DOI] [PubMed] [Google Scholar]
  • 48.Spiegelhalter DJ. Probabilistic prediction in patient management and clinical trials. Stat Med. 1986. Sep;5(5):421–33. doi: 10.1002/sim.4780050506 [DOI] [PubMed] [Google Scholar]
  • 49.Walsh CG, Sharman K, Hripcsak G. Beyond discrimination: A comparison of calibration methods and clinical usefulness of predictive models of readmission risk. J Biomed Inform. 2017. Dec;76:9–18. doi: 10.1016/j.jbi.2017.10.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW, Topic Group “Evaluating diagnostic tests and prediction models” of the STRATOS initiative. Calibration: the Achilles heel of predictive analytics. BMC Med. 2019. Dec 16;17(1):230. doi: 10.1186/s12916-019-1466-7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Steyerberg EW, Vergouwe Y. Towards better clinical prediction models: seven steps for development and an ABCD for validation. Eur Heart J. 2014. Aug 1;35(29):1925–31. doi: 10.1093/eurheartj/ehu207 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988. Sep;44(3):837–45. [PubMed] [Google Scholar]
  • 53.Gonen M. Analyzing receiver operating characteristic curves with SAS. 2007; Available from: https://dl.acm.org/doi/abs/10.5555/1554830
  • 54.Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS One. 2015. Mar 4;10(3):e0118432. doi: 10.1371/journal.pone.0118432 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Kostrubiec M, Pruszczyk P, Bochowicz A, Pacho R, Szulc M, Kaczynska A, et al. Biomarker-based risk assessment model in acute pulmonary embolism. Eur Heart J. 2005. Oct;26(20):2166–72. doi: 10.1093/eurheartj/ehi336 [DOI] [PubMed] [Google Scholar]
  • 56.Lankeit M, Friesen D, Schäfer K, Hasenfuß G, Konstantinides S, Dellas C. A simple score for rapid risk assessment of non-high-risk pulmonary embolism. Clin Res Cardiol. 2013. Jan;102(1):73–80. doi: 10.1007/s00392-012-0498-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Jo JY, Lee MY, Lee JW, Rho BH, Choi W-I. Leukocytes and systemic inflammatory response syndrome as prognostic factors in pulmonary embolism patients. BMC Pulm Med. 2013. Dec 10;13:74. doi: 10.1186/1471-2466-13-74 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Zhu L, Wang C, Yang Y-H, Wu Y-F, Zhai Z-G. Prognostic value of right ventricular dysfunction and derivation of a prognostic model for patients with acute pulmonary thromboembolism. Zhonghua Liu Xing Bing Xue Za Zhi. 2009. Feb;30(2):184–8. [PubMed] [Google Scholar]
  • 59.Sanchez O, Trinquart L, Caille V, Couturaud F, Pacouret G, Meneveau N, et al. Prognostic factors for pulmonary embolism: the prep study, a prospective multicenter cohort study. Am J Respir Crit Care Med. 2010. Jan 15;181(2):168–73. doi: 10.1164/rccm.200906-0970OC [DOI] [PubMed] [Google Scholar]
  • 60.Wicki J, Perrier A, Perneger TV, Bounameaux H, Junod AF. Predicting adverse outcome in patients with acute pulmonary embolism: a risk score. Thromb Haemost. 2000. Oct;84(4):548–52. [PubMed] [Google Scholar]
  • 61.Yamaki T, Nozaki M, Sakurai H, Takeuchi M, Soejima K, Kono T. Presence of lower limb deep vein thrombosis and prognosis in patients with symptomatic pulmonary embolism: preliminary report. Eur J Vasc Endovasc Surg. 2009. Feb;37(2):225–31. doi: 10.1016/j.ejvs.2008.08.018 [DOI] [PubMed] [Google Scholar]
  • 62.Maughan BC, Frueh L, McDonagh MS, Casciere B, Kline JA. Outpatient Treatment of Low-risk Pulmonary Embolism in the Era of Direct Oral Anticoagulants: A Systematic Review. Acad Emerg Med. 2021. Feb;28(2):226–39. doi: 10.1111/acem.14108 [DOI] [PubMed] [Google Scholar]
  • 63.Klok FA, Mos IC, Huisman MV. Brain-type natriuretic peptide levels in the prediction of adverse outcome in patients with pulmonary embolism: a systematic review and meta-analysis. Am J Respir Crit Care Med. 2008;178(4):425–30. doi: 10.1164/rccm.200803-459OC [DOI] [PubMed] [Google Scholar]
  • 64.Becattini C, Vedovati MC, Agnelli G. Prognostic value of troponins in acute pulmonary embolism: a meta-analysis. Circulation. 2007. Jul 24;116(4):427–33. doi: 10.1161/CIRCULATIONAHA.106.680421 [DOI] [PubMed] [Google Scholar]
  • 65.Lankeit M, Gomez V, Wagner C, Aujesky D, Recio M, Briongos S, et al. A strategy combining imaging and laboratory biomarkers in comparison with a simplified clinical score for risk stratification of patients with acute pulmonary embolism. Chest. 2012. Apr;141(4):916–22. doi: 10.1378/chest.11-1355 [DOI] [PubMed] [Google Scholar]
  • 66.Roy P-M, Penaloza A, Hugli O, Klok FA, Arnoux A, Elias A, et al. Triaging acute pulmonary embolism for home treatment by Hestia or simplified PESI criteria: the HOME-PE randomized trial. Eur Heart J. 2021. Aug 31;42(33):3146–57. doi: 10.1093/eurheartj/ehab373 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67.Bova C, Pesavento R, Marchiori A, Palla A, Enea I, Pengo V, et al. Risk stratification and outcomes in hemodynamically stable patients with acute pulmonary embolism: a prospective, multicentre, cohort study with three months of follow-up. J Thromb Haemost. 2009. Jun;7(6):938–44. doi: 10.1111/j.1538-7836.2009.03345.x [DOI] [PubMed] [Google Scholar]
  • 68.Zondag W, Vingerhoets LM, Durian MF, Dolsma A, Faber LM, Hiddinga BI, et al. Hestia criteria can safely select patients with pulmonary embolism for outpatient treatment irrespective of right ventricular function. J Thromb Haemost. 2013. Apr;11(4):686–92. doi: 10.1111/jth.12146 [DOI] [PubMed] [Google Scholar]
  • 69.Royston P, Altman DG, Sauerbrei W. Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med. 2006. Jan 15;25(1):127–41. doi: 10.1002/sim.2331 [DOI] [PubMed] [Google Scholar]
  • 70.Murgier M, Bertoletti L, Darmon M, Zeni F, Valle R, Del Toro J, et al. Frequency and prognostic impact of acute kidney injury in patients with acute pulmonary embolism. Data from the RIETE registry. Int J Cardiol. 2019. Sep 15;291:121–6. doi: 10.1016/j.ijcard.2019.04.083 [DOI] [PubMed] [Google Scholar]
  • 71.Chopard R, Jimenez D, Serzian G, Ecarnot F, Falvo N, Kalbacher E, et al. Renal dysfunction improves risk stratification and may call for a change in the management of intermediate- and high-risk acute pulmonary embolism: results from a multicenter cohort study with external validation. Crit Care. 2021. Feb 9;25(1):57. doi: 10.1186/s13054-021-03458-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 72.Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ. 2016. Jan 25;352:i6. doi: 10.1136/bmj.i6 [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Robert Ehrman

4 Aug 2021

PONE-D-21-19015

Development and validation of a prognostic tool: pulmonary embolism short-term clinical outcomes risk estimation (PE-SCORE)

PLOS ONE

Dear Dr. Weekes,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

==============================

I applaud the authors for undertaking this large and important task--the need for a tool to help stratify patients for short-term adverse outcomes in PE is sorely needed, as is incorporation of an acute measure of RV function on echocardiography.

In addition to the comments by the reviewers, please consider the following when preparing a revision.

-It appears that acute and chronic RV dysfunction were treated equally, but i am not certain this is appropriate, since the goal of the paper is to include PE-related RV dysfunction (RVD). For example, patients with RVD from pre-existing post-capillary pHTN may have outcomes different from those with pre-existing RVD from pre-capillary pHTN which may also be different from those with acute onset pHTN from a PE--is it the RVD that leads to the bad outcome or is it the underlying lung or heart disease?  Compounding this is the fact that size and location of PE are not reported, which potentially further distorts the relationship between the PE, RV function, and outcomes (particularly since PE severity is emphasized in the discussion). i would be interested in seeing an analysis of only those with acute RVD, as i suspect that patients with chronic RVD have a greater likelihood of poor outcomes at regardless of PE size.

-I wonder about the extent to which the presence of sepsis accounts for some of the adverse outcomes as opposed to the PE?

-line 467-469: this does not really make sense to me as the putative pathway for right heart dysfunction leading to adverse outcomes is via reduced left heart function from reduced right-sided output. in addition, i think this statement needs a reference. also, can you include the number of patients that had troponin and BNP levels drawn?

-while the NPV point estimate is quite high, the lower end of the CI may to too low for some providers to feel safe discharging these patients--can you discuss this?

-was there any measurement of inter-rater reliability or central adjudication of RV findings on echo?

-feasibility of GDE is really outside the scope of this paper and i would suggest removing this section from the discussion.

-overall, the statistical methods are quite robust. i would consider, however, moving some of the technical details to a supplement as i suspect that many readers who will find this paper most interesting will be clinicians for whom the statistical details might be overwhelming/distracting.

==============================

Please submit your revised manuscript by Sep 18 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Robert Ehrman, MD, MS

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Partly

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: Interesting paper.Some issues, although methodological aspects are very strong and accurate.

1) Deterioration should be better defined in abstract and in paper

2) Abstract: CI for AUC should be added

3) methods: Authors stated that they performed sensitivity analysis for each candidate. This should be better specified

4) Data cleaning is not clear

.

Reviewer #2: Overall

This paper represents a valiant effort to create a new discriminating decision-making tool for patients with pulmonary embolism including RV echo findings as a variable. There is a need for PE management decision-making tools that consider this variable. Overall though, this tool does not appear to reliably perform at the high level of some of the existing decision-making tools and therefore its utility in practice is at this time limited. Perhaps if validated in a larger cohort of patients outside of one of the original participating institutions it might strengthen the argument for the utility of this decision-making tool but as it stands this tool’s performance appears mediocre. The concept of a risk stratifying PE tool is not novel as there are several other well validated decision-making tools in use for determining low vs high risk PE. PESI, HESTIA, ESC and sPESI consistently have very good, reliably reproducible negative predictive value in multiple validation cohorts and do not require users have any ultrasound training.

While it is unique (and likely important) to include POCUS RV abnormalities as a variable in the tool and there is some literature that supports including this variable in the determination of low vs high risk PE (https://academic.oup.com/eurheartj/article/40/11/902/5263773), inclusion of this variable further limits the generalizability of this tool as there are still a great number of EM trained physicians and midlevel providers who are not POCUS trained or are only minimally pocus trained. Were this tool to demonstrate exceptional discrimination with the addition of the RV echo findings variable there would be an argument for using it however, it underperforms and is more complex to use as compared to existing tools.

As it stands, this paper represents a good idea that needs further reliably reproducible results (particularly in terms of its negative predictive value for low risk PE) before it would be a novel and worthwhile contribution to the current body of evidence of risk stratifying patients with PE. The result differences between the database and validation groups in some key areas (differences in false negatives between the database and validation group and the underpredicting of the tool for the outcome of interest at all scores 0-9) make this tool at face value seem to perform unreliably. Where it may be somewhat novel (but still requires further validation) is as a risk stratification tool for short term outcomes specifically for patients with ESRD and a PE.

Abstract

Line 17: This statement is particularly misleading since the validation group had an 8% proportion of the primary outcome in the zero score group.

Intro

Well written. Identifies the need for a decision tool that considers RV echo findings. Supports the need for a decision tool that can assess risk of short term deterioration and not risk of 30 day deterioration.

Methods

Enrollment criteria appropriate.

This study is aided by the diversity of its multiple institutions.

Line 110-112, 121-124: Using the need for volume expansion or pressors for documented hypotension as a marker of deterioration from PE may be confounded by the fact that your score includes septic patients. Worsening sepsis may cause the same deterioration which in the model of this study may be incorrectly attributed to the PE when really the hypotension is from the infection.

Methods for selection of variables seems appropriate.

Line 169-177: This change is unfortunate and as you cover in your discussion, may be a confounder.

Sample size calculations are appropriate.

Blinding to the development of different databases is a strength.

Line 232: Explanation of LASSO is appreciated and helpful to the reader.

Overall, this section is too long and in places too complex. As written, some of it is difficult to understand due to the use of some statistics that are not frequently encountered (ex Line 283). May benefit from simpler explanation of why these tests were utilized. This comes up again in the data section with the use of F1 and Hosmer-Lemshow. Statistics chosen do seem appropriate for the data set and aim of the study.

Results

Hosmer-Lemeshow for the regression model development database and particularly the validation database indicates poor fit.

F1 score for the high-risk pool seems poor given that this group should be skewed to see more subjects with the primary outcome (I’d assume individuals with a high risk score should experience an increased number of poor outcomes).

Confidence interval for the PPV of the high-risk group is quite wide so while the reported PPV looks good the certainty that it is actually close to 90% is not great (according to that confidence interval it could be less than 80% which at that point, how much utility does this test have?).

It appears the tool significantly underpredicted the primary outcome at every score level in the validation group, particularly in the categorized low risk groups (0-4), which is concerning if this tool is being advertised to discriminate between patients that have a high likelihood of deterioration and require admission vs low likelihood of deterioration and suitability for outpatient management (unless you plan on advertising this tool as dichotomous (less than or equal to zero is low risk, anything else isn’t)).

While the NPV in the low-risk database group with the cut-off set to “0” looks great (NPV 97.9% (93.6 -99.5%)) it underperforms in the validation group (91.9%). Some of the existing PE tools consistently perform at a NPV >97% over multiple validation studies. The PPV of the high-risk PE threshold group appears to suffer similarly where it performs well in the validation group but suffers in the database group. Your attached supplement shows sPESI underperforming as compared to your tool but sPESI was not validated for use in patients with severe renal disease which your database includes, so it is a somewhat misleading comparison. ESC appears to outperform your tool in the both the low risk database and validation cohorts.

408-414: Though several tests indicate good calibration, your data shows this tool underpredicted the primary outcome at every score level.

Discussion

While the AUC values for this tool are fair to good, the AUC precision recall values are not which is reflected in the less than robust PPV values for this score at the high threshold.

Lines 462-470: This is definitely an interesting thought.

Line 483-485: Sums up why this score (if proven to be more reliable in the future) would be a clinical difference maker.

Line 492-494: This is exactly what I believe needs to happen to make this more publishable.

Line 505-506: Agree.

Line 548-550: Agree and this should be a separate publication from this dataset.

Conclusion

Line 577-579: Depends on what you are considering a cutoff for low or high probability. Having clinical deterioration occur within 5 days in 8% of the zero score individuals of the validation group seems uncomfortably high when many of our other decision tools set the threshold at 2% (which was seen in the database group).

Reviewer #3: I would like to start by saying that I am not an expert on the pathology in question, but the considerations

brought by the Authors seem to me to be well argued and with adequate bibliographic references.

The aim of the study is exposed in a very clear way.

I really appreciated the effort to maintain a high degree of rigor in the statistical analysis.

Overall, the study is interesting and well conducted. However, I believe that some methodological aspects

can be improved. I have provided a detailed list of suggestions and references in an attached file.

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Fabrizio D'Ascenzo

Reviewer #2: No

Reviewer #3: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Attachment

Submitted filename: PONE-D-21-19015_revision.pdf

Decision Letter 1

Christophe Leroyer

18 Oct 2021

PONE-D-21-19015R1Development and validation of a prognostic tool: pulmonary embolism short-term clinical outcomes risk estimation (PE-SCORE)PLOS ONE

Dear Dr. Weekes,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

Please submit your revised manuscript by Dec 02 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Christophe Leroyer

Academic Editor

PLOS ONE

Journal Requirements:

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #2: All comments have been addressed

Reviewer #4: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #2: Yes

Reviewer #4: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #2: Yes

Reviewer #4: I Don't Know

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #2: Yes

Reviewer #4: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #2: Yes

Reviewer #4: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #2: Overall, this manuscript has significantly improved since initial submission. While some of my initial concerns about the overall usefulness of this decision tool still stand, in its current iteration this manuscript provides better explanations of how the tool variables were derived and validated and provides increased clarity about the authors’ intent for how the tool is to be used. This manuscript makes a much better argument for why efforts should be made to externally validate (PE-SCORE) and evaluate its performance outside of the institutions in which it was created. The efforts put forth by the authors to address commentary and concerns in my initial review were exhaustive. At this point in time I am satisfied with the revisions to this manuscript and the authors’ responses to my questions. No further revisions are recommended at this time.

Reviewer #4: Comments to the authors :

The authors used data from a prospective registry in which 6 emergency centers in the USA participated, to propose a prognostic score validated secondarily in a second registry.

Their score is based on 9 items, and is claimed to be able to predict a population at low risk of deterioration compared to a population at higher risk of deterioration.

The desire to develop a score with a pragmatic interest (promoting safe outpatient treatment versus keeping patients who may benefit from more intensive treatment in hospital) is commendable and the authors should be warmly congratulated.

However, I have several comments which for the moment limit the acceptability of this work.

Major comments :

- Methods : the authors chose as primary endpoint an unusual composite endpoint compared to other studies(1–3). The authors would have to justify the construction of this criterion more strongly. For example if the objective is to individualize a population to make the risk of death, we could expect death and death from pulmonary embolism as the main endpoint, as was the case with the PESI score. Conversely, if the objective is to individualize a population with more risk of degradation despite well-conducted anticoagulant treatment, the authors must justify why they did not follow criteria such as the one proposed in the PEITHO trial, for example.

- Population: It is surprising to find so few elderly people in the population included in the study, even though the incidence of pulmonary embolism rises after 75 years and that advanced age is a major prognostic factor(4). Do you have a reason for this? Is it explained by a selection bias?

- Results : Many studies find renal failure as a factor of poor prognosis(5), in the short term(6). The authors must be able to discuss this strongly, in particular on the possibility that their high creatinine leaves may cause loss of sensitivity.

- Discussion : I think that the authors should explain more strongly how their score would make it possible to bring things not provided by the multitude of scores already proposed and those included in the international recommendations. Even if this is discussed by the authors, it is annoying not to find the advanced age and cancer, which are well known factors associated with a poor prognosis, as factors integrated in the model.

Minor comments:

- population : one of the limits is the difficulty to extrapolate the results to inpatients diagnosed with PE, which represents about a third of PE cases. Can the authors discussed a little bit ?

- Globally, the draft is of interest, and well written, but some of the information may be put in appendix, in order to shorten it.

1. Roy P, Penaloza A, Hugli O, Klok FA, Arnoux A, Elias A, et al. Triaging acute pulmonary embolism for home treatment by Hestia or simplified PESI criteria: the HOME-PE randomized trial. Eur Heart J [Internet]. 2021 Aug 7;1–13. Available from: https://academic.oup.com/eurheartj/advance-article/doi/10.1093/eurheartj/ehab373/6345003

2. Konstantinides S V, Meyer G, Becattini C, Bueno H, Geersing G-J, Harjola V-P, et al. 2019 ESC Guidelines for the diagnosis and management of acute pulmonary embolism developed in collaboration with the European Respiratory Society (ERS). Eur Heart J [Internet]. 2020 Jan 21;41(4):543–603. Available from: https://academic.oup.com/eurheartj/advance-article/doi/10.1093/eurheartj/ehz405/5556136

3. Meyer G, Vicaut E, Danays T, Agnelli G, Becattini C, Beyer-Westendorf J, et al. Fibrinolysis for patients with intermediate-risk pulmonary embolism. N Engl J Med [Internet]. 2014;370(15):1402–11. Available from: http://www.ncbi.nlm.nih.gov/pubmed/24716681

4. Delluc A, Tromeur C, Ven F Le, Gouillou M, Paleiron N, Bressollette L, et al. Current incidence of venous thromboembolism and comparison with 1998 : a community-based study in Western France. Thromb Haemost. 2016;3–10.

5. Murgier M, Bertoletti L, Darmon M, Zeni F, Valle R, Del Toro J, et al. Frequency and prognostic impact of acute kidney injury in patients with acute pulmonary embolism. Data from the RIETE registry. Int J Cardiol [Internet]. 2019;291:121–6. Available from: https://doi.org/10.1016/j.ijcard.2019.04.083

6. Chopard R, Jimenez D, Serzian G, Ecarnot F, Falvo N, Kalbacher E, et al. Renal dysfunction improves risk stratification and may call for a change in the management of intermediate- and high-risk acute pulmonary embolism: results from a multicenter cohort study with external validation. Crit Care [Internet]. 2021;25(1):57. Available from: http://www.ncbi.nlm.nih.gov/pubmed/33563311

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #2: No

Reviewer #4: No

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

Decision Letter 2

Christophe Leroyer

2 Nov 2021

Development and validation of a prognostic tool: pulmonary embolism short-term clinical outcomes risk estimation (PE-SCORE)

PONE-D-21-19015R2

Dear Dr. Weekes,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Christophe Leroyer

Academic Editor

PLOS ONE

Acceptance letter

Christophe Leroyer

9 Nov 2021

PONE-D-21-19015R2

Development and validation of a prognostic tool: pulmonary embolism short-term clinical outcomes risk estimation (PE-SCORE)

Dear Dr. Weekes:

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

If we can help with anything else, please email us at plosone@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Christophe Leroyer

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Table. Univariable analysis of 138 candidate variables for primary outcome on development database.

    (DOCX)

    S2 Table. Comparison of clinical research sites in development database.

    (DOCX)

    S3 Table. Prognostic performance of sPESI and ESC at low-risk threshold.

    (DOCX)

    S1 Fig. Assignment of points to each of the nine variables in the PE-SCORE model.

    (DOCX)

    S1 Data

    (CSV)

    S2 Data

    (CSV)

    S3 Data

    (CSV)

    Attachment

    Submitted filename: PONE-D-21-19015_revision.pdf

    Attachment

    Submitted filename: Response to Reviewers.docx

    Attachment

    Submitted filename: ITEMIZED responses for 2nd revision PLOSONE.docx

    Data Availability Statement

    All relevant data are within the manuscript and its Supporting Information files.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES