Abstract
Objective:
To evaluate informal physician judgment versus pretest probability scores in estimating risk in patients with suspected coronary artery disease (CAD).
Methods:
We included 4533 patients from the PROMISE (Prospective Multicenter Imaging Study for Evaluation of Chest Pain) trial. Physicians categorized a priori the pretest probability of obstructive CAD (≥70% or ≥50% left main); Diamond-Forrester (D-F) and European Society of Cardiology (ESC) pretest probability estimates were calculated. Agreement was calculated using the κ statistic; logistic regression evaluated estimates of pretest CAD probability and actual CAD (as determined by CT coronary angiography), and clinical outcomes were modelled using Cox proportional hazard models.
Results:
Physician estimates agreed poorly with D-F (κ, 0.16; 95% CI, 0.14–0.18) and ESC (κ, 0.04; 95% CI, 0.02–0.05). Actual obstructive CAD was significantly more prevalent in both the high-likelihood (OR, 3.30; 95% CI, 2.30–4.74) and intermediate-likelihood (OR, 1.43; 95% CI, 1.16–1.76) physician-estimated groups versus the low-likelihood group; ESC similarly differentiated between the three groups (OR, 9.07; 95% CI, 2.87–28.70; and OR 3.87; 95% CI, 1.22–12.28). However, using D-F, only the high-probability group differed (OR, 2.49; 95% CI, 1.74–3.54). Only physician estimates were associated with a higher incidence of adjusted death/myocardial infarction/unstable angina hospitalization in the high- versus low-probability group (HR, 2.68; 95% CI, 1.52–4.74); neither pretest probability score provided prognostic information.
Conclusions:
Compared to D-F and ESC estimates, physician judgment more accurately identified obstructive CAD and worse patient outcomes. Integrating physician judgment may improve risk prediction for stable chest pain patients.
Keywords: Chest pain, risk scores, physician judgment
When evaluating patients with stable chest pain in clinical practice, most physicians formulate estimates of patient status to aid in decision making, including the probability of obstructive coronary artery disease (CAD) for potential revascularization and overall long-term risk of adverse events. These characteristics have been quantified by the Diamond-Forrester (D-F) risk score,1 which has been further validated in the Coronary Artery Surgery Study (CASS) registry and is used in current U.S. guidelines.2 However, recent studies suggest that some contemporary risk scores overestimate actual risk.3,4 To this end, the European Society of Cardiology (ESC) has proposed a revised pretest probability score in its 2019 guidelines.4
Physician estimation is a possible substitute, but little is known about how physician estimation compares to a formal risk score, as well as the relationships between physician- and risk-score-generated estimates and major adverse cardiovascular events. Some groups have evaluated the relationship between either race or sex and the treating physician‟s pretest estimation of obstructive CAD, but they are limited to older, single-center registry-based studies.5,6 To our knowledge, no study has systematically and prospectively assessed physicians‟ estimate of the pretest probability of CAD in comparison to either formal risk score as well as the relationship between these estimates and long-term outcomes.
As part of standard baseline clinical data collection, site investigators in the Prospective Multicenter Imaging Study for Evaluation of Chest Pain (PROMISE) trial were asked to estimate the probability of obstructive disease.7 Formal risk score calculations for each subject were made using D-F/CASS,2 and this was calculated for each trial participant but not provided to physicians. This was the most updated score available at the time of the study, and is still the score recommended by current US guidelines. ESC pretest probability (ESC-PTP) estimates were calculated after it became available in 2019.4
Understanding whether physician estimation differs and/or provides incremental predictive value over and above traditional risk scores could provide valuable insights into future risk algorithms. The objectives of the current analysis were to (1) determine the agreement between physician and D-F and ESC-PTP estimates of pretest probability in PROMISE, (2) report the frequency of actual obstructive CAD found on coronary computed tomographic angiography (CCTA) or cardiac catheterization by physician versus D-F and ESC-PTP estimates, and (3) describe the association of physician versus D-F and ESC-PTP estimates with the clinical outcomes of death, myocardial infarction, and unstable angina hospitalization over a median 25 months of follow-up.
METHODS
PROMISE datasets are available at https://biolincc.nhlbi.nih.gov/studies/promise. There are no commercial use data restrictions, and no data restrictions based on area of research.
Study Population and Design
PROMISE was a pragmatic comparative effectiveness trial that enrolled 10,003 patients at 193 sites in North America representing both community practices and academic medical centers. The PROMISE study design and primary results have been described in detail.7,8 The study enrolled stable symptomatic outpatients without known CAD referred to noninvasive testing for further evaluation, who were randomized to an initial anatomical testing strategy with CCTA or a functional testing strategy (exercise treadmill testing, stress echocardiography, or stress nuclear imaging), and who were then followed for a median of 25 months for outcome events. The local or central institutional review board at each coordinating center and at each of the 193 enrolling sites in North America approved the study protocol. All participants provided written informed consent.
The analytic cohort included 4533 patients with stable chest pain or equivalent who underwent CCTA in the PROMISE trial (Figure 1). The attending physician categorized a priori (prior to non-invasive testing) the pretest probability of obstructive CAD (≥70% stenosis of major epicardial artery or ≥50% left main artery) for each patient according to five categories: very low (<10%), low (10–30%), intermediate (31–70%), high (71–90%), or very high (>90%) according to his or her clinical judgment. Categorization was left at the discretion of the physician. For the purposes of this secondary analysis, very low and low were categorized as low, and high and very high were categorized as high. D-F estimates were categorized as low (<30%), intermediate (30% to <70%), or high (≥70%), and pretest probability was calculated based on age, sex, and chest pain typicality as previously reported.7 The D-F probability cutoffs were chosen to remain consistent with the physician-categorized pretest probabilities listed above. While D-F was the contemporaneous and therefore appropriate comparator for this analysis at the time of PROMISE enrollment, cutoffs were also applied using the 2019 ESC-PTP estimates of <5% for low, 5–15% for indeterminate, and >15% for high pretest probability.4 The ESC-PTP estimate of 5–15% is considered indeterminate (vs. intermediate) since the guidelines advocate that non-invasive testing be considered after assessing the overall clinical likelihood based on clinical modifiers. After exclusions (Figure 1), the final study cohort consisted of 4533 patients.
Figure 1.

Cohort derivation. CCTA, coronary computed tomographic angiography.
Patient and Public Involvement
Patients or the public were not involved in the design, or conduct, or reporting, or dissemination plans of our research.
Statistical Analysis
Continuous baseline characteristics were summarized using means and standard deviations or medians with 25th and 75th percentiles. Categorical variables were summarized using frequencies and percentages. Group comparisons with respect to continuous baseline variables were performed using the Wilcoxon rank sum test; Pearson’s chi-square or Fisher’s exact test was used for comparisons involving categorical variables. Agreement was calculated using the κ statistic. Logistic regression models were used to identify the association between physician, D-F, or ESC-PTP estimates with CAD prevalence. Cox proportional hazard models were used to calculate adjusted hazard ratios between physician, D-F, or ESC-PTP estimates and clinical outcomes of death, myocardial infarction, and unstable angina hospitalization over a median 25 months of follow-up. The proportional hazard assumption was assessed and met. We adjusted a priori for race; body mass index (BMI); hypertension; metabolic syndrome; dyslipidemia; history of carotid, peripheral vascular, or cerebrovascular disease; smoking (ever vs never); family history of premature CAD; depression; physical activity; and CAD. Age, sex, and chest pain typicality were not included in the model since they are part of the D-F classification. Stepwise selection with conservative entry and exit criteria (entry criterion: p-value < 0.1; exit criterion: p-value > 0.2) was used to select the “best” subset of predictors of test positivity. The Hosmer-Lemeshow goodness-of-fit test was used to assess the calibration of the final model, and the area under a receiver operating characteristic curve was used to assess the final model’s discriminatory capacity. Statistical significance was set at α=.05. All analyses were performed in SAS version 9.4 (SAS Institute, Cary, NC).
RESULTS
Study Population and Cohorts
The majority of health care providers participating in the PROMISE trial were cardiologists (86.9%), followed by internal medicine specialists (5.3%), physician assistant/nurse practitioners (3.7%), and family medicine physicians (1.3%).
Among 4533 patients included in the analyses, physicians categorized 209 (4.6%) as having a high probability, 2630 (58.0%) as having an intermediate probability, and 1694 (37.4%) as having a low probability of obstructive CAD. In contrast, D-F categorized 1197 (26.4%) patients as having a high probability, 2854 (63.0%) as having an intermediate probability, and 482 (10.6%) as having a low probability of obstructive CAD; ESC-PTP categorized 2115 (46.7%) patients as having a high probability, 2275 (50.2%) as having an indeterminate probability, and 143 (3.2%) as having a low probability. Baseline clinical characteristics stratified by physician, D-F, and ESC-PTP estimates of obstructive CAD are shown in Table 1. As shown, there were several significant differences among cardiac risk factors and symptoms between physician estimates, D-F, and ESC-PTP in the high/intermediate or indeterminate/low groups.
Table 1.
Baseline clinical characteristics stratified by physician, Diamond-Forrester (D-F), and European Society of Cardiology pretest probability (ESC-PTP) estimates of the likelihood or probability of obstructive coronary artery disease (CAD).
| Physician Estimate | D-F Estimate | ESC-PTP Estimate | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Characteristic | High (N=209) | Intermediate (N=2630) | Low (N=1694) | P-Value | High (N=1197) | Intermediate (N=2854) | Low (N=482) | P-Value | High (N=2115) | Indeterminate (N=2275) | Low (N=143) | P-Value |
| Demographics | ||||||||||||
| Age, mean (SD), y | 62.0 (8.42) | 60.7 (8.24) | 59.8 (7.94) | <.001 | 64.2 (7.53) | 58.8 (7.85) | 60.5 (8.34) | <.001 | 62.5 (8.40) | 58.9 (7.48) | 52.9 (4.21) | <.001 |
| Female Sex | 70 (33.5%) | 1321 (50.2%) | 949 (56.0%) | <.001 | 262 (21.9%) | 1822 (63.8%) | 256 (53.1%) | <.001 | 336 (15.9%) | 1901 (83.6%) | 103 (72.0%) | <.001 |
| Race | .252 | <.001 | 0.012 | |||||||||
| White | 176 (84.6%) | 2190 (84.2%) | 1411 (83.9%) | 1031 (86.7%) | 2331 (82.5%) | 415 (87.2%) | 1792 (85.5%) | 1863 (82.7%) | 122 (85.9%) | |||
| Black | 16 (7.7%) | 287 (11.0%) | 180 (10.7%) | 114 (9.6%) | 323 (11.4%) | 46 (9.7%) | 191 (9.1%) | 276 (12.3%) | 16 (11.3%) | |||
| Other | 16 (7.7%) | 125 (4.8%) | 90 (5.4%) | 44 (3.7%) | 172 (6.1%) | 15 (3.2%) | 113 (5.4%) | 114 (5.1%) | 4 (2.8%) | |||
| Hispanic Ethnicity | 14 (6.7%) | 2185 (8.3%) | 111 (6.6%) | .094 | 97 (8.1%) | 199 (7.0%) | 47 (9.8%) | .070 | 156 (7.4%) | 173 (7.6%) | 14 (9.8%) | 0.588 |
| Cardiac Risk Factors | ||||||||||||
| BMI, mean (SD), kg/m2 | 31.8 (6.16) | 30.5 (5.93) | 30.1(5.89) | <0.01 | 30.2 (5.61) | 30.6 (6.08) | 30.0 (5.81) | .090 | 30.0 (5.36) | 30.7 (6.39) | 30.7 (6.32) | 0.016 |
| Hypertension | 155 (74.2%) | 1779 (67.6%) | 977 (57.7%) | <.001 | 811 (67.8%) | 1812 (63.5%) | 288 (59.8%) | .003 | 1370 (64.8%) | 1456 (64.0%) | 85 (59.4%) | 0.416 |
| Diabetes | 82 (39.2%) | 603 (22.9%) | 255 (15.1%) | <.001 | 298 (24.9%) | 566 (19.8%) | 76 (15.8%) | <.001 | 444 (21.0%) | 478 (21.0%) | 18 (12.6%) | 0.051 |
| Dyslipidemia | 159 (76.1%) | 1798 (68.4%) | 1092 (64.5%) | <.001 | 806 (67.3%) | 1931 (67.7%) | 312 (64.7%) | .447 | 1398 (66.1%) | 1553 (68.3%) | 98 (68.5%) | 0.295 |
| Smoking (ever vs never) | 125 (59.8%) | 1352 (51.4%) | 826 (48.8%) | .007 | 657 (54.9%) | 1418 (49.7%) | 228 (47.3%) | .003 | 1142 (54.0%) | 1090 (47.9%) | 71 (49.7%) | <.001 |
| Family history of premature CAD | 74 (35.4%) | 897 (34.2%) | 502 (29.7%) | .005 | 334 (28.0%) | 1003 (35.3%) | 136 (28.3%) | <.001 | 602 (28.6%) | 818 (36.1%) | 53 (37.3%) | <.001 |
| History of depression | 45 (21.5%) | 503 (19.1%) | 341 (20.1%) | .557 | 177 (14.8%) | 607 (21.3%) | 105 (21.8%) | <.001 | 285 (13.5%) | 567 (24.9%) | 37 (25.9%) | <.001 |
| Participate in physical activity | 93 (44.5%) | 1329 (50.6%) | 911 (54.0%) | .011 | 669 (55.9%) | 1410 (49.5%) | 254 (52.7%) | <.001 | 1180 (55.9%) | 1082 (47.6%) | 71 (49.7%) | <.001 |
| Peripheral arterial disease or cerebrovascular disease | 18 (8.6%) | 149 (5.7%) | 60 (3.5%) | <.001 | 81 (6.8%) | 130 (4.6%) | 16 (3.3%) | .003 | 122 (5.8%) | 103 (4.5%) | 2 (1.4%) | 0.022 |
| Primary Presenting Symptom | ||||||||||||
| Arm or shoulder pain | 7 (3.3%) | 57 (2.2%) | 39 (2.3%) | .541 | 28 (2.3%) | 58 (2.0%) | 17 (3.5%) | .124 | 55 (2.6%) | 44 (1.9%) | 4 (2.8%) | 0.306 |
| Back pain | 2 (1.0%) | 23 (0.9%) | 13 (0.8%) | .916 | 8 (0.7%) | 25 (0.9%) | 5 (1.0%) | .707 | 17 (0.8%) | 19 (0.8%) | 2 (1.4%) | 0.752 |
| Chest pain | 146 (69.9%) | 1871 (71.1%) | 1333 (78.8%) | <.001 | 860 (71.9%) | 2127 (74.6%) | 363 (75.3%) | .162 | 1554 (73.5%) | 1673(73.6%) | 123 (86.0%) | 0.004 |
| Aching/dull | 33 (22.6%) | 466 (24.9%) | 354 (26.6%) | .412 | 228 (26.5%) | 518 (24.4%) | 107 (29.5%) | .084 | 409 (26.3%) | 401 (24.0%) | 43 (35.0%) | 0.015 |
| Burning/pins and needles | 21 (14.4%) | 154 (8.2%) | 130 (9.8%) | .026 | 73 (8.5%) | 199 (9.4%) | 33 (9.1%) | .757 | 140 (9.0%) | 152 (9.1%) | 13 (10.6%) | 0.845 |
| Crushing/pressure/squeezing/tightness | 82 (56.2%) | 967 (51.7%) | 638 (47.9%) | .037 | 432 (50.2%) | 1089 (51.2%) | 166 (45.7%) | .156 | 757 (48.7%) | 872 (52.1%) | 58 (47.2%) | 0.118 |
| Other | 34 (23.3%) | 556 (29.7%) | 426 (32.0%) | .066 | 233 (27.1%) | 661 (31.1%) | 122 (33.6%) | .036 | 445 (28.6%) | 529 (31.6%) | 42 (34.1%) | 0.118 |
| Fatigue or weakness | 9 (4.3%) | 93 (3.5%) | 24 (1.4%) | <.001 | 34 (2.8%) | 76 (2.7%) | 16 (3.3%) | .713 | 63 (3.0%) | 58 (2.6%) | 5 (3.5%) | 0.600 |
| Neck or jaw pain | 1 (0.5%) | 26 (1.0%) | 17 (1.0%) | .757 | 10 (0.8%) | 30 (1.1%) | 4 (0.8%) | .771 | 16 (0.8%) | 26 (1.1%) | 2 (1.4%) | 0.371 |
| Palpitations | 3 (1.4%) | 62 (2.4%) | 42 (2.5%) | .642 | 22 (1.8%) | 75 (2.6%) | 10 (2.1%) | .290 | 48 (2.3%) | 55 (2.4%) | 4 (2.8%) | 0.893 |
| Dyspnea | 36 (17.2%) | 415 (15.8%) | 183 (10.8%) | <.001 | 192 (16.1%) | 393 (13.8%) | 49 (10.2%) | .006 | 288 (13.6%) | 346 (15.2%) | 0 (0.0%) | <.001 |
| Other | 5 (2.4%) | 83 (3.2%) | 40 (2.4%) | .288 | 42 (3.5%) | 68 (2.4%) | 18 (3.7%) | .063 | 73 (3.5%) | 52 (2.3%) | 3 (2.1%) | 0.058 |
| Physician Characterization of Chest Pain | ||||||||||||
| Site characterization of chest pain | <.001 | <.001 | <.001 | |||||||||
| Typical | 119 (56.9%) | 358 (13.6%) | 54 (3.2%) | 531 (44.4%) | 0 (0.0%) | 0 (0.0%) | 395 (18.7%) | 136 (6.0%) | 0 (0.0%) | |||
| Atypical | 88 (42.1%) | 2207 (83.9%) | 1225 (72.3%) | 666 (55.6%) | 2854 (100.0%) | 0 (0.0%) | 1629 (77.0%) | 1891 (83.1%) | 0 (0.0%) | |||
| Non-cardiac | 2 (1.0%) | 65 (2.5%) | 415 (24.5%) | 0 (0.0%) | 0 (0.0%) | 482 (100.0%) | 91 (4.3%) | 248 (10.9%) | 143 (100.0%) | |||
| Medication Use | ||||||||||||
| Aspirin | 132 (63.8%) | 1177 (46.7%) | 654 (40.7%) | <.001 | 608 (52.7%) | 1179 (43.5%) | 176 (37.6%) | <.001 | 1021 (50.7%) | 897 (41.1%) | 45 (32.4%) | <.001 |
| Statin | 114 (55.1%) | 1173 (46.5%) | 693 (43.1%) | .002 | 575 (49.8%) | 1202 (44.3%) | 203 (43.4%) | .004 | 959 (47.6%) | 969 (44.4%) | 52 (37.4%) | 0.015 |
| Beta-blocker | 72 (34.8%) | 660 (26.2%) | 339 (21.1%) | <.001 | 317 (27.5%) | 651 (24.0%) | 103 (22.0%) | .026 | 496 (24.6%) | 546 (25.0%) | 29 (20.9%) | 0.542 |
| ACEi or ARB | 109 (52.7%) | 1157 (45.9%) | 606 (37.7%) | <.001 | 567 (49.1%) | 1129 (41.6%) | 176 (37.6%) | <.001 | 936 (46.5%) | 888 (40.7%) | 48 (34.5%) | <.001 |
ACEi, indicates angiotensin-converting enzyme inhibitor; ARB, angiotensin receptor blocker; BMI, body mass index.
Data shown are no. (%) or no./total no. (%), except where indicated. Variables with missing values: race (n=42), ethnicity (n=24), BMI (n=38), smoking (n=1), family history of premature CAD (n=15), physical activity (n=9), any PAD (n=1), primary presenting symptom (n=3 for each symptom), medication use (n=199 for each medication)
Agreement of Physician Estimates Versus D-F or ESC-PTP for the Presence of Obstructive CAD
The agreement rate with respect to pretest probability of obstructive CAD was poor between both physician and D-F estimates [51.2% (2322/4533; κ, 0.16; 95% CI, 0.14–0.18)] as well as between physician and ESC-PTP estimates [34.9% (1580/4533; κ, 0.04; 95% CI, 0.02–0.05)] (Supplemental Table 1). Physicians generally estimated patients as having lower probability of having obstructive CAD compared to D-F or ESC-PTP (Figure 2). For example, very few patients were felt to be more likely to have obstructive CAD compared to D-F (2.7%; 122/4533) or ESC-PTP (1.3%; 61/4533).
Figure 2.

Proportion of participants estimated to be at low, intermediate/indeterminate and high pretest probability of obstructive coronary artery disease (CAD) by physician, Diamond-Forrester (D-F), or European Society of Cardiology pretest probability (ESC-PTP) estimates.
Prevalence of Observed Obstructive CAD by Physician Estimates Versus D-F or ESC-PTP
Among physician-estimated groups, obstructive CAD was most commonly found in the high-probability group (27.3%) compared to the intermediate (12.6%) or low (8.7%) groups (Table 2). Among D-F estimates, obstructive CAD was also most commonly found in the high-probability group (20.0%), but with similar rates of obstructive CAD in the intermediate (8.9%) or low (8.9%) groups. Among ESC-PTP estimates, obstructive CAD was also most commonly found in the high-probability group (16.6%), with lower rates of obstructive CAD in the indeterminate (8.0%) or low (2.1%) groups. In a multivariable model adjusting for important baseline characteristics, the physician-estimated high- and intermediate-risk groups were significantly more likely (OR, 3.30; 95% CI, 2.30–4.74; and OR, 1.43; 95% CI, 1.16–1.76, respectively) (Figure 3 and Supplemental Table 2) to have actual obstructive CAD on CCTA compared to the low-probability group. When compared to the low-probability group, groups estimated by D-F as having high probability were significantly more likely to have actual obstructive CAD (OR, 2.49; 95% CI, 1.74–3.54), but not the intermediate group; when estimated by ESC-PTP, both the high- and indeterminate-probability groups were significantly more likely to have actual obstructive CAD (OR, 9.07; 95% CI, 2.87–28.70; and OR, 3.87; 95% CI, 1.22–12.28, respectively).
Table 2.
Frequency of observed obstructive coronary artery disease (CAD) by physician or risk score estimate of obstructive CAD.
| Estimate | |||
|---|---|---|---|
| High | Intermediate/indeterminate | Low | |
| Physician Estimate | |||
| Obstructive CAD | 57/209 (27.27%) | 331/2630 (12.59%) | 148/1694 (8.74%) |
| Nonobstructive CAD | 119/209 (56.94%) | 1555/2630 (59.13%) | 952/1694 (56.20%) |
| Normal | 33/209 (15.79%) | 744/2630 (28.29%) | 594/1694 (35.06%) |
| D-F Estimate | |||
| Obstructive CAD | 239/1197 (19.97%) | 254/2854 (8.90%) | 43/482 (8.92%) |
| Nonobstructive CAD | 750/1197 (62.66%) | 1586/2854 (55.57%) | 290/482 (60.17%) |
| Normal | 208/1197 (17.38%) | 1014/2854 (35.53%) | 149/482 (30.91%) |
| ESC-PTP Estimate | |||
| Obstructive CAD | 351/2115 (16.60%) | 182/2275 (8.00%) | 3/143 (2.10%) |
| Nonobstructive CAD | 1360/2115 (64.30%) | 1193/2275 (52.44%) | 73/143 (51.05%) |
| Normal | 404/2115 (19.10%) | 900/2275 (39.56%) | 67/143 (46.85%) |
Obstructive CAD: Most severe stenosis on coronary computed tomographic angiography (CCTA) ≥ 70% (or 50% left main) and any coronary artery calcium (CAC) value = 0 or > 0. Nonobstructive CAD: Most severe stenosis on CCTA > 0% but < 70% (or 50% left main) OR [all CCTA stenosis = 0% and CAC > 0]. Normal: No stenosis > 0% and CAC = 0. If CAC is missing, then use stenosis only.
Figure 3.

Association between physician, Diamond-Forrester (D-F), or European Society of Cardiology pretest probability (ESC-PTP) estimates of the pretest probability of obstructive coronary artery disease (CAD) and observed prevalence of obstructive CAD.
Clinical Outcomes According to Physician Versus D-F or ESC-PTP Estimates of Obstructive CAD
Among physician-estimated groups, the combined outcome of all-cause death/myocardial infarction/unstable angina hospitalization occurred most frequently in the high-probability group (8.1%) compared to the intermediate (2.8%) or low (2.7%) groups over a median of 25 months of follow-up (Supplemental Table 3). Following adjustment, patients in the high-probability group (HR, 2.68; 95% CI, 1.52–4.74) were more likely to experience the combined outcome compared with the low-probability group. However, there was no difference between the intermediate- and low-probability groups (Figure 4A and Supplemental Table 4). Among D-F estimates, the combined endpoint also occurred most frequently in the high-probability group (4.1%) compared to the intermediate (2.5%) and low (3.3%) groups; however, there were no significant differences following adjustment between either the high- or intermediate-probability groups (Figure 4B and Supplemental Table 4) compared to the low-probability group. Among ESC-PTP estimates, the combined endpoint also occurred most frequently in the high-probability group (4.1%) compared to the indeterminate (2.0%) and low (3.5%) groups; however, there were also no significant differences following adjustment between either the high-probability or indeterminate-probability groups compared to the low-probability group (Figure 4C and Supplemental Table 4). Similar findings were observed for any of the estimates for the combined endpoints of cardiovascular death/myocardial infarction/unstable angina.
Figure 4.



Association between (A) physician, (B) Diamond-Forrester (D-F) or (C) European Society of Cardiology pretest probability (ESC-PTP) estimates of the pretest probability of obstructive coronary artery disease (CAD) and clinical outcomes. CV, cardiovascular; MI, myocardial infarction; UAH, unstable angina hospitalization.
DISCUSSION
While pretest probability scores have historically been used to guide risk stratification and clinical assessment of patients with suspected CAD, this study supports recent guideline recommendations that see a limited role for formal pretest probability scoring in the clinical assessment of patients with suspected CAD. We found that both the physician and the D-F and ESC-PTP estimates were able to stratify the probability of obstructive CAD in patients with stable chest pain, although agreement between physician and either the D-F or ESC-PTP algorithms in predicting the presence of obstructive CAD was poor. Patients considered higher risk by physicians or ESC-PTP were significantly more likely to have obstructive CAD (high and indeterminate risk) and be at risk for future events (high risk) compared to patients considered low risk. In contrast, only high-risk patients per D-F were significantly more likely to have obstructive CAD compared to low-risk patients. While physician estimates of high-risk patients were associated with worse outcomes, there was with no significant association for future events for D-F or ESC-PTP at any risk level. Taken together, this study demonstrates that clinical judgment may be a better determinant of risk than a single pretest probability risk score.
Clinical judgment is a central element of the medical profession and essential for physician performance, but relatively little is known about the relationship between clinical judgment and subsequent patient management.9 Physician judgment has been evaluated previously in the diagnosis of such conditions as pulmonary embolism and pneumonia.10,11 In the evaluation of acute coronary syndrome, physician experience may lead to a more accurate diagnosis, but data are limited to smaller, single-center studies.12,13. Although the relationship between either race or sex and the treating physician‟s pretest estimation of obstructive CAD has been explored in stable chest pain, physician estimates were not captured systematically.5,6 Concurrently, mounting evidence has found that the large majority of functional stress tests on outpatients with a clinical syndrome of possible ischemia are normal, and very few of these patients will experience an untoward clinical event.2–5,7 As a result, there is substantial interest in developing strategies to identify both the lowest risk patients who may not require testing14–16 and those with the highest risk who should potentially proceed directly to cardiac catheterization.17 Understanding the role of physician estimates of obstructive disease relative to traditional risk scores is clinically appealing and could identify those patients most likely to benefit from testing.
We found that in nearly half of patients with stable chest pain, physicians typically felt that obstructive disease was less likely than D-F suggested. This extends contemporary thinking that D-F tends to overestimate both the degree of obstructive disease and/or subsequent event rates.3,4,18 We further found that clinical judgment or ESC-PTP estimates were able to discriminate obstructive CAD likelihood in high- and intermediate/indeterminate -risk groups compared to the low-risk group, while only the high-risk group per D-F estimate was associated with a higher prevalence of obstructive CAD. The highest proportion of obstructive CAD was found in the physician high risk group (27.3%) while the lowest proportion of obstructive disease was in the ESC-PTP low risk group (2.1%), suggesting that clinical judgement may be most useful in re-classifying higher risk patients while ESC-PTP may be most beneficial at identifying those at lowest risk. In addition to better predicting CAD, the physician-determined high-risk subgroup was associated with a significantly greater risk of major adverse events, including all-cause death/myocardial infarction/unstable angina hospitalization, while none of the risk groups as determined by D-F or ESC-PTP were associated with events, albeit we acknowledge that the latter scores were not designed to specifically predict outcomes.
Compared to D-F or ESC-PTP, physician estimates are not fixed or singular, but a “mixture” (i.e., varying by age, sex, training, experiences) of different physician estimates that could produce different levels of predictive performance. However, our results suggest that even with a mixture of physician estimates, clinical judgment better delineates both disease prevalence and adverse event risk than a standard risk score. Similar findings have been previously described in acute chest pain where unstructured clinical impression performed better than some contemporary risk scores in excluding myocardial infarction.19 We speculate that this is because physicians likely integrate multiple factors into risk assessment in addition to traditional demographics, risk factors, signs, and symptoms, including severity of risk factors and symptoms rather than merely their presence, as well as subjective impressions that may be difficult to quantify.
This analysis provides several important and novel insights. First, to our knowledge, this is the first systematic evaluation of clinical judgment in the a priori evaluation of stable chest pain among a large heterogeneous group of patients with stable chest pain across multiple sites, making the results highly generalizable. Incorporating clinical judgment may be an optimal strategy to determine high risk in this population, a strategy employed routinely in the evaluation of possible pulmonary embolism.10 Second, these results extend recent studies emphasizing clinical „gestalt‟ in the current National Institute for Health and Care Excellence (NICE) guidelines to perform noninvasive testing solely by physician judgment of the typicality of chest pain, and to eschew traditional risk factors such as age and sex used in other guidelines, including the recent ESC guidelines.4,20 Third, in an era of electronic medical records and protocolized health care, the results of this analysis bolster the importance of clinical thinking and patient-centered care in determining risk for cardiovascular medicine despite advances in technology. Fourth, since this analysis demonstrates the superiority of physician judgment over a traditional risk score in the assessment of chest pain, it provides further impetus to develop strategies to improve clinical judgment among trainees.21 Finally, although the D-F score has served the medical community well for decades, this work extends prior evidence that the D-F does not provide any incremental benefit during the contemporary evaluation of chest pain, suggesting it may be avoided in future research/clinical work22–24, and, for the first time, documents that the contemporary ESC-PTP score similarly adds limited benefit over physician estimates.
This study does have some limitations. First, our cohort was relatively low risk for both obstructive CAD and events. It is unclear how these results might apply to a higher risk group with a greater prevalence of disease or incidence of events (including the original D-F derivation cohort), or those who did not consent to participate in a clinical trial. However, the results are generated from the largest, contemporary real-world evaluation of the role of noninvasive testing among patients with stable chest pain, which strengthens its generalizability. Second, in this analysis, the vast majority of physicians were cardiologists, followed by much smaller proportions of internal medicine specialists and other providers, making analysis by specialty difficult. It would be further informative to understand the relationship between other physician characteristics (i.e., age, sex, and years in practice) and outcomes, but these data were not captured in PROMISE. Previous work in chronic coronary disease has not definitively demonstrated a clear link between clinical experience and clinical outcomes.25 Notwithstanding, given the number of sites and physicians involved, our results would be expected to be consistent across a range of physician characteristics and experiences. Third, the diagnosis of CAD in our study was obtained via CCTA. While acceptance of CCTA as a first-line modality among chest pain patients with low to intermediate pretest probability of obstructive CAD is growing, invasive coronary angiography is still considered the gold standard. Nevertheless, the most recent European Society of Cardiology guidelines4 based pretest probability on a pooled analysis3 that includes CCTA data, including an analysis from the PROMISE trial.26 Finally, the relationship between physician estimates and various risk scores or PTP cutoffs used in this analysis may differ, particularly given recent recognition that D-F overestimates the actual presence of obstructive CAD3,4,26. While D-F was the contemporaneous comparator for this analysis, we strengthened this analysis by comparing physician estimates to ESC-PTP. Compared to the D-F intermediate PTP group, ESC-PTP estimates better delineated the presence of obstructive disease among patients at indeterminate pretest probability.
Conclusions
Physician, D-F, and ESC-PTP estimates were able to stratify the probability of obstructive CAD in patients with stable chest pain; however, agreement between physician estimates and either score algorithm was poor. Compared to D-F score, physician judgment and ESC-PTP more accurately identified obstructive CAD. However, only physician judgment was able to stratify patient adverse outcomes. These results support the development of improved approaches to predict CAD prevalence and risk among patients with stable chest pain, including formal integration of physician judgment and risk estimation.
Supplementary Material
What is already known about this subject?
Pretest probability scores have historically been used to guide risk stratification and clinical assessment of patients with suspected CAD. Informal physician judgment is also used to estimate risk in patients with stable chest pain with suspected CAD, but it is unclear how judgment alone compares to traditional risk scores such as Diamond-Forrester (D-F) or European Society of Cardiology (ESC) pretest estimates.
What does this study add?
Compared to D-F estimates, physician judgment and ESC more accurately identified patients with stable chest pain who had obstructive CAD. However, only physician judgment (and not ESC or D-F) predicted worse patient outcomes in patients with the highest probability of CAD. Clinical judgment may be a better determinant of risk than a single risk score.
How might this impact on clinical practice?
These results support efforts to integrate physician judgment to improve risk prediction among patients with stable chest pain, and support recent guideline recommendations that see a limited role for formal pretest probability scoring in the clinical assessment of patients with suspected CAD.
Acknowledgments:
We thank Adrian Coles, PhD, for his statistical assistance, and Peter Hoffmann for his editorial contributions to this manuscript. Dr. Coles and Mr. Hoffmann did not receive compensation for their assistance, apart from their employment at the Duke Clinical Research Institute.
Funding/Support:
The PROMISE trial was funded by National Heart, Lung, and Blood Institute grants R01 HL098237, R01 HL098236, R01 HL098305, and R01 HL098235.
Role of the Funding Source:
The funding source had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. The views expressed in this article do not necessarily represent the official views of the National Heart, Lung, and Blood Institute.
Disclosures:
Dr. Fordyce: consulting fees/honoraria from Bayer, Novo Nordisk, Sanofi, Boehringer Ingelheim, Pfizer; research support from Bayer; Steering Committee service for HeartFlow. Dr. Mark: consultant fees/honoraria from Medtronic; research support from AGA Medical, AstraZeneca, Bayer Healthcare Pharmaceuticals, BMS, Eli Lilly, Gilead, Merck & Co., Inc. Dr. Hoffmann: research support from HeartFlow. Dr. Patel: consultant fees/honoraria from Bayer Healthcare, Genzyme, Medscape - theheart.org, Merck; research support from AHRQ, AstraZeneca, Jansen, Johnson & Johnson, Maquet, National Heart Lung and Blood Institute, PCORI. Dr. Douglas: research support from HeartFlow. No other disclosures were reported.
Footnotes
Trial Registration: ClinicalTrials.gov identifier: NCT01174550.
References
- 1.Diamond GA, Forrester JS. Analysis of probability as an aid in the clinical diagnosis of coronary-artery disease. N Engl J Med 1979;300(24):1350–1358. [DOI] [PubMed] [Google Scholar]
- 2.Fihn SD, Gardin JM, Abrams J, et al. ; American College of Cardiology Foundation/American Heart Association Task Force. 2012 ACCF/AHA/ACP/AATS/PCNA/SCAI/STS guideline for the diagnosis and management of patients with stable ischemic heart disease: a report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines, and the American College of Physicians, American Association for Thoracic Surgery, Preventive Cardiovascular Nurses Association, Society for Cardiovascular Angiography and Interventions, and Society of Thoracic Surgeons. Circulation 2012;126(25):e354–471. [DOI] [PubMed] [Google Scholar]
- 3.Juarez-Orozco LE, Saraste A, Capodanno D, et al. Impact of a decreasing pre-test probability on the performance of diagnostic tests for coronary artery disease. Eur Heart J Cardiovasc Imaging 2019;20:1198–1207. [DOI] [PubMed] [Google Scholar]
- 4.Knuuti J, Wijns W, Saraste A, et al. ; ESC Scientific Document Group. 2019 ESC Guidelines for the diagnosis and management of chronic coronary syndromes. Eur Heart J 2019. Aug 31. pii: ehz425. doi: 10.1093/eurheartj/ehz425. [Epub ahead of print]. [DOI] [Google Scholar]
- 5.Mark DB, Shaw LK, DeLong ER, Califf RM, Pryor DB. Absence of sex bias in the referral of patients for cardiac catheterization. N Engl J Med 1994;330(16):1101–1106. [DOI] [PubMed] [Google Scholar]
- 6.Whittle J, Kressin NR, Peterson ED, et al. Racial differences in prevalence of coronary obstructions among men with positive nuclear imaging studies. J Am Coll Cardiol 2006;47(10):2034–2041. [DOI] [PubMed] [Google Scholar]
- 7.Douglas PS, Hoffmann U, Patel MR, et al. ; PROMISE Investigators. Outcomes of anatomical versus functional testing for coronary artery disease. N Engl J Med 2015;372(14):1291–1300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Douglas PS, Hoffmann U, Lee KL, et al. ; PROMISE investigators. PROspective Multicenter Imaging Study for Evaluation of chest pain: rationale and design of the PROMISE trial. Am Heart J 2014;167(6):796–803.e791. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Kienle GS, Kiene H. Clinical judgement and the medical profession. J Eval Clin Pract 2011;17(4):621–627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wells PS, Anderson DR, Rodger M, et al. Derivation of a simple clinical model to categorize patients probability of pulmonary embolism: increasing the models utility with the SimpliRED D-dimer. Thromb Haemost 2000;83(3):416–420. [PubMed] [Google Scholar]
- 11.van Vugt SF, Verheij TJ, de Jong PA, et al. ; GRACE Project Group. Diagnosing pneumonia in patients with acute cough: clinical judgment compared to chest radiography. Eur Respir J 2013;42(4):1076–1082. [DOI] [PubMed] [Google Scholar]
- 12.Domanovits H, Schillinger M, Paulis M, et al. Acute chest pain-a stepwise approach, the challenge of the correct clinical diagnosis. Resuscitation 2002;55(1):9–16. [DOI] [PubMed] [Google Scholar]
- 13.Carlton EW, Than M, Cullen L, Khattab A, Greaves K. “Chest pain typicality” in suspected acute coronary syndromes and the impact of clinical experience. Am J Med 2015;128(10):1109–1116.e1102. [DOI] [PubMed] [Google Scholar]
- 14.Adamson PD, Newby DE, Hill CL, Coles A, Douglas PS, Fordyce CB. Comparison of international guidelines for assessment of suspected stable angina: insights from the PROMISE and SCOT-HEART. JACC Cardiovasc Imag 2018;11(9):1301–1310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Fordyce CB, Douglas PS, Roberts RS, et al. ; Prospective Multicenter Imaging Study for Evaluation of Chest Pain (PROMISE) Investigators. Identification of patients with stable chest pain deriving minimal value from noninvasive testing: the PROMISE Minimal-Risk Tool, a secondary analysis of a randomized clinical trial. JAMA Cardiol 2017;2(4):400–408. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Blankstein R, Divakaran S, Shaw L. When can we defer testing for patients with stable chest pain? JACC Cardiovasc Imag 2018;11(9):1311–1314. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Jang JJ, Bhapkar M, Coles A, et al. ; PROMISE Investigators. Predictive model for high-risk coronary artery disease. Circ Cardiovasc Imag 2019;12(2):e007940. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Genders TS, Steyerberg EW, Alkadhi H, et al. ; CAD Consortium. A clinical prediction rule for the diagnosis of coronary artery disease: validation, updating, and extension. Eur Heart J 2011;32(11):1316–1330. [DOI] [PubMed] [Google Scholar]
- 19.Singer AJ, Than MP, Smith S, et al. Missed myocardial infarctions in ED patients prospectively categorized as low risk by established risk scores. Am J Emerg Med 2017;35(5):704–709. [DOI] [PubMed] [Google Scholar]
- 20.National Institute for Health and Care Excellence. Chest pain of recent onset: assessment and diagnosis. March 2010. (updated Nov. 2016). https://www.nice.org.uk/guidance/cg95. [PubMed]
- 21.Brush JE Jr, Lee M, Sherbino J, Taylor-Fishwick JC, Norman G. Effect of teaching Bayesian methods using learning by concept vs learning by example on medical students’ ability to estimate probability of a diagnosis: a randomized clinical trial. JAMA Netw Open. 2019;2(12):e1918023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Cheng VY, Berman DS, Rozanski A, et al. Performance of the traditional age, sex, and angina typicality-based approach for estimating pretest probability of angiographically significant coronary artery disease in patients undergoing coronary computed tomographic angiography: results from the multinational coronary CT angiography evaluation for clinical outcomes: an international multicenter registry (CONFIRM). Circulation. 2011. Nov 29;124(22):2423–32, 1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Genders TS, Steyerberg EW, Hunink MG, et al. Prediction model to estimate presence of coronary artery disease: retrospective pooled analysis of existing cohorts. BMJ. 2012. Jun 12;344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Sekhri N, Perel P, Clayton T, et al. A 10-year prognostic model for patients with suspected angina attending a chest pain clinic. Heart. 2016. Jun 1;102(11):869–75. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Kong DF, Lee KL, Harrell FE Jr, et al. Clinical experience and predicting survival in coronary disease. Arch Intern Med 1989;149(5):1177–1181. [PubMed] [Google Scholar]
- 26.Foldyna B, Udelson JE, Karády J, et al. Pretest probability for patients with suspected obstructive coronary artery disease: re-evaluating Diamond-Forrester for the contemporary era and clinical implications: insights from the PROMISE trial. Eur Heart J Cardiovasc Imaging. 2019;20(5):574–581. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
