Abstract
We examined physician diagnostic certainty as one reason for cross-national medical practice variation. Data are from a factorial experiment conducted in the United States, the United Kingdom, and Germany, estimating 384 generalist physicians’ diagnostic and treatment decisions for videotaped vignettes of actor patients depicting a presentation consistent with coronary heart disease (CHD). Despite identical vignette presentations, we observed significant differences across health care systems, with US physicians being the most certain and German physicians the least certain (p < .0001). Physicians were least certain of a CHD diagnoses when patients were younger and female (p < .0086), and there was additional variation by health care system (as represented by country) depending on patient age (p < .0100) and race (p < .0021). Certainty was positively correlated with several clinical actions, including test ordering, prescriptions, referrals to specialists, and time to follow-up.
Keywords: clinical decision making, medical practice variation, health disparities
Variations in the provision of healthcare and associated health disparities are a topic of longstanding concern for policymakers, clinical providers, and health researchers alike. While differences in clinical decision making (CDM) have been observed for many conditions (1), coronary heart disease has received particular attention and has been shown repeatedly to result in differential diagnostic and treatment decisions by physicians (2, 3). Differences have been observed in the use of coronary revascularization services (4), hospitalization for hypertension (5), history taking (6), as well as gender differences in attributions of cardiac-related symptoms (7).
These variations in medical practice occur as a function of patient characteristics such as race (4, 5), age (5, 7, 8), socioeconomic status (9–11), gender (12–14), and comorbidity status (15), as well as provider and system attributes such as gender (16), attitudes toward aging (17), perceptions of pressure from patients (18); and practice culture (19). Studies of the United States, including especially the RAND Health Services Utilization Study (20) and the Dartmouth Atlas of Health Care (21, 22) project, have consistency documented geographic variations in healthcare and worked over the last two decades to understand why it occurs. Similar differences have been observed cross-nationally among the US, France, and England (23); the US and Canada (24); the United States and United Kingdom (25), among Eastern European countries (26), and in the SYMPHONY trial, a study of 37 countries (27).
How physicians process complex and varied sources of information has been a topic of interest for decades. Classic studies underscore the importance of threshold models for conceptualizing the triggers that prompt physicians to do further testing, and if warranted, provide treatment (28, 29). More recently, related literature in social psychology and economics examines how physicians’ cognitive processing—particularly prejudice, stereotyping, discrimination, and uncertainty—may bias their assessments of patients and decisions about their treatment (30–32). Low health literacy also contributes to difficulties in chronic illness management, and such challenges to doctor-patient communication may exacerbate physician uncertainty (33, 34). This literature, which is quite focused on providing micro-level explanations for observed variation, is often not equipped to also provide information about cross-national differences.
We build on these two literatures by examining how physicians’ diagnostic certainty functions in clinical decision making for coronary heart disease (CHD). Using cross-national experiment data, we are able to simultaneously (1) address the role of diagnostic certainty in observed variations and (2) estimate relative contributions of patient, provider, and health care system influences on clinical decision making.
From a Bayesian decision theory perspective, these variations in clinical decision making should be determined in part by the prevalence of the relevant condition in the larger population. In cross-national studies, for example, women have slightly higher prevalence of angina relative to men (despite higher mortality rates among men), a difference that should influence individual clinical decisions (35–37). Research on the socially constructed aspects of health statistics, however, suggests that biases in clinical decision making and medical treatment contribute independently to differences in some types of health statistics (38). Additional work has shown women have poorer outcomes after acute myocardial infarction (39) and in cardiovascular and diabetes care (13, 14), both after adjusting for covariates. From this perspective, differences in rates may not only reflect epidemiologic differences in underlying disease, but also cumulative interactions between patients and physicians.
Using data from a video vignette experiment, we examine the magnitude and sources of variation in CDM for identical presentations of CHD in three different health care systems: (1) the free market medical system in the United States, (2) the government-based National Health Service of the United Kingdom (40), and (3) the non-profit insurer based system used in Germany (41). This work expands existing social psychological research on CDM to include cross-national comparisons, as well as providing specific evidence for how uncertainty operates in CDM. Building on earlier work showing physicians in different countries had comparable rates of correct diagnoses, we find that diagnostic certainty has independent and unique effects on clinical decision making (25). We address the following research questions: (1) How certain were physicians of their diagnoses of CHD, and how did that vary by health care system? (2) Which types of physicians were the most certain? (3) Which types of patients elicited the highest certainty levels among physicians? (4) How did these patient and provider effects vary across three countries? (5) How was diagnostic certainty associated with subsequent patient management, such as information seeking, test ordering, prescribing, lifestyle recommendations, and referrals/follow-up?
DATA AND METHODS
We conducted a factorial experiment to simultaneously measure the effects of: (a) patient attributes (age, gender, race and socioeconomic status); (b) physician characteristics (gender and years of clinical experience); and (c) separate healthcare systems (the US, the UK, or Germany) on physician diagnostic certainty and subsequent medical decision making when providers are presented with identical signs and symptoms indicative of CHD. Experiments were conducted in the US (Massachusetts), the UK (the West Midlands, SE London and Surrey), and Germany (Northern Rhine/Westfalia region) (42). A full factorial of combinations of patient age (55 vs. 75), gender, race (white vs. black in the US; white vs. Afro Caribbean in the UK; and white only in Germany) and SES (lower vs. higher social class, depicted by occupation as a cleaner/janitor vs. a teacher) was used for the videotaped vignette scenarios (24 = 16 unique vignettes). The decision to omit the race factor from the German experiment was based on discussions with our German colleagues, who advised us that the physicians in our sampling area saw few Black patients in their everyday practices. Audios for the vignettes were dubbed into German and backtranslated to ensure accuracy. One of the 16 combinations was shown to each physician.
CHD was selected for the vignettes because: a) it is among the most common and costly problems presented by older patients to primary care providers (43); b) it represents a clinically well-defined medical condition; c) it admits a range of diagnostic, therapeutic and lifestyle actions; and d) its reported prevalence differs among the US, the UK, and Germany. Scripts for the vignettes were developed from several tape-recorded role-playing sessions with experienced clinical advisors, and professional actors were trained (under experienced physician supervision) to realistically portray a patient presenting with the signs/symptoms of disease to a primary care provider. Patients in the vignette presented with signs and symptoms that were consistent with CHD, including chest pain worsening with exertion, pain in the back between the shoulder blades, stress, and elevated blood pressure. All patients included a non-verbal “Levine’s fist,” a well-known gesture indicative of cardiac pain. Because CHD is a spectrum condition, and live patients do not typically present as clear-cut textbook cases of specific conditions, the vignette also built in several red herring symptoms potentially indicative of a gastrointestinal diagnosis. To this end, the patient also complained of indigestion, feeling worse after a large or spicy meal, having pain similar to heartburn but unresponsive to antacids, and feeling full and “gassy.” This was done not to specifically make the physicians’ diagnostic task more difficult, but to more accurately represent how actual patients present, based on advice from our clinical advisors. The vignette also incorporated references to the patient’s mood, including the spouse’s report that the patient has been difficult to be around and the patient’s self-report of feeling irritated and having decreased energy (see Appendix A for an illustrative excerpt from the vignette).
After viewing the videotaped vignette, physicians were asked, “What do you think is going on with this patient?”, and for each possibility, they were asked for their level of certainty on a scale of 0–100 (where 0 = no certainty and 100 = complete certainty). Physicians were asked through an open-ended question from the interviewer to list their full set of differential diagnoses, and if CHD was present anywhere on that list they were counted as having considered the condition and the accompanying certainty level was included in the present analysis. Physicians were also asked how they would manage the patient in terms of asking for addition information, performing physical examination, ordering tests, prescribing medications, giving lifestyle advice, and referring to other physicians.
Based on the theoretical approach outlined above and our concerns about the social construction of epidemiologic base rates, we do not appeal to population rates of disease to ascertain the “correctness” of a given diagnosis. Rather, we assume that while a physician’s diagnostic priors (among many other sources of learning) may inform decision making, patient-specific clinical information should be used above and beyond those pre-existing base rates. Therefore, the vignette for this study was purposely designed to present a set of signs and symptoms sufficient to trigger a CHD diagnosis, regardless the epidemiologic prevalence in any sub-group of the population. In this sense, the vignette provides sufficient information to suspect the condition, regardless of the social characteristics of the patient.
The study used a probability sample of physicians selected from within each of four strata within each country. To be eligible for selection, physicians had to: (a) be internists or family practitioners in the US and in Germany, or general practitioners in the UK (to most accurately capture the types of non-specialist physicians most likely to treat undiagnosed cases of CHD in each country); (b) be trained at an accredited medical school in the country in which they practiced (no international medical graduates were included); and (c) be currently in clinical practice more than half-time. Within each country, physicians were stratified into four equal cells by gender and level of experience, with “less” experience defined as those with ≤12 years since graduation from medical school in the US or UK (≤ 7 years since licensure in Germany) and those with “more” experience having ≥22 years since graduation from medical school in the US or UK (≥17 years since licensure in Germany). These cutoff dates were chosen to act as a proxy for clinical experience (which has been shown to affect clinical decision making, as discussed above); to standardize as much as possible the amounts of clinical experience across countries; and to create a clear separation between the strata. Twelve strata of physician characteristics (gender, years of clinical experience [<12 or >22 years] and health care system [US/UK/Germany]) were defined, with 32 physicians included in each stratum from each country. This configuration generated a total of 384 physicians required to complete the design the experiment (16 vignettes x 12 physician strata x 2 replications = 384)(Table 1).
Table 1.
US (N=128 in Massachusetts) | UK (N=64 from The Midlands; N=64 from Surrey/SE London) | Germany (N=128) | |||||
---|---|---|---|---|---|---|---|
Level of experience | |||||||
| |||||||
Less* | More** | Less | More | Less | More | Total | |
| |||||||
Male | 32 | 32 | 32 | 32 | 32 | 32 | 192 |
Female | 32 | 32 | 32 | 32 | 32 | 32 | 192 |
Total | 64 | 64 | 64 | 64 | 64 | 64 | 384 |
| |||||||
Response rate | 64.9% | 59.6% | 65.0% |
“Less” experience is defined as those with ≤12 years since graduation from medical school in the US or UK (≤ 7 years since licensure in Germany).
“More” experience is defined as having ≥22 years since graduation from medical school in the US or UK (≥17 years since licensure in Germany).
Screening telephone calls were conducted to identify eligible subjects and an hour-long appointment was scheduled for the interviewer to come to the physician’s place of work to show the vignette and administer related questions. The interviews were conducted in 2001–2 (128 throughout Massachusetts, 64 from The Midlands and 64 throughout Surrey and SE London, England) and in 2005–6 in Germany; no more than one physician was selected from each practice. Each physician subject was provided a modest stipend to partially offset lost revenue [$100 (US), £50 (UK), 100 Euros (Germany)] and to acknowledge his or her participation. Quality control interviews and site visits were conducted and selected tape-recorded interviews were reviewed by supervisors on a regular basis.
Analysis of variance was used to test the main effects and two-way interactions of the design variables (patient gender, race, age, and SES, and physician gender and level of experience) on the diagnostic certainty (0–100, with 0 for not at all certain and 100 for completely certain). If the physician did not make a CHD diagnosis, his or her certainty for these diagnoses was set to 0. To determine the effect of certainty on clinical decision making, logistic regression was used for dichotomous variables (e.g., whether or not an EKG was ordered) and analysis of covariance was used for continuous variables (e.g., number of days to next appointment). Each model included as explanatory variables the design variables, certainty, and the interaction of the design variables and certainty. Using backwards elimination, non-significant effects (at the 0.05 level) other than certainty were removed from the model, leaving a parsimonious model. Due to the challenges of multiple testing, we emphasize consistency across results and focus on identifying general patterns of physician certainty and treatment decisions. Furthermore, the results we observe at the p < .01 level are unlikely to change. To facilitate interpretation, we present actual p-values, unadjusted for multiple testing. To further facilitate interpretation of results, we indicate in Tables 2 and 3 the number of expected and observed significant results.
Table 2.
Experimental Factor | Mean Certainty | p-value | ||
---|---|---|---|---|
Patient Gender | .0058 | |||
Male | 51.8 | |||
Female | 43.7 | |||
| ||||
Patient Age, years | .0003 | |||
55 | 42.4 | |||
75 | 53.1 | |||
| ||||
Patient Race (UK/US only) | .4024 | |||
Black | 50.7 | |||
White | 53.6 | |||
| ||||
Patient SES | .1998 | |||
Lower | 45.9 | |||
Upper | 49.6 | |||
| ||||
Physician Gender | .7221 | |||
Male | 48.3 | |||
Female | 47.2 | |||
| ||||
Physician Experience | .1575 | |||
Less | 45.7 | |||
More | 49.8 | |||
| ||||
Practice Setting (Country) | <.0001 | |||
Germany | 38.9 | |||
UK | 46.4 | |||
US | 57.9 | |||
| ||||
Patient Gender x Patient Age | .0086 | |||
Male | 55 | 50.3 | ||
75 | 53.3 | |||
Female | 55 | 34.5 | ||
75 | 52.9 | |||
| ||||
Patient Race x Practice Setting (Country) | .0021 | |||
(UK/US only for p-value) | Germany | White | 38.9 | |
UK | Black | 50.3 | ||
UK | White | 42.5 | ||
US | Black | 51.2 | ||
US | White | 64.7 | ||
| ||||
Patient Race x Level of Experience (UK/US only) | .0500 | |||
Black | less | 44.5 | ||
Black | more | 56.9 | ||
White | less | 54.1 | ||
White | more | 53.0 | ||
| ||||
Patient Age x Practice Setting (Country) | .0100 | |||
55 | Germany | 27.6 | ||
55 | UK | 42.2 | ||
55 | US | 57.3 | ||
75 | Germany | 50.3 | ||
75 | UK | 50.5 | ||
75 | US | 58.5 | ||
| ||||
Expected significant | 1.4 | |||
Observed significant | 7 |
Table 3.
3a. Continuous Variables (Change per 10 point increase in certainty) | |||
---|---|---|---|
Variable | Change per 10 point increase in certainty | 95% confidence interval | P |
Number of questions | −0.04 | −0.15, 0.07 | .4178 |
Number of examinations | 0.06 | −0.02, 0.14 | .1137 |
Number of tests for CHD | |||
Germany | 0.02 | −0.06, 0.10 | .5815 |
UK | 0.56 | 0.48, 0.64 | <.0001 |
US | 0.39 | 0.31, 0.47 | <.0001 |
Time to next appointment (days) | −0.44 | −0.68, −0.20 | .0003 |
Number of pieces of lifestyle advice | −0.02 | −0.07, 0.03 | .5360 |
| |||
Expected significant | .25 | ||
Observed significant | 3 |
3b. Dichotomous Variables (Odds ratio per 10 point increase in certainty) | |||
---|---|---|---|
Variable | OR per 10 point increase in certainty | 95% confidence interval | p |
Information seeking
| |||
4 or more questions | 1.00 | 0.92, 1.08 | .9332 |
Questions about: | |||
pathology | 0.96 | 0.89, 1.03 | .2704 |
medical history | 1.06 | 0.98, 1.14 | .1675 |
pain smoking | 1.03 | 0.96, 1.11 | .4437 |
lower SES patient | 1.14 | 1.04, 1.25 | .0065 |
upper SES patient | 1.00 | 0.91, 1.09 | .9417 |
alcohol | 0.96 | 0.87, 1.05 | .3215 |
psychological state | 0.91 | 0.84, 0.98 | .0163 |
social questions | 0.88 | 0.81, 0.96 | .0062 |
general questions | 1.10 | 1.01, 1.19 | .0232 |
| |||
Physical Examination
| |||
Complete physical | 0.99 | 0.90, 1.09 | .8828 |
| |||
Test ordering
| |||
Order tests for CHD | |||
Germany | 1.06 | 0.97, 1.17 | .1714 |
UK | 1.86 | 1.53, 2.24 | <.0001 |
US | 1.91 | 1.56, 2.33 | <.0001 |
Stress test | 1.34 | 1.22, 1.47 | <.0001 |
ECG/EKG | |||
Germany | 1.46 | 1.28, 1.66 | <.0001 |
UK | 1.31 | 1.18, 1.45 | <.0001 |
US | 1.95 | 1.61, 2.36 | <.0001 |
| |||
Prescription writing
| |||
CHD appropriate prescription | 1.53 | 1.38, 1.70 | <.0001 |
| |||
Referrals
| |||
Referral to cardiologist | 1.23 | 1.12, 1.36 | <.0001 |
Referral to other medical | 0.93 | 0.85, 1.00 | .0641 |
professional | |||
| |||
Advice giving
| |||
Advice about: | |||
Diet | 0.97 | 0.91, 1.04 | .4291 |
Smoking | 1.00 | 0.92, 1.09 | .9896 |
Alcohol | 0.94 | 0.86, 1.02 | .1446 |
Relaxation | 0.91 | 0.81, 1.03 | .1434 |
Exercise | 1.07 | 0.94, 1.22 | .3009 |
Weight | 0.99 | 0.85, 1.17 | .9424 |
| |||
Expected significant | 1.1 | ||
Observed significant | 12 |
RESULTS
1. How certain were physicians and how did that vary by practice setting (country)?
Across all three countries, the vast majority of physicians correctly considered CHD (74.2% in Germany, 88.3% in the UK, 95.3% in the US, and 85.9% overall, p < .0001), yet there were also significant differences in how many physicians failed to consider CHD in each health care system, with a 21.1% difference in rates between Germany and the US.
Physicians’ certainty levels for CHD diagnosis varied from 0 to 100, with an average of 52.1 (see Figure 1). Again, there was significant variation across countries, with the US physicians having the highest average certainty (57.9), followed by the UK (46.4) and Germany (38.9) (p<.0001)(Table 2). Using Tukey’s Method of multiple comparisons, we found that the level of certainty in the US was significantly higher than that in either the UK or Germany, while the certainty levels were statistically comparable in the UK and Germany.
2. Which physicians were the most certain?
We next examined whether physician characteristics (gender and years of experience) were associated with certainty, independent of the health care system in which providers practiced (Table 2). While there were no significant main effects for physician characteristics, we did observe an interaction between patient race and physician level of experience (Figure 2a). With white patients, physicians had comparable certainty levels (54.1 vs. 53.0, less and more experienced, respectively). With black patients, however, more experienced physicians had increased certainty (56.9) while those with less experience were less certain (44.5).
3. With which patients were physicians most certain?
Independent of health care system differences, physician certainty varied significantly according to the gender of the patient, with physicians reporting higher average certainty levels with male versus female patients (51.8 vs. 43.7, p = .0058) and by age with higher certainty for older patients (53.1 vs. 42.4, p = .0003). We also observed an interaction between patient age and gender, such that physicians were much less certain in making a CHD diagnosis for younger women (34.5) with otherwise identical symptom presentation (p = .0086) (Figure 2b). There were no main effects of patient race or SES on certainty (Table 2).
4. How did patient and provider effects vary across countries?
In addition to the main effect differences between countries, we also observed variation between countries according to characteristics of the presenting patient (Table 2). First, we observed an interaction between patient age and health care system, with US physicians having comparable diagnostic certainty for younger (55-year-old) and older (77-year-old) patients (57.3 vs. 58.5), German physicians having the greatest difference in certainty for the two types of patients (27.6 vs. 50.3), and UK physicians falling in between (42.2 vs. 50.5) (p = .0100) (Figure 2c). Second, there was an interaction between health care system and patient race. While physicians in the US and UK had similar certainty levels for black patients (51.2 vs. 50.3), their certainty levels diverged when the patient was white, with US physicians having increased certainty and UK physicians being less certain (42.5 vs. 64.7, p = .0015) (Figure 2d).
5. What was the effect of certainty on clinical decision making?
In turn, physicians’ diagnostic certainty for CHD significantly influenced their subsequent diagnostic and therapeutic clinical actions (Tables 3a and 3b). Logistic regression results showed that for each ten point increase in certainty for a CHD diagnosis, physicians were less likely to ask questions about the patient’s psychological state (OR 0.91, p = .0163) or social environment (OR 0.88, p = .0062), but they were more likely to ask about other general information (OR 1.10, p = .0232) (Table 3b). We also observed an interaction of patient SES and certainty on the odds of asking questions about smoking. As the certainty of the CHD diagnosis increased, physicians were more likely to ask lower SES patients about smoking (OR 1.14, p = .0065), but certainty had no effect on the likelihood of physicians asking upper SES patients about smoking.
Certainty was also significant for test ordering behavior. For example, as certainty of a CHD diagnosis increased, physicians from the US and the UK were significantly more likely to order at least one diagnostic test for CHD (p <.0001), while this relationship did not hold for German physicians (Table 3a). Furthermore, as certainty increased, UK and US physicians ordered greater numbers of CHD-related tests (p < .0001), while certainty did not significantly affect the number of CHD-related tests that the German physicians ordered (Table 3a). However, increased certainty was associated with increased likelihood of physicians from all three countries ordering stress tests and ECG/EKG tests (p < .0001), and this effect was the strongest in the US (OR 1.95, p<.0001)(Table 3b).
Increased certainty was associated with higher odds of writing a CHD-appropriate prescription (beta blockers, calcium channel blockers, aspirin, and short acting nitrates) (OR 1.53, p <. 0001) in all three countries. Higher certainty of a CHD diagnosis also increased the odds of a physician referring the patient to a cardiologist (OR 1.23, p < .0001). Finally, increased certainty was significant for predicting how soon a physician would request to see the patient again (p = .0003), with a repeat visit requested 0.44 days sooner per 10 point increase in certainty.
DISCUSSION
We observed significant differences between health care systems, with US physicians having the highest levels of certainty. As expected, physicians were least certain of their CHD diagnoses with younger female patients (12, 44). In addition, there was racial variation depending on the physician’s level of experience and variation by health care system depending on patient age and race. Increased certainty was associated with differences in information seeking as well as increased test ordering, prescriptions, referrals to cardiologists, and shorter time to follow up.
Previous studies of medical practice variation have largely focused on either system-level social and economic patterns in CDM or on physicians’ individual-level cognitive processing as mechanisms that may generate such variation. The factorial design of our experiment allowed for unconfounded estimates of the simultaneous effects of patient characteristics, provider attributes, and healthcare systems (as represented by country) on physicians’ certainty of CHD diagnoses and their subsequent clinical actions. With our analytic approach, we were able to capture physicians at all points on the certainty spectrum rather than excluding those who did not list CHD among their differential diagnosis selections. While those who did not consider CHD among their differential diagnosis selections did not provide an explicit certainty value, we know their certainty concerning the presence of CHD was low because they were allowed to list a full set of diagnoses (we recognize that physicians may have high certainty that CHD is absent from the vignette, but this is beyond the scope of the present analysis). Therefore, if they were treated as missing they would not be missing at random. By increasing the variability, our estimates are rendered more conservative due to a decrease in power.
The sample size of 384 allows us to detect a difference in certainty of 12 points with 97% power, assuming a standard deviation of 30 (which, assuming a unimodal Beta distribution of 0–1, is the upper bound of a standard deviation). That is, a true 12 point difference in certainty between two groups will be detected 97 percent of the time at α = 0.05. Because the experiment was replicated, a pure error term with 192 degrees of freedom was used to test all effects using analysis of variance. Due to the omission of the race factor for the German experiment (explained above), only the US and UK data are considered for the effect of patient race on certainty. For all other analyses, all data were considered.
The main effects related to patient characteristics (decreased certainty with female and younger patients) partially corroborate Bayesian perspectives and studies of uncertainty and statistical discrimination (31, 45) suggesting that when physicians are uncertain, they are likely to make diagnostic decisions that are consistent with existing epidemiologic base rates. In this case prior assumptions overwhelm the presenting patient-specific data, thereby contributing to the reification of some types of existing health statistics. Most importantly, these findings extend previous work by showing that certainty—not simply identifying a diagnosis, but having diagnostic certainty about that condition—has an independent effect on clinical actions (46). Therefore, these results suggest that having CHD on the differential diagnosis list is necessary, but not sufficient, for physicians to take appropriate therapeutic actions; this result is consistent with the notion that physicians need to pass certainty thresholds in order to order more tests or treat a patient (28). By extension, improving disparities in CHD outcomes is not just a matter of physicians learning to more appropriately consider CHD in specific populations (for example, women), but also to be able to do so with sufficient certainty to trigger appropriate clinical actions to improve morbidity and mortality outcomes.
However, these perspectives do not fully explain our results and the persistence of some between-health care system differences implies that features of the broader sociological, cultural, and organizational environments are also relevant to decision making. Beyond diagnostic certainty, observed cross-national variation in CHD diagnosis and treatment may also be a function of differences in a series of influences that are beyond the scope of the present analysis or our study more generally. These include both patients’ and physicians’ cultural expectations for medical practice and treatment as well as biological variations in the prevalence of CHD and related conditions at a population level, such that physicians in different healthcare systems may be differentially equipped to identify CHD with certainty. For example, previous research has shown cross-cultural differences in the relationship between symptoms and underlying conditions (47, 48). Other possible explanations include funding mechanisms, expectations for physicians to achieve diagnostic certainty in a brief period of time, modes of practice, and access to resources across the three countries. For example, economic reimbursement policies may translate to more pressure on US physicians to achieve a firm diagnosis and management plan during the initial patient consultation so they can be paid, whereas in the UK and Germany physicians may tend to make these decisions over a series of consultations close together in time. Similarly, ready availability of technological equipment in Massachusetts in conjunction with a fear of lawsuits for missed diagnoses may lead to increased testing in the US relative to physicians in the UK or Germany, while increased rates of referral among UK and German physicians (25) may explain lower rates of testing and prescription treatment relative to the US. Increased regulation in the UK with pay-for-performance may also contribute to the U.K. physicians’ higher certainty relative to their German counterparts, where professional pressures and regulations are not yet as explicit as in the U.S. (49). Most generally, physician learning is known to be related to the local context of practice, such that physicians may either self-select into environments with practice styles like their own, or they may adapt to the local culture—either type of pattern could exist within the local contexts that were selected from each country for this study.
Every study represents a balance between internal and external validity. We recognize that our vignette-based approach has some limitations compared to studies of behavior in natural interaction—the most obvious being that physicians do not directly interact with the patient in the vignette. For present purposes, vignettes offer several key advantages over alternative methods: (1) they allow for the manipulation of several variables at once and the measurement of unconfounded effects, thereby “isolating physicians’ decision making from other factors in the environment” (50); (2) standardization of case mix; (3) vignette-based studies allow for the collection of a large amount of information simultaneously from a large number of subjects; (4) make efficient use of time; and (5) are cost-effective (for example, standardized patients would have been prohibitively expensive in this context). In a direct comparison of vignettes, standardized patients, and chart abstraction, Peabody and colleagues (51) validated the use of vignettes for studying quality of outpatient care, and studies comparing vignettes with standardized patients and other methods corroborate the result that vignettes are ecologically valid for studies of medical decision making (50, 52, 53).
We took four precautionary steps in an attempt to minimize possible threats to external validity and compensate for the artificial aspects of the experimental situation (i.e., that physicians may behave differently with a videotaped patient under experimental conditions compared with real patients in an everyday clinical setting). First, considerable effort was devoted to ensuring the clinical authenticity of the videotaped presentation. Expert clinical consultants were actively involved in all stages of the process, from early stages of role-playing and script development to final stages of film shooting, where they oversaw vignette filming to determine face and content validity. Second, the doctors viewed the tapes in the context of their practice day (often during their lunch periods) to maximize the likelihood that they encountered real patients before and after they viewed the patient in the videotape, and also so they were in a physical setting they associated with decision making. Third, the doctors were specifically instructed at the outset to view the patient as one of their own patients and to respond as they would typically respond in their own practice. In the U.S., 90.6% of the doctors considered the vignettes to be very or reasonably typical, 91.4% in the U.K. and 81.3% in Germany.
Our study has limitations that underscore the need for additional research. First, questions remain about how physicians cognitively process these cues from patients. Previous research in cognitive psychology has suggested that physicians often rely on pattern recognition as well as more analytic types of processing, such as Bayesian decision making, when evaluating patient cues (54, 55). However, these questions are beyond the scope of the present study and these data do not allow us to specify the exact cognitive and psychological processes physicians use when interpreting information from the vignettes. A similar study of US physicians (with the same CHD vignette) primed physicians to determine whether the under-diagnosis of CHD in some patient populations was due to physicians not considering that diagnosis, or considering it and then eliminating it from their differential diagnosis (56). Also beyond the scope of this study, but potentially related, is the question of differences in clinical decision making practices between family practitioners and internists, despite both groups being likely to treat the type of patient depicted in the vignette.
While the cross-national component of our study identified persistent differences between countries, there remains limited generalizability from each group of physicians to the entire population of physicians in their respective countries. In terms of statistical and clinical significance, our results are relatively modest and therefore limited in their ability to explain the wide range of cross-national variations that have been identified in existing literature. However, the possible explanations outlined above build on our current results and are promising avenues for future inquiry.
In summary, our findings underscore the role of uncertainty during the clinical decision making process in contributing to, or amplifying, CHD-related disparities. To the extent that inequalities are generated from within healthcare systems, researchers and policy makers should continue to develop interventions targeted at the level of the patient-physician encounter while also considering which broader, system-level factors influence individual physician behaviors. Because diagnostic certainty is so important for understanding subsequent clinical actions, our results also highlight the need for interventions to not only increase diagnostic accuracy, but also to increase certainty in order to lead to optimal therapeutic actions.
Acknowledgments
Financial support for this study was provided entirely by a grant from National Institutes of Health, National Institute on Aging (Grant #AG16747). The study sponsors had no involvement in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the paper for publication.
Appendix A. Illustrative excerpt from vignette script for CHD
Doctor: So, what brings you here today?
Patient: Actually, I’ve been having a fair amount of indigestion.
Doctor: Indigestion?
Patient: Yes.
Doctor: Can you describe it?
Patient: Yes, I just sort of get this feeling right here (rubs chest/stomach area) … it’s not really a pain, actually. It’s just really uncomfortable. It’s usually after a big meal, if I’ve eaten too much, or if the food’s particularly spicy, maybe. I sometimes feel kind of sick, too.
Doctor: Spicy foods, mostly, or does this happen after other foods too.
Patient: Actually, it’s probably more common after a big meal, and especially if I have to rush to get somewhere afterwards. I get sort of queasy. It’s really uncomfortable.
Doctor: How long has this been going on?
Patient: About three months.
Doctor: Would you say this is something new, or has this kind of thing happened before? Patient: No, this is new. Now, sometimes I do get heartburn, but that usually goes away with antacids.
Doctor: How is this similar or different from the heartburn?
Patient: Well, the heartburn’s right here too (massaging the same spot), but it burns. This other stuff is just, um, just a bad feeling. It’s hard to describe.
Doctor: Tell me about the last time this happened. Where were you?
Patient: (pauses, thinking) The last time? We had just gone out for a meal and were walking to the car, quickly I remember, it was chilly. And I started to get the feeling.
Doctor: What did you do?
Patient: I stopped. And tried to get a deep breath in.
Doctor: Did that seem to help?
Patient: Yes, a bit. My [wife/husband] came over and asked if I was OK. I said, sure, I’m fine and we got into the car. After a few minutes of driving, the feeling began to go away.
References
- 1.Institute of Medicine. Unequal Treatment: Confronting Racial and Ethnic Disparities in Healthcare. Washington, D.C: The National Academies Press; 2003. [PubMed] [Google Scholar]
- 2.Ayanian JZ, Epstein AM. Differences in the use of procedures between women and men hospitalized for coronary heart disease. N Engl J Med. 1991;325(4):221–5. doi: 10.1056/NEJM199107253250401. [DOI] [PubMed] [Google Scholar]
- 3.Schwartz LM, Fisher ES, Tosteson NA, Woloshin S, Chang CH, Virnig BA, et al. Treatment and health outcomes of women and men in a cohort with coronary artery disease. Arch Intern Med. 1997;157(14):1545–51. [PubMed] [Google Scholar]
- 4.Popescu I, Vaughan-Sarrazin MS, Rosenthal GE. Differences in mortality and use of revascularization in black and white patients with acute MI admitted to hospitals with and without revascularization services. JAMA. 2007;297(22):2489–95. doi: 10.1001/jama.297.22.2489. [DOI] [PubMed] [Google Scholar]
- 5.Holmes JS, Arispe IE, Moy E. Heart disease and prevention: race and age differences in heart disease prevention, treatment, and mortality. Med Care. 2005;43(3 Suppl):I33–41. [PubMed] [Google Scholar]
- 6.James TL, Feldman J, Mehta SD. Physician variability in history taking when evaluating patients presenting with chest pain in the emergency department. Acad Emerg Med. 2006;13(2):147–52. doi: 10.1197/j.aem.2005.08.007. [DOI] [PubMed] [Google Scholar]
- 7.Martin R, Gordon EE, Lounsbury P. Gender disparities in the attribution of cardiac-related symptoms: contribution of common sense models of illness. Health Psychol. 1998;17(4):346–57. doi: 10.1037//0278-6133.17.4.346. [DOI] [PubMed] [Google Scholar]
- 8.McKinlay JB, Potter DA, Feldman HA. Non-medical influences on medical decision-making. Soc Sci Med. 1996;42(5):769–76. doi: 10.1016/0277-9536(95)00342-8. [DOI] [PubMed] [Google Scholar]
- 9.Armstrong DL, Strogatz D, Wang R. United States coronary mortality trends and community services associated with occupational structure, among blacks and whites, 1984–1998. Soc Sci Med. 2004;58(11):2349–61. doi: 10.1016/j.socscimed.2003.08.030. [DOI] [PubMed] [Google Scholar]
- 10.Fincher C, Williams JE, MacLean V, Allison JJ, Kiefe CI, Canto J. Racial disparities in coronary heart disease: a sociological view of the medical literature on physician bias. Ethn Dis. 2004;14(3):360–71. [PubMed] [Google Scholar]
- 11.McKinlay JB, Link CL, Freund KM, Marceau LD, O’Donnell AB, Lutfey KE. Sources of Variation in Physician Adherence with Clinical Guidelines: Results from a Factorial Experiment. Journal of General Internal Medicine. 2007;22(3):289–96. doi: 10.1007/s11606-006-0075-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Arber S, McKinlay JB, Adams A, Marceau LD, Link CL, O’Donnell AB. Patient Characteristics and Inequalities in Doctors’ Diagnostic and Management Strategies relating to CHD: A Video-simulation Experiment. Social Science and Medicine. 2006;62(1):103–15. doi: 10.1016/j.socscimed.2005.05.028. [DOI] [PubMed] [Google Scholar]
- 13.Bird CE, Fremont AM, Bierman AS, Wickstrom S, Shah M, Rector T, et al. Does quality of care for cardiovascular disease and diabetes differ by gender for enrollees in managed care plans? Womens Health Issues. 2007;17(3):131–8. doi: 10.1016/j.whi.2007.03.001. [DOI] [PubMed] [Google Scholar]
- 14.Fremont AM, Correa-de-Araujo R, Hayes SN. Gender disparities in managed care: it’s time for action. Womens Health Issues. 2007;17(3):116–9. doi: 10.1016/j.whi.2007.04.001. [DOI] [PubMed] [Google Scholar]
- 15.Wexler DJ, Grant RW, Meigs JB, Nathan DM, Cagliero E. Sex disparities in treatment of cardiac risk factors in patients with type 2 diabetes. Diabetes Care. 2005;28(3):514–20. doi: 10.2337/diacare.28.3.514. [DOI] [PubMed] [Google Scholar]
- 16.Britt H, Bhasale A, Miles DA, Meza A, Sayer GP, Angelis M. The sex of the general practitioner: a comparison of characteristics, patients, and medical conditions managed. Med Care. 1996;34(5):403–15. doi: 10.1097/00005650-199605000-00003. [DOI] [PubMed] [Google Scholar]
- 17.Collins E, Katona C, Orrell M. Management of depression in the elderly by general practitioners: II. Attitudes to ageing and factors affecting practice. Fam Pract. 1995;12(1):12–7. doi: 10.1093/fampra/12.1.12. [DOI] [PubMed] [Google Scholar]
- 18.Armstrong D, Fry J, Armstrong P. Doctors’ perceptions of pressure from patients for referral. BMJ. 1991;302(6786):1186–8. doi: 10.1136/bmj.302.6786.1186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Curoe A, Kralewski J, Kaissi A. Assessing the cultures of medical group practices. J Am Board Fam Pract. 2003;16(5):394–8. doi: 10.3122/jabfm.16.5.394. [DOI] [PubMed] [Google Scholar]
- 20.Brook RH, Park RE, Chassin MR, Kosecoff J, Keesey J, Solomon D. Do Patient, Physician, and Hospital Characteristics Affect Appropriateness and Outcome of Selected Procedures? Santa Monica, CA: RAND; 1991. [Google Scholar]
- 21.Goodman DC, Grumbach K. Does having more physicians lead to better health system performance? JAMA. 2008;299(3):335–7. doi: 10.1001/jama.299.3.335. [DOI] [PubMed] [Google Scholar]
- 22.Wennberg JE, Fisher ES, Skinner JS. Geography and the debate over Medicare reform. Health Aff (Millwood) 2002;(Suppl Web Exclusives):W96–114. doi: 10.1377/hlthaff.w2.96. [DOI] [PubMed] [Google Scholar]
- 23.Weisz D, Gusmano MK, Rodwin VG. Gender and the treatment of heart disease in older persons in the United States, France, and England: a comparative, population-based view of a clinical phenomenon. Gend Med. 2004;1(1):29–40. doi: 10.1016/s1550-8579(04)80008-1. [DOI] [PubMed] [Google Scholar]
- 24.Pilote L, Saynina O, Lavoie F, McClellan M. Cardiac procedure use and outcomes in elderly patients with acute myocardial infarction in the United States and Quebec, Canada, 1988 to 1994. Med Care. 2003;41(7):813–22. doi: 10.1097/00005650-200307000-00005. [DOI] [PubMed] [Google Scholar]
- 25.McKinlay J, Link C, Arber S, Marceau L, O’Donnell A, Adams A, et al. How do Doctors in Different Countries Manage the Same Patient? Results of a Factorial Experiment Health Services Research. 2006;41(6):2182–2200. doi: 10.1111/j.1475-6773.2006.00595.x. [Erratum in 41(6):2303.] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Gurjeva OS, Bukhman G, Murphy S, Cannon CP. Treatment and outcomes of eastern Europeans with coronary syndromes in OPUS-TIMI 16. Int J Cardiol. 2005;100(1):101–7. doi: 10.1016/j.ijcard.2004.08.075. [DOI] [PubMed] [Google Scholar]
- 27.Kramer JM, Newby LK, Chang WC, Simes RJ, Van de Werf F, Granger CB, et al. International variation in the use of evidence-based medicines for acute coronary syndromes. Eur Heart J. 2003;24(23):2133–41. doi: 10.1016/j.ehj.2003.09.018. [DOI] [PubMed] [Google Scholar]
- 28.Pauker SG, Kassirer JP. The threshold approach to clinical decision making. N Engl J Med. 1980;302(20):1109–17. doi: 10.1056/NEJM198005153022003. [DOI] [PubMed] [Google Scholar]
- 29.Pauker SG, Kassirer JP. Decision analysis. N Engl J Med. 1987;316(5):250–8. doi: 10.1056/NEJM198701293160505. [DOI] [PubMed] [Google Scholar]
- 30.van Ryn M. Research on the provider contribution to race/ethnicity disparities in medical care. Med Care. 2002;40(1 Suppl):I140–51. doi: 10.1097/00005650-200201001-00015. [DOI] [PubMed] [Google Scholar]
- 31.Balsa AI, McGuire TG, Meredith LS. Testing for statistical discrimination in health care. Health Serv Res. 2005;40(1):227–52. doi: 10.1111/j.1475-6773.2005.00351.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Lutfey KE, Ketcham JD. Patient and provider assessments of adherence and the sources of disparities: evidence from diabetes care. Health Serv Res. 2005;40(6 Pt 1):1803–17. doi: 10.1111/j.1475-6773.2005.00433.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Villaire M, Mayer G. Chronic illness management and health literacy: an overview. J Med Pract Manage. 2007;23(3):177–81. [PubMed] [Google Scholar]
- 34.Safeer RS, Keenan J. Health literacy: the gap between physicians and patients. Am Fam Physician. 2005;72(3):463–8. [PubMed] [Google Scholar]
- 35.Bittner V. Angina pectoris: reversal of the gender gap. Circulation. 2008;117(12):1505–7. doi: 10.1161/CIRCULATIONAHA.108.764217. [DOI] [PubMed] [Google Scholar]
- 36.Hemingway H, Langenberg C, Damant J, Frost C, Pyorala K, Barrett-Connor E. Prevalence of angina in women versus men: a systematic review and meta-analysis of international variations across 31 countries. Circulation. 2008;117(12):1526–36. doi: 10.1161/CIRCULATIONAHA.107.720953. [DOI] [PubMed] [Google Scholar]
- 37.Shaw LJ, Lewis JF, Hlatky MA, Hsueh WA, Kelsey SF, Klein R, et al. Women’s Ischemic Syndrome Evaluation: current status and future research directions: report of the National Heart, Lung and Blood Institute workshop: October 2–4, 2002: Section 5: gender-related risk factors for ischemic heart disease. Circulation. 2004;109(6):e56–8. doi: 10.1161/01.CIR.0000116210.70548.2A. [DOI] [PubMed] [Google Scholar]
- 38.McKinlay JB. Some contributions from the social system to gender inequalities in heart disease. J Health Soc Behav. 1996;37(1):1–26. [PubMed] [Google Scholar]
- 39.Marrugat J, Gil M, Sala J. Sex differences in survival rates after acute myocardial infarction. J Cardiovasc Risk. 1999;6(2):89–97. doi: 10.1177/204748739900600205. [DOI] [PubMed] [Google Scholar]
- 40.Light DW. Universal health care: lessons from the British experience. Am J Public Health. 2003;93(1):25–30. doi: 10.2105/ajph.93.1.25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Altenstetter C. Insights from health care in Germany. Am J Public Health. 2003;93(1):38–44. doi: 10.2105/ajph.93.1.38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Fisher RA. Statistical Methods, Experimental Design and Scientific Inference. New York: Oxford University Press; 1990. [Google Scholar]
- 43.Cohen JW, Krauss NA. Spending and Service Use Among People with the Fifteen Most Costly Medical Conditions. Health Affairs. 2003;22 (2):129–38. doi: 10.1377/hlthaff.22.2.129. [DOI] [PubMed] [Google Scholar]
- 44.Arber S, McKinlay JB, Adams A, Marceau LD, Link CL, O’Donnell AB. Influence of Patient Characteristics on Doctors’ Questioning and Lifestyle Advice for Coronary Heart Disease: A UK/US Video Experiment. British Journal of General Practice. 2004;54(506):673–678. [PMC free article] [PubMed] [Google Scholar]
- 45.Balsa AI, McGuire TG. Statistical discrimination in health care. J Health Econ. 2001;20(6):881–907. doi: 10.1016/s0167-6296(01)00101-1. [DOI] [PubMed] [Google Scholar]
- 46.Lutfey KE, Link CL, Grant RW, Marceau LD, McKinlay JB. Is certainty more important than diagnosis for understanding race and gender disparities?: An experiment using coronary heart disease and depression case vignettes. Health Policy. 2008 doi: 10.1016/j.healthpol.2008.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Staniland JR, Clamp SE, de Dombal FT, Solheim K, Hansen S, Ronsen K, et al. Presentation and diagnosis of patients with acute abdominal pain: comparisons between Leeds, U.K. and Akershus county, Norway. Ann Chir Gynaecol. 1980;69(6):245–50. [PubMed] [Google Scholar]
- 48.Wigton RS. Social Judgement Theory and Medical Judgement. Thinking and Reasoning. 1996;2(2/3):175–90. [Google Scholar]
- 49.Campbell S, Reeves D, Kontopantelis E, Middleton E, Sibbald B, Roland M. Quality of primary care in England with the introduction of pay for performance. N Engl J Med. 2007;357(2):181–90. doi: 10.1056/NEJMsr065990. [DOI] [PubMed] [Google Scholar]
- 50.Veloski J, Tai S, Evans AS, Nash DB. Clinical vignette-based surveys: a tool for assessing physician practice variation. Am J Med Qual. 2005;20(3):151–7. doi: 10.1177/1062860605274520. [DOI] [PubMed] [Google Scholar]
- 51.Peabody JW, Luck J, Glassman P, Dresselhaus TR, Lee M. Comparison of vignettes, standardized patients, and chart abstraction: a prospective validation study of 3 methods for measuring quality. JAMA. 2000;283(13):1715–22. doi: 10.1001/jama.283.13.1715. [DOI] [PubMed] [Google Scholar]
- 52.Dresselhaus TR, Peabody JW, Lee M, Wang MM, Luck J. Measuring compliance with preventive care guidelines: standardized patients, clinical vignettes, and the medical record. J Gen Intern Med. 2000;15(11):782–8. doi: 10.1046/j.1525-1497.2000.91007.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Robra BP, Kania H, Kuss O, Schonfisch K, Swart E. Determinants of hospital admission--investigation by case vignettes. Gesundheitswesen. 2006;68(1):32–40. doi: 10.1055/s-2005-858903. [DOI] [PubMed] [Google Scholar]
- 54.Brooks LR, LeBlanc VR, Norman GR. On the difficulty of noticing obvious features in patient appearance. Psychol Sci. 2000;11:112–7. doi: 10.1111/1467-9280.00225. [DOI] [PubMed] [Google Scholar]
- 55.Eva KW, Brooks LR. The underweighting of implicitly generated diagnoses. Acad Med. 2000;75(Supplement):81–83. doi: 10.1097/00001888-200010001-00026. [DOI] [PubMed] [Google Scholar]
- 56.Lutfey KE, Eva KW, Gerstenberger E, Link CL, McKinlay JB. Unpublished manuscript. 2008. The Cognitive Basis of Diagnostic and Treatment Disparities in Coronary Heart Disease: Results of a Factorial Experiment. [DOI] [PMC free article] [PubMed] [Google Scholar]