Abstract
Objective
To identify styles of physician decision making (as opposed to singular clinical actions) and to analyze their association with variations in the management of a vignette presentation of coronary heart disease (CHD).
Data Source
Primary data were collected from primary care physicians in North and South Carolina.
Study Design
In a balanced factorial experimental design, primary care physicians viewed one of 16 (24) video vignette presentations of CHD and provided detailed information about how they would manage the case.
Data Collection Method
256 MD primary care physicians were interviewed face-to-face in North and South Carolina.
Principal Findings
We identify three clusters depicting unique styles of CHD management that are robust to controls for physician (gender and level of experience) and patient characteristics (age, gender, socioeconomic status, and race) as well as key organizational features of physicians' work settings. Physicians in Cluster 1 “Cardiac” (N = 92) were more likely to focus on cardiac issues compared with their counterparts; physicians in Cluster 2 “Talkers” (N = 93) were more likely to give advice and take additional medical history; whereas physicians in Cluster 3 “Minimalists” (N = 71) were less likely than their counterparts to take action on any of the types of management behavior.
Conclusions
Variations in styles of decision making, which encompass multiple outcome variables and extend beyond individual-level demographic predictors, may add to our understanding of disparities in health quality and outcomes.
Keywords: Medical decision making, medical practice variation, cluster analysis, vignettes, coronary heart disease
Medical practice variation has a long history as a topic of interest for social science and health services researchers, with recent empirical investigations (McKinlay et al. 2006; Bernheim et al. 2008; Franks and Fiscella 2008; Lee et al. 2008; Muroff et al. 2008; Shackelton et al. 2009) and health policy strategies (Institute of Medicine 2001, 2003; Icks et al. 2007) increasingly turning to provider decision making as one potential contributor to observed health disparities. This literature generally seeks to understand well-documented patterns wherein physicians make different diagnostic and treatment decisions based on nonmedical factors, including patient characteristics (such as race, gender, age, socioeconomic status) (Arber et al. 2006), but also providers' individual attributes (gender, level of experience, specialty, place of training) (Shackelton-Piccolo et al. 2011) and the characteristics of the health care settings in which they work, such as practice culture (Kralewski et al. 2005a,b), work stress (Siegrist et al. 2010), presence of health information technology (Ketcham et al. 2009), and country (von dem Knesebeck et al. 2008). In some cases, this variation amounts to differences in quality as measured by health care processes, such as whether a patient receives a specific test (e.g., hemoglobin A1c) or meets guideline criteria (e.g., average glucose or lipid levels). However, to the extent that medical decision making reinforces health disparities that accrue as a function of broader social differences, and reinforces bias in epidemiologic bases rates used to justify medical decisions, decision making has the potential to amplify or exacerbate broader differentials in health outcomes and therefore requires further investigation.
For example, much of this research focuses on variation in diagnostic rather than management decisions, with an overall effect that has been criticized as too narrow to accurately or fully understand medical decision making (Cook 2010). Furthermore, technical health care process outcomes are typically measured by single measures, either the presence of a diagnostic label or the implementation of a specific treatment defined by practice guidelines as appropriate (e.g., hospitalization, revascularization), and these processes are assumed to have predictive value for health outcomes of individuals. Variation in these process outcomes, by extension, is predicted based on demographic characteristics of physicians and patients, while less is known about how modifiable factors such as attitudinal and behavioral differences predict differential diagnosis and management. As a result, the growing literature on the role of clinical decision making in the generation or exacerbation of health disparities, while rich in these aspects, is increasingly fragmented and descriptive. There is a need for research that integrates and synthesizes broader patterns of decision making, not only in terms of understanding what providers do across a range of process outcomes but also what kind of providers engage in which styles of management, and how they think through those decisions.
We build on earlier research using cluster analysis to examine how physicians' styles of management influence differential treatment decisions. We use data from a video vignette factorial experiment wherein a wide range of outcome data were collected, ranging from additional information physicians would seek in a medical history, to tests they would order, to (among others) referrals they would make. Rather than examining how individual patient or provider characteristics would influence specific outcomes (e.g., whether male physicians are more likely to order EKG tests for female patients), we conducted cluster analysis to identify types or styles of management behavior physicians exhibited. Because these outcomes are generated in a balanced factorial experiment, we are able to adjust for patient and physician characteristics, all in the context of a standardized vignette presentation for coronary heart disease (CHD). The cluster analysis therefore accounts for variation that is unexplained in ANOVA analysis where outcomes are treated separately.
Methods
Data were collected from primary care physicians in the United States as part of a larger factorial experiment designed to simultaneously measure the effects of: (1) patient attributes (age, gender, race, and socioeconomic status); (2) physician characteristics (gender and years of clinical experience); and (3) cognitive priming status on medical decision making for an actor “patient” presenting with CHD in a videotaped vignette. The priming manipulation was done to determine whether previously observed gender differences in the diagnosis and management of CHD were the result of physician not actively considering CHD for women patients, or considering it and then discounting its likelihood. To examine this issue, half of all physicians were primed (i.e., explicitly directed) to consider a CHD diagnosis. For those who were selected at random for priming at the point when the physician was enrolled in the study, the interviewer read a cover story indicating the patient was recently on vacation and sought medical advice for his/her symptoms, at which point she/he was advised about the possibility of CHD and told to see her/his regular primary care provider upon returning home. A full factorial of 24 = 16 combinations of patient age (55 vs. 75), gender, race (black vs. white), and socioeconomic status (SES) (lower vs. higher, depicted by current or former employment as a janitor or school teacher) were used for the video scenarios. One of the 16 combinations was shown to each physician.
To be eligible for selection, physicians had to (1) be internists or family practitioners with M.D. degrees; (2) have graduated from medical school between 1996–2001 or 1960–1987 (to obtain clear separation between higher and lower levels of experience so we could stratify physicians accordingly); and (3) be currently working in primary care in North or South Carolina more than halftime. A letter of introduction was mailed to prospective participants and screening telephone calls were conducted to identify eligible physicians. An appointment was scheduled with each eligible participant for a one-on-one, structured interview, lasting 1 hour. Physicians were recruited into four strata, including two genders and two levels of experience (22 = 4 combinations of physician characteristics). Due to logistical constraints and the challenges in recruiting sufficient numbers of physicians in all cells, we were unable to include physician race as a factor in the study design. The required 256 interviews (16 vignettes × 4 combinations of physician attributes × 2 priming conditions × 2 replications) were conducted over a period of 10 months in 2006–2007 via purposive sampling from state-wide physician lists. Purposive sampling was employed because the imperative was to fill experiment design cells to preserve orthogonality, which was a priority over population generalizability. Generalizability results from the randomization of the vignettes. IRB approval for the study was obtained, and signed informed consents were collected from each participating physician. Each physician subject was provided a stipend ($200) to partially offset lost revenue and to acknowledge their participation.
The medical condition (CHD) was selected because (1) it is among the most common and costly problems presented by older patients to primary care providers (Cohen and Krauss 2003); (2) it represents an example of a well-defined organic medical condition; (3) it admits a range of diagnostic, therapeutic, and lifestyle actions; and (4) it is an extensively studied condition in which variations in diagnostic and treatment decisions have been repeatedly demonstrated. The script was developed from several tape-recorded role-playing sessions with experienced clinical advisors. Patients in the vignette presented with signs and symptoms strongly suggestive of CHD, including chest pain worsening with exertion, pain in the back between the shoulder blades, stress, and elevated blood pressure. The vignette scripts, including nonverbal gestures, were identical for each vignette condition. Because live patients do not typically present as clear-cut textbook cases of specific conditions, the vignette also built in several red herring symptoms potentially indicative of a gastrointestinal (GI) diagnosis. To this end, the patient also complained of indigestion, feeling worse after a large or spicy meal, having pain similar to heartburn but unresponsive to antacids, and feeling full and “gassy.” This was done not specifically to make the physicians' diagnostic task more difficult, but to more accurately represent how actual patients present, based on advice from our clinical advisors. The vignette also incorporated references to the patient's mood, including the spouse's report that the patient has been difficult to be around and the patient's self-report of feeling irritated and having decreased energy.
We took four precautionary steps in an attempt to minimize possible threats to external validity, or the possibility that physicians would behave differently in the experiment than in everyday life. First, considerable effort was devoted to ensuring the clinical authenticity of the videotaped presentation. This was achieved by basing the scripts on clinical experience, filming with experienced clinicians present, and using professional actors/actresses. Second, the doctors viewed the vignette in the context of their practice day (not at a professional meeting, a course update, or in their home) so that it was likely they encountered real patients before and after they viewed the patient in the videotape. Third, the doctors were specifically instructed at the outset to view the patient as one of their own patients and to respond as they would typically respond in their own practice. For present purposes, a critical benefit of vignettes is that they allow for the manipulation of several variables at once and for the measurement of unconfounded effects, thereby “isolating physicians' decision making from other factors in the environment” (Veloski et al. 2005).
To assess management decision making, physicians were asked a series of structured interview questions regarding how they would treat the patient in terms of additional information they would request, physical examinations they would perform, tests they would order, medications they would prescribe, lifestyle advice they would provide, and other physicians to whom they would refer. Questions were asked in an open-ended format, and responses, which were generally provided in a list format (e.g., “I would prescribe X, Y, Z medications”), were recorded verbatim by interviewers. These responses were subsequently coded in-house by a senior project manager and the first author, based on a coding scheme developed with clinical consultants and with their input when clarification was required. Because the information provided was somewhat technical, our coding scheme was developed with a combination of inductive and deductive approaches, combining existing templates for test and medication lists, compiled from sources such as insurance and electronic medical records, with an examination of the responses we found in the data to make sure everything fit in a logical category. We further refined the detail levels of coding categories based on the number of responses so that we had sufficient numbers in each group to conduct quantitative analyses (e.g., collapsing medications into classes). The project manager then coded as much as she was able and identified any questions she had, and the first author reviewed the codes, coded outstanding questions as possible, and any remaining queries were directed to physician consultants for final disposition. Any disagreements were discussed in person and resolved when both parties agreed. In the final portion of the interview, physicians completed a 10-minute self-administered questionnaire about their practice settings and medical education, including the questions about practice culture, time constraints, and use of health information technology.
The results from the previously published (Lutfey et al. 2010) results of the main effects focused on certainty of a CHD diagnosis, measured on a scale of 0–100 (where 0 indicated a complete lack of certainty and 100 was absolute certainty). This analysis showed that patient gender and age (but not race) differences in CHD diagnosis persist even after physicians have been primed to consider a CHD diagnosis (that is, they consider CHD as a possibility but still discount women and younger patients' likelihood of having CHD, although these differences do not exist in our data for race). Based on these patterns, the present analysis is focused on cluster analysis using clinical management actions, rather than CHD diagnostic certainty, to explain why we see systematic differences in how these vignette patients are managed. The focus on management and the use of clusters allow for an examination of broader styles of action rather than fragmented reports of how demographic characteristics are associated with single process measures.
Analytic Strategy
A physician's decision-making process is known in these data to be influenced by the type of patient seen and the physician's gender and level of experience (Lutfey et al. 2010). For example, female physicians were almost twice as likely as male physicians to ask questions about cardiac symptoms. Patient and physician factors explained 46–62 percent of the variation in patient management. After accounting for patient and physician factors, approximately 40–50 percent of the variation in patient management remains unexplained. The goal of this study is not to describe previously reported findings showing that patient management depends on the type of patient seen and the characteristics of the examining physician, but rather to use a novel technique to group similar physicians into clusters based on the remaining variation in patient management strategy and to attempt to explain the variation that remains after accounting for patient and physician factors. The clustering analysis was performed using the residuals after accounting for any predicted differences between physicians based on the study's design variables and standardizing these residuals to have a standard deviation of 1 (to account for differences in the scales used between the items). As the effects of the design factors have already been accounted for prior to performing the clustering analysis, any differences between clusters are, by definition, independent of the experimental design variables.
The clusters were defined using the physicians' responses to the items that pertained to their management of the patient, including all clinical actions they would take. As many of these items are correlated and measure unobserved, or latent, phenomena, an alpha factor analysis (Kaiser and Caffrey 1965) (PROC FACTOR, SAS version 9.2) with varimax rotation was used to determine the latent structure of the data. Items that did not load onto a single factor were excluded from further analysis. As a physician's decision-making process is known in these data to be influenced by the type of patient seen and the physician's gender and level of experience (Lutfey et al. 2010), the clustering analysis was performed using the residuals after accounting for any predicted differences between physicians based on the study's design variables and standardizing these residuals to have a standard deviation of 1 (to account for differences in the scales used between the items). Therefore, any differences between clusters are, by definition, independent of the experimental design variables.
To identify an appropriate number of clusters for inclusion, a hierarchical clustering approach (PROC CLUSTER, SAS version 9.2) was used in which each physician begins as his/her own cluster and physicians are grouped into clusters according to the similarity of their responses. In this procedure, the average linkage was used to measure distances between responses. The pseudo F and t2 statistics were used to select the appropriate number of clusters for the analysis. After determining the number of clusters, physicians were assigned to clusters by means of a nonhierarchical, k-means method (PROC FASTCLUS, SAS version 9.2), which minimizes within-cluster distances between individuals, relative to the between-cluster distances observed using the default method to determine cluster seeds. Subjects were assigned to clusters iteratively, until the solution with the minimum squared deviance was obtained.
The clusters are described by reporting the mean responses within each cluster to each of the variables used to construct the cluster. p-values comparing the mean residual response for each variable using an analysis of variance are reported, and the clusters are compared using Tukey's method of multiple comparisons. Additional variables are examined using similar methods in an attempt to explain the differences found between the clusters.
Results
To increase validity and further justify the variables we included in the cluster analysis, we first conducted a factor analysis to identify underlying structures in the data related to how the numerous outcomes measures we collected were associated with one another. Twenty-two variables were included in the factor analysis (Table 1). Five latent factors were identified, and sixteen variables loaded cleanly onto one of the five factors (defined as an absolute factor loading of 0.3 or greater on a single factor, which are indicated in bold on Table 1). The remaining six variables were cross-loaded and therefore excluded from ongoing analysis because each of these variables measures more than one concept, implying that they are psychometrically weak (the last six variables in Table 1 were excluded from cluster analysis for this reason). The five factors are as follows: (1) Advice, which included items concerning physicians giving advice on diet, exercise, smoking, alcohol, and psychosocial issues such as stress management; (2) Cardiac-focused items, which included physicians asking questions pertaining to the patient's cardiac risk factors, questions about smoking behavior, prescribing a cardiac medication, and ordering a cardiac-related test; (3) History, which included three items, asking questions about alcohol usage, questions about the patients' gastrointestinal history, and questions about cardiac history; (4) Symptoms, which included physicians asking questions about cardiac symptoms and pain reported in the vignette; and (5) Referrals, which included two items, referrals to cardiologist or a physician in the “other” category.
Table 1.
Varimax Rotated Factor Loadings for 22 Variables Included in Factor Analysis
| Variable | Factor 1 (“Advice”) | Factor 2 (“Cardiac”) | Factor 3 (“History”) | Factor 4 (“Symptoms”) | Factor 5 (“Referrals”) |
|---|---|---|---|---|---|
| Advice about diet | 0.60 | −0.09 | 0.00 | 0.04 | −0.01 |
| Advice about exercise | 0.51 | 0.14 | 0.01 | −0.01 | 0.10 |
| Advice about smoking | 0.49 | 0.20 | 0.10 | 0.13 | −0.19 |
| Advice about alcohol | 0.47 | 0.05 | 0.20 | 0.18 | −0.30 |
| Advice about psych | 0.37 | −0.11 | 0.11 | −0.04 | 0.09 |
| Questions about cardiac risk factors | −0.06 | 0.59 | 0.00 | −0.22 | −0.11 |
| Prescribed a cardiac medication | 0.05 | 0.55 | −0.22 | 0.19 | 0.21 |
| Questions about smoking | 0.05 | 0.48 | 0.25 | −0.03 | −0.04 |
| Ordered a cardiac test | −0.03 | 0.35 | 0.08 | 0.03 | 0.05 |
| Questions about alcohol | 0.14 | 0.13 | 0.50 | 0.14 | −0.19 |
| Questions about GI history | 0.08 | −0.03 | 0.41 | 0.05 | 0.00 |
| Questions about cardiac history | 0.04 | 0.17 | 0.36 | −0.13 | 0.01 |
| Questions about cardiac symptoms | −0.08 | 0.10 | −0.06 | 0.66 | −0.04 |
| Questions about pain | 0.09 | 0.03 | 0.07 | 0.31 | 0.06 |
| Referral to a cardiologist | 0.01 | 0.15 | −0.05 | −0.10 | 0.56 |
| Referral to another physician | 0.01 | −0.05 | −0.00 | 0.10 | 0.31 |
| Time to next appointment | 0.29 | −0.25 | −0.16 | 0.00 | −0.22 |
| Questions about medications | 0.02 | 0.15 | 0.21 | 0.04 | −0.18 |
| Questions about diet | −0.04 | 0.01 | 0.22 | 0.01 | 0.17 |
| Prescribed a GI medication | 0.14 | −0.08 | −0.09 | 0.24 | −0.15 |
| Ordered a GI test | −0.02 | −0.12 | 0.23 | 0.30 | 0.06 |
| Number of examinations | 0.18 | 0.30 | 0.25 | 0.19 | −0.04 |
| Eigenvalues | 7.15 | 5.39 | 3.50 | 3.04 | 2.93 |
Note. Bold entries indicate an absolute loading of 0.3 or higher on a single factor.
Among the 256 physicians, three clusters of management decisions were identified. The prevalence of management decisions in each cluster is presented in Table 2. Radar plots are used to illustrate the simultaneous distribution of decisions within each cluster (Figure 1).
Table 2.
Percent of Physicians Reporting Specific Management Decisions within Each Cluster (Difference from Expectation Based on Experimental Design Factors) for Variables Considered for Inclusion in the Cluster Analysis. The Column Labeled Tukey's Multiple Comparisons Lists Any Significant Differences between the Clusters
| Variable | Cluster 1 “Cardiac” | Cluster 2 “Talkers” | Cluster 3 “Minimalists” | Tukey's Multiple Comparisons |
|---|---|---|---|---|
| N | 92 | 93 | 71 | – |
| Asked questions about | ||||
| Cardiac risk factors | 72% (11%) | 70% (8%) | 27% (−25%) | Cluster 3 less likely than clusters 1 and 2 |
| Cardiac symptoms | 49% (12%) | 34% (−2%) | 21% (−14%) | All three clusters are different from one another |
| Pain | 34% (5%) | 33% (4%) | 17% (−11%) | Cluster 3 less likely than clusters 1 and 2 |
| Smoking | 78% (9%) | 84% (12%) | 32% (−27%) | Cluster 3 less likely than clusters 1 and 2 |
| Alcohol | 24% (−7%) | 51% (17%) | 14% (−13%) | Cluster 2 more likely than clusters 1 and 3 |
| Cardiac history | 35% (−1%) | 53% (16%) | 13% (−20%) | All three clusters are different from one another |
| GI history | 16% (−12%) | 45% (16%) | 14% (−6%) | Cluster 2 more likely than clusters 1 and 3 |
| Ordered testing | ||||
| Cardiac | 100% (1%) | 100% (3%) | 92% (−4%) | Cluster 3 less likely than clusters 1 and 2 |
| Prescribed | ||||
| A cardiac medication | 84% (16%) | 57% (−9%) | 58% (−9%) | Cluster 1 more likely than clusters 2 and 3 |
| Offered advice about | ||||
| Smoking | 29% (−12%) | 62% (17%) | 27% (−6%) | Cluster 2 more likely than clusters 1 and 3 |
| Alcohol | 11% (−13%) | 41% (17%) | 10% (−6%) | Cluster 2 more likely than clusters 1 and 3 |
| Diet | 50% (−18%) | 85% (12%) | 76% (8%) | Cluster 1 less likely than clusters 2 and 3 |
| Exercise | 30% (−10%) | 48% (9%) | 34% (1%) | Cluster 1 less likely than clusters 2 and 3 |
| Psych | 12% (−12%) | 37% (9%) | 24% (4%) | Cluster 1 less likely than clusters 2 and 3 |
| Offered a referral to | ||||
| A cardiologist | 44% (7%) | 29% (−7%) | 36% (0%) | Clusters 1 and 2 are different |
| Some other health care professional | 17% (6%) | 4% (−5%) | 9% (−1%) | Clusters 1 and 2 are different |
Figure 1.

Radar Plots of Management Decisions by Cluster. [A = Advice, Q = Questions asked, M: Medications prescribed, T = Tests ordered, R = Referrals]Note. The dotted circle denotes the expected response based on experimental design factors. A point inside the circle represents a lower response than predicted from the model, and a point outside the circle represents a higher response than predicted from the model.
Cluster 1
(“Cardiac”) represents 35.9 percent of the sample (N = 92), with physicians in this cluster reporting management decisions more focused on cardiac issues relative to their counterparts in the other two clusters. Physicians in Cluster 1 were more likely to ask for additional information on cardiac risk factors, cardiac symptoms, prescribe a cardiac medication, and to refer the patient to a specialist. For advice about diet, exercise, and psychosocial issues, as well as prescribing cardiac medications, Cluster 1 is significantly different from Clusters 2 and 3.
Cluster 2
(“Talkers”) represents 36.3 percent of the sample (N = 93), and physicians in this cluster asked significantly more questions and provided more advice than their counterparts in the other two clusters. They asked more questions about smoking, alcohol, medication usage, and both cardiac and GI medical histories. At the same time, they also gave more advice regarding smoking, alcohol, diet, exercise, and psychological issues (such as the role of stress in the patient's life). For advice about smoking and alcohol, and for questions about alcohol and GI history, Cluster 2 is significantly different from Clusters 1 and 3.
Cluster 3
(“Minimalists”) represents 27.7 percent of the sample, and these physicians were the least likely to engage in a range of management decisions compared with physicians in the other two clusters. They were less likely to ask for additional information on a range of patient health behaviors, symptoms, and medical histories that were addressed by physicians in the other clusters (cardiac risk factors, cardiac symptoms, pain, smoking, alcohol, and medications), while also being less likely to order a cardiac-related test. For questions about cardiac risk, smoking, and pain, and cardiac test ordering, Cluster 3 is significantly different from Clusters 1 and 2.
In analyses (not presented here), we attempted to explain differences among the clusters by considering additional variables collected during the interview, including, among others: additional physician characteristics (location of medical school training and practice, type, and ownership of practice); practice culture (clinical and administrative autonomy, organizational trust, adoption of new technologies, stress); clinical time needed, time allotted, and additional time physician reported needing for patient visits; and use of a range of information technologies (such as electronic medical records, tracking reports, and access to clinical guidelines). We see no sustained significant differences by these variables in terms of their predictive ability to differentiate clusters. We also considered physicians' diagnostic decisions, and although physicians in the Cardiac cluster reported slightly higher diagnostic certainty for CHD (62, 56, and 53 on a scale of 0–100, p < .05), there are no other significant differences by cluster in how physicians diagnosed the vignette.
Discussion
We used cluster analysis to identify three types of management styles exhibited by primary care physicians for vignette patients presenting with CHD, based on 16 different clinical actions physicians reported they would do to manage the patient presented in the vignette. Based on these behaviors, physicians belonged to one of the following “style” clusters: (1) “Cardiac focused,” who were most likely to ask about cardiac symptoms and pursue cardiac testing and medications; (2) “Talkers,” who were mostly likely to ask for additional medical history and provide advice; and (3) “Minimalists,” who were the least likely to take action on any of the variables. These clusters were robust to the physician and patient characteristics that were manipulated as part of the balanced factorial experiment study design, as well as a range of additional potential confounders. This approach therefore accounts for variation that is unexplained in earlier models in which study design factors and clinical actions are all considered separately in ANOVA analysis. In terms of disparities in CHD management for this vignette, the Minimalists were least likely to take action to treat the condition directly, followed by Talkers, with Cardiac Focused being most likely to treat CHD. For other conditions, we hypothesize that these same clusters could have differential effectiveness (for example, “Talkers” may be more effective with diabetes patients, wherein education is a priority).
Previous research suggests that physicians may be prone to different approaches to treatment based on their ideologies about their roles in patient care (Lutfey 2005; Welch and McKinlay 2011) or their specialty training in internal medicine versus family practice (Shackelton-Piccolo et al. 2011). These conceptualizations are based on assumptions about pathophysiological or biopsychosocial models versus biophysiological or biomedical approaches to understanding health, with the former implying more holistic attention to the patient while the latter implies a more strictly medical intervention (Welch and McKinlay 2011). By contrast, the clusters that emerge from our analysis reveal three distinct approaches. Two of our clusters (“Talkers” and “Cardiac”) may coincide with biopsychosocial and biomedical models, respectively. However, our third cluster (“Minimalists”) suggests an additional style that is not clearly associated with specialty training, and as shown in Table 2, this group took many fewer clinical actions than would have been predicted by the experimental factors in the study design. Furthermore, two of the three clusters we identify (“Minimalists” and “Talkers”) are not consistent with a Bayesian model of decision making based on epidemiologic base rates and symptom presentation, as the clusters do not disappear when we adjust for patient characteristics, and the essential signs and symptoms of CHD embedded in the vignette should be sufficient to trigger a treatment of the condition. If physicians were making treatment decisions based in part of epidemiologic base rates to predict the likelihood of various outcomes, then we would expect patient characteristics to remain significant. Finally, the clusters are robust to controls for physicians' reported certainty of a CHD diagnosis, which suggests that, beyond the sources of medical practice variation discussed above, differences in management behavior persist even conditional on a correct diagnosis and have significant implications for management decisions.
In terms of understanding the potential role of clinical decision making in generating or amplifying health disparities, these clusters suggest an additional “upstream” source of variation when compared with individual demographic characteristics as proxies for understanding how physicians might treat patients. Rather than trying to intervene based on patient or provider demographics, or by teaching physicians to conduct certain types of tests or follow condition-specific guidelines, an additional point of intervention may be to adjust their treatment styles in a more holistic manner rather than focusing on specific points of change. For example, physicians in the “minimalists” cluster may be taught to be mindful of atypical CHD presentations for women and to order EKG tests for those patients, but the benefits of that intervention may be limited if those doctors are prone to undertreating patients across the board. Similarly, physicians who spend a great deal of time “talking” with patients may be less time efficient if, despite that time investment, they do not accurately treat the presenting problem. Interventions designed to modify general approaches may be more fruitful for a range of outcomes than interventions intended to change one type of behavior for one outcome. Specialized training for physicians according to their style may be appropriate, such that “minimalists” are coached to more fully explore possible treatment actions, or “talkers” have training in the limits of trying to modify patient health behavior through advice, or the “cardiac” physicians would be trained to not allow potential diagnoses to be missed due to their focus on CHD. Regardless of the substance of the interventions designed, the target would be based on the physicians' management styles rather than treating all physicians as identical and trying to encourage them to implement the most recent evidence-based clinical practice. These clusters imply that implementation of such clinical guidelines will meet with limited, or at least highly unstandardized, success, despite the larger goal of standardizing quality of care (Griffin et al. 2004).
This study has limitations that should be considered, with particular consideration of variables that were unmeasured in this work, but may have affected our results and could inform future research. Vignettes have been used successfully in clinical settings for more than 30 years (Williamson 1965), and as described above, we took several steps to minimize threats to external validity, with a result that when physicians in our study were asked how typical the patient viewed on the videotape was compared with patients they encounter in everyday practice, 89.8 percent considered them either very typical or reasonably typical. Despite this high level of face validity of our vignettes, there are important aspects of real-time patient–provider interaction that are not captured with vignette study design. Some aspects of conversational interaction, such as whether physicians are open to patients being active in the decision-making process or whether they would ask patient preferences before making a treatment decision, are not captured here and may be relevant for understanding the clusters we observe (Kaplan, Greenfield, and Ware 1989; Krupat et al. 2000; Heritage and Maynard 2006). Similarly, we are not able to address racial concordance within our balanced factorial design due to logistical constraints in recruitment. Racial/ethnic concordance is a well-established influence (Cooper et al. 2003, 2006; Lutfey and Ketcham 2005) in patient–provider interaction and may also be a productive avenue for explaining the clusters we identified. Such studies that measure physician behavior in real-time patient encounters (Roter et al. 1997; Bensing, Roter, and Hulsman 2003) may dovetail nicely with our results to move the field forward by understanding how physician thought processes interact with patient-provider communication patterns.
We had limited data on physician attitudinal and personality characteristics that may explain our clusters in ways that our other organizational level and occupational controls did not. For example, implicit attitudes toward patients, as well as attitudes about uncertainty, threats of malpractice, and worries about job security or burnout, could be modifiable targets for intervention if they are related to management style. Similarly, community-level variations, such as concentration of health care resources, poverty levels, and racial/ethnic diversity of general population and patient loads may help explain our results and should be further investigated. Our dataset is small and data were collected in North and South Carolina, and are therefore limited in their broad population generalizability despite randomization within the experiment. However, in the only significant result of the 10 physician characteristics for which we adjusted (in addition to the design factors of gender and experience), we observed that physicians in South Carolina were disproportionately likely to fall into the Minimalist cluster (p = .002), lending support to the notion that geographic variation may also be important—but not the full story—for understanding differences in management styles (Prigerson and Maciejewski 2012; Newman 2000).
Despite these limitations, the presence of these clusters highlights a need for further investigation into additional sources of variation in styles of managing clinical cases. The robust presence of these clusters provides new information about how process outcomes group in ways that are not observable in analyses which focus on single outcomes, or when single outcomes are measured serially. The fact that the clusters are not related to patient characteristics suggests that they are caused by something else, such as unobserved provider, organization, or community characteristics. Rather than chasing after “smoking gun” variables that will account for observed differences, we think productive next steps involved continuing to integrate processes and measures, such as batteries of physician attitudinal measures as well as local geographic information. To the extent that physicians select into certain locations, and they are also socialized to practicing in ways that are commensurate with their local environments, there is a need for psychosocial and geographic variables to be considered in concert. If medical practice variation is more a function of providers and their environments than patients, then ongoing investigation of disparities stemming from practice variation may be well served to focus in these ways to develop targeted policy and clinical interventions.
Acknowledgments
Joint Acknowledgment/Disclosure Statement: Financial support for this study was provided by a grant from NHLBI (HL079174).
Disclosures: None.
Disclaimers: None.
SUPPORTING INFORMATION
Additional supporting information may be found in the online version of this article:
Appendix SA1: Author Matrix.
Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.
References
- Arber S, McKinlay JB, Adams A, Marceau LD, Link CL, O'Donnell AB. “Patient Characteristics and Inequalities in Doctors' Diagnostic and Management Strategies Relating to CHD: A Video-Simulation Experiment”. Social Science and Medicine. 2006;62(1):103–15. doi: 10.1016/j.socscimed.2005.05.028. [DOI] [PubMed] [Google Scholar]
- Bensing JM, Roter DL, Hulsman RL. “Communication Patterns of Primary Care Physicians in the United States and the Netherlands”. Journal of General Internal Medicine. 2003;18(5):335–42. doi: 10.1046/j.1525-1497.2003.10735.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernheim SM, Ross JS, Krumholz HM, Bradley EH. “Influence of Patients' Socioeconomic Status on Clinical Management Decisions: A Qualitative Study”. The Annals of Family Medicine. 2008;6(1):53–9. doi: 10.1370/afm.749. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cohen JW, Krauss NA. “Spending and Service Use among People with the Fifteen Most Costly Medical Conditions”. Health Affairs. 2003;22(2):129–38. doi: 10.1377/hlthaff.22.2.129. [DOI] [PubMed] [Google Scholar]
- Cook DA. “Medical Decision Making: What Do We Trust?”. Journal of General Internal Medicine. 2010;25(4):282–3. doi: 10.1007/s11606-010-1293-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cooper LA, Roter DL, Johnson RL, Ford DE, Steinwachs DM, Powe NR. “Patient-Centered Communication, Ratings of Care, and Concordance of Patient and Physician Race”. The Annals of Internal Medicine. 2003;139(11):907–15. doi: 10.7326/0003-4819-139-11-200312020-00009. [DOI] [PubMed] [Google Scholar]
- Cooper LA, Beach MC, Johnson RL, Inui TS. “Delving below the Surface: Understanding How Race and Ethnicity Influence Relationships in Health Care”. Journal of General Internal Medicine. 2006;21(suppl 1):S21–7. doi: 10.1111/j.1525-1497.2006.00305.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Franks P, Fiscella K. “Reducing Disparities Downstream: Prospects and Challenges”. Journal of General Internal Medicine. 2008;23(5):672–7. doi: 10.1007/s11606-008-0509-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Griffin SJ, Kinmonth AL, Veltman MW, Gillard S, Grant J, Stewart M. “Effect on Health-Related Outcomes of Interventions to Alter the Interaction between Patients and Practitioners: A Systematic Review of Trials”. The Annals of Family Medicine. 2004;2(6):595–608. doi: 10.1370/afm.142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Heritage J, Maynard DW. “Problems and Prospects in the Study of Physician-Patient Interaction: 30 Years of Research”. The Annual Review of Sociology. 2006;32(15):15.1–15.24. [Google Scholar]
- Icks A, Haastert B, Rathmann W, Schroder-Bernhardi D, Giani G. “Cost Comparison Analysis: Pentaerythrithyl Tetranitrate (PETN) and Isosorbide Dinitrate (ISDN) Prescribed to Diabetic Patients in Primary Care Practices in Germany”. International Journal of Clinical Pharmacology and Therapeutics. 2007;45(9):516–23. doi: 10.5414/cpp45516. [DOI] [PubMed] [Google Scholar]
- Institute of Medicine. Crossing the Quality Chasm: A New Health System for the 21st Century. Washington, DC: The National Academies Press; 2001. [PubMed] [Google Scholar]
- Institute of Medicine. Unequal Treatment: Confronting Racial and Ethnic Disparities in Healthcare. Washington, DC: The National Academies Press; 2003. [PubMed] [Google Scholar]
- Kaiser H, Caffrey J. “Alpha Factor Analysis”. Psychometrika. 1965;30(1):1–14. doi: 10.1007/BF02289743. [DOI] [PubMed] [Google Scholar]
- Kaplan SH, Greenfield S, Ware JE., Jr “Assessing the Effects of Physician-Patient Interactions on the Outcomes of Chronic Disease”. Medical Care. 1989;27(3 suppl):S110–27. doi: 10.1097/00005650-198903001-00010. [DOI] [PubMed] [Google Scholar]
- Ketcham JD, Lutfey KE, Gerstenberger E, Link CL, McKinlay JB. “Physician Clinical Information Technology and Health Care Disparities”. Medical Care Research and Review. 2009;66(6):658–81. doi: 10.1177/1077558709338485. [DOI] [PMC free article] [PubMed] [Google Scholar]
- von dem Knesebeck O, Bönte M, Siegrist J, Marceau L, Link C, Arber S, Adams A, McKinlay J. “Country Differences in the Diagnosis and Management of Coronary Heart Disease – A Comparison between the US, the UK and Germany”. BMC Health Services Research. 2008;8:198. doi: 10.1186/1472-6963-8-198. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kralewski J, Dowd BE, Kaissi A, Curoe A, Rockwood T. “Measuring the Culture of Medical Group Practices”. Health Care Management Review. 2005a;30(3):184–93. doi: 10.1097/00004010-200507000-00002. [DOI] [PubMed] [Google Scholar]
- Kralewski JE, Dowd BE, Heaton A, Kaissi A. “The Influence of the Structure and Culture of Medical Group Practices on Prescription Drug Errors”. Medical Care. 2005b;43(8):817–25. doi: 10.1097/01.mlr.0000170419.70346.b8. [DOI] [PubMed] [Google Scholar]
- Krupat E, Rosenkranz SL, Yeager CM, Barnard K, Putnam SM, Inui TS. “The Practice Orientations of Physicians and Patients: The Effect of Doctor-Patient Congruence on Satisfaction”. Patient Education and Counseling. 2000;39(1):49–59. doi: 10.1016/s0738-3991(99)00090-7. [DOI] [PubMed] [Google Scholar]
- Lee SJ, Joffe S, Artz AS, Champlin RE, Davies SM, Jagasia M, Kernan NA, Loberiza FR, Jr, Soiffer RJ, Eapen M. “Individual Physician Practice Variation in Hematopoietic Cell Transplantation”. Journal of Clinical Oncology. 2008;26(13):2162–70. doi: 10.1200/JCO.2007.15.0169. [DOI] [PubMed] [Google Scholar]
- Lutfey K. “On Practices of ‘Good Doctoring’: Reconsidering the Relationship between Provider Roles and Patient Adherence”. Sociology of Health and Illness. 2005;27(4):421–47. doi: 10.1111/j.1467-9566.2005.00450.x. [DOI] [PubMed] [Google Scholar]
- Lutfey KE, Ketcham JD. “Patient and Provider Assessments of Adherence and the Sources of Disparities: Evidence from Diabetes Care”. Health Services Research. 2005;40(6 pt 1):1803–17. doi: 10.1111/j.1475-6773.2005.00433.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lutfey KE, Eva KW, Gerstenberger E, Link CL, McKinlay JB. “Physician Cognitive Processing as a Source of Diagnostic and Treatment Disparities in Coronary Heart Disease: Results of a Factorial Priming Experiment”. Journal of Health and Social Behavior. 2010;51(1):16–29. doi: 10.1177/0022146509361193. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McKinlay J, Link C, Marceau L, O'Donnell A, Arber S, Adams A, Lutfey K. “How Do Doctors in Different Countries Manage the Same Patient? Results of a Factorial Experiment”. Health Services Research. 2006;41(6):2182–200. doi: 10.1111/j.1475-6773.2006.00595.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muroff J, Edelsohn GA, Joe S, Ford BC. “The Role of Race in Diagnostic and Disposition Decision Making in a Pediatric Psychiatric Emergency Service”. General Hospital Psychiatry. 2008;30(3):269–76. doi: 10.1016/j.genhosppsych.2008.01.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Newman L. “New Dartmouth Atlas: Improving US Cardiac Care?”. Lancet. 2000;356(9230):660. doi: 10.1016/S0140-6736(05)73809-5. [DOI] [PubMed] [Google Scholar]
- Prigerson HG, Maciejewski PK. “Dartmouth Atlas: Putting End-Of-Life Care on the Map but Missing Psychosocial Detail”. The Journal of Supportive Oncology. 2012;10(1):25–8. doi: 10.1016/j.suponc.2011.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Roter DL, Stewart M, Putnam SM, Lipkin M, Jr, Stiles W, Inui TS. “Communication Patterns of Primary Care Physicians”. Journal of the American Medical Association. 1997;277(4):350–6. [PubMed] [Google Scholar]
- Shackelton R, Marceau L, McKinlay JB. Variations in the Diagnosis and Management of the Same Patient with Heart Disease in Primary Care: Do Internists and Family Practitioners Use Different Explanatory Models? Watertown, MA: New England Research Institutes; 2010. [Google Scholar]
- Shackelton RJ, Marceau LD, Link CL, McKinlay JB. “The Intended and Unintended Consequences of Clinical Guidelines”. The Journal of Evaluation in Clinical Practice. 2009;15(6):1035–42. doi: 10.1111/j.1365-2753.2009.01201.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shackelton-Piccolo R, McKinlay JB, Marceau LD, Goroll AH, Link CL. “Differences between Internists and Family Practitioners in the Diagnosis and Management of the Same Patient with Coronary Heart Disease”. Medical Care Research Reviews. 2011;68(6):650–66. doi: 10.1177/1077558711409047. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Siegrist J, Shackelton R, Link C, Marceau L, von dem Knesebeck O, McKinlay J. “Work Stress of Primary Care Physicians in the US, UK and German Health Care Systems”. Social Science and Medicine. 2010;71(2):298–304. doi: 10.1016/j.socscimed.2010.03.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Veloski J, Tai S, Evans AS, Nash DB. “Clinical Vignette-Based Surveys: A Tool for Assessing Physician Practice Variation”. American Journal of Medical Quality. 2005;20(3):151–7. doi: 10.1177/1062860605274520. [DOI] [PubMed] [Google Scholar]
- Welch L, McKinlay JB. Physicians Treatment Ideologies and Organizational Constraints. Philadelphia, PA: Eastern Sociological Society Annual Meeting; 2011. [Google Scholar]
- Williamson JW. “Assessing Clinical Judgment”. Journal of Medical Education. 1965;40:180–7. [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
