Abstract
Rationale.
Historic and present-day racism and inequity in the United States (U.S.) have resulted in diminished trust in health care among many populations. A key barrier to improving trust in health care is a dearth of well-validated measures appropriate for diverse populations. Indeed, systematic reviews indicate a need to develop and test updated trust measures that are multidimensional and inclusive of relevant domains (e.g., fairness).
Objective.
We developed three trust measures: the Trust in My Doctor (T-MD), Trust in Doctors in General (T-DiG), and Trust in the Health Care Team (T-HCT) scales.
Methods.
After developing an initial item pool, expert reviewers (n=6) provided feedback on the face validity of each scale. We conducted cognitive interviews (n=21) with a convenience sample of adults to ensure items were interpreted as intended. In 2020, we administered an online survey to a convenience sample of U.S. adults recruited through the Qualtrics Panel (n=801) to assess scale reliability and validity.
Results.
Exploratory and confirmatory factor analyses indicated acceptable model fit for second order latent factor models for each scale (root mean square error of approximation: <0.07, comparative fit index: ≥0.98, and standardized root mean square residual: ≤0.03). The T-MD contained 25 items and six subscales: communication competency, fidelity, systems trust, confidentiality, fairness, and global trust. The T-DiG and T-HCT each contained 29 items and seven subscales (the same subscales in the T-MD plus an additional subscale related to stigma-based discrimination). Each scale was strongly correlated with existing trust measures and perceived racism in health care and was significantly associated with delayed health care seeking and receipt of a routine health exam.
Conclusions.
The multidimensional T-MD, T-DiG, and T-HCT scales have sound psychometric properties and may be useful for researchers evaluating trust-related interventions or conducting studies where trust is an important construct or main outcome.
Keywords: Trust, mistrust, health care, doctors, scale development, scale validation, health equity, United States
Introduction
The coronavirus disease 2019 (COVID-19) pandemic has highlighted the importance of patient trust in and trustworthiness of the U.S. health care system as health officials aim to prevent infections (e.g., by increasing public trust in vaccination) and provide medical care to infected individuals (Daly et al., 2021; Thompson et al., 2021). Trust has been defined in multiple ways, but one useful description is “the optimistic acceptance of a vulnerable situation in which the trustor believes the trustee will care for their interests,” (Hall et al., 2002a). In the medical care context, a patient is in a particularly vulnerable position, making trust a vital aspect of clinical care. Related to trust is the separate construct of medical mistrust. Various definitions of medical mistrust also exist, but Griffith and colleagues (2021) recently clarified medical mistrust as “a general sense of unease or suspicion toward someone or something that is predicated… on the notion that the provider or healthcare entity may not act in the patient’s best interest and… may actively work against the patient.” Importantly, medical mistrust is a rational response to ongoing and historical racism and inequity (Griffith et al., 2021). Unlike trust, conceptualizations of medical mistrust may not necessarily name what or who is mistrusted. Accordingly, the trust-related literature often focuses on specific health provider or system attributes, such as competence, whereas the medical mistrust literature often focuses on the larger systems and structures that create suspicion and health inequities (e.g., racism). In this study, we focus on trust but aim to reduce silos between the trust and mistrust literature and push the trust field to consider larger factors like racism that affect trust in specific entities, such as a patient’s primary care doctor, doctors in general, or the larger health care system (Bazargan et al., 2021; Feagin & Bennefield, 2014).
Evidence suggests that the U.S. performs poorly compared to other countries in public trust in physicians, ranking 24th out of 29 high-income countries (Blendon et al., 2014). Additionally, numerous health outcomes are associated with lower trust in and mistrust of health care entities, such as delays in health care seeking, lapses in cancer screening, and reduced COVID-19 vaccine uptake (Adams et al., 2017; Benkert et al., 2019; Birkhauer et al., 2017; Thompson et al., 2021; Yang et al., 2011). Furthermore, trust has been linked to health disparities as racial and ethnic minority individuals report lower trust in doctors and the larger health care system than white individuals (Armstrong et al., 2008; Armstrong et al., 2013; Halbert et al., 2006).
Given the importance of trust in health care entities, public health researchers and practitioners have long been interested in developing interventions that foster patient trust. Yet, a Cochrane review of randomized controlled trials on this topic found that there is insufficient evidence to suggest any specific intervention improves trust in doctors because the trials produced inconsistent and conflicting results (Rolfe et al., 2014). The authors of this review highlighted measurement issues as a contributing factor for the conflicting results and lack of evidence about the effectiveness of trust-related interventions. Recent systematic reviews of measures used to assess trust in and mistrust of health care entities echo concerns about the qualitative and quantitative validity of commonly used measures (Müller et al., 2014; Ozawa & Sripad, 2013; Williamson & Bigman, 2018). Prior studies have not consistently reported scale psychometrics or employed qualitative cognitive interview methods to understand how individuals perceive and respond to scale items (Hillen et al., 2011; Müller et al., 2014; Ozawa & Sripad, 2013; Williamson & Bigman, 2018). Additionally, many existing measures were developed over 20 years ago in a different era of medical care, potentially limiting current-day applicability (Ozawa & Sripad, 2013).
A separate concern is that existing measures may not sufficiently capture all relevant dimensions of trust. Based on a systematic review of measures used to assess trust in a health system, the Health Systems Trust Content Area Framework was developed to conceptualize trust (Ozawa & Sripad, 2013). This framework identified eight key content areas: competence, fidelity, honesty, confidentiality, confidence, communication, systems trust, and fairness (defined in Table 1). The review concluded that some key constructs, such as fairness, are rarely included in trust measures. For example, the widely used Trust in Physicians Scale and Wake Forest Physician Trust Scale do not include items related to fairness (Hall et al., 2002b; Thom et al., 1999). Excluding the fairness content area from trust scales may be particularly problematic as trust may be broken, in part, due to unfair and unjust treatment of populations (such as in the US Public Health Service Study of Syphilis at Tuskegee) or due to racism in the health care system (Alsan & Wanamaker, 2016; Armstrong et al., 2013; Gamble, 1997; Hammond, 2010; Scharff et al., 2010). Indeed, centuries of racism and exploitation within the U.S. health care system make fairness particularly important to assess (Feagin & Bennefield, 2014). Similarly, recent literature suggests patient-provider communication is a critical component of trust, but communication is rarely a focus of existing measures (Greene & Ramos, 2021). To our knowledge, no existing measure of trust has been developed deductively by applying the Health Systems Trust Content Area Framework to ensure that key constructs, such as fairness and communication, are represented in the measure. Furthermore, most existing scales measure trust as a unidimensional construct, but prior research suggests that trust is complex and comprised of multiple dimensions (Greene & Ramos, 2021; Ozawa & Sripad, 2013). Therefore, existing measures do not offer researchers the opportunity to assess specific dimensions of trust that may be particularly related to their research question (e.g., fairness in a study related to health equity).
Table 1.
Domains in the Health Systems Trust Content Area Framework and Factor Names/Definitions for the Trust in My Doctor (T-MD), Trust in Doctors in General (T-DiG), and Trust in the Health Care Team (T-HCT) Scales
Domain | Definition |
---|---|
Domains in the Health Systems Trust Content Area Framework | |
Competence | Perceived ability to provide health-related services including qualifications and reputation. |
Fidelity | Duty of health sector workers to help patients beyond self-interest. |
Honesty | Level of integrity and openness in a relationship. |
Systems Trust | Belief in institutions, processes, and policies of the health system. |
Communication | Quality and nature of information exchange between all involved parties. |
Confidence | Belief in the reliability of involved parties. |
Confidentiality | Maintaining privacy of patient information. |
Fairness | Perceived treatment of disadvantaged and vulnerable groups. |
Factor Names and Definitions for the T-MD, T-DiG, and T-HCT Scales | |
Communication Competency | Quality and reliability of information exchanged between involved parties. Note: This factor combines items from the competence, communication, and confidence domains of the Health Systems Trust Content Area Framework. |
Fidelity | Duty of health sector workers to help patients beyond self-interest and facilitate open and honest relationships. Note: This factor combines items from the fidelity and honesty domains of the Health Systems Trust Content Area Framework. |
Systems Trust | Belief in institutions, processes, and policies of the health system. |
Confidentiality | Maintaining privacy of patient information. |
Fairness | Fair treatment of all patients, including patients who are minoritized and experience disadvantage. |
Stigma-Based Discrimination (Factor is present only in the T-DiG and T-HCT scales) | Discriminatory treatment of patients diagnosed with health conditions that carry stigma or who engage in behaviors that carry stigma (e.g., HIV and drug use). Note: This factor is only included in the T-DiG and T-HCT scales. Items related to stigma-based discrimination were removed from the T-MD scale after cognitive interview participants found them hard to answer when thinking of their primary care doctor. |
Global Trust | Overall trust in and trustworthiness of doctors and health care staff. Note: This domain was added based on feedback from experts and was not part of the original Health Systems Trust Content Area Framework. |
Note: The domains in the top half of the table are the original domains presented in the Health Systems Trust Content Area Framework. The domains in the bottom half of the table represent factor names and definitions for the T-MD, T-DiG, and T-HCT Scales that were derived from the exploratory and confirmatory factor analyses.
Assessing trust also requires measures that are meaningful across populations. Researchers often compare average trust scores between different racial and ethnic groups, which assumes the scale measures the same construct in the same way across groups (i.e., measurement invariance) (Putnick & Bornstein, 2016). This assumption may not always be appropriate as different populations have varying contextual and cultural experiences with the health care system. Yet, the development and validation of trust measures rarely include an assessment of measurement invariance. Given these concerns, researchers have called for studies to improve the measurement of trust in health care (Müller et al., 2014; Ozawa & Sripad, 2013). We aimed to address these gaps in the field by developing updated, multidimensional measures of trust in three distinct entities—trust in my doctor (T-MD), trust in doctors in general (T-DiG), and trust in the health care team (T-HCT).
Methods
Below, we describe the methods for developing and validating the T-MD, T-DiG, and T-HCT scales using a series of cognitive interviews and an online survey of U.S. adults. All study participants provided informed consent, and the University of North Carolina at Chapel Hill Institutional Review Board deemed this study exempt.
Scale Development
Informed by DeVellis’ scale development guidelines (2016), a comprehensive review of the trust literature, and previously developed trust measures, we created an initial item pool for each scale. We used the Health Systems Trust Content Area Framework as the conceptual foundation to guide the scale development process and inform each scale’s initial item pool (i.e., a deductive approach) (Hinkin, 1998; Ozawa & Sripad, 2013). More specifically, we drafted new items and adapted items from previously developed measures to assess each of the Framework’s eight content areas, which are defined in Table 1. All items were declarative statements and used a five-point Likert scale ranging from strongly disagree to strongly agree. We also drafted instructions for each scale to clarify the object of trust, such as the respondent’s primary care doctor for the T-MD or people/team members who work in the larger health care system for the T-HCT. In the T-MD, the instructions ask respondents without a primary care doctor to think about the doctor they last saw. We used the Flesch-Kincaid Grade Level Readability Test to calculate the average U.S. grade level required for comprehension of each scale (Ley & Florio, 1996).
Six experts (four of whom were external to the authorship team) reviewed the initial item pools of all three scales to assess clarity, conciseness, relevance, and face validity. Experts also suggested new items, as needed. Two experts were selected as reviewers because they led the development of existing trust-related measures. Two additional experts were selected because they had strong publication records in the field. The final two experts were selected because they had expertise in psychometrics and scale development. We revised the draft item pools to incorporate expert feedback.
Next, we conducted a series of 21 cognitive interviews to understand how individuals responded to the scale instructions and draft items, with the goal of ensuring the scales were interpreted as intended. Concordant with common practice, each participant reviewed one of the three scales, yielding a total of seven interviews per scale (Willis & Artino, 2013). To mirror eligibility criteria for the online survey (see below), participants were eligible for cognitive interviews if they were 18 years of age or older, identified as Black or white, and were comfortable reading and speaking English. Participants were recruited from the Research Triangle Park area of North Carolina through flyers posted in the local community and in online spaces (e.g., public libraries and Facebook). We did not post recruitment materials in doctor’s offices or other places with explicit connections to the health care system to help recruit participants with diverse perspectives instead of only those currently engaged with—and perhaps most trusting of—the health care system. In recruitment materials, participants were invited to share their opinions to help researchers design a better way to gather public opinions about doctors and the health care system. Interviews were conducted in person in December 2019.
During the interview, participants were asked to read each survey item aloud and describe their thoughts while responding (a think-aloud approach) (Willis & Artino, 2013). The interviewer followed up with probes as necessary to elicit context about participant responses and specific items that caused confusion. When participants found an item confusing, the interviewer asked for suggestions about how the item could be revised or if the item should be removed. Each interview lasted approximately 60–90 minutes and was audio-recorded and professionally transcribed. Field notes and interview transcripts were reviewed to summarize participant responses and recommendations to revise, clarify, or drop items. We dropped items that multiple participants found repetitive of other well-worded items in the scale. We also incorporated suggestions to revise or drop items that multiple participants found confusing.
Scale Validation
During March and April 2020, we recruited a nationwide convenience sample of U.S. adults to complete an online questionnaire assessing the psychometric properties of the T-MD, T-DiG, and T-HCT. We recruited participants using the Qualtrics Panel—a demographically diverse platform used by universities and researchers throughout the world and shown to be more representative of the U.S. population than other panels (Boas et al., 2018). Participants were eligible if they were 18 years of age or older, comfortable reading and conversing in English, and resided in the U.S. We used a quota-based recruitment approach to obtain a diverse sample that approximately represented the U.S. population in terms of gender, age, education level, and diagnosis of a chronic disease. For purposes of later measurement invariance testing, participants had to identify as either Black or white, and we used quota-based recruitment to achieve a balanced proportion of Black and white participants in the sample.
Participants completed an online questionnaire that included the item pools for the T-MD, T-DiG, and T-HCT scales. We formatted the questionnaire so that participants could complete it on a mobile device or computer. To address potential order effects, each scale and the items within each scale were presented in a random order. The questionnaire itself included items designed to assess construct and predictive validity. For construct validity, we hypothesized that the T-MD, T-DiG, and T-HCT would be strongly correlated with previously developed and validated measures of trust or mistrust in the same entities. We measured trust in one’s individual physician using the Trust in Physician Scale (Cronbach’s α = 0.92) and trust in doctors in general using the General Physician Trust Scale (Cronbach’s α = 0.93) (Hall et al., 2002a; Thom et al., 1999). We used the Medical Mistrust Index (Cronbach’s α = 0.89) to assess mistrust of health care teams/organizations (LaVeist et al., 2009). For the T-MD and T-DiG, we hypothesized that these scales would be positively correlated with the aforementioned measures of trust in the same entities. We hypothesized that the T-HCT would be negatively correlated with the Medical Mistrust Index. We also assessed standard demographics, such as age, income, and health insurance status.
Prior studies suggest that experiences of racism within the health care context may result in lower trust in health care providers and systems (Bazargan et al., 2021; Hammond, 2010). We therefore hypothesized that the T-MD, T-DiG, and T-HCT would negatively correlate with perceived racism in health care. Perceived racism in health care was measured using the Racism in Health Care Index (Cronbach’s α = 0.88) (LaVeist et al., 2000).
An important consideration when conceptualizing trust in health care providers and the U.S. health care system is whether the underlying cognitive processes can be largely explained by individuals’ more general perceptions of trust in other people or the government. Given longstanding and ongoing racism and exploitation in the U.S. (Bailey et al., 2017; Feagin & Bennefield, 2014), it is conceivable that lower trust in health care providers or teams stems from a general lack of trust in the structures that purportedly protect people in the U.S. (i.e., the government and goodwill of other people). We contend that, although these experiences are likely related to trust in health care providers or teams, they do not fully account for it. Hence, we hypothesized that the T-MD, T-DiG, and T-HCT would be modestly correlated with trust in others and trust in the federal government (an assessment of discriminant validity). We assessed trust in others by asking participants if they would say that most people can be trusted or that you can’t be too careful in dealing with people, an item from the General Social Survey. We also asked participants how much trust they have in the federal government. Responses were measured on a 5-point scale ranging from “none at all” to “a great deal.”
Based on prior evidence, we also hypothesized that the T-MD, T-DiG, and T-HCT scales would be negatively associated with delayed health care seeking and positively associated with receipt of a routine health exam in the past 12 months (LaVeist et al., 2009; Musa et al., 2009; Powell et al., 2019). We measured delayed health care seeking by asking participants if they postponed or delayed seeking care they felt they needed in the past 12 months. We also asked participants if they received a routine health exam in the past 12 months. Both of these items were adapted from the Behavioral Risk Factor Surveillance System survey (Centers for Disease Control and Prevention, 2018).
Data Analysis
We used Stata version 15 to conduct univariate statistics and randomly split the dataset into two halves (one derivation sample and one holdout sample) to facilitate exploratory and confirmatory factor analyses. Negatively worded items in each scale were reverse coded so that higher scores indicated greater trust. Using the derivation sample, we conducted an exploratory factor analysis in Mplus version 8 and applied the oblique Promax rotation (Goretzko et al., 2019). The weighted least squares estimator (WLSMV) was used as it more readily accommodated the ordinal nature of the scale items. We used the Health Systems Trust Content Area Framework to guide decisions about extracting factors and to ensure the factor structure was theoretically coherent (Ozawa & Sripad, 2013). We also identified factor solutions to test by examining the scree plot of eigenvalues (Goretzko et al., 2019). We removed items that performed poorly in the exploratory factor analysis (i.e., items that did not have strong loadings (<0.30) onto any one factor or that loaded onto factors that were not theoretically meaningful).
We used the separate holdout sample to conduct confirmatory factor analysis in Mplus and verify the factor structure identified from the exploratory factor analysis. We assessed model fit using the scaled chi-square test (p>0.05), comparative fit index (CFI; ≥0.95), root-mean-square error of approximation (RMSEA; ≤0.06), and standardized root mean square residual (SRMR; ≤0.08) per criteria recommended by Hu and Bentler (1999). We first assessed model fit for a first order factor structure of each scale. We then assessed fit for a second order model of the T-MD, T-DiG, and T-HCT because we designed each scale with the notion that the distinct but related domains from the Health Systems Trust Content Area Framework (e.g., confidentiality, fidelity, and fairness) could be accounted for by the second/higher level constructs of trust in my doctor, trust in doctors in general, and trust in the health care team, respectively. Second order confirmatory factor analysis is a theory driven approach to develop a more parsimonious account of the correlations among first order factors (Brown, 2015). After assessing model fit using the holdout sample, we assessed model fit using the full sample. Consistent with prior recommendations and studies, we used the full sample for subsequent validation analyses (Ehrhart et al., 2014; Kyriazos, 2018; McClair et al., 2021). We assessed internal consistency reliability for the second order scale models using an approach described by Raykov and colleagues (2018) to account for each scale’s multidimensional structure. We assessed internal consistency for each unidimensional subscale using Cronbach’s alpha.
Next, we conducted multiple groups testing to assess whether the scales measured trust in a consistent fashion across Black and white participants (i.e., measurement invariance). We assessed three levels of measurement invariance for each scale: configural, weak/metric, and strong/scalar. Measurement invariance testing was conducted on the first order factor structure of each scale (i.e., each scale was modeled as correlated factors). We evaluated configural invariance to assess whether the same factor dimensions held in each participant group. We then evaluated weak/metric invariance to assess whether the latent variables had the same metric (i.e., equal factor loadings) across groups. We also evaluated strong/scalar invariance to assess whether the latent variables had equal factor loadings and thresholds across groups. We used a ‘build up’ approach to progressively fit more restrictive models starting with a configural invariance model and ending with a strong invariance model. We used chi-square difference testing and goodness-of-fit indices to examine whether model fit worsened with each successive model. Of note, the chi-square difference test frequently rejects models when violations are minor, especially when the sample size is large (i.e., greater than 300) (Chen, 2007). Therefore, we used other fit indices in addition to the chi-square difference test to assess measurement invariance. We followed guidelines from Chen (2007) that suggest a change of no more than 0.01 in CFI, 0.015 in RMSEA, or 0.03 in SRMR indicates invariance.
After identifying the measurement model for each scale, we assessed construct and predictive validity. Using the WLSMV estimator in Mplus, we assessed whether each scale correlated with existing trust measures, trust in others, trust in the federal government, and perceived racism in health care to assess construct validity. Finally, we assessed whether each scale significantly predicted delayed health care seeking and receipt of a routine medical exam in the past 12 months after adjusting for covariates to examine predictive validity. Predictive validity analyses adjusted for age, sex, race, education, income, marital status, health insurance status over the past 12 months, and chronic disease diagnosis. We used one model per trust scale to test each construct and predictive validity hypothesis. Data were missing for <1% of study variables. Missing data were accounted for using pairwise deletion.
Results
Cognitive Interview Results
The initial item pool contained 45 items for each of the T-MD, T-DiG, and T-HCT scales. After incorporating expert input, each scale’s item pool contained 60 items. In addition to the eight domains in the Health Systems Trust Content Area Framework, we added one additional domain—global trust—to assess overall trust in, and trustworthiness of, doctors and health care staff based on expert recommendations. An example item in the global trust domain is “Doctors are trustworthy.” We also revised the language in the T-HCT after receiving early feedback. Initially, all T-HCT items asked questions about the “health care system” (e.g., “The health care system listens to people’s concerns”). However, we learned that the term “health care system” may be confusing as it often requires applying humanistic traits to an inanimate system (e.g., people listen, not systems). Therefore, we revised the T-HCT language to refer to people who work in health care and updated the scale instructions to define this terminology (shown in supplementary file 1).
We conducted cognitive interviews (n=21) to assess whether the scale instructions and draft items were understandable and assessed the intended psychosocial constructs. The mean age of cognitive interview participants was 44.8 years (SD=14.3). Ten of the participants (48%) identified as male, 10 identified as female (48%), and one participant identified as non-binary. Eleven participants (52%) identified as white, and 10 participants (48%) identified as Black. About 43% of participants (n=9) had a high school or less education level. Sixteen participants (76%) had a primary care provider.
Cognitive interview participants found the measures and scale instructions “straightforward” but suggested clarifying or removing certain items that were repetitive. Participants generally reported no problems with the terminology used throughout the scales to identify the object of trust (e.g., “doctor” and “people who work in health care”) and interpreted these terms as described in the scale instructions. Several participants commented that these were “good questions” and expressed appreciation that researchers are examining issues around trust in health care. For example, one participant commented, “some people would be like wow, I’m glad somebody finally asked… and they would have been brutally honest, a lot of them because they’ve never been asked this. They never knew they…had a right to an opinion…”
Based on participant feedback during cognitive interviews, we dropped 16 items from the T-MD scale and 11 items from the T-DiG and T-HCT scales. We removed more items from the T-MD than the other two scales because participants reported difficulty answering some questions when referring to themselves and their specific primary care doctor. For example, we removed items from the T-MD related to stigma-based discrimination (e.g., “My doctor might treat me unfairly if I was diagnosed with HIV”) because participants found them overly hypothetical and therefore hard to answer. In contrast, participants reported no issues when answering similar questions about doctors and patients in general (e.g., “Doctors treat patients diagnosed with HIV unfairly”). We also modified the wording of four items in the T-MD and five items in the T-DiG and T-HCT to improve clarity. For example, we revised the item “Doctors treat patients fairly, regardless of their sexual orientation” to instead read “Doctors treat patients fairly, regardless of their sexual orientation (e.g., straight, gay, lesbian, or bisexual)” because multiple participants asked for clarification about the meaning of “sexual orientation” or considered the term a synonym of gender. After the cognitive interviews, the T-MD scale contained 44 items and the T-DiG and T-HCT scales each contained 49 items.
Online Questionnaire Results
Study sample.
The analytic dataset contained responses from 801 participants who completed the online questionnaire. Participant ages ranged from 18–86 years (M=46 years, SD=17.0). Approximately 50% of participants identified as male, 49% identified as female, and 1% identified as non-binary or another gender. About 51% of the sample identified as white, and 49% identified as Black. Most participants were unmarried (61%), had health insurance continuously for the past 12 months (84%), and had not previously been diagnosed with a chronic disease (57%). Approximately 40% of the sample had a high school education or less, and 35% of the sample had an annual income of less than $30,000. About 74% of the sample had a primary care doctor. Table 2 presents participant characteristics.
Table 2.
Online Questionnaire Participant Characteristics
Characteristic | Black (n=392) | White (n=409) | Total (n=801) | |||
---|---|---|---|---|---|---|
n or Mean | % or SD | n or Mean | % or SD | n or Mean | % or SD | |
Age | ||||||
Mean, SD | 44.9 | 17.2 | 47.2 | 16.7 | 46.1 | 17.0 |
Gender | ||||||
Male | 102 | 26% | 299 | 73% | 401 | 50% |
Female | 286 | 73% | 106 | 26% | 392 | 49% |
Non-binary or Gender Not Listed | 4 | 1% | 4 | 1% | 8 | 1% |
Education | ||||||
High School or Less | 127 | 32% | 193 | 47% | 320 | 40% |
Some College/Associate’s Degree | 160 | 41% | 95 | 23% | 255 | 32% |
Bachelor’s Degree | 78 | 20% | 78 | 19% | 156 | 19% |
Graduate Degree | 27 | 7% | 43 | 11% | 70 | 9% |
Income | ||||||
$29,999 or Less | 156 | 40% | 123 | 30% | 279 | 35% |
$30,000–59,999 | 122 | 31% | 128 | 31% | 250 | 31% |
$60,000–89,999 | 60 | 15% | 55 | 14% | 115 | 14% |
$90,000 or More | 54 | 14% | 103 | 25% | 157 | 20% |
Marital Status | ||||||
Married | 107 | 27% | 205 | 50% | 312 | 39% |
Unmarried | 285 | 73% | 204 | 50% | 489 | 61% |
Has a Primary Care Provider | ||||||
No | 104 | 27% | 102 | 25% | 206 | 26% |
Yes | 288 | 73% | 307 | 75% | 595 | 74% |
Health Insurance Status | ||||||
Continuously Insured for Past 12 Months | 326 | 83% | 343 | 84% | 669 | 84% |
Had Insurance Gap in Past 12 Months | 66 | 17% | 66 | 16% | 132 | 16% |
Chronic Disease Diagnosis | ||||||
None | 207 | 53% | 252 | 62% | 459 | 57% |
One or More Chronic Diseases | 185 | 47% | 157 | 38% | 342 | 43% |
Delayed Care in Past 12 Months | ||||||
No | 305 | 78% | 316 | 77% | 621 | 78% |
Yes | 87 | 22% | 93 | 23% | 180 | 22% |
Received Check Up in Past 12 Months | ||||||
No | 96 | 24% | 133 | 33% | 229 | 29% |
Yes | 296 | 76% | 276 | 67% | 572 | 71% |
Note: SD = Standard deviation
Exploratory and confirmatory factor analyses.
We conducted exploratory factor analysis using data from the derivation sample (n=400). Given that the Health Systems Trust Content Area Framework contains eight domains and expert reviewers suggested adding one additional domain (global trust), we initially explored a nine-factor solution for each scale. However, this solution resulted in several factors with eigenvalues below 1.0 in each scale, suggesting that fewer factors should be extracted. The nine-factor solution also did not fully correspond with the guiding framework. As such, we examined the scree plots for each scale, and assessed solutions with 4–7 factors for the T-MD and 5–8 factors for the T-DiG and T-HCT.
We identified that a six-factor solution for the T-MD and a seven-factor solution for the T-DiG and T-HCT had clear conceptual meanings and resulted in consistently strong factor loadings. The four- and five-factor solutions for the T-MD were not conceptually meaningful (e.g., the solutions combined items from domains that should be distinct according to the guiding framework, expert reviewers, and cognitive interview participants). Additionally, the RMSEA from the four- and five-factor T-MD models (0.10 and 0.08) suggested less than ideal model fit. The six-factor solution for the T-MD resulted in a RMSEA value indicative of good fit (0.06) and identified factors with clear conceptual meanings based on the guiding framework. The seven-factor T-MD solution added an additional factor that did not make conceptual sense, and the strongest loading on this factor was low (0.24). Accordingly, we moved forward with a six-factor solution for the T-MD. A similar pattern of results emerged when we examined solutions with 5–8 factors for the T-DiG and T-HCT. During this stage, we removed 19 items from the T-MD and 20 items from the T-DiG and T-HCT that had poor factor loadings (<0.30), loaded onto factors that were not theoretically coherent, or were effectively collinear with similar items. The items removed from each of the scales are presented in supplementary file 2.
The resulting T-MD scale contained 25 items and six factors: communication competency, fidelity, systems trust, confidentiality, fairness, and global trust. These factors largely mapped onto the Health Systems Trust Content Area Framework with a few exceptions. First, items from the competence, communication, and confidence domains of the framework formed a factor that could be more generally characterized as communication competency. The grouping of these items was conceptually coherent as all items related to the ability of health care providers to communicate in a competent, high quality, and reliable manner. This grouping of items was also supported by recent literature (Greene & Ramos, 2021) and our cognitive interviews, which found significant overlap between patient-provider communication and provider competence as providers often demonstrate competence while communicating with patients. Second, several items from the fidelity and honesty domains of the framework were combined into a factor named fidelity (i.e., the duty of health sector workers to help patients beyond self-interest). The grouping of these items was conceptually coherent as physician fidelity requires honesty. For example, physicians cannot recommend expensive treatments to make money (an item assessing fidelity) without being dishonest.
The T-DiG and T-HCT scales each contained 29 items and seven factors (the same six factors as the T-MD plus one additional factor, stigma-based discrimination). The additional factor related to the discriminatory and unfair treatment of patients diagnosed with health conditions or who engage in behaviors that carry stigma (e.g., HIV and drug use). Although items in the stigma-based discrimination factor are similar to items in the fairness factor, the grouping of these items into a separate factor made conceptual sense because, separate from fairness more broadly, these items refer to the larger discriminatory treatment that people diagnosed with certain health conditions may experience when seeking care. Table 1 presents an overview of the factor names and definitions for the T-MD, T-DiG, and T-HCT scales and how they compare to the Health Systems Trust Content Area Framework.
Next, we used the holdout sample (n=401) to conduct confirmatory factor analysis and test the factor solutions identified from exploratory factor analyses. Results for the six-factor, first order model for the T-MD scale and the seven-factor first order models for the T-DiG and T-HCT scales suggested acceptable model fit (see Table 3). Standardized factor loadings were strong (range 0.61–0.98) and all statistically significant at p<0.001 in the models for each scale. We also assessed the fit of second order models for each of the three scales using the holdout sample, and results suggested acceptable model fit (Table 3). We moved forward with a second order model for each scale given that 1) the second order models suggested acceptable model fit; 2) there was minimal to no change in RMSEA, CFI, and SRMR when comparing the first and second order models for each scale; and 3) there is theoretical justification that the second order constructs (i.e., T-MD, T-DiG, and T-HCT) have direct effects on the first order domains (e.g., respondent perceptions of physician/health care team fidelity, fairness, and confidentiality). Supplementary Figures S1–S3 visually present the second order factor structure of each scale.
Table 3.
Model Fit for First and Second Order Models of the T-MD, T-DiG, and T-HCT in Confirmatory Factor Analysis Using Holdout Sample (n=401) and Full Sample (n=801)
χ2(df) | RMSEA (90% CI) | CFI | SRMR | |
---|---|---|---|---|
T-MD | ||||
First Order (holdout sample) | 798.9(260)* | 0.072 (0.066 – 0.078) | 0.99 | 0.03 |
Second Order (holdout sample) | 750.9(269)* | 0.067 (0.061 – 0.073) | 0.99 | 0.03 |
Second Order (full sample) | 811.3(269)* | 0.050 (0.046 – 0.054) | 0.99 | 0.02 |
T-DiG | ||||
First Order (holdout sample) | 789.2(356)* | 0.055 (0.050 – 0.060) | 0.98 | 0.03 |
Second Order (holdout sample) | 925.6(370)* | 0.061 (0.056 – 0.066) | 0.97 | 0.04 |
Second Order (full sample) | 1362.3(370)* | 0.058 (0.055 – 0.061) | 0.98 | 0.03 |
T-HCT | ||||
First Order (holdout sample) | 976.3(356)* | 0.066 (0.061 – 0.071) | 0.98 | 0.03 |
Second Order (holdout sample) | 1027.5(370)* | 0.067 (0.062 – 0.071) | 0.98 | 0.04 |
Second Order (full sample) | 1613.6(370)* | 0.065 (0.062 – 0.068) | 0.98 | 0.03 |
Significant at p<0.001
Note: T-MD = Trust in My Doctor; T-DiG = Trust in Doctors in General; T-HCT = Trust in the Health Care Team; RMSEA = root mean square error of approximation; CFI = comparative fit index; SRMR = standardized root mean square residual
We also assessed the fit of each second order model using the full sample (n=801) since we planned to use the full sample for subsequent analyses. Confirmatory factor analysis results of the second order solution from the full sample similarly suggested acceptable model fit for each scale (shown in Table 3). All factor loadings were statistically significant at p<0.001. Internal consistency reliability for each scale and subscale was acceptable (>0.70). Tables 4–6 present the confirmatory factor analysis results for the second order factor solution of each scale using the full sample. The item pools for each scale required an 8th grade reading level or less (7th grade for the T-MD, 8th grade for the T-DiG, and 7th grade for the T-HCT). Supplementary file 1 presents the final item pools and participant instructions for each scale.
Table 4.
Confirmatory Factor Analysis Results for the Trust in My Doctor (T-MD) Scale Using Full Sample (n=801)a
Latent Variable | Factor/Item Name and Descriptionb | Factor Loadingc | Standard Error | Scale Reliabilityd |
---|---|---|---|---|
Trust in My Doctor (T-MD) | Communication Competency | 0.97 | 0.005 | 0.90 |
Fidelity | 0.77 | 0.019 | ||
Systems Trust | 0.72 | 0.019 | ||
Confidentiality | 0.91 | 0.009 | ||
Fairness | 0.90 | 0.009 | ||
Global Trust | 0.93 | 0.007 | ||
Communication Competency | My doctor knows how to treat my medical problems | 0.87 | 0.011 | 0.89 |
My doctor explains the benefits and risks of treatments to me | 0.87 | 0.010 | ||
My doctor listens to me (Bova et al., 2006) | 0.90 | 0.009 | ||
My doctor believes me when I say something is wrong | 0.81 | 0.014 | ||
My doctor follows up with me when needed | 0.83 | 0.012 | ||
Fidelitye | My doctor puts making money above my needs (Shea et al., 2008) | 0.81 | 0.016 | 0.83 |
My doctor recommends expensive treatments to make money | 0.87 | 0.014 | ||
My doctor might experiment on me without my knowledge (Shea et al., 2008) | 0.77 | 0.020 | ||
My doctor rushes through appointments | 0.80 | 0.018 | ||
Systems Trust | My doctor would be held accountable if they made a mistake | 0.91 | 0.008 | 0.92 |
My doctor would be held accountable if they treated me unfairly | 0.95 | 0.007 | ||
My doctor would be held accountable if they discriminated against me | 0.91 | 0.009 | ||
Confidentiality | My doctor keeps my medical records private (LaVeist et al., 2009; Rose et al., 2004) | 0.90 | 0.010 | 0.88 |
My doctor uses secure systems to store medical records | 0.87 | 0.011 | ||
My doctor respects my privacy | 0.93 | 0.010 | ||
Fairness | My doctor would treat me fairly, regardless of my ability to pay | 0.79 | 0.015 | 0.95 |
My doctor would treat me fairly, regardless of my race or ethnicity (Shea et al., 2008) | 0.93 | 0.006 | ||
My doctor would treat me fairly, regardless of my gender (e.g., male, female, or nonbinary) | 0.93 | 0.008 | ||
My doctor would treat me fairly, regardless of my sexual orientation (e.g., straight, gay, lesbian, or bisexual) | 0.90 | 0.009 | ||
My doctor would treat me fairly, regardless of my weight | 0.92 | 0.008 | ||
My doctor would treat me fairly, regardless of my religion | 0.92 | 0.008 | ||
My doctor would treat me fairly, regardless of my education level | 0.94 | 0.006 | ||
Global Trust | All things considered, I trust my doctor (Safran et al., 1998) | 0.98 | 0.003 | 0.95 |
I put my trust in my doctor (Carver et al., 1989) | 0.95 | 0.005 | ||
My doctor is trustworthy | 0.95 | 0.006 |
Model Fit: χ2(269) = 811.32, p<0.001; RMSEA = 0.050, 90% CI: 0.046 – 0.054; CFI = 0.99; SRMR = 0.02
Citations depict items that were adapted from existing scales
All factor loadings are standardized and significant at p<0.001
Reliability for the second order model of the T-MD scale was calculated using an approach described by Raykov and colleagues (2018); reliability for the unidimensional subscales was calculated using Cronbach’s alpha
Items in this domain/factor are reverse coded
Table 6.
Confirmatory Factor Analysis Results for the Trust in the Health Care Team (T-HCT) Scale Using Full Sample (n=801)a
Latent Variable | Factor/Item Name and Descriptionb | Factor Loadingc | Standard Error | Scale Reliabilityd |
---|---|---|---|---|
Trust in the Health Care Team (T-HCT) | Communication Competency | 0.95 | 0.007 | 0.89 |
Fidelity | 0.75 | 0.016 | ||
Systems Trust | 0.70 | 0.019 | ||
Confidentiality | 0.88 | 0.010 | ||
Fairness | 0.89 | 0.008 | ||
Stigma-based Discrimination | 0.51 | 0.024 | ||
Global Trust | 0.86 | 0.011 | ||
Communication Competency | People who work in health care have good judgment (Thom et al., 1999) | 0.81 | 0.015 | 0.87 |
People who work in health care explain the benefits and risks of treatments to patients | 0.80 | 0.015 | ||
People who work in health care listen to patients (Bova et al., 2006) | 0.91 | 0.008 | ||
People who work in health care believe patients when they say something is wrong | 0.76 | 0.016 | ||
People who work in health care follow up with patients when needed | 0.80 | 0.014 | ||
Fidelitye | People who work in health care put making money above patient needs (Shea et al., 2008) | 0.86 | 0.014 | 0.87 |
People who work in health care recommend expensive treatments to make money | 0.84 | 0.014 | ||
People who work in health care hide mistakes (Rose et al., 2004) | 0.81 | 0.017 | ||
People who work in health care might experiment on patients without their knowledge (Shea et al., 2008) | 0.77 | 0.020 | ||
People who work in health care rush through appointments | 0.76 | 0.020 | ||
Systems Trust | People who work in health care are held accountable if they make a mistake | 0.90 | 0.010 | 0.92 |
People who work in health care are held accountable if they treat patients unfairly | 0.93 | 0.009 | ||
People who work in health care are held accountable if they discriminate against patients | 0.92 | 0.009 | ||
Confidentiality | People who work in health care keep medical records private (LaVeist et al., 2009; Rose et al., 2004) | 0.90 | 0.009 | 0.87 |
People who work in health care use secure systems to store medical records | 0.88 | 0.011 | ||
People who work in health care respect patient privacy | 0.89 | 0.013 | ||
Fairness | People who work in health care treat patients fairly, regardless of their ability to pay | 0.79 | 0.015 | 0.93 |
People who work in health care treat patients of all races and ethnicities fairly (Shea et al., 2008) | 0.88 | 0.008 | ||
People who work in health care treat patients fairly, regardless of their gender (e.g., male, female, or nonbinary) | 0.89 | 0.008 | ||
People who work in health care treat patients fairly, regardless of their sexual orientation (e.g., straight, gay, lesbian, or bisexual) | 0.89 | 0.008 | ||
People who work in health care treat patients fairly, regardless of their weight | 0.88 | 0.009 | ||
People who work in health care treat patients fairly, regardless of their religion | 0.88 | 0.009 | ||
People who work in health care treat patients fairly, regardless of their education level | 0.88 | 0.010 | ||
Stigma-Based Discriminatione | People who work in health care treat patients with a history of mental illness unfairly | 0.71 | 0.031 | 0.71 |
People who work in health care treat patients diagnosed with HIV unfairly | 0.72 | 0.033 | ||
People who work in health care treat patients who abuse drugs unfairly | 0.72 | 0.030 | ||
Global Trust | All things considered, I trust people who work in health care (Safran et al., 1998) | 0.93 | 0.006 | 0.92 |
I put my trust in people who work in health care (Carver et al., 1989) | 0.92 | 0.008 | ||
People who work in health care are trustworthy | 0.95 | 0.006 |
Model Fit: χ2(370) = 1613.62, p<0.001; RMSEA = 0.065, 90% CI: 0.062 – 0.068; CFI = 0.98; SRMR = 0.03
Citations depict items that were adapted from existing scales
All factor loadings are standardized and significant at p<0.001
Reliability for the second order model of the T-HCT scale was calculated using an approach described by Raykov and colleagues (2018); reliability for the unidimensional subscales was calculated using Cronbach’s alpha
Items in this domain/factor are reverse coded
Measurement invariance.
We used a model ‘build up’ approach to assess configural, weak/metric, and strong/scalar invariance for each scale across Black and white participants. When comparing the weak/metric invariance model to the configural invariance model for the T-MD, the chi-square difference test was non-significant (p>0.05) and the other fit indices (i.e., RMSEA, CFI, and SRMR) did not significantly worsen. When comparing the strong/scalar invariance model to the weak/metric invariance model for the T-MD, the chi-square difference test was significant (p<0.05), but the other fit indices did not significantly worsen. These results suggested that strong/scalar invariance held across the first order T-MD model. This pattern was similar when testing measurement invariance for the T-DiG and T-HCT, suggesting that each scale measured trust the same way across Black and white participants. Supplementary Table S1 presents results from measurement invariance testing.
Construct validity.
We hypothesized that the T-MD, T-DiG, and T-HCT would be strongly correlated with previously developed and validated measures of trust in or mistrust of the same entities. The T-MD scale was strongly and positively correlated with the Trust in Physicians Scale (r = 0.84, p<0.001). The T-DiG was strongly and positively correlated with the General Physician Trust Scale (r = 0.86, p<0.001). Furthermore, the T-HCT was strongly and negatively correlated with the Medical Mistrust Index (r = −0.62, p<0.001). These results supported our hypotheses. The T-MD, T-DiG, and T-HCT were also strongly and positively correlated with each other as expected given that they include similarly worded items while referring to different objects of trust (T-MD with T-DiG: r = 0.72, p<0.001; T-MD with T-HCT: r = 0.71, p<0.001; T-DiG with T-HCT: r = 0.88, p<0.001).
As hypothesized, the T-MD was modestly correlated with trust in others (r = 0.17, p<0.001) and trust in the federal government (r = 0.20, p<0.001). The T-DiG was also modestly correlated with trust in others (r = 0.27, p<0.001) and trust in the federal government (r = 0.35, p<0.001). Similarly, the T-HCT was modestly correlated with trust in others (r = 0.25, p<0.001) and trust in the federal government (r = 0.32, p<0.001). These results supported our hypothesis that each scale would be modestly correlated with trust in others and trust in the federal government. Furthermore, the T-MD, T-DiG, and T-HCT scales were negatively correlated with perceived racism in health care, as hypothesized (T-MD: r = −0.34, p<0.001; T-DiG: r = −0.54, p<0.001; T-HCT: r = −0.55, p<0.001).
Predictive validity.
To assess predictive validity, we tested the hypotheses that the T-MD, T-DiG, and T-HCT scales would be negatively associated with delayed health care seeking and positively associated with receipt of a routine health exam in the past 12 months after adjusting for covariates. As hypothesized, all three scales were negatively associated with delayed health care seeking (T-MD: β = −0.24, 95% CI −0.33, −0.15; T-DiG: β = −0.24, 95% CI −0.33, −0.15; T-HCT: β = −0.27, 95% CI −0.35, −0.18). Similarly, all three scales were positively associated with receipt of a routine health exam, as hypothesized (T-MD: β = 0.23, 95% CI 0.14, 0.31; T-DiG: β = 0.14, 95% CI 0.05, 0.23; T-HCT: β = 0.11, 95% CI 0.03, 0.20).
Discussion
The development and validation of the T-MD, T-DiG, and T-HCT scales build on the growing body of work aimed to understand and measure trust in health care settings (Müller et al., 2014; Ozawa & Sripad, 2013; Williamson & Bigman, 2018). Our results suggest that each scale demonstrates sound psychometric properties. Based on exploratory and confirmatory factor analyses, we identified factor structures for each scale that aligned with the Health Systems Trust Content Area Framework (Ozawa & Sripad, 2013). We found evidence of construct validity as each scale was strongly correlated with previously developed and validated measures of trust in health care entities. Each scale was also negatively correlated with perceived racism in health care, mirroring previous research and suggesting that trust decreases as people perceive and experience racism in health care (Bazargan et al., 2021; Hammond, 2010). The scales were modestly correlated with trust in others and trust in the federal government, suggesting that the scales distinguish between trust in health care and trust in these other entities (i.e., other people and the government).
We also found evidence of predictive validity as all three scales were significantly associated with delayed health care seeking and receipt of a routine health exam. These results echo the larger literature, suggesting that trust in health care providers predicts health services use (LaVeist et al., 2009). Future research is needed to understand the mechanisms by which trust may be associated with health services use outcomes (e.g., health care seeking delays and receipt of preventive health care).
Given evidence suggesting that trust influences health outcomes, researchers are interested in designing interventions and making structural changes within the health care system to build trust. In a commentary describing strategies to improve trust in health care, Lee and colleagues suggested that trust become a standard metric for evaluating and potentially rewarding health care organizations (Lee et al., 2019). Such important ideas for prioritizing patient trust cannot be implemented effectively without understanding and reliably measuring trust. Yet, measuring trust has historically been challenging, and many researchers employ single-item or brief measures of trust that may not sufficiently capture the complex nature of this construct (Ozawa & Sripad, 2013). The T-MD, T-DiG, and T-HCT scales offer new approaches to measure and assess trust while considering its multidimensional nature.
Using a multidimensional measure of trust and the Health Systems Trust Content Area Framework may help researchers understand why previous trust interventions produced inconsistent and largely null results (Ozawa & Sripad, 2013; Rolfe et al., 2014). Previous interventions have generally targeted only one domain of trust (Rolfe et al., 2014), and it is perhaps unsurprising that such interventions may not improve overall trust scores given the multidimensional nature of trust. For example, an intervention that aims to improve patient-provider communication may improve patient ratings on the communication competency domain of trust, but there is no theoretical reason that improved communication would affect most other domains (e.g., confidentiality and fidelity). Interventionists should carefully consider which domains of trust to prioritize and choose a measure of trust that assesses the appropriate domains.
Accordingly, researchers assessing interventions designed to increase trust in medical providers and teams may find the T-MD, T-DiG, and T-HCT a useful evaluation tool. In studies where trust is not the main outcome of interest, researchers may prefer to use existing scales that are unidimensional and/or contain fewer items. However, as health systems tackle the challenge of earning trust across diverse populations, multidimensional measures—such as the T-MD, T-DiG, and T-HCT—may be particularly relevant. Capturing the complex, multidimensional nature of trust may be especially important when trust is the main intervention outcome. For example, researchers may find that their intervention improves some dimensions of trust but not others. Indeed, an intervention focused on holding health care providers accountable for unfair treatment may increase scores on the fairness and systems trust domains, but not other domains related to communication and confidentiality. The full effect of such interventions may be masked when using a unidimensional measure or one that does not sufficiently capture a wide range of trust dimensions. Furthermore, health systems may find these scales useful to identify patients with lower trust scores who may benefit from outreach efforts that help patients remain engaged in care.
Limitations
Our study has several limitations. Although we recruited a diverse national sample of U.S. participants to assess the psychometric properties of each scale, this was a convenience sample of online participants who identified as Black or white. Future research is needed to assess the psychometric properties of these scales among other populations, such as other racial and ethnic groups and people living in other countries. We also used a cross-sectional design. Future studies would benefit from using longitudinal designs to further assess the psychometric properties of each scale (e.g., test-retest reliability) and evaluate meaningful changes in people’s level of trust across the developmental lifespan. Future studies would also benefit from assessing the predictive validity of these scales for outcomes that do not rely on self-report (e.g., by using electronic medical records to capture health services utilization). Although we recruited participants from diverse sources not necessarily connected to health care organizations, it is possible that participants who consented to this study were more trusting of medical researchers than the general population. Additionally, given the similar wording of the items in the T-DiG and T-HCT scales, we may have observed distinctions between the T-DiG and T-HCT because they were tested together in one study. Researchers who use these scales separately may not find these distinctions. Finally, survey data collection occurred in March - April 2020, when concerns about the COVID-19 pandemic began to rise in the U.S. Results may have differed if data were collected prior to the emergence of this pandemic. Future research should explore how trust in doctors and health care teams may change as the nation watches how medical and public health leaders respond to the pandemic.
Strengths
A key strength of this study includes the deductive development of multidimensional trust scales using the Health Systems Trust Content Area Framework (Ozawa & Sripad, 2013). To our knowledge, this is the first study to develop measures of health care-related trust that cover the dimensions identified in this framework. Overall, our study found strong support for this Framework as the factors in each scale largely mapped onto the Framework. However, we found notable exceptions as, for example, the communication and competence domains overlapped and were grouped in the same factor (communication competency). This study joins recent evidence finding overlap between communication and competence in the assessment of trust (Greene & Ramos, 2021). Future studies should further explore how domains in the Health Systems Trust Content Area Framework may overlap and/or be uniquely important in the assessment of trust. Additionally, our study focused on creating multidimensional measures of trust based on the Health Systems Trust Content Area Framework, but future work is needed to develop a similar framework conceptualizing medical mistrust to advance research about and measurement of medical mistrust. Such a framework could explicitly name factors like racism, homophobia, and classism that cause a general unease about and mistrust of health care entities.
Although the comprehensive and multidimensional nature of our scales is a strength, it may represent a challenge to researchers aiming to manage participant burden. As such, future research is needed to develop shorter versions of each scale that maintain the multidimensional structure. Additionally, researchers may wish to use only one of these scales per study depending on their research question. For example, a researcher intending to measure patient trust in the medical doctors of a practice may prefer the T-DiG, whereas researchers aiming to assess trust in people who work in health care may prefer the T-HCT. Similarly, a researcher measuring trust in a population where many people do not have a primary care doctor may prefer the T-DiG or T-HCT instead of the T-MD. Because the T-DiG and T-HCT are highly correlated, researchers should use either the T-DiG or the T-HCT in their studies, depending on which one best fits their research objectives. In each of these situations, researchers should carefully consider which object of trust is most relevant to their research question and select an appropriate measure. Indeed, we designed new measures for researchers who wish to study trust in these three entities while considering critical concepts like fairness that are often excluded from existing trust measures. Another key strength is that we designed our scales to require an 8th grade reading level or less and conducted in-depth cognitive interviews to help ensure our items were understandable and achieved their desired purpose.
Conclusions
The T-MD, T-DiG, and T-HCT are valid measurement options to assess individuals’ trust in their personal or regular doctor, doctors in general, and the larger health care team. These multidimensional scales may be particularly useful for researchers who aim to design trust-related interventions, measure trust in national surveys to track trends over time, or conduct studies where trust is an important construct or main outcome of interest. These scales may also support the rigorous measurement and understanding of trust in health care settings, which is critical to achieving health equity.
Supplementary Material
Table 5.
Confirmatory Factor Analysis Results for the Trust in Doctors in General (T-DiG) Scale Using Full Sample (n=801)a
Latent Variable | Factor/Item Name and Descriptionb | Factor Loadingc | Standard Error | Scale Reliabilityd |
---|---|---|---|---|
Trust in Doctors in General (T-DiG) | Communication Competency | 0.95 | 0.007 | 0.89 |
Fidelity | 0.79 | 0.015 | ||
Systems Trust | 0.74 | 0.017 | ||
Confidentiality | 0.84 | 0.012 | ||
Fairness | 0.88 | 0.010 | ||
Stigma-based Discrimination | 0.51 | 0.025 | ||
Global Trust | 0.87 | 0.011 | ||
Communication Competency | Doctors have good judgment (Thom et al., 1999) | 0.86 | 0.012 | 0.87 |
Doctors explain the benefits and risks of treatments to patients. | 0.82 | 0.013 | ||
Doctors listen to patients (Bova et al., 2006) | 0.85 | 0.011 | ||
Doctors believe patients when they say something is wrong | 0.78 | 0.015 | ||
Doctors follow up with patients when needed | 0.82 | 0.013 | ||
Fidelitye | Doctors put making money above patient needs (Shea et al., 2008) | 0.79 | 0.017 | 0.85 |
Doctors recommend expensive treatments to make money | 0.82 | 0.015 | ||
Doctors hide mistakes (Rose et al., 2004) | 0.82 | 0.017 | ||
Doctors might experiment on patients without their knowledge (Shea et al., 2008) | 0.72 | 0.021 | ||
Doctors rush through appointments | 0.73 | 0.021 | ||
Systems Trust | Doctors are held accountable if they make a mistake | 0.88 | 0.010 | 0.92 |
Doctors are held accountable if they treat patients unfairly | 0.94 | 0.006 | ||
Doctors are held accountable if they discriminate against patients | 0.94 | 0.008 | ||
Confidentiality | Doctors keep medical records private (LaVeist et al., 2009; Rose et al., 2004) | 0.88 | 0.011 | 0.88 |
Doctors use secure systems to store medical records | 0.86 | 0.011 | ||
Doctors respect patient privacy | 0.93 | 0.009 | ||
Fairness | Doctors treat patients fairly, regardless of their ability to pay | 0.79 | 0.014 | 0.93 |
Doctors treat patients of all races and ethnicities fairly (Shea et al., 2008) | 0.90 | 0.008 | ||
Doctors treat patients fairly, regardless of their gender (e.g., male, female, or nonbinary) | 0.89 | 0.008 | ||
Doctors treat patients fairly, regardless of their sexual orientation (e.g., straight, gay, lesbian, or bisexual) | 0.88 | 0.009 | ||
Doctors treat patients fairly, regardless of their weight | 0.88 | 0.010 | ||
Doctors treat patients fairly, regardless of their religion | 0.89 | 0.010 | ||
Doctors treat patients fairly, regardless of their education level | 0.88 | 0.009 | ||
Stigma-Based Discriminatione | Doctors treat patients with a history of mental illness unfairly | 0.70 | 0.026 | 0.74 |
Doctors treat patients diagnosed with HIV unfairly | 0.81 | 0.027 | ||
Doctors treat patients who abuse drugs unfairly | 0.72 | 0.025 | ||
Global Trust | All things considered, I trust doctors (Safran et al., 1998) | 0.95 | 0.006 | 0.92 |
I put my trust in doctors (Carver et al., 1989) | 0.92 | 0.007 | ||
Doctors are trustworthy | 0.91 | 0.009 |
Model Fit: χ2(370) = 1362.34, p<0.001; RMSEA = 0.058, 90% CI: 0.055 – 0.061; CFI = 0.98; SRMR = 0.03
Citations depict items that were adapted from existing scales
All factor loadings are standardized and significant at p<0.001
Reliability for the second order model of the T-DiG scale was calculated using an approach described by Raykov and colleagues (2018); reliability for the unidimensional subscales was calculated using Cronbach’s alpha
Items in this domain/factor are reverse coded
HIGHLIGHTS.
We developed new measures of trust in three health care entities.
The scales demonstrated sound psychometric properties.
Each scale correlated with existing measures and perceived racism in health care.
Each scale was associated with delayed health care seeking.
These scales can support the rigorous measurement of trust in health care settings.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Declaration of Competing Interests: None
CRediT (Contributor Roles Taxonomy) Author Statement
Jennifer Richmond, PhD, MSPH: Conceptualization, methodology, formal analysis, investigation, data curation, writing - original draft, writing - review & editing, visualization, project administration, funding acquisition
Marcella H. Boynton, PhD: Conceptualization, methodology, formal analysis, writing - review & editing
Sachiko Ozawa, PhD, MHS: Conceptualization, methodology, writing - review & editing
Kathryn E. Muessig, PhD: Conceptualization, methodology, writing - review & editing
Samuel Cykert, MD: Conceptualization, methodology, writing - review & editing
Kurt M. Ribisl, PhD: Conceptualization, methodology, writing - review & editing, supervision, project administration, funding acquisition
REFERENCES
- Adams LB, Richmond J, Corbie-Smith G, & Powell W (2017). Medical Mistrust and Colorectal Cancer Screening Among African Americans. J Community Health, 42, 1044–1061. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Alsan M, & Wanamaker M (2016). Tuskegee and the Health of Black Men. National Bureau of Economic Research Working Paper Series, No. 22323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Armstrong K, McMurphy S, Dean LT, Micco E, Putt M, Halbert CH, et al. (2008). Differences in the patterns of health care system distrust between blacks and whites. J Gen Intern Med, 23, 827–833. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Armstrong K, Putt M, Halbert CH, Grande D, Schwartz JS, Liao K, et al. (2013). Prior experiences of racial discrimination and racial differences in health care system distrust. Med Care, 51, 144–150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bailey ZD, Krieger N, Agenor M, Graves J, Linos N, & Bassett MT (2017). Structural racism and health inequities in the USA: evidence and interventions. Lancet, 389, 1453–1463. [DOI] [PubMed] [Google Scholar]
- Bazargan M, Cobb S, & Assari S (2021). Discrimination and Medical Mistrust in a Racially and Ethnically Diverse Sample of California Adults. Ann Fam Med, 19, 4–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benkert R, Cuevas A, Thompson HS, Dove-Meadows E, & Knuckles D (2019). Ubiquitous Yet Unclear: A Systematic Review of Medical Mistrust. Behav Med, 45, 86–101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Birkhauer J, Gaab J, Kossowsky J, Hasler S, Krummenacher P, Werner C, et al. (2017). Trust in the health care professional and health outcome: A meta-analysis. PLoS ONE, 12, e0170988. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Blendon RJ, Benson JM, & Hero JO (2014). Public trust in physicians--U.S. medicine in international perspective. N Engl J Med, 371, 1570–1572. [DOI] [PubMed] [Google Scholar]
- Boas TC, Christenson DP, & Glick DM (2018). Recruiting large online samples in the United States and India: Facebook, Mechanical Turk, and Qualtrics. Political Science Research and Methods, 1–19. [Google Scholar]
- Bova C, Fennie KP, Watrous E, Dieckhaus K, & Williams AB (2006). The health care relationship (HCR) trust scale: development and psychometric evaluation. Res Nurs Health, 29, 477–488. [DOI] [PubMed] [Google Scholar]
- Brown T (2015). Confirmatory Factor Analysis for Applied Research, Second. New York. [Google Scholar]
- Carver CS, Scheier MF, & Weintraub JK (1989). Assessing coping strategies: a theoretically based approach. J Pers Soc Psychol, 56, 267–283. [DOI] [PubMed] [Google Scholar]
- Centers for Disease Control and Prevention. (2018). Behavioral Risk Factor Surveillance System Survey. Atlanta, Georgia. [Google Scholar]
- Chen FF (2007). Sensitivity of Goodness of Fit Indexes to Lack of Measurement Invariance. Structural Equation Modeling: A Multidisciplinary Journal, 14, 464–504. [Google Scholar]
- Daly M, Jones A, & Robinson E (2021). Public Trust and Willingness to Vaccinate Against COVID-19 in the US From October 14, 2020, to March 29, 2021. Jama. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DeVellis RF (2016). Scale development: Theory and applications: Sage publications. [Google Scholar]
- Ehrhart MG, Aarons GA, & Farahnak LR (2014). Assessing the organizational context for EBP implementation: the development and validity testing of the Implementation Climate Scale (ICS). Implementation Science, 9, 157. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feagin J, & Bennefield Z (2014). Systemic racism and U.S. health care. Soc Sci Med, 103, 7–14. [DOI] [PubMed] [Google Scholar]
- Gamble VN (1997). Under the shadow of Tuskegee: African Americans and health care. American journal of public health, 87, 1773–1778. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goretzko D, Pham TTH, & Bühner M (2019). Exploratory factor analysis: Current use, methodological developments and recommendations for good practice. Current Psychology. [Google Scholar]
- Greene J, & Ramos C (2021). A Mixed Methods Examination of Health Care Provider Behaviors That Build Patients’ Trust. Patient Educ Couns, 104, 1222–1228. [DOI] [PubMed] [Google Scholar]
- Griffith DM, Bergner EM, Fair AS, & Wilkins CH (2021). Using Mistrust, Distrust, and Low Trust Precisely in Medical Care and Medical Research Advances Health Equity. Am J Prev Med, 60, 442–445. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Halbert CH, Armstrong K, Gandy OH Jr., & Shaker L (2006). Racial differences in trust in health care providers. Arch Intern Med, 166, 896–901. [DOI] [PubMed] [Google Scholar]
- Hall MA, Camacho F, Dugan E, & Balkrishnan R (2002a). Trust in the medical profession: conceptual and measurement issues. Health Serv Res, 37, 1419–1439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hall MA, Zheng B, Dugan E, Camacho F, Kidd KE, Mishra A, et al. (2002b). Measuring patients’ trust in their primary care providers. Med Care Res Rev, 59, 293–318. [DOI] [PubMed] [Google Scholar]
- Hammond WP (2010). Psychosocial correlates of medical mistrust among African American men. Am J Community Psychol, 45, 87–106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hillen MA, de Haes HC, & Smets EM (2011). Cancer patients’ trust in their physician-a review. Psychooncology, 20, 227–241. [DOI] [PubMed] [Google Scholar]
- Hinkin TR (1998). A Brief Tutorial on the Development of Measures for Use in Survey Questionnaires. Organizational Research Methods, 1, 104–121. [Google Scholar]
- Hu L, & Bentler PM (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal, 6, 1–55. [Google Scholar]
- Kyriazos T (2018). Applied Psychometrics: The 3-Faced Construct Validation Method, a Routine for Evaluating a Factor Structure. Psychology, 9, 2044–2072. [Google Scholar]
- LaVeist TA, Isaac LA, & Williams KP (2009). Mistrust of health care organizations is associated with underutilization of health services. Health Serv Res, 44, 2093–2105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- LaVeist TA, Nickerson KJ, & Bowie JV (2000). Attitudes about racism, medical mistrust, and satisfaction with care among African American and white cardiac patients. Med Care Res Rev, 57 Suppl 1, 146–161. [DOI] [PubMed] [Google Scholar]
- Lee TH, McGlynn EA, & Safran DG (2019). A Framework for Increasing Trust Between Patients and the Organizations That Care for Them. Jama, 321, 539–540. [DOI] [PubMed] [Google Scholar]
- Ley P, & Florio T (1996). The use of readability formulas in health care. Psychology, Health & Medicine, 1, 7–28. [Google Scholar]
- McClair TL, Sripad P, Casseus A, Hossain S, Abuya T, & Gottert A (2021). The Client Empowerment in Community Health Systems Scale: Development and validation in three countries. Journal of global health, 11, 07010–07010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Müller E, Zill JM, Dirmaier J, Härter M, & Scholl I (2014). Assessment of Trust in Physician: A Systematic Review of Measures. PLoS ONE, 9, e106844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Musa D, Schulz R, Harris R, Silverman M, & Thomas SB (2009). Trust in the health care system and the use of preventive health services by older black and white adults. American journal of public health, 99, 1293–1299. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ozawa S, & Sripad P (2013). How do you measure trust in the health system? A systematic review of the literature. Soc Sci Med, 91, 10–14. [DOI] [PubMed] [Google Scholar]
- Powell W, Richmond J, Mohottige D, Yen I, Joslyn A, & Corbie-Smith G (2019). Medical Mistrust, Racism, and Delays in Preventive Health Screening Among African-American Men. Behav Med, 45, 102–117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Putnick DL, & Bornstein MH (2016). Measurement invariance conventions and reporting: The state of the art and future directions for psychological research. Developmental Review, 41, 71–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Raykov T, Goldammer P, Marcoulides GA, Li T, & Menold N (2018). Reliability of Scales With Second-Order Structure: Evaluation of Coefficient Alpha’s Population Slippage Using Latent Variable Modeling. Educational and psychological measurement, 78, 1123–1135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rolfe A, Cash-Gibson L, Car J, Sheikh A, & McKinstry B (2014). Interventions for improving patients’ trust in doctors and groups of doctors. Cochrane Database Syst Rev, 3, Cd004134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rose A, Peters N, Shea JA, & Armstrong K (2004). Development and testing of the health care system distrust scale. J Gen Intern Med, 19, 57–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Safran DG, Kosinski M, Tarlov AR, Rogers WH, Taira DH, Lieberman N, et al. (1998). The Primary Care Assessment Survey: tests of data quality and measurement performance. Med Care, 36, 728–739. [DOI] [PubMed] [Google Scholar]
- Scharff DP, Mathews KJ, Jackson P, Hoffsuemmer J, Martin E, & Edwards D (2010). More than Tuskegee: Understanding Mistrust about Research Participation. J Health Care Poor Underserved, 21, 879–897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shea JA, Micco E, Dean LT, McMurphy S, Schwartz JS, & Armstrong K (2008). Development of a revised Health Care System Distrust scale. J Gen Intern Med, 23, 727–732. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thom DH, Ribisl KM, Stewart AL, & Luke DA (1999). Further validation and reliability testing of the Trust in Physician Scale. The Stanford Trust Study Physicians. Med Care, 37, 510–517. [DOI] [PubMed] [Google Scholar]
- Thompson HS, Manning M, Mitchell J, Kim S, Harper FWK, Cresswell S, et al. (2021). Factors Associated With Racial/Ethnic Group–Based Medical Mistrust and Perspectives on COVID-19 Vaccine Trial Participation and Vaccine Uptake in the US. JAMA network open, 4, e2111629–e2111629. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Williamson LD, & Bigman CA (2018). A systematic review of medical mistrust measures. Patient Educ Couns, 101, 1786–1794. [DOI] [PubMed] [Google Scholar]
- Willis GB, & Artino AR (2013). What Do Our Respondents Think We’re Asking? Using Cognitive Interviewing to Improve Medical Education Surveys. Journal of Graduate Medical Education, 5, 353–356. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang TC, Matthews SA, & Hillemeier MM (2011). Effect of health care system distrust on breast and cervical cancer screening in Philadelphia, Pennsylvania. Am J Public Health, 101, 1297–1305. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.