Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Dec 1.
Published in final edited form as: Clin Trials. 2015 Jul 7;12(6):618–626. doi: 10.1177/1740774515592881

Characteristics of clinical trials that require participants to be fluent in English

Brian L Egleston 1, Omar Pedraza 1, Yu-Ning Wong 1, Roland L Dunbrack Jr 1, Candace L Griffin 2, Eric A Ross 1, J Robert Beck 1
PMCID: PMC4643363  NIHMSID: NIHMS697302  PMID: 26152834

Abstract

Background/Aims

Diverse samples in clinical trials can make findings more generalizable. We sought to characterize the prevalence of clinical trials in the United States that required English fluency for participants to enroll in the trial.

Methods

We randomly chose over 10,000 clinical trial protocols registered with ClinicalTrials.gov and examined the inclusion and exclusion criteria of the trials. We compared the relationship of clinical trial characteristics with English fluency inclusion requirements. We merged the ClinicalTrials.gov data with U.S. Census and American Community Survey data to investigate the association of English language restrictions with ZIP-code level demographic characteristics of participating institutions. We used Chi-squared tests, t-tests, and logistic regression models for analyses.

Results

English fluency requirements have been increasing over time, from 1.7% of trials having such requirements before 2000 to 9.0% after 2010 (p<0.001 from Chi-squared test). Industry sponsored trials had low rates of English fluency requirements (1.8%) while behavioral trials had high rates (28.4%). Trials opening in the Northeast of the U.S. had the highest regional English requirement rates (10.7%) while trials opening in more than one region had the lowest (3.3%, p<0.001). Since 1995, trials opening in ZIP-codes with larger Hispanic populations were less likely to have English fluency requirements (OR=0.92 for each 10 percent increase in proportion of Hispanics, 95% CI 0.86–0.98, p=0.013). Trials opening in ZIP-codes with more residents self-identifying as Black/African American (OR=1.87, 95% CI 1.36–2.58, p<0.001 for restricted cubic spline term) or Asian (OR=1.16 for linear term, 95% CI 1.07–1.25, p<0.001) were more likely to have English fluency requirements. ZIP-codes with higher poverty rates had trials with more English language restrictions (OR=1.06 for a 10 percent poverty rate increase, 95% CI 1.001–1.11, p=0.045). There was a statistically significant interaction between year and intervention type, such that the increase in English fluency requirements was more common for some interventions than for others.

Conclusions

The proportion of clinical trials registered with ClinicalTrials.gov that have English fluency requirements for study inclusion has been increasing over time. English language restrictions are associated with a number of characteristics, including the demographic characteristics of communities in which the sponsoring institutions are located.

Keywords: English fluency requirements, Eligibility requirements, Health disparities research, Minority participation

Introduction

Recruiting a diverse sample of clinical trial participants can make the findings of a clinical trial more generalizable to the overall population.1,2 A primary objective of a clinical trial is often to test the efficacy of an intervention. Efficacy is generally defined in the clinical trial setting as “the true biological effect of a treatment.”3 In contrast, effectiveness is generally defined as “the effect of a treatment when widely used in practice.”3 A diverse population enhances inferences concerning treatment effectiveness in a wider population.

Increasing participation of underrepresented groups in clinical trials has also been advocated as a mechanism for reducing health disparities.4 If a diverse group of patients is recruited to large Phase III trials, then prespecified subgroup analyses might be able to identify whether treatment is particularly effective for certain categories of individuals. Such subgroup analyses might be particularly important for treatments that are not very effective on average, but very effective within a specific subpopulation.

One reason why investigators might be generally reluctant to include a diverse population is that diversity may reduce power to detect effects.5,6 Heterogeneity in treatment effects among subgroups in a clinical trial will increase variability in effect estimators.5 Thus, larger samples will need to be enrolled in a trial to overcome the increase in variability due to a more diverse sample. The need for a larger sample size in clinical trials has been noted, and some investigators have argued that “practical clinical trials” should be developed with a focus on broader issues of effectiveness rather than efficacy.2,7

Perceived implementation barriers may also influence decisions regarding the inclusion of non-English speakers to clinical trials. For example, there may be concerns about the costs of recruiting multi-lingual staff and translating study instruments.8 It has also been argued that informed consent documents cannot be translated into all possible languages, and hence it may be unethical to include patients for whom translation services are not available.9

Much research concerning barriers to participation in clinical trials has been qualitative in nature.10,11 In this paper, we quantify the proportion of studies that require participants to be fluent in English to enroll in the trial. We investigate the trial characteristics associated with such restrictions.12

Methods

Data for this project came from registered protocol data available on ClinicalTrials.gov. As stated on the website, the ClinicalTrials.gov database contains detailed information on over 100,000 clinical trials sponsored by the National Institutes of Health, other governmental agencies, and private industry. The National Library of Medicine of the National Institutes of Health (NIH) maintains the database. In 2004, members of the International Committee of Medical Journal Editors issued a statement that clinical trials had to be registered on public databases prior to enrolling patients for results to be published in major medical journals.13 As a result, registering clinical trials on websites such as ClinicalTrials.gov before patients are enrolled is now standard. The database contains a rich source of information about the specific eligibility criteria of clinical trials.

Using the geographic search tools of ClinicalTrials.gov, we found 68,188 located in the United States on January 31, 2013. Of these, we randomly sampled and reviewed inclusion and exclusion criteria of over 10,000 protocols. We used the R (R Foundation for Statistical Computing, Vienna, Austria) command sample() to randomly permute the order of the trials and then chose the first 10,000 for further examination. This project was part of a larger study funded by the National Cancer Institute to examine racial and English fluency exclusions in clinical trials. Preliminary data14 indicated that overall rates of exclusions of some racial groups might be as low as 1%. Hence, we chose to examine 10,000 studies so that we would have 90% power to detect odds ratios of 2.0 when comparing characteristics of trials with (expected number=100) and without exclusions (expected number=9,900). This assumed a 5% Type I error rate (2-sided) and a baseline rate of a clinical trial characteristic, such as the geographic region of the country (i.e. Northeast, Midwest, South, West, Multi-region), of 25% in trials that do not have exclusions. For example, if 25% of trials without exclusions were located in the United States Northeast, we could detect an association of region with English fluency exclusions if 40% of trials with exclusions were located in the United States Northeast (40%/60%)/(25%/75%)= OR of 2.0).

We coded a study as requiring participants to be fluent in English if the inclusion or exclusion criteria stated that participants were required to speak English or had similar terminology. Some examples of specific exclusion criteria were “unable to speak English,” and “not speaking English.” Similarly, specific inclusion examples required that participants “should be fluent in English,” “must be able to read written English,” and should be “literate in English.” Studies that allowed participants to speak English and at least one other language were not categorized as requiring participants to speak English. For example, studies that required participants speak either English or Spanish were not considered to require participants to speak English only.

Three individuals reviewed studies (BLE, OP, and CG). To ensure standardization of the coding algorithm, the three reviewers initially coded overlapping sets of randomly selected trials for learning purposes. Differences in coding were discussed and reconciled. We used Cohen’s Kappa to assess inter-rater reliability. We used Chi-squared and t-tests to investigate whether those trials used to assess inter-rater reliability were different from those trials not used.

We examined the relationship of English fluency requirements with characteristics of the trials using Chi-squared and t-tests. Trial characteristics available from ClinicalTrials.gov fields were funding agency, study type (e.g. observational vs. interventional), intervention type (e.g. behavioral or drug), phase of trial, gender of participants, year study opened, and region of the country based on U.S. Census definitions. We did not include tabular cells representing missing data for the Chi-squared tests.

For studies which included ZIP-codes of sponsoring or participating institutions, we identified ZIP-code level demographic characteristics from the United States Census15 and the American Community Survey.16 In particular, we examined whether there were general differences between studies that did and did not have English language requirements in the proportion of area residents that self-identified as Caucasian/White, African American/Black, Asian, Hispanic, spoke English only, spoke English less than very well, or had incomes below the poverty line. The 10 year Census data captured all of this information until 2010. In 2010, the Census stopped using the long form for the decennial census, and instead began relying on the American Community Survey to capture language ability and poverty characteristics of the U.S. population. For studies that opened from 1995–2004, we used Census 2000 data. For studies that opened in 2005 or later, we used Census 2010 data for race and ethnicity and the American Community Survey 2007–2011 data for English language and poverty data. For a small number of studies with missing data in the relevant time period, we used the available data in the other time period. For a trial that opened in more than one ZIP-code (e.g. a multi-center trial), we used the simple average of the multiple ZIP-code level characteristics for that trial in analyses. In analyses examining ZIP-code trends, we excluded the relatively small number of trials on ClinicalTrials.gov opened prior to 1995 (approximately 1% opened 1994 or earlier). We summarized the ZIP-code level percentage data using means, medians, and inter-quartile ranges. We also examined the relationship between racial ZIP-code characteristics using scatter plots and Spearman correlations.

We used univariable logistic regression models to further investigate linear and non-linear relationships of English fluency requirements with trial and ZIP-code characteristics. We examined models with ZIP-code characteristics entered as a linear term and via restricted cubic spline terms17 with three knots to investigate non-linear relationships. We used the models with spline terms to create figures that visually display the relationship of ZIP-code characteristics with the probability of having English fluency requirements. We used splines because they allow for a flexible fit, and do not restrict the relationship to be linear or quadratic. In essence, splines allow the data to speak for themselves while allowing some smoothing of the curve so that general trends are more readily apparent. For models in which we examine ZIP-code characteristics, we report odds ratios that indicate the increase in odds of a trial having English language exclusions for a 10 point increase in the proportion of a ZIP-code reporting a characteristic (e.g. a 10 to 20 percent increase in the proportion of residents in a ZIP-code self-identifying as Asian). We used this scale with ZIP-code level data to improve the interpretability of the odds ratios.

We used likelihood ratio tests of nested logistic regression models to investigate whether the relationships of English language restrictions with trial characteristics varied over time (i.e. interaction hypotheses). In separate interaction models for each variable, we included as main effects terms: the continuous year variable, and either the linear ZIP-code demographic variable or the categorical dummy indicators for non-ZIP-code variables. We included as interaction terms the multiplication of year times the linear ZIP-code demographic variable or year times the one or more dummy indicators. We used a logistic regression model with year entered via restricted cubic spline (3 knots) and the knot terms multiplied by indicators of each intervention type to create a figure that visually depicts the change in exclusions over time by intervention type. To ensure a large enough sample size over time, we only used interventions with more than 80 associated trials in this analysis.

In order to quantify the relationship of year on English fluency exclusions after adjusting for clinical trial characteristics and ZIP-code demographics listed above, we used a multiple logistic regression model in which we included year (as a continuous variable), the ZIP-code level variables via restricted cubic splines, and the other trial characteristics listed above as categorical dummy indicators. Because the ZIP-code level percentage variables were correlated with each other, we examine four stepwise logistic regression models with forward selection (p<0.05 by likelihood ratio test) as sensitivity analyses to determine the most salient ZIP-code characteristics: 1) a model that included only the English proficiency ZIP-code characteristics, 2) a model that included the four race/ethnicity ZIP-code characteristics, 3) a model which include all six English proficiency and race/ethnicity characteristics, and 4) a model which included the two English proficiency variables and the Hispanic and Asian ZIP-code proportion variables (two groups that might be more likely to speak languages other than English at home). We examined models with and without spline terms.

Analyses were conducted using STATA (Statacorp, College Station, TX). Since the study examined details of trial protocols, the project was determined not to consist of human subjects research by the Fox Chase Cancer Center Institutional Review Board. The study was funded by grants from the National Cancer Institute (R03CA167264 and P30 CA 06927); the NCI did not review the results of our work.

Results

A total of 10,361 trial protocols were examined out of 68,188 total. Of these, 10,311 had sufficient details to identify English-language exclusions. Many of the 50 with insufficient details had notes that information was redacted from the public ClinicalTrials.gov website. Of the 10,311, 737 (7.1%) required participants to speak English. The earliest trial in our dataset had a start date in 1957, but the next trial did not start until 1976; 98.9% with year available started after 1994.

We found that inter-rater agreement among the three coders was excellent. Cohen’s Kappa was 0.90 (standard error [SE] 0.14) for 54 studies reviewed by OP and BLE, 0.88 (SE 0.12) for 66 studies reviewed by CG and BLE, and 0.80 (SE 0.08) for 145 studies reviewed by CG and OP. When comparing studies used and not used to determine inter-rater reliability, we did not find any statistically significant differences for the variables of interest presented in Tables 1 or 2.

Table 1.

Characteristics of trials by presence of English fluency requirements. Missing cells were not used for hypothesis testing/p-values.

No English Language Restriction
n (row%)
Requires English Fluency
n (row%)
P-value
Number 9,574 737
Year Opened P<0.001
 Before 1995 105 (97.2%) 3 (2.8%)
 1995–1999 462 (98.5%) 7 (1.5%)
 2000–2004 1,717 (94.9%) 93 (5.1%)
 2005–2009 4,184 (92.4%) 345 (7.6%)
 2010 and later 2,821 (91.0%) 280 (9.0%)
 Missing 285 (96.9%) 9 (3.1%)
Funding Agency P<0.001
 Industry 2,959 (98.2%) 53 (1.8%)
 NIH 1,036 (93.9%) 67 (6.1%)
 U.S. Federal Government 203 (86.8%) 31 (13.2%)
 Combination of the above 2,695 (90.8%) 274 (9.2%)
 Other 2,681 (89.6%) 312 (10.4%)
Study Type P=0.006
 Expanded Access 12 (100%) 0 (0%)
 Interventional 7,909 (93.2%) 578 (6.8%)
 Observational 1,637 (91.1%) 159 (8.9%)
 Missing 16 (100%) 0 (0%)
Intervention P<0.001
 Behavioral 691 (71.6%) 274 (28.4%)
 Biological 734 (99.6%) 3 (0.4%)
 Device 626 (94.8%) 34 (5.2%)
 Dietary Supplement 170 (95.5%) 8 (4.5%)
 Drug 5,034 (96.4%) 186 (3.6%)
 Genetic 51 (98.1%) 1 (1.9%)
 Other 1,667 (90.0%) 186 (10.0%)
 Procedure 522 (92.2%) 44 (7.8%)
 Radiation 79 (98.8%) 1 (1.3%)
Phase P<0.001
 Phase 0 63 (86.3%) 10 (13.7%)
 Phase 1 1,393 (96.2%) 55 (3.8%)
 Phase 1 | Phase 2 477 (94.6%) 27 (5.4%)
 Phase 2 2,243 (96.1%) 92 (3.9%)
 Phase 2 | Phase 3 152 (95.6%) 14 (8.4%)
 Phase 3 1,227 (95.3%) 60 (4.7%)
 Phase 4 706 (92.8%) 55 (7.2%)
 Other 3,313 (88.7%) 424 (11.4%)
Gender P<0.001
 Both 8,268 (93.0%) 618 (7.0%)
 Female 892 (90.1%) 98 (9.9%)
 Male 393 (94.9%) 21 (5.1%)
 Missing 21 (100%) 0 (0%)
Region Opened P<0.001
 Midwest 1,324 (91.4%) 124 (8.6%)
 Northeast 1,677 (89.3%) 200 (10.7%)
 South 2,367 (92.8%) 184 (7.2%)
 West 1,214 (90.3%) 130 (9.7%)
 Multi-region 2,820 (96.7%) 95 (3.3%)
 Missing 172 (97.7%) 4 (2.3%)

Table 2.

Demographic characteristics of the ZIP-codes in which the trials open by presence of English fluency requirements. Trials with missing ZIP-code data and those opened before 1995 were excluded. SD=standard deviation, IQR=inter-quartile range. The statistics summarize the percents among ZIP-codes. For example, the means depict the average of the percent characteristics among ZIP-codes. Of those trials with at least one ZIP-code level characteristic missing, 193 (7.3%) of 2,650 trials required English fluency.

ZIP-code Level Characteristics No English Language Restrictions Requires English Fluency P-value
Percent responding that they speak English only n=7,225 n=559
 Mean (SD) 76.2% (14.0%) 76.4% (13.0%) p=0.83
 Median (IQR) 78.4% (70.4%–85.8%) 78.1% (69.9%–86.2%)
Percent responding that they speak English less than “very well” n=7,225 n=559
 Mean (SD) 17.3% (9.5%) 17.7% (9.5%) p=0.24
 Median (IQR) 16.1% (10.3%–22.2%) 17.5% (10.0%–24.0%)
Percent with incomes below the poverty level n=7,117 n=544
 Mean (SD) 21.9% (15.2%) 23.2% (15.6%) 0.045
 Median, IQR 18.1% (11.6% – 29.0%) 19.7% (10.7%–31.1%)
Percent responding that they are White alone n=7,269 n=564
 Mean (SD) 57.0% (20.4%) 56.3% (20.9%) p=0.49
 Median (IQR) 59.6% (45.5%–71.0%) 58.4% (42.1%–73.1%)
Percent responding that they are Black/African American alone n=7,269 n=564
 Mean (SD) 17.1% (17.5%) 17.7% (19.5%) p=0.43
 Median (IQR) 11.6% (5.1%–22.2%) 10.0% (4.2%–22.8%)
Percent responding that they are Asian alone n=7,269 n=564
 Mean (SD) 9.6% (9.2%) 11.1% (11.0%) 0.0002
 Median (IQR) 7.0% (3.9%–12.4%) 8.4% (3.8%–14.7%)
Percent responding that they are Hispanic alone n=7,269 n=564
 Mean (SD) 13.4% (14.4%) 11.9% (12.5%) 0.012
 Median (IQR) 9.1% (4.6%–15.7%) 7.9% (4.2%–14.0%)

Table 1 presents the characteristics of the trials stratified by the presence of English language requirements for participation. We found that English language requirements were present among all types of studies and some important differences emerged. The proportion of studies that require English fluency has been increasing over time. Before 2000, only 1.7% of studies required that participants be fluent in English, with the percentage increasing to 9.0% after 2010. Studies funded solely by industry were least likely to require participants to be fluent in English (1.8%) while studies funded by the NIH, other Federal agencies, or some combination of the three (industry/NIH/other Federal agencies) had higher rates of English language requirements. Those funded by other sources of funding had the largest absolute number of English language requirements.

Table 2 presents the ZIP-code level characteristics of trials that did and did not have English fluency requirements. Studies with English fluency requirements were more likely to be located in areas with higher poverty rates (p=0.045) and more Asians (p=0.0002), but less likely to be located in areas with more Hispanics on average (p=0.012), although the magnitudes of the differences were modest.

In terms of the intervention type, behavioral studies were the most likely to require English fluency with 28.4% having such requirements. However, there were still substantial numbers of trials in other categories that required English fluency. For example, 186 (3.6%) of drug trials had English fluency requirements as did 44 (7.8%) of procedure trials. Those trials that did not fall in the typical Phase 1 through Phase 4 categories were more likely to require English fluency. In terms of geography, studies in the Northeast were most likely to have English fluency requirements while studies in the South and located in more than one region were least likely to have such requirements (see bottom of Table 1).

Figure 1 presents the estimated proportion of trials with English language requirements by ZIP-code demographics with respect to English speaking and poverty characteristics. There was not a statistically significant association between the proportion of a ZIP-code that speaks only English or English less than very well and the proportion of studies with English language requirements. In a simple logistic regression model, there was a relationship between the poverty rate of a ZIP-code and the odds of a clinical trial having English language requirements (OR=1.06 for a 10 percent poverty rate increase, 95% CI 1.001–1.11, p=0.045).

Figure 1.

Figure 1

Estimated proportion of studies with English fluency requirements by ZIP code-level English speaking ability and poverty characteristics. In univariable logistic regression models, the relationship of poverty rate with English language fluency requirements was statistically significant, while the proportions that speak English or speak English less than very well in a ZIP code were not statistically significant.

Figure 2 shows the relationship of racial and ethnic ZIP-code characteristics with the estimated probability of English fluency requirements from the univariable logistic models with spline terms. A U-shape pattern was observed for the proportion of individuals self-identifying as Black or African-American in a ZIP-code, with higher rates of English language requirements in ZIP-codes with the fewest or most African Americans (OR=0.72 for each ten percentage point increase, 95% Confidence Interval [CI] 0.60–0.87, p=0.001 for linear term, OR=1.87, 95% CI 1.36–2.58 p<0.001 for spline term). Studies that were located in ZIP-codes with the highest percentage of Asians were the most likely to require participants to be fluent in English; the association appeared to be relatively linear in nature (OR=1.16 for ten point increase, 95% CI 1.07–1.25, p<0.001 for linear term alone, p=0.92 for nonlinear spline relationship). Conversely, studies located in ZIP-codes with large percentages of Hispanics were the least likely to require participants to be fluent in English (OR=0.92, 95% CI 0.86–0.98, p=0.013 for 10 point increase linear term alone, p=0.11 for non-linear spline relationship).

Figure 2.

Figure 2

Estimated proportion of studies with English fluency requirements by ZIP code-level race and ethnicity characteristics. In univariable logistic regression models, the proportion reporting Asian race alone, Black race alone, or Hispanic ethnicity alone had statistically significant relationships. The proportion reporting White race alone was not statistically significant. The total proportion of the four racial groups in ZIP codes does not necessarily sum to one, as those reporting another race/ethnicity or more than one race were not included in these four groups.

Great variability in racial distributions among ZIP-codes makes it possible for the relationships among the demographic groups to differ so much (i.e. the relationship of one group is not simply the inverse of another group). In Supplemental Figure 1, we provide scatter plots showing the pairwise proportions within each ZIP-code reporting each of the four races and ethnicities alone. Beyond each group’s proportion bounding the proportion of other groups, as reflected in the Spearman correlations, there does not seem to be a strong relationship between the proportion of reporting one race or ethnicity in a ZIP-code and the proportion reporting a specific other race or ethnicity. The totals of the four proportions do not sum to one since the four categories do not include those reporting another race or ethnicity or those reporting more than one race. On average, the four racial/ethnic groups represented 97.1% (SD 1.8%, range 51.9%–100%) of a ZIP-code’s residents.

When examining a full logistic regression model that included as covariates all of the variables included in Tables 1 and 2, the odds of a study having English fluency restrictions increased by 8% per year since 1995 (OR=1.08, 95% CI 1.04–1.12, p<0.001, Supplemental Table 1). As a sensitivity analysis, we re-estimated the model only among the trials open from 2005 on, and the relationship was similar (OR=1.07, 95% CI 1.02–1.12, p=0.007). We found that intervention type was the only variable that had a statistically significant interaction with year trial opened (p=0.009, Supplemental Table 2). In Figure 3, we present changes in restrictions over time from 1976 by intervention type. We omitted from the figure those trials with genetic and radiation interventions due to their small sample sizes and near zero rates (see Table 1).

Figure 3.

Figure 3

Proportion of studies with English language restrictions by intervention type and time period.

We found the following in the four logistic regression models with forward stepwise selection of the ZIP-code English proficiency and race/ethnicity characteristics: 1) English proficiency at the ZIP-code level was not associated with English language restrictions in clinical trials when the two variables were examined alone, 2) ZIP-codes with higher percentages of Black/African American and Asian individuals were associated with more English language restrictions when the four race/ethnicity variables were entered into the models, 3) the results concerning Black/African American and Asian percentages were replicated when all six race, ethnicity, and English proficiency variables were entered, and 4) higher percentages of Hispanic individuals were associated with fewer English language restrictions and higher percentages of individuals speaking English less than very well were associated with more restrictions when the English proficiency, Asian, and Hispanic ZIP-code percentage variables were included in the model. The inferences were the same when entering the covariates into the models via linear (i.e. untransformed terms) or restricted cubic spline terms. In general, these sensitivity findings confirmed our results.

Discussion

We found eligibility restrictions requiring participants to be fluent in English in a wide array of clinical trials. Of note is that the percentage of clinical trials with English fluency requirements has been increasing over time. Only 1.7% of studies listed on ClinicalTrials.gov required English fluency prior to 2000, but 9.0% had such a requirement after 2010. Our multiple logistic regression model results indicate that the odds of a trial having English language exclusions has been increasing since 1995 by 8% per year.

When stratifying by intervention type of the trial, behavioral trials had the largest percentage of trials that required English fluency. Still, such trials represented only 274 (37%) of the 737 of trials that required English fluency. In terms of absolute numbers, drug trials were among the most common types of trials to have English requirements.

Also, there was some evidence that English language restrictions are related to the demographics of the communities in which trials open. Studies opening in ZIP-codes that had the fewest or highest percentages of minorities were most likely to require English fluency. However, there was variation in the relationship of community demographics with the probability that trials have English language requirements. Clinical trials in ZIP-codes with a greater percentage of Hispanics were less likely to have English language requirements while those in ZIP-codes with a larger percentage of Asians were more likely to have English language requirements. Studies in areas with higher poverty rates were more likely to have English fluency requirements.

While we did not adjust for multiple hypothesis testing in this work, many of the relationships were highly statistically significant (i.e. p<0.01) suggesting that they would still be statistically significant even if we had adjusted for multiple comparisons.

Our finding that the proportion of clinical trials with English language requirements has increased over time was against expectations; there have been a number of initiatives designed to promote diversity in clinical trials. Concerns about recruitment and diversity of participants in general led to the development of the 1994 NIH guidelines on the inclusion of minorities and women in clinical trials.18 Other agencies issued similar guidelines.19,20 However, such guidelines are not necessarily clear as to whether those who cannot speak English are covered by the policies. Further, minorities may still be underrepresented in clinical trials even in the post-guideline and pro-inclusion era.21,22 Most studies likely do not formalize the number of minorities to be recruited to a study, and those that try to set such goals often are unsuccessful in achieving desired recruitment.23

There are many reasons why investigators cannot recruit certain subpopulations to clinical trials, and the list of reasons seems to vary among minority groups.23 Investigator perceptions may contribute to the exclusion of non-English speakers. Studies have found that English speakers have a low comprehension of clinical trial terminology and procedures.12,24 Areas of potential confusion among clinical trial participants regardless of language include the nature and purpose of randomization10,11,12,24 and the ability of participants to voluntarily withdraw from a study.24 Even when patient protections are explained, some patients may worry that their safety is still at risk.10 It may be particularly challenging to address such concerns through the informed consent and study enrollment process when a patient does not speak English.12 To overcome some of these obstacles, the FDA has issued guidance regarding the oral translation of informed consent documents for clinical trials “if a non-English speaking subject is unexpectedly encountered.”25

Research concerning whether minorities are less inclined to enroll in clinical trials has had mixed findings. Researchers have found that Spanish speakers have expressed fear that they are not offered trials because of their minority status.26 Some Hispanic survey respondents have stated that language is a barrier that makes it difficult for them to enroll in a clinical trial.10,27,28 Even when translators are present, some may be concerned that there is loss of important information in the translation process.26 Asian survey respondents have also mentioned that language differences can be barriers to clinical trial participation.11,29 Still, a survey administered by the Department of Veteran Affairs did not find substantial differences in willingness to engage in clinical research among races even if actual participation rates might vary.30

It is possible that minority participation in clinical trials could be limited because of institutional barriers. There is some evidence that minorities in general are not included in clinical trials because they are not offered enrollment in numbers proportional to their population level representation.31 Explicit exclusion of non-English speakers could be one reason for the lack of representation of some ethnicities in the United States.

This work provides one of the most comprehensive studies of English language requirements present in clinical trials in the United States. A strength of the study is that we used a large database of information on publicly registered clinical trials. A limitation of the work is that we did not examine the actual protocols. It is possible that the inclusion and exclusion requirements in the actual trial protocols differed from the information posted on ClinicalTrials.gov. Another limitation was that we were missing ZIP-code level information on many of the trials. One reason was that study ZIP-code data was not listed for many trials registered on ClinicalTrials.gov. For some studies, we were unable to obtain demographic data because the reported ZIP-code was a business or institutional-associated ZIP-code with no census demographic data linked to it. However, the fact that the percentage of trials with English language requirements among those with missing ZIP-code level data was similar to the percentage among those with non-missing ZIP-code level data (approximately 7%) suggests that any missing data bias was not large.

In conclusion, we found that inclusion and exclusion requirements restricting clinical trial participation to those who are fluent in English have become more common over time. Trials across many different scientific fields have English language requirements. Future research can investigate the justifications for such language requirements and why the prevalence has increased over time.

Supplementary Material

Acknowledgments

This was funded by National Institutes of Health, National Cancer Institute grants R03CA167264 and P30CA006927.

Footnotes

The authors report no conflicts of interest.

References

  • 1.Elting LS, Cooksley C, Bekele BN, et al. Generalizability of cancer clinical trial results: prognostic differences between participants and nonparticipants. Cancer. 2006;106:2452–2458. doi: 10.1002/cncr.21907. [DOI] [PubMed] [Google Scholar]
  • 2.Tunis SR, Stryer DB, Clancy CM. Practical clinical trials: Increasing the value of clinical research for decision making in clinical and health policy. JAMA. 2003;290:1624–1632. doi: 10.1001/jama.290.12.1624. [DOI] [PubMed] [Google Scholar]
  • 3.Piantadosi S. Clinical trials: A methodologic perspective. New York: John Wiley & Sons; 1997. [Google Scholar]
  • 4.Christian MC, Trimble EL. Increasing participation of physicians and patients from underrepresented racial and ethnic groups in National Cancer Institute-sponsored clinical trials. Cancer Epidemiol Biomarkers Prev. 2003;12:277–283. [PubMed] [Google Scholar]
  • 5.Rosenbaum PR. Heterogeneity and causality: Unit heterogeneity and design sensitivity in observational studies. Am Stat. 2005;59:147–152. [Google Scholar]
  • 6.Maitournam A, Simon R. On the efficiency of targeted clinical trials. Stat Med. 2005;24:329–339. doi: 10.1002/sim.1975. [DOI] [PubMed] [Google Scholar]
  • 7.Glasgow RE, Magid DJ, Beck A, et al. Practical clinical trials for translating research to practice. Med Care. 2005;43:551–557. doi: 10.1097/01.mlr.0000163645.41407.09. [DOI] [PubMed] [Google Scholar]
  • 8.Frayne SM, Burns RB, Hardt EJ, et al. The exclusion of non-English-speaking persons from research. J Gen Intern Med. 1996;11:39–43. doi: 10.1007/BF02603484. [DOI] [PubMed] [Google Scholar]
  • 9.Jiya M. The recruitment of non-English speaking subjects into human research. J Med Ethics. 1999;25:420–421. doi: 10.1136/jme.25.5.420-a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Evans KR, Lewis MJ, Hudson SV. The role of health literacy on African American and Hispanic/Latino perspectives on cancer clinical trials. J Cancer Educ. 2012;27:299–305. doi: 10.1007/s13187-011-0300-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Tu SP, Chen H, Chen A, et al. Clinical trials: understanding and perceptions of female Chinese-American cancer patients. Cancer. 2005;104(12 Suppl):2999–3005. doi: 10.1002/cncr.21524. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Stead M, Eadie D, Gordon D, et al. “Hello, hello--it’s English I speak!”: a qualitative exploration of patients’ understanding of the science of clinical trials. J Med Ethics. 2005;31:664–669. doi: 10.1136/jme.2004.011064. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.De Angelis CD, Drazen JM, Frizelle FA, et al. Is this clinical trial fully registered?--A statement from the International Committee of Medical Journal Editors. N Engl J Med. 2005;352:2436–2438. doi: 10.1056/NEJMe058127. [DOI] [PubMed] [Google Scholar]
  • 14.Egleston BL, Dunbrack RL, Jr, Hall MJ. Clinical trials that explicitly exclude gay and lesbian patients. N Engl J Med. 2010;362:1054–1055. doi: 10.1056/NEJMc0912600. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. [accessed on 4 February 2015]; www.census.gov.
  • 16. [accessed on 4 February 2015]; http://www.census.gov/acs/www/
  • 17.Harrell FE., Jr . Regression modeling strategies. Chapter 2. New York: Springer; 2001. pp. 11–40. [Google Scholar]
  • 18.AHRQ Policy on the inclusion of priority populations in research. Notice: NOT-HS-030-010. Released February 27, 2003. [Google Scholar]
  • 19.United States Government Accountability Office. Prescription drugs: FDA guidance and regulations related to data on elderly persons in clinical drug trials. Washington, D.C: Sep 28, 2007. Publication number GAO-07-47R. [Google Scholar]
  • 20. [accessed on 4 February 2015];NIH policy and guidelines on the inclusion of women and minorities as subjects in clinical research – Amended, October, 2001. http://grants1.nih.gov/grants/funding/women_min/guidelines_amended_10_2001.htm.
  • 21.Murthy VH, Krumholz HM, Gross CP. Participation in cancer clinical trials: race-, sex-, and age-based disparities. JAMA. 2004;291:2720–2726. doi: 10.1001/jama.291.22.2720. [DOI] [PubMed] [Google Scholar]
  • 22.Link MW, Mokdad AH, Stackhouse HF, et al. Race, ethnicity, and linguistic isolation as determinants of participation in public health surveillance surveys. Prev Chronic Dis. 2006;3:A09. [PMC free article] [PubMed] [Google Scholar]
  • 23.Durant RW, Davis RB, St George DM, et al. Participation in research studies: factors associated with failing to meet minority recruitment goals. Ann of Epidemiol. 2007;17:634–642. doi: 10.1016/j.annepidem.2007.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Cameron P, Pond GR, Xu RY, et al. A comparison of patient knowledge of clinical trials and trialist priorities. Curr Oncol. 2013;20:e193–e205. doi: 10.3747/co.20.1323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.U.S. Food and Drug Administration. A guide to informed consent-information sheet. [accessed on February 4, 2015];Guidance for Institutional Review Boards and clinical investigators. http://www.fda.gov/regulatoryinformation/guidances/ucm126431.htm#nonenglish.
  • 26.Ellington L, Wahab S, Martin SS, et al. Factors that influence Spanish- and English-speaking participants’ decision to enroll in cancer randomized clinical trials. Psycho-Oncology. 2005;15:273–284. doi: 10.1002/pon.943. [DOI] [PubMed] [Google Scholar]
  • 27.Calderon JL, Baker RS, Fabrega H, et al. An ethno-medical perspective on research participation: a qualitative pilot study. Med Gen Med. 2006;8:23. [PMC free article] [PubMed] [Google Scholar]
  • 28.Ulrich A, Thompson B, Livaudais JC, et al. Issues in biomedical research: what do Hispanics think? Am J Health Behav. 2013;37:80–85. doi: 10.5993/AJHB.37.1.9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Lin JS, Finlay A, Tu A, et al. Understanding immigrant Chinese Americans’ participation in cancer screening and clinical trials. J Community Health. 2005;30:451–466. doi: 10.1007/s10900-005-7280-5. [DOI] [PubMed] [Google Scholar]
  • 30.Kressin NR, Meterko M, Wilson NJ. Racial disparities in participation in biomedical research. J Natl Med Assoc. 2000;92:62–69. [PMC free article] [PubMed] [Google Scholar]
  • 31.Wendler D, Kington R, Madans J, et al. Are racial and ethnic minorities less willing to participate in health research? PLoS Med. 2006;3:e19. doi: 10.1371/journal.pmed.0030019. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

RESOURCES